The objective of the processing community is to allow transformation of the research data to improve interpretation for a specific purpose.
The roles and behaviour of the data processing community can be summarized as follows:
A Service Provider provides a processor service that can be instantiated, configured and deployed. The process consumer requests an instance of the processor, configures it by specifying the data source(s) and the transformation parameters. The process consumer than requests/receives the result.
The following roles are identified in the Data Processing Community:
Data Processing subsystem: the community component representing the data processing community. A passive role that responds to a wide range of data processing requests such as analyse, aggregate, visualise, etc.
Process Designer (active role): an agent who builds, maintains and configures and a processor.
Processor (passive role): a system and/or algorithm used by the Process Consumer that transforms data.
Process Consumer (passive/active role): an agent that is responsible for triggering the (possibly compound) Service of processing data, and for receiving and ultimately analysing the output of the Service.
The following behaviour is identified in the Data Processing Community:
Deploy Process, the behaviour performed by the process consumer, the process designer and the processor whereby the processor is requested, instantiated, configured (a.o. specification of data source) and deployed.
Collect Data, the behaviour performed by the processor and data target whereby the data target requests/received the resulting data from the processor. The resulting data can be a visualisation.
Communities that typically collaborate with the data processing community related are: data creation community for e.g. cleaning data or named-entity-recognition (creates new data), data management community for providing e.g. format transformations, data provisioning community for acting as a data source, data identification community for e.g. detecting references or fetching remote data sources and user authentication community for authenticating the process consumer for access to the service and the data source.