Dataspace rulebooks

IDSA rulebook

"The IDSA Rulebook serves several purposes regarding the development and operation of data spaces. The aim is to describe clearly which rules are mandatory and which are optional guidelines. This governance framework includes functional, technical, operational, and legal dimensions:

Guidelines for the functionality of common services are presented as well as the definition, processes, and services of specific roles.
Guidelines on how to implement or use a technical artifact of the IDSA.
Guidelines for the work and collaboration within data services.
Guidelines for the legal basis in compliance with the regulatory environment to ensure trust and security."

Sitra rulebook

"The rulebook model for a fair data economy is a guide for creators of fair data economy data spaces. Agreement templates and other tools make it easier to build and join new data spaces, which highlight transparency in data sharing.

The rulebook model contains:

extensive instructions for setting up a data space
a glossary
data space canvas
business, governance, legal, and technical check lists with a range of control questions
an ethical maturity model for defining a code of conduct
Rolebook and Servicebook tools to define roles and services in the data space
agreement templates."

Initial rulebook overview, steps to implementation and requirements

Rulebook overview

A dataspace rulebook is a collection of documents that establish a governance framework and general terms and conditions for fair, secure and legally sound multilateral data sharing among participants within a dataspace and optionally with other dataspaces.

Dataspaces do not provide a simple plug and play solution. Those embarking on a dataspace need to make some fundamental design choices and agree on the evolving legal and operational frameworks that a dataspace will operate under.

A rulebook provides a comprehensive structured approach for anyone looking to build or participate in data ecosystems where maintaining data sovereignty is a priority. It provides concrete guidance on how to establish governance (Dataspace Governance Authority - DSGA), defining the roles, responsibilities and interactions of participants.

The IDSA rulebook grounds the technical and organisational framework within the evolving European context, and therefore requires considerable adaptation for the Australian context. An adapted Australian rulebook will provide a set of blueprints to enable trusted data sharing while keeping control of your data, addressing both the technical and non-technical challenges like governance, trust, and importantly Australian legal compliance.

Steps to creating a dataspace from scratch (many of these components are described below)

Define its purpose
Understand the needs of intended participants (this guides the architecture you choose, centralised, federated, or hybrid? – design all the functional elements

a. Do participants need maximum autonomy and sovereignty?

b. Is some level of central control required or desired?

c. What kind of technical maturity do participants need?

d. What data will be shared?

Is it going to be a closed invitation only group, or open to anyone who meets the criteria?
Will these be centralised, federated or de-centralised (data, catalogue, & clearing house)?
Establish the DSGA, publish the DSSD (dataspace Self description)
Establish method to find the dataspace
Build registration service – application, reviewing and issuing credentials

Four layers of Governance (from foundational = 1, to specific = 4)

1. Soft infrastructure – Foundation generic building blocks, Common legal basis, Framework documents

2. Dataspace domain governance – Sector specific principles (i.e. rules for biosecurity data), semantic & legal interoperability (allowing for some variation but maximising connectivity)

3. Dataspace ecosystem governance – Defines the rules for one specific instance of a dataspace, rules for trust within the group and how it might interact with other dataspaces.

4. Dataspace instance governance – The actual execution layer, the operational part that implements and enforces the rules for the specific dataspace.

The required concepts required for any dataspace to operate

I. How you establish trust between participants

A. Who are dataspace participants

Data consumer – the participant that will use the data
Data provider – the entity that holds the data and has the rights to access and share those data
Service providers – offer optional value enhancements to the dataspace, often mandated services by DSGA like identity providers, or trust services, intermediaries services (that help establish relationships for data sharing, brokerage services, regulated intermediaries, value added services) process the data, analyse it, or provide additional services using the dataspace functions, i.e. data analysis platform

B. How do you become a dataspace participant? (a structured process to meet membership policies)

First discover the dataspace (central registry, website, word of mouth)
Need to read the Dataspace self description (describes the rules, required attributes, accepted trust anchors [entity that can reliably certify an attribute], technical set up needed)
Self evaluate if you meet the requirements
If you think you meet the requirements you apply for a membership, through a membership service endorsed by the DSGA, they request and verify participants attributes (credentials, technical readiness, legal audit, background checks etc.)
Once DSGA is satisfied, you are issued with a membership credential for that specific dataspace
Only then does the participant set up the required technical components (i.e. connector) to interact with the dataspace

C. Attribute based trust: name profession, verify attributes (27001 certification, accredited researcher)

How are attributes listed? Self-descriptions Dataspace self-description (DSSD, lists rules, the attributes participants will be required to have, the vocabularies used, technical detail
When you want to join you provide a participant self-description, listing attributes in a structured way that matches DSSD requirements
How are attributes verified? Trust anchors and trust frameworks

a. Trust anchor = entity that can reliably certify an attribute, gov agency issuing business licenses, or industry body professional certifications

b. Trust framework is the set of rules or process that a trust anchor follows to establish that certification

c. The Dataspace governance authority (DSGA) decides which trust anchors it trusts

d. Based on verification against these trusted sources, the DSGA can issue participants verifiable credentials (VCs) digital proofs of your attributes, i.e. a secure digital badge, that badge will set what you are authorised to do in that space

D. Match them against policies and rules, trust level based on this matching

Types of policies:

a. Access policies: control who can see data offers available in the catalogue

Policies to find data

i. Membership policies Basic criteria of participants attributes (what are the entry requirements)

ii. Access policies (based on participants attributes filter what data offers you are allowed to see)

(a) Time based

(b) Location based

iii. Auditor might see all of what is available, but some participants would not know about all the offerings

b. Contract policies (govern the terms and the usage of the data after a contract is agreed upon)

i. Can permit actions

(a) Does the participant meet requirements for that specific agreement? Might need a specific technical capability to receive data, or proof you have signed a legal contract off line.

(b) Can be automated, or implemented by contractor ad-hoc depending on complexity or exceptions

ii. Can require certain obligations

Usage policies dictate what you are allowed to do with the data after you receive it

(a) Can only use data for a specific stated purpose

(b) Only keep it for a limited time

(d) Only process it within a highly secure technical environment

(e) Enforcement varies a lot

(i) Technical enforcement varies by sensitivity and risk, protection needs of the data fundamentally change the enforcement criteria & technical requirements

(ii) Highly sensitive data might use secure enclaves (TREs) or specific connectors

(iii) Low risk data might just log and audit after the fact (statement – don’t resell)

iii. Complex rules often combine these elements, sorting out the attribute based trust i.e. evaluate participant attributes against these layered policies & the dataspace assesses the risk, and determines the appropriate dynamic context aware trust level for sharing that specific data asset with that specific participant

II. How you enable data discoverability

Catalogue structure (required once you have more than two participants)

A. Metadata about the data assets, uses agreed vocabulary, descriptions, format, perhaps quality

B. Catalogues provide the DCO Data Contract Offers initial proposals that outline the terms under which the data can be shared.

C. Might include federated searches across multiple catalogues

D. Access control policies are crucial in the catalogue context, must enforce attribute based access control, and only show the DCOs you are authorised to see.

E. Centralised vs federated catalogues

Centralised catalogue: is simple for DSGA to administer, everything is in one place, BUT it can become a bottleneck for performance, a single point of failure, attractive target for security attacks, reduces participant autonomy
Federated: spreads catalogue among trusted nodes, allows for regional control with partition by region or datatype, BUT participant sovereignty still somewhat limited as the operators of federated nodes have more control, technical implementation is harder than a centralised design as synchronisation can be hard
Decentralised: aims for highest participant autonomy, with each participant running their own catalogue or each publishing DCO to a shared registry that everyone can query. No single point of failure, scales well & participants have maximal control over publishing their own offers, HOWEVER this shifts the complexity to the participants as they need to administer their own catalogue node, and need mechanisms to discover other participants catalogues

III. How you negotiate data contracts

A. DCA data contract agreement, based on data contract offer (DCO)

B. Evaluate specific policies detailed in the DCO, contract policies and usage policies against your participant attributes

C. Need to verify participant meets all requirements for this specific transaction, i.e. Check verifiable credentials (confirm with trust anchors)

D. Meeting DCA requirements does not necessarily result in immediate data transfer, execution can happen later (allowing for complex workflows or scheduled transfers), or further approvals after agreement is technically in place.

IV. The actual data sharing and usage process

When data sharing execution happens, policies might be reevaluated if a significant amount of time has passed since meeting technical requirements.

A. Data provider can push data to the consumer

B. Or the consumer can pull the data

C. Technology depends heavily on the data type, security level, established trust between the parties, & underlying infrastructure

D. Technical orchestration of the transfer through the DS connector, which starts and stops the transfer, monitors the flow, and potentially enforcing usage policies during or after the transfer

E. Code to data: Can send your processing code to run where the data resides, typically within a secure environment, only the results of the computation are sent back. Useful for massive datasets, or highly sensitive datasets.

F. Move file by file copy through HTTPS, API streaming, or code to data environments

V. How do you ensure observability for compliance and accountability as specified in DCA?

A. Mechanisms needed to observe and record what happens during data sharing and usage. Perhaps facilitated by a trusted third party clearing house. Why needed:

Legal compliance, demonstrated to regulator - policies followed
Business reasons - calculating payments in a marketplace,

B. How is observability implemented, centralised may be more risky, federated or de-centralised might be harder technically but less risky. Chosen architecture is important. In decentralised each node must track with logs the DCAs and execution of those DCAs Then trusted auditors verify logs, these auditors would be approved (possibly certified by the DSGA), auditor requests are also logged.

C. Services such as Payment clearing systems, notary services, reporting can build on logging infrastructure

VI. The use of vocabularies and semantic models

A. Machine actionable vocabularies are needed for the data and the process of requesting data, making offers, negotiating contracts etc.

B. DSGA provides core mandatory policy and data asset vocabularies as part of the DSSD, and participants might agree on additional ones

C. Semantic models for the shared data assets

Technically these are optional, and some dataspaces have kept data harmonisation levels low to start until as the data starts moving it becomes clear where further efforts on harmonisation would be most beneficial.
In some cases, it will be mandatory to fully describe the data meaning

D. Semantic models for policies are mandatory

Participants must agree on what usage policy, attribute, data asset, purpose, limitation etc mean
Without these contract or rule enforcement is not possible

VII. Scaling interoperability (without it dataspaces just provide larger data silos, and can deliver cross domain value) A shared responsibility with the DSGA setting baseline standards. Participants responsible for ensuring they can meet standards and operate under them for specific use cases

A. Within dataspaces all participants within a dataspace can technically connect and semantically understands the rules and data as defined by the DSGA

B. Across dataspaces be able to request and potentially combine data from multiple dataspaces

C. Technical interoperability - Physical and logical connections i.e. protocols (dataspace protocol - foundational), interfaces, data formats, syntactic interoperability (shared meaning), making sure data is structured in a way that all participants can parse

D. Organisational interoperability, aligning business processes, workflows, responsibilities, governance structures (follow same processes for joining, negotiating etc)

E. Legal interoperability - ensuring legal concepts, contractual clauses, and regulatory requirements are understood, accepted and applied consistently even when participants operate under slightly different legal frameworks or interpretations.

F. Trust frameworks and anchors can be shared services and they also need clear standards to be interoperable across multiple dataspaces

VIII. IDSA tools and frameworks for implementation

A. Magic triangle - core technical and operational framework with three interconnected pillars -

Reference Architecture model - Highlevel abstract blueprint & conceptual foundation (describing how to build a dataspace) - designed to be technology agnostic, describing concepts, functions, processes, and roles across 5 layers from business layer, to functional layer, to process layer, to information layer to system layer inclusive of three cross cutting perspectives (security, certification & governance).
International Dataspace Specifications (technical rulebook) - Open source building blocks including resources hosted on GitHUB (open source IDSA information model, component specifications for a DS connector, communication protocols like the dataspace protocol, and how usage control should be implemented), These specifications take the abstract concepts from RAM and bind them to concrete technical concepts and standards, provide detailed documentation and specifications needed to actually build dataspace compliant solutions
Certification scheme - the operational framework for ensuring trust and verifying that components, operational environments and participants meet IDSA specifications correctly and securely. Roles in certification include evaluation facilities, rules & criteria for assessment, at different trust and assurance levels

B. IDS testbed - provides tools and environments for evaluating IDS components against certification criteria, ensuring they conform to the specifications and pass certain security tests, some of which can be automated. Used to help ensure technical interoperability from different venders

IX. Necessary legal and operational components to make a dataspace work

A. Relationship of certification to operational framework - practical example of how functional requirements translate into operational processes.

DSGA selects trust frameworks it will rely on (could include IDSA certification, just one kind of acceptable trust framework) and specifies in the DSSD the minimum participant attributes required to establish trust (i.e. must have IDSA certification at Assurance level 2 and trust level 2).
Participants then prove they meet these trust levels in their participant self description (PSD)
DSSD also lists the trust acceptable trust anchors required to verify these attributes (for IDSA certifications the trust anchor would be the certification authority)

B. Legal Agreements - how do they fit into the framework given the legal landscape is fragmented

Legislation typically only provides the general legal framework
Dataspaces need a specific contractual framework for interactions between participants that define rules of engagement, liabilities, IP rights etc for that specific dataspace
IDSA is drafting components and modules designed to supplement other materials, i.e.contractual building blocks such as terms relevant for particular domains, clauses addressing international data sharing, or privacy enhancing requirements
Adopt and consolidating existing best practices - adapting Sitra’s rulebook

a. Practical tools like template clauses, contracts, control questions to guide agreements, and a code of conduct to help organisations build FAIR and transparent data sharing

b. Well aligned with IDSA goals - providing sovereignty through clear terms of use and trust

c. Clauses including auditing rights, security requirements and ethical considerations

Page updated

Report abuse