The project will use different types of data and guidance on preservation, exploitation and curation depending on the disciplines and use-cases.
Data types and preservation:
The project will use different types of data for the different use-cases. HEP-1 and HEP-3 use-cases will refer, for the data preservation and sharing, to the guidelines established by the ATLAS policy, documented in J.Phys.Conf.Ser. 1085 (2018) no.4, 042011. HEP-2 will follow, for the data management, collection and standards, the widely used LNF data policy (http://www.lnf.infn.it/computing/regolamento/politica_e.html). MED-1 use-case will use only publicly available data. MED-2 will use experimental rheological data taken and stored at Polytechnic University of Bucharest, publicly available data and, subject to approval of the ethical committee, data from Hospital S. Croce and Carle, Cuneo, Italy. The medical images will be de-identified in order to comply with the definition of “anonymous data” referred to in Recital 26 of the GDPR. NS1 case will deal with data previously recorded by Sapienza in behaving animals (Macaca rhesus).
Data exploitation:
where possible, all data will be made public to the society. For the HEP1 cases, publicly available Monte Carlo generators and open-source detector emulation kits will be used as described in Section 2.2. For all cases, trained models, saliency maps and algorithms developed will be shared through gitlab platforms among which BALTIG, the gitlab instance managed by INFN. For the MED-2 use-case, all information will be made publicly available whenever proprietary aspects (for example for what concerns MedLea related technology) are not impacted. As a partnering member of the EC Flagship Human Brain Project (HBP), Sapienza will soon include some of the data in the KnowledgeSpace facility (https://knowledge-space.org).
XAI-TOOLS developed algorithms, both in terms of procedures and code, will be made publicly available using the git-hub platform and under GNU General Public License v3.0 free software license.
Data curation and preservation (including costs):
curation and preservation of the data will be the responsibility of the partners and/or their respective collaborations where established guidelines exist. In general, no cost has been charged to this project for data archiving as it is anticipated that the amount of data generated for long-term retention will not exceed the capacity provided free by the institutions in this consortium. The cost for curation of NS1 data in the KnowledgeSpace facility will be covered by HBP. On reasonable request, they are also available to other groups.