We have realised we can publish an estimate of the scale and growth rate of data under management across the university sector.
──────────────────
An immediate desire to understand what it means arises.
However, a first attempt to ‘dig into’ that graph to reveal any important substructure has shown that we don’t have easy access to such information or certainty on what substructure is important.
Yin implies all data is inherently valuable and should be retained until demonstrably not valuable.
Yang implies we start with a minimum retention period followed by deletion, and extend retention for understood reasons.
──────────────────
Both approaches are appropriate to some data.
We don’t know what intermediate states there are or how to ‘categorise’ data into any appropriate state.
All of the RDCC participants are providing services at scale to support a diversity of research data, including commercial options. This ecosystem did not exist when the RDMP process was created and current research data practice was established.
──────────────────
Research data life cycle support now has access to mechanisms that did not previously exist.
We observed that we didn’t have a clear understanding of ‘what causes data to have life cycles’ or indeed for ‘different data to have different life cycles’ - what are the drivers?
──────────────────
We proposed attempting to answer this question on a discipline specific basis for disciplines generating high costs into our data support solutions.
At a national research level, we have invested in Yin over the last 15 years (and energised by NCRIS) much more strongly than Yang.
──────────────────
We are building a cultural agreement around FAIR, but we don’t have a cultural agreement that supports action on limits to resourcing, ‘baking in’ the treatment of sensitivity or agreeing that much data has an end-of-life.
There is no connection between the content of RDMPs and subsequent decision making (ie what actually happens).
──────────────────
We converted this observation into the plan to automate decision making based on some form of ‘new’ RDMP-2.0.