CSDM-Demo-1

SMH-Digital transformation project 2020-2022

Common Semantic Data Model (CSDM) Mission

Focusing on Digital Transformation and Service Infrastructure Development

In this computer science section of practice, we explain how to achieve and create an interconnected graph to visualize healthcare data from initial development of data to connecting data with the open source community. The entire process includes seven main action steps, six of which we outline in this guidance document. The activity between step 4 and step 5 cannot be effectively demonstrated, and therefore is not discussed in detail as part of the process in this guidance document. The remaining six steps are briefly outlined with associated diagrams and signposted.

Step 1 : The first step in the process relates to creating a demonstrator for Data Collection, the focus of this initial step is to design electronic forms, which are able to capture healthcare data in a structured way. Moving from a Model of Use (a paper based form) to a Model of Meaning ( a demonstrator cloud based system which offers a framework for machine to machine communication). For instance, agreed information is broken into single data fields rather than clustered into a text box. Attributes presented to depict an address are on different rows. For example, the address format on Amazon is presented as Street Name, Apartment Name, Eircode all of which are entered in different fields or rows. While designing the form for data collection it is also important to give preference to capturing what terms are core, what terms are common, and finally consider what local terms must be used within the organization. These are entitled context specific terms. Once data has been agreed and structured for data collection in the demonstrator. The system is tested by entering synthesised data collected into the system through an online form presented on a mobile device. Once this process is completed, you can then proceed to checking the quality of the data that was entered onto the form in Step 2.

Data Collection: Data Collection is the process of gathering and measuring information on variables of interest, in an established systematic fashion that enables one to answer stated research questions, test hypotheses, and evaluate outcomes.

WHAT SMH-Input Form

WHOM Service provider or staff (FOR ALL)

FOR WHAT Data Collection

HOW https://arcg.is/CXamK

STANDARD NA but follow Well Structure Online form (i.e. Amazon Customer Address Field)

Tool:https://survey123.arcgis.com/ [Developer Account]

WHAT Access Point

HOW Scan QR Code

WHAT SYSTEM Distributed System

WHOM Service provider or staff

ACCESS LOCATION All Service Centre

Step 2: Checking the quality of the data that is stored in your demonstrator database involves reviewing how your data is stored. While the programme team has applied a number of conformance rules initially, there is a need to revise and check any rules deployed are working according to the demonstrator system requirements. This may involve an iterative set of processes that you may wish to review and revise at the point of data entry. Step Two involves signing off on the processes, which underpin the data you are collecting in your demonstrator database. Specific activities that you are checking include reviewing summary reports to ascertain if there are any missing values or anomalies within data, which you have agreed to collect. Checking summarised data, which is stored for any errors, is important and this can be done using admin password and login software provided through the administrator authentication and permissions system. Data at this time is viewed through the summarised table view. Remember this view is restricted to only staff who are monitoring the data collated. Step 2 also provides an opportunity to see metadata (right part of the below Figure) that is the data (i.e. Submitted by and Sumitted time), which describes the data collected in the demonstrator database.

2. Data Storage: Data storage refers to the use of recording media to retain data using computers or other devices. The most prevalent forms of data storage are file storage, block storage, and object storage, with each being ideal for different purposes.

WHAT Data Storage

WHOM Service Monitoring Staff

WHAT SYSTEM Cloud-Based Centralise System

ACCESS LOCATION Service Head Office

WHAT DS-Details View

Step 3: The 3rd step in this process relates to translating the agreed required information translated from the original Model of Use (data collected in every day practice on paper format) to the agreed data for the Model of Meaning (data to be collected and transferred from machine to machine). The process involves a significant set of mapping across different tools which requires translation and alignment of the agreed information now presented as formalised terms into a readable quantified language represented as named entity types, and properties. This facilitates the computer to computer interactions which support semantic meaning and requires defined ontological schema. This translation process is an important step in interoperability development but one which is complex and specialised. If your interested in developing capabilities and skills in this field we recommend additional reading on the topic which is available from a set of separate links (see Ontology and Practical Guide To Building OWL Ontologies.)

In the below diagram we present the demonstrator represented as an ontological schema. The Upper part of the figure which is represented in the black box illustrates the classes or entity types associated with the schema whereas in the bottom part of Figure you can see all terms in a table format which we gather during the processes associated with Steps 1 and Step 2. The arrow illustrated in Figure represents the direct mapping between terms. It is a semi-automated process just to say that this mapping needs to be provided in supervision of a human expert.

3. Translation: Model of Use->Model of Meaning

Tool: https://usc-isi-i2.github.io/karma/ [Open source]

STANDARD: HL7 FHIR4 (Schema) and W3C OWL (Formal Language)

The 4th step is actually connected with the previous step but main purpose of this is to look at data more carefully and find out if there is any anomaly or whether all values which we collected are correct e.g. is there any null value or empty column ? What about the date format? Are they all the same? Is ICD-10 code properly stored or entered ? etc. Green bar above the data field (See below Figure) shows the distribution of the data available in the respective column.

4. Data Integration: Data integration is the combination of technical and business processes used to combine data from disparate sources into meaningful and valuable information.

WHAT Top Part: Schema and Bottom Part: Data Collected from Input Form

The 5th step is all about Data Visualization more precisely data visualization through a Knowledge Graph, more about different data visualization techniques and knowledge graphs can be found in the FAQ section. Here you can see how one name entity is related with another name entity using a directed arrow. Node represent class or type (i.e. Person, Hospital, Location etc.) of a name entity. different colours associated with the different classes of the name entity. This kind of graph visualization shows how different elements of a healthcare system are interconnected. It is very intuitive in nature to find out new information otherwise hidden in silos of different hospital systems or databases.

5. Data Visualization: Data visualization is part of many business-intelligence tools and key to advanced analytics. It helps people make sense of all the information, or data, generated today. With data visualization, information is represented in graphical form, as a pie chart, graph, or another type of visual presentation.

Node->Name Entity, Arrow->Relation, Info Box on Right Side

Tool: Ontotext GraphDB [Free Edition]

The 6th step is to show how you can link or connect your own organizational data with the data which are available in the open source community following Tim Berners-Lee vision of connected web. In this example we demonstrate how we connected our data with the external data source based on the common value that is Ireland in our case. As Ireland is a unique name we are able to link it with UN Geopolitical ontology data which is available online. This way we can enrich our datasets which finalily help to draw further analytics. We may also connect our dataset from Ordnance Survey Ireland (OSI) datasets to do analysis at County level. This step is also associated with FAIR (stands for findable, accessible, interoperable and reusable) principles of data.

6. Data Linking: Data linking – creating links between records from different sources based on common features present in those sources. Also known as ‘data linkage’ or ‘data matching’, data are combined at the unit record or micro level.

Linked among local data from SMH and UN Geopolitical Data based on common point (i.e. Country name Ireland)