When you are first testing the data from a new data provider you will need to do some basic validations just to ensure the data they are sending is what you are expecting. The import engine will do validations on individual data elements but you also need to do some checks on what they are sending. A lot of this is just manually querying and looking at the data. Below is a start at a list of SQL commands you can use to help validate the data. Please submit more validations and helper SQL statements.
IMPORTANT: The data release and validation process is iterative. You should ask them to send you a small number of requests, approximately 20 requests, just validate the file format and the data. When it looks like those records are okay you can then accept a larger number, approximately 1000 requests. After you validate the larger quantity you should be able to let them send all their data. Because of this iterative process you might find it best to remove the older data if it was not sent correctly. BE CAREFUL DOING DELETES ON THE PRODUCTION DATABASE. This is why it's good to be doing the initial test against a test database restored with a recent production backup.
-Overall data review
-RequestIDs
-FacilityCodes
-Test codes
-Dates
These validations assume you will be able to limit the data to review by using the left six characters of the RequestID. For example, for DisaLab in Mozambique = 'MZDISA'. For the SQL below replace the 'CCLIMS' with the appropriate code for the country and data provider you are validating.
The first thing to do is to just look at the full set of data for a screen or two of records. Take a look at the data and make sure there are no obvious errors in the RequestID, FacilityCodes, Dates, Results, etc.
-- Make sure the RequestID is formatted correctly and both Request and Result data is being sent
SELECT * FROM Requests WHERE LEFT (RequestID, 6) = 'CCLIMS';
SELECT * FROM LabResults WHERE LEFT (RequestID, 6) = 'CCLIMS';
-- This query is useful to ensure they have sent the data correctly for joining from Requests to LabResults
SELECT req.*, res.*
FROM Requests AS req LEFT JOIN LabResults AS res ON req.RequestID = res.RequestID AND req.OBRSetID = res.OBRSetID
WHERE LEFT (req.RequestID, 6) = 'CCLIMS'
-- You can also look to see if they have included specific data like the extra registration information
-- VIRAL might need changed to the PanelCode your implementation is using to store registration information.
SELECT top 100 req.*, res.*
FROM Requests AS req LEFT JOIN LabResults AS res ON req.RequestID = res.RequestID AND req.OBRSetID = res.OBRSetID
WHERE LEFT (req.RequestID, 6) = 'CCLIMS'
AND LIMSPanelCode = 'VIRAL'
ORDER BY req.RequestID
-- Next you can check on more specific data elements
-- This give you an idea of what kinds of panels and results they are sending it and you can check
-- to see if they are correct
SELECT req.LIMSPanelCode, req.LIMSPanelDesc, res.LIMSObservationCode, res.LIMSObservationDesc, count(1)
FROM Requests AS req INNER JOIN LabResults AS res ON req.RequestID = res.RequestID AND req.OBRSetID = res.OBRSetID
WHERE LEFT (req.RequestID, 6) = 'CCLIMS'
GROUP BY req.LIMSPanelCode, req.LIMSPanelDesc, res.LIMSObservationCode, res.LIMSObservationDesc
ORDER BY req.LIMSPanelCode, req.LIMSPanelDesc, res.LIMSObservationCode, res.LIMSObservationDesc
-- This is good because you can see the values there are actually sending for results and make sure you area accounting for all of them in your reports
SELECT req.LIMSPanelCode, req.LIMSPanelDesc, res.LIMSObservationCode, res.LIMSObservationDesc, res.LIMSRptResult, count(1)
FROM Requests AS req INNER JOIN LabResults AS res ON req.RequestID = res.RequestID AND req.OBRSetID = res.OBRSetID
WHERE LEFT (req.RequestID, 6) = 'CCLIMS'
GROUP BY req.LIMSPanelCode, req.LIMSPanelDesc, res.LIMSObservationCode, res.LIMSObservationDesc, res.LIMSRptResult
ORDER BY req.LIMSPanelCode, req.LIMSPanelDesc, res.LIMSObservationCode, res.LIMSObservationDesc, res.LIMSRptResult
-- Check to see if their facility codes look okay
SELECT DISTINCT RequestingFacilityCode, ReceivingFacilityCode, TestingFacilityCode, LIMSFacilityCode
FROM Requests
WHERE LEFT(RequestId,6) = 'CCLIMS'
-- Check to see if the facility codes they have entered actually made to FacilityCodes in the Facilities table.
-- In these queries if the Description is empty you should do more investigation as to whether the actual code is missing or just the Description (both need fixed)
SELECT DISTINCT req.ReceivingFacilityCode, recfc.Description
FROM Requests AS req LEFT JOIN [OpenLDRDict].[dbo].Facilities AS rfc ON req.RequestingFacilityCode = rfc.FacilityCode
WHERE LEFT(req.RequestId,6) = 'CCLIMS';
SELECT DISTINCT req.RequestingFacilityCode, recfc.Description
FROM Requests AS req LEFT JOIN [OpenLDRDict].[dbo].Facilities AS recfc ON req.ReceivingFacilityCode = recfc.FacilityCode
WHERE LEFT(req.RequestId,6) = 'CCLIMS';
SELECT DISTINCT req.RequestingFacilityCode, tfc.Description
FROM Requests AS req LEFT JOIN [OpenLDRDict].[dbo].Facilities AS tfc ON req.TestingFacilityCode = tfc.FacilityCode
WHERE LEFT(req.RequestId,6) = 'CCLIMS';
SELECT DISTINCT req.RequestingFacilityCode, lfc.Description
FROM Requests AS req LEFT JOIN [OpenLDRDict].[dbo].Facilities AS lfc ON req.LIMSFacilityCode = lfc.FacilityCode
WHERE LEFT(req.RequestId,6) = 'CCLIMS';
-- Which dates have they sent and do they make sense?
SELECT req.LIMSDateTimeStamp, req.SpecimenDateTime, req.RegisteredDateTime, req.ReceivedDateTime, req.AnalysisDateTime, req.AuthorisedDateTime
FROM Requests AS req INNER JOIN LabResults AS res ON req.RequestID = res.RequestID AND req.OBRSetID = res.OBRSetID
WHERE LEFT(req.RequestId,6) = 'CCLIMS