Institute for Systems Biology and University of California, San Francisco
The proposed product, EVIDARA, distills knowledge graphs representing “medical knowledge” for the ARS. EVIDARA deemphasizes “reasoning” which remains challenging and may result in spurious, contradictory returns (inconsistent graphs). Instead it uses empirical evidence to rank and filter the returned KG nodes with to increase real-world relevance. To evaluate KGs for empirical support, EVIDARA uses information derived from clinical and multiomics large cohort raw data sets that do not yet exist as formal knowledge source, and combines such observational data with its internal “awareness” of all factoid relationships extracted from public KS that is embodied by the vast SPOKE network. Thus, EVIDARA exposes “all” preexisting knowledge of relationships (factoids stored in SPOKE) with freshly observed relationships. The source of empirical evidence (cohort, study) is also relayed to the ARS. Thus, using raw data from studies, EVIDARA maps a layer of empirical weights (for a given query context Q) to all nodes in the SPOKE network and also computes empirical relationships between the nodes.
Importantly, since the KN at the core of EVIDARA, SPOKE, already integrates many (currently>25) KS and since its edge attributes capture the provenience in the original KS, the pruned KG returned by EVIDARA contains information to guide the ARS to identify more specialized KS (provided by Knowledge Providers) to refine the query. Facilitating this horizontal interaction with other ARAs and Knowledge Providers will be taken into account in the development of EVIDARA.