2021-09-08 SEP

Journal Club

Monjoy

Combining mechanistic and machine learning models for predictive engineering and optimization of tryptophan metabolism (2020) Nature Communications https://www.nature.com/articles/s41467-020-17910-1 , pdf (open)

...

Discussion points (anyone)

Recall metabolic control in Fig1

Contrast mechanistic for optimized metabolic production (Predictive engineering) vs model-free epidemiological forecasting that can cope with novelty graciously.

Can we go ML all the way with ODEs? Can we leap to numerical derivation from there?

"CRISPR/Cas9-mediated genome editing was a vital enabling technology for this project" - wow ...

Hackathon

Identity services

What identity flows does Cloud ID support? Of those, which ones make sense for Connect?

https://cloud.google.com/identity-platform/docs

Bhaumik (if no third party oauth2 service is being used): password-less + no shared logical devices is the logical solution.

De-arraying TMAs

The ideal solution would engage a tile server by first de-arraying the low resolution TMA spots, and then extracting the high resolution TMA images for autonomous governance and classification. That doesn't yet exist outside Praful's head :-) so a review of pre-web solutions in the last decade may be in good order. Intriguingly, none uses AI:

QuPath

https://www.nature.com/articles/s41598-017-17204-5 (2017)

https://github.com/qupath/qupath

ATMAD

https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-018-2111-8 (2018)

Core segmentation and Registration for fast TMA De-arraying

https://ieeexplore.ieee.org/document/7164147 (2015)

TMAmap

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3189244 (2011)

others even older, I think safe stick to the last decade.

Connect's API "QAQC"

What is checked in each API call?

We all agree to reserve QAQC to what an API call checks and Consistency Check for dictionary checks, pre-analytical data wrangling, or anything that calls an external service (in the old days this was sometimes described as "database enrichment") ?

Discussion:

  • QAQC embeds firestore (data ingestion API level),

  • Consistency embeds Big Query,

  • Vijay check for asynchronous "QC" dashboard - can it be all moved to in-browser solution (as we had suggested for QAQC ...)

Nicole's list:

https://nih.app.box.com/file/818145170951 let's comb through it

Jonas Almeida @jonasalmeida Sep 01 16:30

@/all - Today we had good technical discussion at the end of the data systems meeting. I thought Steven made an important point that we may want to clarify in our own descriptions of the data workflow: that we have been calling QA to fastidious pre-analytical integrity checks. The defining distinction is very practical and is the source of their repeated questions about why don’t we make “QAQC" part of our data ingestion process, i.e. at the API level. Indeed in most people’s minds QAQC at the API level is something much more granular, such as checking that critical variables have values, fix case sensitive variable names, or making sure the JSON structure can be parsed.

To avoid that confusion, I’d like to propose we call “Pre-analytical Consistency Checks” (for short Consistency Checks) to running code that verifies consistency with data dictionary. Since this is now safely decoupled for the API checks, it may also be a good place for true statistical checks, including generating sumary statistics for batch submissions, flagging value distributions outside a reasonable range, etc - which may be of interrest to the sites on its own right.


...