2020-07-15 JUL
Daniel, AI on cellular data
notes/questions from presentation:
Non-linear dimensional reduction: UMAP and tSNE (thank you Wendy!). Here's UMAP.js, put to good use in observable.
Daniel promised ref material about handling sparse data :-)
Python Function to convert a sparse Matrix to a sparse tensor will separate
the data into Train/Dev/Test
sparseMatrix -- is your sparseMatrix
ds_list -- is a list that specifies if a row in the matrix is in the
train, test, or validate dataset
ds_type -- is either "Train","Dev",or "Test" [values in ds_list]
def make_log_maxn_tensor(sparseMatrix,ds_list,ds_type):
# filter the sparse matrix... (note: cannot be a COO formatted matrix)
indx = np.equal( np.array(ds_list,dtype=object),np.array(ds_type,dtype=object) )
x0 = sparseMatrix[indx,:]
del indx
# get the log transform...
x0.data = np.log1p(x0.data)
# normalize to the max...
sklearn.preprocessing.normalize(x0, norm="max", axis=1, copy=False)
# convert the filtered matrix to COO format
x0 = x0.tocoo()
indices = np.mat([x0.row, x0.col]).transpose()
# make a sparse tensor and re-order it...
x0 = tf.SparseTensor(indices,x0.data,x0.shape)
x0 = tf.sparse.reorder(x0)
return x0
epiBox - final review by authors
https://docs.google.com/document/d/1O9geu57wnKAKG3CPPF5vfn_ku208pkuLPItVevhofnY/edit (original proposal)
https://nih.app.box.com/file/659496546706 (manuscript)
Plotly surgery, anyone?
Mortality Tracker - destill
Application Note goes out first: https://docs.google.com/document/d/1zcFAAd1SKrRbpxwAiQyKnViAHB17E-g47ZISvJm0uiw/edit
https://episphere.github.io/firstwave (manuscript)
Epidemiology Commons -
https://docs.google.com/document/d/1Vq53jcJlCPF8lEv1IxJLDU4cThBVMeWk-lrIK4tQw_U/edit (manuscript)
Concept ID
https://docs.google.com/document/d/1-9tBQMs7SItPZ7WzlPH-hvNUUlrB5UnD4fcZymlNjMo/edit (rationale)
github Apps help?
Quest
https://docs.google.com/document/d/1LKYaegz1C2tp5xZhw8VJmV6a6eyHdli_CD9fvv1A1No/edit (manuscript)
Using Novelty to select AI models and identify cohorts
https://docs.google.com/document/d/12QRQPjQcbpG9IAsLeidJbHjo_I4064HraxxEvCn0Ej0/edit
CDC Wonder (conception/evaluation stage)
https://wonder.cdc.gov (study) --> https://episphere.github.io/mortalitytracker/board/,
discuss cross-filter binding: Erika, how far can we go in Plotly?
Computational Genomics interest group,
Wendy leading
Jonas, Jeya, Gus --> Thurs 9:00
NIH Data Science Showcase
did everybody get the scheduling?