2020-07-15 JUL

Daniel, AI on cellular data

notes/questions from presentation:

GTEx (API) vs(?) GXA

SRA

Non-linear dimensional reduction: UMAP and tSNE (thank you Wendy!). Here's UMAP.js, put to good use in observable.

L1 and L2 regularization

Daniel promised ref material about handling sparse data :-)


Python Function to convert a sparse Matrix to a sparse tensor will separate

the data into Train/Dev/Test


sparseMatrix -- is your sparseMatrix

ds_list -- is a list that specifies if a row in the matrix is in the

train, test, or validate dataset

ds_type -- is either "Train","Dev",or "Test" [values in ds_list]


def make_log_maxn_tensor(sparseMatrix,ds_list,ds_type):

# filter the sparse matrix... (note: cannot be a COO formatted matrix)

indx = np.equal( np.array(ds_list,dtype=object),np.array(ds_type,dtype=object) )

x0 = sparseMatrix[indx,:]

del indx


# get the log transform...

x0.data = np.log1p(x0.data)

# normalize to the max...

sklearn.preprocessing.normalize(x0, norm="max", axis=1, copy=False)


# convert the filtered matrix to COO format

x0 = x0.tocoo()

indices = np.mat([x0.row, x0.col]).transpose()


# make a sparse tensor and re-order it...

x0 = tf.SparseTensor(indices,x0.data,x0.shape)

x0 = tf.sparse.reorder(x0)

return x0




Batch normalization

epiBox - final review by authors

https://docs.google.com/document/d/1O9geu57wnKAKG3CPPF5vfn_ku208pkuLPItVevhofnY/edit (original proposal)

https://nih.app.box.com/file/659496546706 (manuscript)

Plotly surgery, anyone?

Mortality Tracker - destill

Application Note goes out first: https://docs.google.com/document/d/1zcFAAd1SKrRbpxwAiQyKnViAHB17E-g47ZISvJm0uiw/edit

https://episphere.github.io/firstwave (manuscript)

Epidemiology Commons -

https://docs.google.com/document/d/1Vq53jcJlCPF8lEv1IxJLDU4cThBVMeWk-lrIK4tQw_U/edit (manuscript)

Concept ID

https://docs.google.com/document/d/1-9tBQMs7SItPZ7WzlPH-hvNUUlrB5UnD4fcZymlNjMo/edit (rationale)

github Apps help?

Quest

https://docs.google.com/document/d/1LKYaegz1C2tp5xZhw8VJmV6a6eyHdli_CD9fvv1A1No/edit (manuscript)

Using Novelty to select AI models and identify cohorts

https://docs.google.com/document/d/12QRQPjQcbpG9IAsLeidJbHjo_I4064HraxxEvCn0Ej0/edit

CDC Wonder (conception/evaluation stage)

https://wonder.cdc.gov (study) --> https://episphere.github.io/mortalitytracker/board/,

discuss cross-filter binding: Erika, how far can we go in Plotly?

Computational Genomics interest group,

Wendy leading

Jonas, Jeya, Gus --> Thurs 9:00

NIH Data Science Showcase

did everybody get the scheduling?