2021-11-10 NOV
Journal Club
Costless canonical transition spaces, from sequences to time series
Biological Sequences
Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids
A tutorial introduction to hidden Markov models and other probabilistic modelling approaches in computational sequence analysis.
Richard Durbin, Sean Eddy, Anders Krogh, and Graeme Mitchison.
Cambridge University Press, 1998.
ISBN 0-521-62041-4 (hardback)
http://eddylab.org/cupbook.html
Analysis of genomic sequences by Chaos Game Representation
Jonas S. Almeida, João A. Carriço, António Maretzek, Peter A. Noble, Madilyn Fletcher
Bioinformatics, Volume 17, Issue 5, May 2001, Pages 429–437,
https://doi.org/10.1093/bioinformatics/17.5.429
Sequence analysis by iterated maps, a review.
Jonas S. Almeida
Briefings in Bioinformatics. 2014 5(3):369-75 [PMID:24162172].
Time Series
- False neighbors and false strands: A reliable minimum embedding dimension algorithm
Matthew B. Kennel and Henry D. I. Abarbanel
Phys. Rev. E 66, 026209 – Published 23 August 2002
https://journals.aps.org/pre/abstract/10.1103/PhysRevE.66.026209
- Nearest neighbor embedding with different time delays
Sara P. Garcia and Jonas S. Almeida
Phys. Rev. E 71, 037204 – Published 29 March 2005
https://journals.aps.org/pre/abstract/10.1103/PhysRevE.71.037204
Succession
no, not the HBO series ... :-D
It's about things that happen one after another, in space or time, like DNA and time series
Canonical spaces
Standard way of presenting a mathematical object such that the similarity of two objects can be easily assessed by the similarity of their canonical forms.
borrowing from Jeya and Lee's arguments for interactive interpretation. Jeya argues for the encoded space to satisfy the canonical expectation.
... opening the discussion, led by Jeya
Proposed definition (Jeya :D): a canonical space is a low-dimensional representation of the input feature space where, two perceptively similar objects are quantitatively similar and two perceptively, relatively different objects are quantitatively, relatively different.
Canonical representations in different modalities:
1. Natural language
Word2vec: https://arxiv.org/abs/1301.3781
GloVe: https://nlp.stanford.edu/projects/glove/
2. Vision
Translation-, rotational-, viewpoint-, size-, and illumination-invariance.
Scale-invariance in object detection: https://arxiv.org/abs/1711.08189.
AlexNet: https://proceedings.neurips.cc/paper/2012/file/c399862d3b9d6b76c8436e924a68c45b-Paper.pdf
Data Augmentation: https://journalofbigdata.springeropen.com/articles/10.1186/s40537-019-0197-0
3. Dynamic systems
4. Genomics
Signatures of mutational processes in human cancer: https://www.nature.com/articles/nature12477.
Loose references:
TF Projector: https://projector.tensorflow.org/
https://douglasduhaime.com/posts/visualizing-latent-spaces.html <-- canonical space to represent faces?
Variational AutoEncoders: https://www.jeremyjordan.me/variational-autoencoders/
https://www.compthree.com/blog/autoencoder/
Transformers: https://transformer.huggingface.co/
Hackathon
epiVerse
The statistician's perspective - let's take our time ...
(Nicole) - https://observablehq.com/@episphere/nicole
Observable Communication with Code
"Ideas are fragile, communicate with code and be rewarded with better examples for better code, etc - full discussion at:
https://javascriptjabber.com/d3-and-data-visualization-ft-ian-johnson-jsj-507
Ian Johnson is a former Google UX engineer and data visualization engineer with ObservableHQ building data visualizations with JavaScript. He works on both the tools and the visualizations built with D3 on the web. He discusses how to use tools like D3 to tell a story using your data.
Advertising Praful: https://observablehq.com/@episphere/brazwebinar
Lung Cancer data
(Praful)
Recent conferences and summits
ML Community day: https://www.youtube.com/watch?v=-atRNuVuJFw
CRDC End-to-End Analysis: report http://bit.ly/crdc-e2e , (unFAIR? discuss, compare with observable) integration: https://linkml.io.
Chrome Dev Summit: https://developer.chrome.com/devsummit
GitHub Universe: https://www.githubuniverse.com - note code spaces and codepilot. Note also how github now describes itself as a Cloud ...