2021-11-10 NOV

Journal Club

Costless canonical transition spaces, from sequences to time series

Biological Sequences

  • Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids
    A tutorial introduction to hidden Markov models and other probabilistic modelling approaches in computational sequence analysis.
    Richard Durbin, Sean Eddy, Anders Krogh, and Graeme Mitchison.
    Cambridge University Press, 1998.
    ISBN 0-521-62041-4 (hardback)

    http://eddylab.org/cupbook.html

  • Analysis of genomic sequences by Chaos Game Representation
    Jonas S. Almeida, João A. Carriço, António Maretzek, Peter A. Noble, Madilyn Fletcher
    Bioinformatics, Volume 17, Issue 5, May 2001, Pages 429–437,

    https://doi.org/10.1093/bioinformatics/17.5.429
    Sequence analysis by iterated maps, a review.
    Jonas S. Almeida
    Briefings in Bioinformatics. 2014 5(3):369-75 [PMID:24162172].

Time Series


Succession

no, not the HBO series ... :-D

It's about things that happen one after another, in space or time, like DNA and time series

Canonical spaces

Standard way of presenting a mathematical object such that the similarity of two objects can be easily assessed by the similarity of their canonical forms.

borrowing from Jeya and Lee's arguments for interactive interpretation. Jeya argues for the encoded space to satisfy the canonical expectation.

... opening the discussion, led by Jeya


Proposed definition (Jeya :D): a canonical space is a low-dimensional representation of the input feature space where, two perceptively similar objects are quantitatively similar and two perceptively, relatively different objects are quantitatively, relatively different.


Canonical representations in different modalities:

1. Natural language

Word embeddings

Google ML Course: Embeddings

Word2vec: https://arxiv.org/abs/1301.3781

GloVe: https://nlp.stanford.edu/projects/glove/

2. Vision

Translation-, rotational-, viewpoint-, size-, and illumination-invariance.

Scale-invariance in object detection: https://arxiv.org/abs/1711.08189.

AlexNet: https://proceedings.neurips.cc/paper/2012/file/c399862d3b9d6b76c8436e924a68c45b-Paper.pdf

Data Augmentation: https://journalofbigdata.springeropen.com/articles/10.1186/s40537-019-0197-0

3. Dynamic systems

Phase space

4. Genomics

Mutational signatures.

Signatures of mutational processes in human cancer: https://www.nature.com/articles/nature12477.


Loose references:

TF Projector: https://projector.tensorflow.org/

https://douglasduhaime.com/posts/visualizing-latent-spaces.html <-- canonical space to represent faces?

Variational AutoEncoders: https://www.jeremyjordan.me/variational-autoencoders/

https://www.compthree.com/blog/autoencoder/

Transformers: https://transformer.huggingface.co/

Hackathon

epiVerse

The statistician's perspective - let's take our time ...

(Nicole) - https://observablehq.com/@episphere/nicole

Observable Communication with Code

"Ideas are fragile, communicate with code and be rewarded with better examples for better code, etc - full discussion at:

https://javascriptjabber.com/d3-and-data-visualization-ft-ian-johnson-jsj-507
Ian Johnson is a former Google UX engineer and data visualization engineer with ObservableHQ building data visualizations with JavaScript. He works on both the tools and the visualizations built with D3 on the web. He discusses how to use tools like D3 to tell a story using your data.

Advertising Praful: https://observablehq.com/@episphere/brazwebinar

Lung Cancer data

(Praful)

Recent conferences and summits