You can learn a lot about someone by the books they read. Here's a taste of what I've been thinking about lately. You can also scroll further down if you're curious about research publications I've been reading.
Updated Apr. 27, 2021
The End of Alzheimer's: The First Program to Prevent and Reverse Cognitive Decline || Dale Bredesen (2017)
Alzheimer's Disease (AD) is the most prevalent form of dementia, affecting roughly 1 in 9 Americans above the age of 65 years-old. Due to demographic trends, a tsunami of AD cases is expected by 2050 whereby 160 million people will be affected worldwide. The dominant theory of AD stipulates that the build-up of plaques composed of amyloid-beta protein leads to neurotoxicity and, eventually, to cognitive decline. This so-called amyloid hypothesis has served as the basis for AD drug development for the past decades. Despite resounding success of these drug candidates in pre-clinical studies based on animal models, the much lower success rate of 0.4% for human clinical trials has dampened many researchers' enthusiasm for the amyloid hypothesis as the guiding paradigm for AD drug development.
This apparent failure of the amyloid hypothesis comes with little surprise to UCLA professor Dale Bredesen. Simply, build-up of amyloid plaques is not the causal driver of AD, but rather a by-product of the disease. The question naturally becomes: Then, what ARE the causal drivers of AD?. In this book, Bredesen offers a useful categorisation for thinking about AD etiology: Type 1 hot AD type characterised by excessive neuro-inflammation, Type 2 cold AD due to brain atrophy and Type 3 toxic AD as a result to exposure to toxic substances (and even a Type 1.5 glucotoxic AD). The point is not to consider these types as mutually exclusive, but rather to associate genetic predisposition and lifestyle to typical patterns of AD pathology. Indeed, the unfolding of AD within a patient is so complex that no single drug represents a suitable remedy. In fact, Bredesen compares treatment of AD as patching a roof with up to 36 holes thus requiring up to 36 patches. In other words, there are 36 different known mechanisms which contribute to AD pathology and a whole multi-faceted therapeutic program is required. This is where the ReCODE (Reversal of COgnitive DEcline) program steps in.
Bredesen proposes ReCODE as the first program to effectively reverse AD from a programmatics precision medicine paradigm. It involves identifying which of the 36 holes need to be patched for each person using laboratory results. These lab values in combination with patient phenotype then dictates the patient-specific ReCODE regimen. The treatment might involve anything from vitamin and hormonal supplements to a fasting-based ketoflex diet, cognitive training and sleep tracking (there are up to 36 holes in that roof after all). Identifying potential exposure to toxins like mercury or mold is also crucial. This might all sound a bit complicated, but what's the bottom line? For the above 200 subjects who've followed ReCODE, those that have been systematic in following the program and adjusting based on lab values have been capable of reversing cognitive decline and biomarkers of AD. Reading testimonials and case studies in the book makes one very hopeful!
I very much enjoyed reading this book as it provides a larger-view holistic perspective on the disease and its treatment. ReCODE does have the disadvantage of requiring active patient participation over at least the 3 months test period. This would obviously be well worth the effort considering the potential of actually reversing cognitive decline, and that's the whole point in the first place. However, ReCODE may be difficult to implement in patients with late-stage cognitive decline who have difficulty understanding instructions, processing feedback and communicating uncertainties. Perhaps Bredesen's follow-up book (published 2020) can provide guidance for the most difficult cases.
Deep Medicine: How Artificial Intelligence Can Make Healthcare Human Again || Eric J. Topol (2019)
Physician Eric Topol explains the current state of medicine where a lost physician-client connection is not only costing lives, but also trust in modern medical practices altogether. However, despite artificial intelligence (AI) often being regarded as dehumanising, the author promotes AI in medicine as an avenue for relieving physicians and clinical staff of easily automatable and freeing doctors to engage in more meaningful relationships with their patients. Topol provides insightful accounts regarding how different AI algorithms are demonstrating improved performances over current practices, and the fields of medicine most impacted by advances in AI. I thought Chapter 12 neatly summarised the potential of AI in medicine with an example of how a user may interact with his/her own Virtual Medical Assistant. The exchanges between user and Assistant are eery, and even better, possible in a not-so-distant-future. A nice touch.
The Molecule of More: How a Single Chemical in Your Brain Drives Love, Sex, and Creativity -- and Will Determine the Fate of the Human Race || Daniel Z. Lieberman & Michael E. Long (2018)
Motivation is key to success. It might thus come to no surprise that our brain contains neural circuits underlying motivation and reward anticipation. In this book, dopamine is highlighted as a key ingredient, coursing through mesolimbic and mesocortical dopaminergic circuits. The so-called Molecule of More thus blesses us with a drive towards accomplishment, although its circuits are susceptible to being highjacked by substance abuse or thwarted in neuropsychiatric disorders. The book is a short and straightforward read which blends findings in neuroscience with prose-like descriptions of how dopamine drives our behaviour in everyday life.
Why We Sleep: Unlocking the Power of Sleep and Dreams || Matthew Walker (2017)
This was a must-read for me. The study of sleep is a prime example of the overlap between electrophysiology, human health and altered states of consciousness (yes, I've got my eyes on you psychedelics). It was also a great continuation to The Brain from Inside Out (see below) as it adresses the role of different electrophysiological events (e.g. K-complexes, delta waves, sleep spindles, etc.) in promoting sleep. Beyond electrophysiology though, sleep is argued to be the most neglected pillars of health. Sleep neglect has such wide-reaching consequences that one (i.e. me) cannot help but feel a sense of "over-generalisability": every health problem seems to have a source in sleep neglect. Paradoxically, I'd come less appreciative of the importance of putting-in the required 7-9 hours of quality sleep. But you're reminded by Prof Walker that that's the whole point. DON'T miss out on sleep because it DOES have long-term effects on diabetes, hypertension, Alzheimer's disease and, of course, being tired. And that's not all to say that Walker shies away from going into specifics. One of my favourite segments from the book discusses the elevated concentrations of norepinephrine during sleep in PTSD patients. This neurotransmitter mitigates REM sleep's ability to promote recovery from traumatic experiences. Walker describes experiments using adrenergic antagonists for re-establishing healthy sleep in PTSD patients by giving them back their much needed REM sleep for emotional coping. As a result of reading this book, I've started limiting my exposure to blue light before bedtime; I'm sure anyone reading this book will find easy solutions to enhance their sleep quality and quantity.
This one's for the brain-wave geeks and embodied cognition phenomenologists. Buzsáki invites readers to suspend their "outside-in" views of the brain. Our empiricist inclinations to think of brain activity as responsive to external stimuli misses out on our actively predictive and outwards-oriented mental machinery. Sensorial perceptions do not simply arise from exposure to perceptible objects -- as common sense would have it -- but rather our brain seeks out perceptions by way of coordination of motor commands. From the "inside-out" point of view, perception thus results from action and movement. This thesis which comes in different flavours (the Fristonian Free Energy Principle variant being my favourite) does not however originate from Buzsáki's work. The book's mind-shifting material comes from the set of neuroscientific observations which has lead the author to arrive at the "inside-out" perspective of the brain rather independently from other thinkers. The book is neuroscience-heavy regarding these observations, so I'll only mention one aspect of what makes the brain operate inside-out-edly (ouff...). Even a brain at rest is an active one. At all times, oscillations exhibit coordinated communication according to a neural syntax, a song and dance maintaining the brain in a state of self-organised spontaneous activity. The brain's inherent spontaneity keeps it in constant planning and motion, searching for cues from the external environment to incorporate into the self-pacing drive as an adaptive coping mechanism. Maybe that wasn't all too clear, but if you've had the slightest inkling as to what it might mean to how the brain works, then this book is for you.
Gene regulatory networks (GNRs) are dynamic, meaning that the interaction between the molecular elements of its network evolves over time. This review article explains some of the main trends for inferring these mechanisms from multi-omic data. Three general classes of approaches for reconstructing regulatory networks are described: correlation-based methods, regression-based methods and methods based on probabilistic graphical models. The latter approach often makes use of Hidden Markov Models (HMMs) for inferring graph structure and its evolution over time. Moreover, despite the dynamic nature of these regulatory networks, static information can be incorporated into different models for enrichment.
The underlining pathophysiology of Alzheimer's Disease (AD) usually develops up to decades in advance of clinical symptoms. Early diagnosis and stratification of mild cognitive impairement (MCI) and AD are thus essential for disease management. Although there are promising avenues in this respect using brain imaging, a more accessible and cost-effective means is warranted. Mobile and wearable devices represent one type of solution as these are already abundant and available as consumer products. Moreover, the ability to embed biosensing technologies provides physiological grounds for longitudinal of MCI and AD. In this article, authors review the vast array of devices which can track behavioural, biological, executive and language-related variables. One factor which caught my particular attention were modalities capable of tracking the integrity of the Autonomic Nervous System (ANS). This approach is grounded on the relation between AD and deficiencies in acetylcholine, a neurotransmitter directly involved in the regulation of the ANS. Deficiencies in acetylcholine lead to disregulations of the ANS which can, for instance, compromise the interaction between the brain and heart which is mediated by the vagus nerve. A hallmark of this process is a reduction in Heart Rate Variability (HRV) which can be sensed with an electrocardiogram (ECG) or photoplethysmography (PPG).
A new classification strategy for Alzheimer's Disease is gaining popularity, based on the presence of Amyloid-beta (A), Tau proteins (T) and signs of neurodegeneration (N) - i.e. the AT(N) classification. The AT(N) classification thus leaves out vascular burden (V), a significant contributor to AD pathology. The authors propose a revision to the AT(N) classification - AT(N)(V) - and provides a short review of the evidence and brain imaging techniques for assessing brain vascular burden. The main hallmarks such as white-matter hyper-intensity, lacunes and microbleeds are all detectable with MRI sequences and thus accessible to many research and clinical institutions.
Pathology related to Alzheimer's Disease (AD) may arise years before any clinical symptoms. It thus becomes vital to properly diagnose AD at its earliest stage based on biological biomarkers. Although brain imaging techniques such as PET and MRI already exist to surveil the presence of Tau and Amyloid-beta, these modalities are quite expensive (and even invasive in the case of PET). Spinal taps are another invasive option to detect AD-related proteins in CSF. Fortunately, research from the last few years demonstrate the feasibility of detecting phosphorylated Tau species (pTau) and Amyloid-beta using blood tests. Blood tests for AD diagnosis bears the potential to diagnose AD years before manifestation of clinical symptoms.
Variational Deep Embedding (VaDE) employs Variational auto-encoders (VAEs) with Mixture of Gaussian (MoG) priors to infer clusters from the learned embedding (see previous entry for a description of VaDE). VaDE was shown to be effective on MNIST images of hand-written digits which are static, meaning that these images hold no information regarding temporal sequences (I mean, they're images after all). However, it is often the case that the data to be clustered is temporal in nature, such as videos or even longitudinal clinical measures. Hence, some representation of such sequential data needs to be captured in order to feed this representation to the VaDE algorithm. This is exactly what is proposed by the authors of this article by using LSTM networks to learn this representation of sequential data. Due to the ability of LSTMs to learn representations based on recurrence within the data, they name their technique Variational Deep Embedding with Recurrence - or VaDER. The authors demonstrate the merits of VaDER when applied to simulated and benchmark sequential data for validation purposes. They then use VaDER to learn clusters of clinical patient trajectories for Alzheimer's Disease (AD) and Parkinson's Disease (PD) based on longitudinal clinical assessment scores. As a result, the authors were able to suggest the stratification of AD and PD patients into three sub-categories, or clusters. The results are compelling, considering that the VaDER model was only trained on clinical scores and thus ignored other relevant disease biomarkers derived e.g. from brain imaging and spinal taps. Another innovation of theirs is to incorporate data imputation within model training, allowing missing data to be learned in parallel to the embedding clusters. I will definitely read more material from this group.
Variational auto-encoders (VAE) are often leveraged to learn low-dimensional embeddings of data. Importantly, Bayesian priors are used to constrain these embeddings such that new embedding vectors can be sampled. Gaussian priors are the most common in VAE and result in elegant closed-form solutions to the lost function, i.e. the evidence lower-bound (ELBO) to the posterior distribution which encapsulates a reconstruction loss and a latent loss. The latent loss tracks how (dis)similar is the embedding space to the prior and is thus responsible for shaping the embedding space. Moreover, since VAEs learn easy-to-use embeddings, they can be used to generate new data. In this paper, the authors propose an efficient way of constraining VAEs such that they learn an embedding space which allows clustering of observable data. Specifically, a Mixture of Gaussians (MoG) prior is used such that each Gaussian distribution is associated with a cluster. This Variational Deep Embedding algorithm - or VaDE - is tested against MNIST data (which is a collection of images of hand-written digits) with good results. Thus, the rather simple innovation of replacing standard Gaussian priors with MoG goes a long way, allowing to cluster data using VAEs and even generate new data belonging to these learned clusters.
Alzheimer's Disease (AD) involves a variety of biological factors such as brain accumulation of misfolded proteins, disruption of cerebrovasculature, altered brain metabolism and excessive neuro-inflammation; AD can thus be said to be multifactorial. Often implicit within a multifactorial view of AD is its corollary multiscale aspect, whereby some biological factors evolve over microscopic scales whereas others evolve over macroscopic scales - and anything in between. Biological factors interact throughout these spatial scales, the complexity of these interactions significantly contributing to the burden of AD in terms of pathology and treatment outcomes. The Neuroinformatics in Precision Medicine Lab (NeuroPM) had already published their Multifactorial Causal Modelling (MCM) framework for quantifying and predicting interactions between biological factors based mostly on brain imaging features. However, it is unfeasible to fully capture AD's multiscale nature with brain imaging alone, thus requiring an extension to MCM.
In this manuscript, MCM is extended by incorporating gene expression data (i.e. transcriptomics) as modulator parameters within MCM. This Gene Expression MCM (GE-MCM) variant thus represents an important step in closing the gap across spatial scales of AD pathology. It aims at describing which transcriptomic signatures drive inter-factor coupling from a longitudinal patient-specific standpoint, with downstream effects on cognitive capabilities. Unearthing genes whose expression mediates multifactorial interactions allows to then dig into the pathways associated with these genes. The main pathways found in this study were related to apoptosis, cholecystokinin receptor signalling, inflammation mediated by chemokine and cytokine, and gonadotropin-releasing hormone receptors. The authors provide insightful figures for linking gene expression to biological factors (themselves driving alterations in other biological factors, but, granted, that sounds a bit convoluted). It is also argued that their results are consistent with previous findings, owing credence to GE-MCM as a methodology which successfully "decodes the genetic mediators of spatiotemporal macroscopic brain alterations with aging and disease progression".
The ability to sequence RNA at the single-cell level (scRNA-Seq) is redefining the precision medicine landscape. However, non-biological factors - or batch effects - can hinder our capacity to leverage scRNA-Seq for studying disease-related genetics. The authors propose single-cell GAN (scGAN) to learn useful embeddings of scRNA-Seq data while also controlling for batch effects. The scGAN model is composed of a variational autoencoder network (VAE) to learn such embeddings and a discriminator network which models batch effects using the embeddings as input. Embeddings can then be used for clustering as shown in this study with mouse and human pancreatic cells datasets, as well as a Major Depressive Disorder (MDD) single-nucleus RNA-Seq dataset. The authors also suggest using scGAN as a pretraining routine before performing classification of disease phenotype. First, additional NN layers are introduced to fine-tune the pretrained weights of scGAN in a supervised fashion. The partial derivatives learned from backpropagation are then used as indicators of which genes are best related to the classification performance of disease phenotype. Authors show this approach to be well-suited to the small MDD dataset (17 clinical vs 17 controls), out-performing other state-of-the-art methods. As for my previous reads, a great article for learning about molecular biology, genetics and ML/DL methods.
Neurodegenerative disorders such as Alzheimer's Disease (AD) are multifactorial and thus underly a complex diversity of genetic, molecular and cellular alterations. To resolve for such complexity, single-cell RNA sequencing (scRNA-seq) provides a detailed window into the interaction between brain cell-types, transcriptomics and AD-related phenotypes. This study demonstrates the sort of insight that scRNA-seq may deliver, such as understanding AD sex differences, evolution from early to late pathology and alterations in regulatory pathways such as myelination, inflammation, neuronal survival and proteostasis. In brief, the implications of scRNA-seq for AD are boundless. It is not surprising that this article has been cited above 400 times, despite only being published in mid-2019.
In biology, interpretability is key. It thus becomes crucial that the deployment of neural networks which capture non-linear relations within data be accompanied by a useful representation of the results. In this paper, an Embedded Topic Model is proposed for embedding single-cell transcriptomic data into separate groupings, or topics. How the genes encoded for a given cell are mixed across topics is addressed with a variational autoencoder, giving rise to a probabilistic model of gene and topic embedding as well the topic mixture for each cell. Practically, what this provides is an ability to aggregate combinations of genes as a function of phenotype, e.g. disease. I'll stop here, as I might need to re-read this one.
We typically think of messenger RNA (mRNA) as being produced within the nucleus and directly trafficked towards the ribosomes for protein production. In reality, mRNA can be shuttled towards other parts of the cytoplasms and even to the membrane to be transported out of the cell altogether. The central question that this study addresses is Can the subcellular localization of mRNA be predicted by its sequence information alone? Here, an end-to-end network is employed which incorporates convolutional layers, a bidirectional LSTM and an attention layer. Authors reveal some of the short sequences called zipcodes thought to mediate this localization through interactions with RNA binding proteins. A good read for those seeking to learn more about molecular biology and Machine Learning.
Graph neural representational learning of RNA secondary structures for predicting RNA-protein interactions || Zichao Yan, William L Hamilton and Mathieu Blanchette, Bioinformatics, Volume 36, Issue Supplement_1, July 2020, Pages i276–i284
Hamilton and Mathieu Blanchette, Bioinformatics, Volume 36, Issue Supplement_1, July 2020, Pages i276–i284
As I'm currently trying to learn more about graph representation learning, DL and molecular biology and have yet to adequately master these fields, feel free to completely disregard my attempt at summarising this study. Wish me luck.
Nucleotides which compose a given RNA molecule exhibit a variety of interactions. The pushing and pulling forces causes RNA to bend into interesting shapes, resulting in RNA's secondary structure. The secondary structure of RNA is important since it dictates how RNA interacts with RNA binding proteins (RBPs). As the first sentence of the article's Introduction states: "RNA-protein interactions mediate all post-transcriptional regulatory events, including RNA splicing, capping, nuclear export, degradation, subcellular localisation and translation (Stefl et al., 2005)". So yeah, RNA-protein interactions and the underlining RNA secondary structures are a big deal. In this paper, Graph Neural Networks (GNN) are used to create node embeddings of nucleotides within a larger graph embedding of the RNA molecule. The graph representation is learned via message passing by combining a convolution operator to capture spread in local covalent bonds and LSTM to update the graph representation with recurrence. The final embeddings of the graph representation are then pooled using another LSTM network to obtain a global representation of the whole RNA molecule. In the process (somewhere), integrated gradients are used to assign attribution scores to emphasise motifs contributing most to protein binding.
Hopefully, some of the fancy concepts I've just put out without any real explanation is nonetheless an incitation to take a look at this paper. I'll end by saying that Figure 5 lays out the representative sequences and secondary structure motifs for four RBPs when using GNN. If you have the visualisation skills and graphical design capabilities to render the secondary structures shown in Figure 5, I'd be curious to see how those look.
Multiway Graph Signal Processing on Tensors: Integrative analysis of irregular geometries || J. S. Stanley, E. C. Chi and G. Mishne, IEEE Signal Processing Magazine, vol. 37, no. 6, pp. 160-173, Nov. 2020
I had this one waiting in my endless should-read list of research publications. This work provides avenues for extending graph signal processing to multiway datasets which are often formatted as tensors. For me, the main insight was the Kronecker product as a way to "chain" linear operators together into tensor spaces (hopefully I got that right). The resulting multilinear transforms could then be applied to tensor formatted data for processing its different factors, modes or ways for data compression. This approach is especially useful when viewing the data at hand as a product graph, which is basically a description of how graphs describing the irregular geometries of individual factors can be embedded into one another. Kinda like a graph representation of Russian matryoshka dolls, where the graph from one factor is nested within the nodes of another. The paper also expands on using tensor representation-based penalty terms for optimization. Hard to say if I enjoyed reading this one, although I did pick up on some new concepts previously unbeknownst to me.