"Even in dreaming, man shapes reality."
Sigmund Freud
Recently, I had a great chance to take a lecture by Prof. Michal Irani from the Weizmann Institute of Science. She is a prominent researcher in computer vision and computer science, and her recent research interest is to understand the human brain via machine learning techniques. She gave me a talk about how to recover original images from the fMRI signal measured from the human brain, including recent progress and interesting future topics. In particular, the main technical challenges are (i) human brain signals are different from each other, and (ii) only scarce data can be used for training a machine learning model; she addressed these obstacles with techniques motivated by self-supervised learning and a clever use of foundation models. She has an ambitious plan to construct a database of human fMRI signals, aiming to extract common features from fMRI signals measured from different people. After listening to her lecture, a question has arisen for me: "If we can convert a brain signal to the corresponding image, then can we catch some signals from our imagination and visualize them through your algorithm?" Although she did not provide the full answer, she thinks the question is interesting, and her research team is preparing to investigate this direction. In this essay, we discuss how to tackle my question and its several interesting consequences.
We can think of what Prof. Irani did as solving an inverse problem, a classical yet important problem in signal processing. The key idea is simple: Our brain is a black-box oracle, which takes an image as an input and returns a brain signal, e.g., an fMRI signal. For example, if a human in the MRI machine sees an apple, then we can capture a subtle change in the brain signal associated with seeing an image of an apple. Of course, this argument implicitly assumes that the human brain has a static state only; we already know this is barely true because of our mind and memory changes as time goes.
Nevertheless, such a viewpoint motivates us to design the following somewhat insane and ethically problematic experiment: let me tie a person up to the inside of an MRI machine. We will keep his eyes open throughout the whole experiment with some artificial tears. We also give him a special drug that will block all his feelings, e.g., happiness, madness, or fright. This will prevent additional noise during the experiment. Now, this person will see every single image in the ImageNet dataset, which consists of more than 10M images, and we will measure the fMRI signals for each image. Since we now obtain all the image-signal pairs corresponding to the ImageNet dataset, we can estimate the black-box oracle inside his brain via a certain function. Furthermore, likewise to the NbNet (TPAMI 2018) or Vec2Face (CVPR 2020), we can find an (approximate) inverse of such a black-box oracle. Hence, from this approach, we can detour the data scarcity issue faced by Prof. Irani! Of course, conducting this experiment results in a strong rejection from the IRB (Institutional Review Board), and we will end up being marked as *mad scientists*...
Unfortunately, alongside ethical issues, several technical issues remain in this approach. First, the inverse problem becomes extremely hard if there is measurement noise. Even when the black-box oracle function was in fact a linear function, with simple Gaussian noise, finding the exact black-box oracle function is an NP-hard problem. Of course, we can find an "approximated" solution, but as Prof. Irina said, fMRI signals themselves contain a huge amount of noise, which was one of the technical challenges in her work. In addition, if we now assume that our brain is indeed a "state machine", which we ignored for a while, then the problem becomes different. Every animal, including humans, is sensitive to environmental changes because it would affect the animal's survival. And the human is capable of catching such "changes" because it can memorize. Hence, we can expect that even for a single person, the order of showing images will differentiate the entire fMRI signals. Of course, likewise to the movie Memento, if this person loses his memory once he sees an image, or if we can put him in that state *physically*, then we can remove the assumption of being a "state machine"; in this case, we can get much stronger rejection from IRB. Maybe this is the reason why Prof. Irani has focused more on discovering common features from fMRI signals from multiple people, not for such an unethical experiment.
Of course, solving an inverse problem for a black-box "state machine" is also an interesting topic. One naive idea is to conduct a branching over the timeline; an algorithm from this approach would terminate within a finite time. Since this is not a major focus of our discussion, let us just assume that there is an efficient algorithm to do so. In this case, what can we do, or what should we do? For example, we can upload our mind to the digital world in the form of a function returned from our *fancy* algorithm. Or, for a somewhat criminal mind, we can hijack fMRI signals from authorities and try to recover confidential information. Such questions are also interesting, but we will focus on the very first question: what if we can recover what we imagine? If the algorithm is accurate and efficient enough, then we can represent our mind in a visual form--from a snapshot to a video. That is, not only objects in our mind, e.g., concrete objects such as an apple, a banana, or masterpieces, or abstract objects such as money, power, or humor, but also our subjective feelings, e.g., joy, beauty, or frightness, can be represented in an objective, visually conceivable form. This functionality is quite powerful and intriguing. For example, I can illustrate and convey my subjective feelings of the object in my mind, e.g., unicon--a white horse-like object with a horn on the head, but once delivered and reconstructed by others, the resulting ones would be different from each other and highly affected by the prior knowledge. Bronies may think of a horse similar to one of the characters in My Little Pony, and some indians who have never seen a horse before would think completely differently from what I thought.
Analyzing the human mind has been a long-lasting and fascinating topic in humanity. Dr. Freud, who appeared at the top of this essay, proposed a systematic method to analyze the human mind through dreams, viewing dreams as a window to reflect the unconscious part of ourselves. The dream has been considered an unexplored yet mysterious regime in our lives and has been an interesting topic in various liberal arts, such as Shakespeare's "A Midsummer Night's Dream" or Salvador Dali's "The Persistence of Memory", to modern media such as Inception (2011), Arrival (2016), or Rick and Morty (2018). Here, how can our *fancy* algorithm contribute here? Since we can visually perceive objects inside the dream, we can catch some similar signals as fMRIs; hence, we can expect that our algorithm is applicable! Moreover, if we have strong computation machines, then we can make a snapshot or a QHD 144 FPS video of our dream. This will enthuse several researchers and lucid dreamers; perhaps we can study how lucid dreamers can wake up inside his/her dream from an objective viewpoint.
From this perspective, we design the following (thought) experiment. Disclaimer: This is too unethical to conduct in reality; the IRB would like to suggest mental care! Let us assume that there is a person who is taking a deep sleep. We connect our fancy machine with a fancy algorithm to visualize his dream. Since he is taking a deep sleep, he cannot perceive any outside stimulation. Now, we have an appliance to make him a quick death, e.g., a sharp guillotine to cut his head off or a high-dose sleeping pill. Then, what does he feel inside his dream? Does he fall into "Limbo" like the movie Inception, or will his world become dark all of a sudden? From an outsider's viewpoint, we will observe the latter: he will die in an absolute time stamp, and a (biologically) dead man cannot have a dream.
How about the insider's viewpoint? In our setting, the insider cannot perceive "death". However, in an absolute time stamp, he will definitely die. From this irony, we can make a hypothesis: the insider feels that time freezes, and he will be stuck in the dream forever. This resembles putting the insider into the black hole, and the outsider tries to observe him. The insider and outsider of the dream perceive the time differently; all of us have already experienced this before, and this is a cliche in literature with the dream. In addition, many experiments demonstrated that the very before we face death, our brain makes lots of neurotransmitters such as dopamine. Here, we distinguish what the receiver perceives in the dream and what his body responds to outside stimulation. This results in intense signals in our brain, and this is considered one of the reasons for experiencing his/her life flashing before the eyes. If the insider were a lucid dreamer, then sorry for him... From the outsider's viewpoint, our machine will show us an extremely fast video and stop if it cannot receive a signal anymore. One frustrating thing is that we still cannot verify such a hypothesis; we don't have a volunteer or such a fancy machine now. Nevertheless, as Dr. Freud said, if our dream represents our inside, then we would use our (thought) experiment as a medium to understand death and related phenomena.
Someday, we can demystify our mind, dreams, and death. We believe that the above discussion would broaden our understanding and open new directions. Moreover, the advancement of technology, likewise to Prof. Irani's devotion, I personally look forward to conducting these experiments without IRB issues :)