Kurzgesagt – In a Nutshell 

Sources – Impossible Machines


Thanks to our expert —

Georgia State University

- In terms of numbers, they’re mostly filled up with water molecules - the grains of sand. Water gives a cell’s insides the consistency of soft jelly and enables other things to move around easily. 


We have to simplify a lot here and therefore we take the mass fraction of water in a female body (about 50%, further below we have sources for the fractions of water, fat etc. in female and male bodies). 


If a cell weights 1 nanogram (ng), then we have 0.5 ng of water. So first you have to calculate how many molecules are in water (with the help of the Avogadro constant 6.023 x 1023) and get it down to nanograms. 


18 grams of H2O contains 6.023 x 1023 molecules. We use 18 as it is the molecular weight of water.

That means that 1 gram is (6.023 x 1023) / 18 = 3.346111 x 1022

1 gram to nanogram is (3.346111 x 1022)/ 1 x 109 = 3.34 x 1013 and so 0.5 ng are (3.34 x 1013)/ 2 = 1.673056 x 1013 = 16.730.560.000.000 


#Bianconi E, et al. (2013): An estimation of the number of cells in the human body. Annals of human biology, Vol. 40 (6)

https://pubmed.ncbi.nlm.nih.gov/23829164/ 

Quote: “Therefore, since the mean weight of a mammalian cell has been estimated to be 1 ng (Makarieva et al., 2008), for a standard body weight of 70 kg (Irving, 2007), there would be 7x1013 cells.”


#Encyclopedia Britannica (2022): Avogadro’s number

https://www.britannica.com/science/Avogadros-number 

Quote: “Avogadro’s number, number of units in one mole of any substance (defined as its molecular weight in grams), equal to 6.02214076 × 1023.”



- Almost all the other things, the rice and fruit are proteins. Several billion in total, moret than 10,000 different kinds – depending on the function of the cell. Your cells are basically protein robots, as is all life really


#Milo, R. (2013): What is the total number of protein molecules per cell volume? A call to rethink some published values. BioEssays : news and reviews in molecular, cellular and developmental biology, Vol. 35 (12)

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3910158/ 

Quote: “We can now use characteristic volumes to reach the number of proteins per cell. For an E. coli cell of 1 μm3 volume (average values often vary between 0.5 and 2 μm3 depending on growth rate and conditions), the estimates give a range of 3–4 million proteins per cell. For a haploid budding yeast cell of characteristic volume 40 μm3, the two estimates give a range of 100–150 million proteins per cell. Applying the estimated protein densities to mammalian cell volumes yields a value of about 1010 proteins per cell for cell lines with characteristic volumes of 2,000–4,000 μm3. Yet because cell volume can change several fold under different growth conditions, it is usually much more accurate to use values per unit volume rather than a total protein count per cell.”


#Beck, M. et al. (2011): The quantitative proteome of a human cell line. Molecular systems biology, Vol. 7 (549)

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3261713/ 

Quote: “From the identified peptides, we inferred 10 006 proteins (Supplementary Table S1, raw data available at https://proteomecommons.org), which is to our knowledge the by far most comprehensive proteome map of a mammalian cell line, with earlier studies reaching, e.g. 5399 proteins in U2OS (Lundberg et al, 2010) and 2859 proteins in HeLa cells (Wisniewski et al, 2009).

(...)

In this study, we determined the so far most extensively measured human cell proteome. We identified >10 000 proteins expressed in the commonly used human tissue culture cell line U2OS and demonstrate that protein discovery has reached saturation under the experimental conditions used, i.e., that further measurements of the same type would not be expected to identify additional proteins.



- In fact all solid, nonfat parts of your body are mostly made out of protein – even your bones.


The percentage of protein in the total mass (BM = body mass) is 14 to 16% . The percentage of fat can vary greatly, from 5% in athletes to 50% in severely overweight people. The largest proportion is water with about 60%.


#Duda, K. et al. (2019): Chapter 1- Human Body Composition and Muscle Mass. In: Muscle and Exercise Physiology

https://www.sciencedirect.com/science/article/pii/B9780128145937000013?via%3Dihub 


Quote: “Total body protein (TBPro) accounts for about 14%-16% of BM, that is, ~11 kg in men and 9 kg in women.

(...)

Body fat is one of the most changeable elements of body composition. It can account for 7%-10% of BM in well-trained endurance athletes and in some extremely well-trained marathon runners can account for less than 5% of BM (Costill, 1986; Noakes, 2003). On the other hand in case of pathological obesity, body fat can constitute up to 50% of BM (Alema´n et al., 2017).

(...)

At the chemical level, the two largest compartments of the system are water (approximately 60% of BM) and anhydrous fat (20%-30% of BM). Mean values of TBW have been reported to range from 38 to 50 L in men (~60% of BM), whereas in women, it is between 26 and 40 L (~50% of BM), (Chumlea et al., 2001). Women and elderly individuals have less body water, due to greater adiposity and lower muscle mass. TBW decreases with age. For instance, in individuals around 60 years of age, it comprises 55% of BM in case of males, and 45% in females.



- This is how this language works in a nutshell: It all begins with amino acids, tiny organic molecules. They’re the alphabet of the language of life. There are 21 different ones, like different letters. Amino acid a, amino acid b, c and so on.


We are talking about the amino acids that are found in the human body. There are actually 22, but the 22nd amino acid pyrrolysine has so far only been found in bacteria and archaea. 


The 21st amino acid, selenocysteine, is a special case. It doesn’t have its own DNA code. Instead, it is produced detour from stop codons. Stop codons ensure that the “copy of the DNA” (mRNA) from which proteins are made stops at a certain point.


#Turanov, A. A. et al. (2011): Biosynthesis of Selenocysteine, the 21st Amino Acid in the Genetic Code, and a Novel Pathway for Cysteine Biosynthesis. Advances in Nutrition, Vol. 2 (2)

https://academic.oup.com/advances/article/2/2/122/4644532 

Quote: “Selenocysteine (Sec) is the 21st amino acid in the genetic code and this selenium containing amino acid is cotranslationally incorporated into selenium-containing proteins, designated selenoproteins, in response to the codon, UGA (1–3). Although UGA is normally a termination codon that dictates the cessation of protein synthesis, it is also used as a Sec codon by numerous organisms in each of the 3 domains of life: eubacteria, archaea, and eukaryotes."



- If you put around 50 amino acids together, they form a protein, which in the language of life is a word. 


There is no strict lower limit of amino acids above you speak of a "protein". However, a guideline of 50 amino acids has become established. Chains shorter than 50 amino acids are called peptides. 


#Alberts, B. et al. (2002): The Shape and Structure of Proteins. Chapter 3. Proteins. Molecular Biology of the Cell, 4th edition

https://www.ncbi.nlm.nih.gov/books/NBK26830/#:~:text=Proteins%20come%20in%20a%20wide,and%202000%20amino%20acids%20long 

Quote: “Proteins come in a wide variety of shapes, and they are generally between 50 and 2000 amino acids long.”


#University of Queensland (2017): Explainer: Peptides vs proteins - what's the difference?

https://imb.uq.edu.au/article/2017/11/explainer-peptides-vs-proteins-whats-difference#:~:text=Both%20peptides%20and%20proteins%20are,of%20amino%20acids%20than%20proteins 

Quote: “Both peptides and proteins are made up of strings of the body’s basic building blocks – amino acids – and held together by peptide bonds. In basic terms, the difference is that peptides are made up of smaller chains of amino acids than proteins.

But the definition, and the way scientists use each term, is a little loose. As a general rule, a peptide contains two or more amino acids. And just to make it a little more complicated, you will often hear scientists refer to polypeptides – a chain of 10 or more amino acids.

Dr Mark Blaskovich from the Institute for Molecular Bioscience (IMB) at The University of Queensland in Australia says approximately 50-100 amino acids is the cut-off between a peptide and a protein. But most peptides found in the human body are much shorter than that – chains of around 20 amino acids.”



- And if you put many of these protein words together, you get a sentence, called a biological pathway.


#Ross, L. N. (2018): Causal Concepts in Biology: How Pathways Differ from Mechanisms and Why It Matters. The British Journal for the Philosophy of Science, Vol. 72 (1)

https://www.journals.uchicago.edu/doi/full/10.1093/bjps/axy078#_i3 

Quote: “The pathway concept is commonly found in the biological sciences. Biologists refer to gene expression pathways, cell-signalling pathways, metabolic pathways, developmental pathways, circulatory pathways, neural pathways, and ecological pathways, just to name a few. In all of these cases the notion of a pathway refers to a sequence of causal steps that string together an upstream cause to a set of causal intermediates to some downstream outcome. For example, gene expression pathways track causal connections from genes, to their intermediate products to a final phenotype of interest.9 Signal transduction pathways track causal connections from an upstream signal, through intermediate transduction steps, to some final effect (Figure 1). Metabolic pathways capture sequences of steps in the chemical conversion of an initial metabolic substrate into some final downstream product (Figure 2) (Kaushansky [2006]).” 



- In reality this language of life is so complex that it defies imagination. 


The image below shows a section of an overview map in which the metabolic pathways of a human are shown. You can see this as an example of how complex the Pathways system is when visualized as a "map." In this interactive representation, you can click on individual elements or entire pathways and get detailed information on the substances, interactives and sequences involved. 


#Kyoto Encyclopedia of Genes and Genomes (KEGG) (2022): Metabolic pathways - Reference pathway

https://www.genome.jp/pathway/map01100

- You need to know about 8000 words to speak a human language really well.


We refer to English here. In a study that looked at how many words you need to understand, for example, certain novels or newspapers without help and at 98%, it was found that you need an 8,000 to 9,000 word-family vocabulary in reading and 6,000 to 7,000 in spoken text. 

Word-family means that one knows several derivations of a word, e.g. beside "play" also e.g. "played" or "playing". 


#Nation, I.S.P. (2006): How Large a Vocabulary Is Needed For Reading and Listening? The Canadian Modern Language Review/La Revue canadienne des langues vivantes, Vol. 63 (1)

https://www.lextutor.ca/cover/papers/nation_2006.pdf

Quote: “This article has two goals: to report on the trialling of fourteen 1,000 word-family lists made from the British National Corpus, and to use these lists to see what vocabulary size is needed for unassisted comprehension of written and spoken English. The trialling showed that the lists were properly sequenced and there were no glaring omissions from the lists. If 98% coverage of a text is needed for unassisted comprehension, then a 8,000 to 9,000 word-family vocabulary is needed for comprehension of written text and a vocabulary of 6,000 to 7,000 for spoken text.

(...)

The range, frequency, and dispersion data that were used for the division of the words into lists is thus based on lemmas and not on word-families. For example, the word-family of abbreviate contains the following members: abbreviate, abbreviates, abbreviated, abbreviating, abbreviation, abbreviations. This family consists of two lemmas: the abbreviate lemma with four members and the abbreviation lemma with two members.”



- But in the language of life there are an estimated 20,000.


Here, it is assumed that "one gene = one protein". In fact, however, there can be numerous different modifications and variants of a protein. If one were to add all these modifications and variants, the number would increase drastically. 


#Aebersold, R. et al. (2018): How many human proteoforms are there? Nature chemical biology, Vol. 14 (3)

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5837046/#:~:text=Thus%2C%20if%20a%20single%20representative,%2Dcoding%20genes5%2C6 

Quote: “Thanks to the human genome project, we can now estimate the number of protein-coding genes to be in the range of 19,587–20,245 (refs.1,3,4). Thus, if a single representative protein from every gene is used as the definition of the proteome, the estimated size is just ~20,000. This number may decrease somewhat, as it has been difficult to find an expressed protein encoded by some of these putative protein-coding genes5,6. However, if one considers that many genes are transcribed with splice variants, the number of human proteins increases to ~70,000 (per Ensembl3). In addition, many human proteins undergo PTMs that can strongly influence their function or activity. These PTMs include glycosylation, phosphorylation and acetylation, among a few hundred others (Fig. 1a), giving rise to many hundreds of thousands of additional protein variants5; furthermore, though many proteins are unmodified, some fraction of proteins are already annotated with multiple modifications (Fig. 1b). Finally, selected genes for proteins like immunoglobulins and T-cell receptors undergo somatic recombination to increase the number of potential protein variants into the billions in certain cell types across one’s lifetime7,8.”


#Ponomarenko, E. A. et al. (2016): The Size of the Human Proteome: The Width and Depth. International journal of analytical chemistry, Vol. 2016

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4889822/ 

Quote: “Following the hypothesis of “one gene = one protein,” there should be at least ~20,000 nonmodified (canonical) human proteins. Taking into account products of alternative splicing (AS), those containing single amino acid polymorphisms (SAPs) arising from nonsynonymous single-nucleotide polymorphisms (nsSNPs), and those that undergo PTMs [4, 5], as many as 100 different proteins can potentially be produced from a single gene. Of the many different terms proposed to describe protein variants [6], here, we chose “protein species” [7] or “proteoforms” [6].”



- And while the average English word has 5 letters, human proteins have an average of 375 amino acids.


One study explored the average word length via an analysis of Google Ngram Viewer data. A tool that tracks the frequency of any search word in publications between 1500 and 2019, it found that the average word length increased over the centuries and is now 5.1 letters.


#Bochkarev, V. et al. (2012): Average word length dynamics as indicator of cultural changes in society. Social Evolution and History, Vol. 14 (2), pp. 153-175

https://www.researchgate.net/publication/230764201_Average_word_length_dynamics_as_indicator_of_cultural_changes_in_society

#Brocchieri, L. & Karlin, S. (2005): Protein length in eukaryotic and prokaryotic

proteomes. Nucleic Acids Research, 2005, Vol. 33 (10)

https://www.researchgate.net/publication/7790262_Protein_length_in_eukaryotic_and_prokaryotic_proteomes

- The longest protein has more than 30,000! 


The protein consists of 34350 amino acids and is called "titin". It’s located in the muscles and can be called a “rubber band” that organizes the proteins actin and myosin. In a nutshell, the interaction of myosin and action ensures that the muscle can move. 


#Cooper, G. M. (2000): The Cell: A Molecular Approach. Actin, Myosin, and Cell Movement. 

https://www.ncbi.nlm.nih.gov/books/NBK9961/ 

Quote: “Two additional proteins (titin and nebulin) also contribute to sarcomere structure and stability (Figure 11.20). Titin is an extremely large protein (3000 kd), and single titin molecules extend from the M line to the Z disc. These long molecules of titin are thought to act like springs that keep the myosin filaments centered in the sarcomere and maintain the resting tension that allows a muscle to snap back if overextended. Nebulin filaments are associated with actin and are thought to regulate the assembly of actin filaments by acting as rulers that determine their length.”

“Molecules of titin extend from the Z disc to the M line and act as springs to keep myosin filaments centered in the sarcomere. Molecules of nebulin extend from the Z disc and are thought to determine the length of associated actin filaments.”


#SIB (retrieved 2022): ProtParam TITIN_HUMAN (Q8WZ42)

https://web.expasy.org/cgi-bin/protparam/protparam1?Q8WZ42@1-34350@

Quote: The computation has been carried out on the complete sequence (34350 amino acids).”



- For the average protein length of a human cell of 375 amino acids, you get a stunning 6.8 x 10495 of possible proteins your cells can make.


21375 = 6.8 x 10^495

#WolframAlpha.com

https://www.wolframalpha.com/input?i=21%5E375 



- Quadrillion googol googol googol googol times more than there are atoms in the universe.


One way is to extrapolate the number of atoms by the density of matter in the universe. The average density of matter in the universe is known from cosmological surveys and is equivalent to about 5-6 protons per cubic meter (WMAP = Wilkinson Microwave Anisotropy Probe. A NASA Explorer mission launched in 2001).


NASA (2014): What is the Universe Made Of?

https://wmap.gsfc.nasa.gov/universe/uni_matter.html 

Quote: “By making accurate measurements of the cosmic microwave background fluctuations, WMAP is able to measure the basic parameters of the Big Bang model including the density and composition of the universe. WMAP measures the relative density of baryonic and non-baryonic matter to an accuracy of better than a few percent of the overall density. It is also able to determine some of the properties of the non-baryonic matter: the interactions of the non-baryonic matter with itself, its mass and its interactions with ordinary matter all affect the details of the cosmic microwave background fluctuation spectrum.

WMAP determined that the universe is flat, from which it follows that the mean energy density in the universe is equal to the critical density (within a 0.5% margin of error). This is equivalent to a mass density of 9.9 x 10-30 g/cm3, which is equivalent to only 5.9 protons per cubic meter.”

It is assumed that 80% of the mass of the universe consists of Dark Matter, which would mean that only 1 proton per cubic meter of all consists of actual matter. 


#University of Oslo (2020): Strategic Dark Matter Initiative (SDI). Searching for the nature of Dark Matter combining Astro-, Astroparticle- and Particle Physics.

https://www.mn.uio.no/fysikk/english/research/projects/darkmatter/ 

Quote: “Cosmological observations show that over 80% of the matter in the Universe is made of a component that is not directly visible, coined dark matter, that most likely consists of a so far unknown type of elementary particle.”


If we now assume that 90% of all atoms in the universe are hydrogen (which has only one atom), then we can roughly say that there is one atom per cubic meter in the visible universe. With a radius of the observable universe of around 45 light-years this gives a volume of the universe in the order of 1080 cubic meters and thus 1080 atoms in the universe.


#Grochala, W. (2015): First there was hydrogen. Nature Chemistry Vol. 7 (264)

https://www.nature.com/articles/nchem.2186#change-history 

Quote: “Today hydrogen is estimated to account for 90% of all atoms in the universe, and it is essential to the material world. That includes ourselves: close to two-thirds of the atoms in our bodies are hydrogen.”



- If you untangled a cell’s DNA, it would be about two meters long. All of your body’s DNA combined into one long string, would reach to the sun and back over 20 times!


#Piovesan, A. et al. (2019): On the length, weight and GC content of the human genome. BMC Research Notes, Vol. 12 (106)

https://bmcresnotes.biomedcentral.com/articles/10.1186/s13104-019-4137-z 

Quote: “Considering a mean length in a diploid cell of 206.62 cm and the latest estimation of a mean of 3 × 1012 nucleated cells for a reference human being [38, 39], the total extension in length of all nuclear DNA molecules present in a single human individual is of about 6.20 billion km (6.20 × 1012 m) and is sufficient to cover the Earth-Sun distance (https://cneos.jpl.nasa.gov/glossary/au.html) more than 41 times. Considering a mean weight in a diploid cell of 6.46 pg, the genome weight summed across nucleated human cells would be about 19.39 g, almost the weight of 100 carats (https://sizes.com/units/carat.htm).”


- Around 1% of your DNA is made up of genes – which are basically protein dictionaries that contain all the words of the language of life your cells speak. But genes are also the building manuals for all the proteins your cells need to function. The rest of your DNA is probably not useless but basically like a set of rules.


In a nutshell: proteins are produced by making a kind of copy of certain genes. This messenger ribonucleic acid (mRNA) serves as a template for a specific protein in the ribosomes. 

In addition to DNA, which produces proteins, there are also proteins that don’t contain a "protein code", i.e. "non-coding DNA" (“ncDNA” and correspondingly "non-coding mRNA", ncRNA).


A few years ago, ncDNA was still thought to be "junk”. It is now known that it has a major influence on the structure of proteins. The last source (Figure 1) shows an example of such a network. While two proteins are translated by only one gene (gene A), the ncRNAs (miRNA, lncRNA, circRNA, i.e. different types of ncRNA, e.g. lncRNA = long non-coding RNA) of the other genes regulate and interact in the production process of the two proteins.


#National Institute of General Medical Sciences (2012): Genetics by the Numbers

https://nigms.nih.gov/education/Inside-Life-Science/Pages/Genetics-by-the-Numbers.aspx 

Quote: “More than 98 percent of our genome is noncoding DNA—DNA that doesn't contain information to make proteins. As it turns out, some of this "junk DNA" has other jobs. So far, scientists have learned that it can help organize the DNA within the nucleus and help turn on or off the genes that do code for proteins.”


#Boland, C. R. (2017): Non-coding RNA: It’s Not Junk. Digestive Diseases and Sciences, Vol. 62 (5)

Quote: “It came as a major surprise even to those working on the HGP that only ~1.5% of the human genome encodes ~21,000 distinct protein-coding genes[3]. So, what is the rest of our DNA doing? Is it simply packaging material, like wrapping paper and Styrofoam? Some initially referred to the non-coding DNA as “junk”, a concept that triggered skepticism by many observers. Many of the non-coding sequences are repeated transposable (i.e., moveable) elements that facilitate genomic rearrangements, and are of evolutionary importance.”


#Slack, F. J. &  Chinnaiyan, A. M. (2019): The Role of Non-coding RNAs in Oncology. Cell. Vol. 179 (5)

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7347159/ 

Quote: “For decades, the miniscule protein-coding portion of the genome was the primary focus of medical research. The sequencing of the human genome showed that only about 2% of our genes ultimately code for proteins, and many in the scientific community believed that the remaining 98% was simply non-functonal “junk” (Mattick and Makunin, 2006; Slack, 2006). However, the ENCODE project revealed that the non-protein coding portion of the genome is copied into thousands of RNA molecules (Djebali et al., 2012; Gerstein et al., 2012) that not only regulate fundamental biological processes such as growth, development, and organ function, but also appear to play a critical role in the whole spectrum of human disease, notably cancer (for recent reviews, see (Adams et al., 2017; Deveson et al., 2017; Rupaimoole and Slack, 2017)).”


#Adams, B. D. et al. (2017): Targeting noncoding RNAs in disease. The Journal of clinical investigation, Vol. 127(3)

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5330746/ 

- The 21 different amino acids all have slightly different charges. Some are more negative, others more positive. When your cells build proteins, they put different amino acids together in chains, basically long strings. Now, because of the different charges of the amino acids used, these strings begin to fold in on themselves. This folding process is so complex that we still haven’t completely understood how exactly it works. But in a nutshell, 1D strings become 3D structures.  


#Lodish, H. et al. (2005): Molecular Cell Biology Fifth Edition. Chapter 3 - Protein structure and function

https://cdn.preterhuman.net/texts/science_and_technology/nature_and_biology/Cell_and_Molecular_Biology/Molecular%20Cell%20Biology%205th%20ed%20-%20Lodish%20et%20al.pdf 

Quote: “A protein chain folds into a unique shape that is stabilized by noncovalent interactions between regions in the linear sequence of amino acids. This spatial organization of a protein—its shape in three dimensions—is a key to understanding its function. Only when a protein is in its correct three-dimensional structure, or conformation, is it able to function efficiently. A key concept in understanding how proteins work is that function is derived from three-dimensional structure, and three-dimensional structure is specified by amino acid sequence.

(...)

The primary structure of a protein is simply the linear arrangement, or sequence, of the amino acid residues that compose it.

(...)

The second level in the hierarchy of protein structure consists of the various spatial arrangements resulting from the folding of localized parts of a polypeptide chain; these arrangements are referred to as secondary structures polypeptide may exhibit multiple types of secondary structure depending on its sequence. In the absence of stabilizing noncovalent interactions, a polypeptide assumes a randomcoil structure. However, when stabilizing hydrogen bonds form between certain residues, parts of the backbone fold into one or more well-defined periodic structures: the alpha (𝛂) helix, the beta (𝛃) sheet, or a short U-shaped turn.

(...)

Tertiary structure refers to the overall conformation of a polypeptide chain—that is, the three-dimensional arrangement of all its amino acid residues. In contrast with secondary structures, which are stabilized by hydrogen bonds, tertiary structure is primarily stabilized by hydrophobic interactions between the nonpolar side chains, hydrogen bonds between polar side chains, and peptide bonds.

(...)

This variation in structure has important consequences in the function and regulation of proteins. Different ways of depicting the conformation of proteins convey different types of information.”


Proteins are basically 3D puzzle pieces, with a very specific shape. In the world of proteins, shape is everything. Because its 3D shape determines which areas of a protein are charged in which way, and this determines how it can interact with other proteins! 


#Lodish, H. et al. (2005): Molecular Cell Biology Fifth Edition. Chapter 3 - Protein structure and function

https://cdn.preterhuman.net/texts/science_and_technology/nature_and_biology/Cell_and_Molecular_Biology/Molecular%20Cell%20Biology%205th%20ed%20-%20Lodish%20et%20al.pdf 

Quote: “However, none of these three ways of representing protein structure convey much information about the protein surface, which is of interest because it is where other molecules bind to a protein. Computer analysis can identify the surface atoms that are in contact with the watery environment. On this water-accessible surface, regions having a common chemical character (hydrophobicity or hydrophilicity) and electrical character (basic or acidic) can be mapped. Such models reveal the topography of the protein surface and the distribution of charge, both important features of binding sites, as well as clefts in the surface where small molecules often bind (Figure 3-5d). This view represents a protein as it is “seen” by another molecule.”