Diary of an Insane Cell Mechanic

A psychologist's descent into molecular biology

Dedicated to my father: Dr. M. W. Shrager, M.D.

Copyright (c) 2000-2003 Jeff Shrager

In 1990 Pat Langley and I wrote, in the introduction to our collection on computational models of scientific discovery and theory formation, that "...an important source for models of science will come from the developmental psychology of socialization, which studies the way that a child learns to become part of his or her culture. Insights into this process may provide hypotheses about the paths through which graduate students and junior scientists become members of their scientific community -- mastering the ways of thinking, operating, and communicating that constitute the institution of science."

Around 1995 I decided that since I'd spent nearly twenty years studying the most environmentally damaging organism, I wanted to spend the next twenty studying the most environmentally positive one: phytoplankton. Phytoplankton form the base of the ocean food chain, create half of the oxygen atmosphere, and sink half of the CO2 greenhouse gases. If they go, we go!

In 1997 I left my last cognitive / neuroscience position, at the University of Pittsburgh, and joined Afferent Systems, a drug discovery start up, with the clear goals of making enough money to fund my way into environmental phytoplankton biology, and to learn biochemistry and molecular biology, since I'd never taken a biology or organic chemistry course in my life. I picked up all the standard textbooks, and read every day on CalTrain to and from San Francisco. In 1999 I took the Biochemistry, Cell, and Molecular Biology GRE Advanced Test (all one test, thank god), and did well enough to feel that I was ready to take the plunge into molecular biology. I found a lab at Stanford (actually, at the Carnegie Inst. of Washington Dept. of Plant Biology, which is conveniently located at Stanford) doing what I wanted to do: environmental algal biology, and offered to trade: my time and computer skills for free in exchange for training. I worked in the lab half time in 1999 and started full time in June of 2000.

Now, although I had (and have) in mind to leave people and cognition behind me, I realized that this was a unique opportunity to fulfill my own prediction (with Pat), as above, by conducting a participant-observer study on what it's like to be becoming a molecular biologist. I was encouraged in this by my friends and colleagues in the Psychology of Science, Mike Gorman, who was holding a workshop in March of 2001 on The Psychology of Science, and Kevin Dunbar, who had observed molecular biology labs (at Stanford, as it happens, although not the lab that I joined). As a professional cognitive psychologist and psychologist of science (to whatever extent it means to be a professional in these fields....), I could use my experiences to think about the psychology of science in a new way: From the point of view of a scientist in the making, a sort of graduate student experience in molecular biology.

So between May of 2000, more-or-less when I started full time at the lab, and December 2000, I kept a cognitive diary.

Please note:

There are about 75 entries up to 20001215 that I am slowly converting to HTML. If you would like an update when there are major new additions, send me email. (The small (u) after a date indicates that this entry has not been examined except to do cursory HTML conversion, and so may contain typographical errors, etc.)

Except for the occasional "borrowed" image, all of this text and my personal images are Copyright (c) 2000-2001 by Jeff Shrager. This work was presented publicly for the first time at the Workshop on Cognitive Studies of Science and Technology, organized by Mike Gorman and held March 24-27, 2001 at the University of Virginia. If you would like to cite any of this material, please refer to this page, and/or to:

Shrager, J. (2003) On Being and Becoming a Molecular Biologist: Notes from the Diary of an Insane Cell Mechanic. To appear in Gorman, et al. (Eds.) [working title]. [download pdf version].

And, if you use the contents of this web page, please cite its URL.

(Aside from correcting spelling and grammar, and removing some "unprintable" material, this is complete and uncut. Where I have had later thoughts on a topic, I've added separate notes with the date of the comment. Also, for reasons that I don't recall, the dates of the first and second entries are reversed. Maybe I recorded them wrong.)

Here, then, is the Diary of an Insane Cell Mechanic:


ATP is a real thing. You can pipette it out of a vial. In fact, it's used all the time in various biochemical reactions, but I mainly knew it, to this point, as merely the famous carrier of life's energy. It was an icon in a textbook, connected to a bunch of other icons by lines and arrows. But in order to do PCR reactions, and others, you pick up a vial of dATP and pipette some of it into your reaction vessel. This was for me a like meeting a famous actor on the street, or something. More, even like meeting the president, and finding out that he's just this regular guy, you know, who you can take out to lunch, and chat with, and then pipette into your reaction vessel to make PCR go.

The scale of these reactions -- both large and small at the same time -- is hard to comprehend. I am constantly pipetting single microliter quantities, and sometimes a quarter of that! You can hardly see a microliter, much less .25 microliters! It amazes me that the pipette can actually do it -- precision machining, I guess. At the same time, a single microliter of dATP, for example, at 100mM/microlters is, let's see, .1 mole, which is 6e23 * .1 or 6e22. That's, let's see 22 orders of magnitude or about a million million billion ATP molecules. Even though Bill Gates has several dollars for every year that the earth has existed, he doesn't have even a penny per ATP molecule in the drop that I can hardly see on the end of my carefully machined pipette.

Not only did I do a PCR reaction today, but it worked! Well, one of the ten that I ran worked. Who knows what happened to the others, but at least there's some confirmation that I'm doing the thing generally right. This is one big difference between computing and lab science, esp. molecular biology, and I guess chemistry would be like this as well: There's hardly anything like an interim error check. You do three days of hard work, and in the end you run a gel, and either you get a result or you don't. If you don't, who knows what went wrong! You go back and try to remember, but it's all a haze, and your notes only tell you what you meant to be doing, not what you actually did (at least not in terms of mistakes!). So having one of ten reactions work out is actually an excellent result (for me, anyhow). If none of them had worked, I wouldn't have had any idea whether I was making a serious general error, or I was just unlucky, whereas now I know that what I'm doing it generally-speaking the right thing, and that I'm probably just some combination of unlucky and sloppy.

So, what worked, you might well ask? One (out of the ten) gene recovery operations that I did on my mutants worked. The idea is that when you find a mutant that you think is interesting, you want to know what mutation it has. As it turns out, we made these mutants by inserting a transposon (actually, -LZ- made them) randomly into the genetic sequence of the cyanobacteria. This interrupts the function of one (or sometimes more than one) gene. To find out what gene that is, we first cut the genome of the bacteria up into little pieces using restriction enzymes, then use another enzyme to splice the pieces back together, sort of at random. Then use a process called "inverse PCR" to amplify anything that is now a loop and that has the transposon in it. Next we isolate and purify the amplified loops (if there are any -- this is the point at which I only had one of ten to do) and then sequence them -- which means, send them to a machine that gives you back the sequence of GACTs in a day or so. The loops will have a little bit of the gene that was interrupted in them, so we can tell, by looking in the gene database for cyanobacteria, which gene was interrupted to cause whatever effect is was that we observed that made us pick out this bacteria as a mutant to begin with. Sounds complex, and it is. There a lot of luck involved, although with numbers like 6e22, it's a pretty sure thing that you'll get a least one of most combinations, although there are lot of possible combinations as well!

Are there any cognitive lessons that I can draw at this point that are interesting? It's hard. There's so much going on, and a lot of it I only barely understand. How about major view transformations? Maybe the one about dATP being a real live thing. Also, seeing PCR really work, whereas I've only read about it in books until today. There are buffers everywhere which I only barely understand. You add them to water when you're doing a reaction, and I guess that they make the reaction go better. What do they contain? Probably different things in different cases. The lysis buffer smells really nasty. Interestingly, this is the only thing that has had a substantial smell so far. Except for my Sodium Acetate, which had to be pH'ed to 5.1 by adding a massive amount of acetic acid, and so smells like vinegar, which is pretty much what it is at this point.

[Note added 20010504: Until sometime after this entry was written, I didn't actually understand what a "buffer" was. To me they were just some mix that you did another reaction in. This is partly true, and molecular biologists tend to use the term buffer in this way, whereas in reality, the buffering property of buffers is generally quite important because they maintain the pH, or some other property, of a mixture. Buffers are generally (although not always) mostly non-toxic, although I didn't know this at the time either. There is a funny scene that takes place someplace before this entry was written where I am in a lab coat, gloves and safety glasses, carrying "lysis buffer" carefully across the lab at arm's length. Not knowing what "buffer" meant, I assumed at the time that this actually did the lysis, which was presumably dangerous! Well, I don't know what other folks must have thought, but lysis buffer, as it turns out, is pretty much just soapy water!]


To my amazement, I understand DNA extraction. The most amazing thing about this is that I only did it once before, and on that occasion I was completely lost; nearly not understanding anything that I was doing. What happened in the interim of a month or so between not understanding at all, and understanding nearly completely what I am doing -- without practice? One thing that I have practiced is the intervening physical steps: Pipetting, etc. whose performance was a chore and distraction the first couple of times through the protocol. I've done a bunch of pipetting and making up chemicals in other contexts, so that when I got back to the DNA extraction protocol, those things weren't distractions. Another thing that I have been exposed to in the intervening month is multiple repetitions of the base terms: Restriction enzymes, etc. which, although I knew them from my textbooks before coming to the lab, I've seen multiply now in multiple contexts, and either seen in talks, or overheard in snippets of conversation in the lab the use of these protocol micro-steps over and over, although I couldn't tell you one specific instance! There is, also, the fact that -LZ- has re-explained the whole protocol to me once more, and this is a very slightly different protocol than the one I used to run (though just as complex -- slightly more so, actually).

What does it mean that I "understand" the protocol now? There are several aspects. First, I'm comfortable with it now; I'm not afraid of the pieces. Second, I can monkey with the parameters a little to get things to work out a little better; Third, I understand (although usually don't think about) the mechanisms, for the most part, that go into each step. For example, I understand the odd sequence in which you start with Phenol, then use Phenol/Chloroform, then Chloroform+IsoAmyl Alcohol. The phenol is doing the work of removing the proteins, and the Chloroform is doing the work of removing the remaining phenol (because chloroform binds phenol). But I still don't understand the IsoAmyl Alcohol. I think that I could understand a textbook's explanation, though, which is more than I could have said a month ago. Another thing that is different, now, is that I made up all my own reagents. Before there was an embarrassing borrowing of reagents, mostly from -LZ-, but some from others in the lab. This is expected, and accepted when one first starts out, but -LZ- made it clear that I should prepare my own reagents this time. Why should this have anything to do with anything? I'm not sure, although by preparing my own reagents I get a better understanding of how they work. In part this understanding is imparted through the recipe books, which tells you, for example, a little at a very general level of why something has to be at a certain pH (e.g., because it won't bind the proteins otherwise -- that phrase "bind the proteins" gives me a snippet of what's going on with this reagent). Another thing about having my own reagents is that I feel like I sort of "own" the protocol now. I can monkey with pieces of it freely, although I don't do that very much.

Here's on example of a place that I did monkey with it, and what I found. It turns out that the very first extraction step either gets a lot of DNA or very little. In this step, you literally beat the cells to death with a set of glass beads in a blender, along with phenol, then you spin the whole thing down and take out the liquid on the top, which the phenol has generously left us containing just DNA and RNA. So, if you don't get very much liquid on top (the "supernatant") then, I theorized, maybe there were some cells not broken up, and so beating the whole thing again would yield some more. So I did so. And, not to my great shock, it didn't work. All you get the second time 'round is a bunch of protein muck. Apparently the work is pretty much accomplished on the first go-around. (The next time I try this, actually, I'll probably add some TE as well - there might not have been enough liquid for it to work. See! Reasoning!) Now, this is pretty shallow reasoning, but it's a lot more than I could have done a week ago!

[Note added 20010504: Phenol is actually quite dangerous, and the bead beater really shakes the hell out of the tube with cells and phenol in it. I have learned to be extremely careful about tightening the top of this tube, and putting it in the bead beater quite securly or else you get phenol all over the place. The beater has a top, but you really don't want to be cleaning phenol off of it, and the bench. Also, I never touch the beat beater without gloves on! Who knows what someone else has busted in it!]


There is something very SOAR-like about the relationship between my carrying out procedures and my understanding of them. The first time through, or the first few times through, any new procedures, I don't understand it at all -- I am, in fact, at a loss, and quite stressed about having to think at all about what I am doing, much less anything like its larger implications. All I know is that highest level goal (e.g., inactivate a gene), and an enormous number of tiny steps: pipette .25ul of A into B. I have nothing at all in between those two. As a result I find it nearly impossible to keep the whole thing on track as I can't see the tracks more than a few inches in front of my nose. -LZ- has to tell me every step. Even when there is a kit involved, which has pretty explicit instructions, I have to ask -LZ- about nearly every step: It says to centrifuge. What speed? It says to use a tube that does not bind DNA -- do we have tubes like that? -LZ- is always helpful, but seems a little short about it, as though I'm asking whether to use a spoon, knife, or fork to eat soup with. I don't think that she is actually upset at me, it's just the way she is, and a little bit of language problem.

But once I get through the procedure the first time, or maybe the first couple of times, pieces of it start coming together into what the SOAR folks would probably call "chunks." I have a bunch of chunks that I know deeply all the time very easily. For example, spinning down a set of tiny quantities (order 1ul) that I have pipetted onto the walls of a tube, so that they actually mix at the bottom. I do this all the time now, automatically. This is actually one of those procedures that is so obvious (once you know it) that they don't even tell it to you in the recipes (like holding the spoon open-side up), and I think that I now understand it well enough to be able to deploy it when I need it and correctly. And there are more complex ones, like setting up and programming a PCR reaction, that are like this. I'll get back to some of these more complex ones in a bit, because the way that I learned them has interesting aspects. The point, though, is that once I have these unit tasks compiled, I start to understand what I'm doing a little better. It's hard to distinguish how much of this is merely exposure to the overall activity, v. gaining facility with unit tasks, because they are obviously confounded, but it seems to me that by unloading some of the attention that it takes to keep myself on track, by virtue of compiling down these unit tasks, and then sets of unit tasks, etc., I've given myself both the attentional resources and calmed myself down enough, to be able to think about something in between the exact thing that I am doing right now, and the overall goal -- I can start to see the tracks through the fog a little.

Procedures I can now do easily, which are used all over:

* basic PCR (not so easily)

* Running a basic Gel

* DNA extraction (also not so easily, but not too bad)

* setting up for sequencing

* getting the sequence results and looking up the gene in the database (there is a lot of conceptual subtlety to this that I still don't understand, but the mechanics of it are compiled.)

* restriction (cutting DNA)

Another thing that helps greatly in my coming to understand what's up is making mistakes. When I make a mistake, getting myself out the problem, or, as is usually the case, -LZ- or someone getting me out of the problem, helps my understanding greatly. There are many examples of this. In one recent one, I was running a protocol from a kit, whose purpose is to clone genes. -LZ- said to select a restriction enzyme that will cut the vector, but not my gene. She suggested EcoRI. I had to figure out how to figure this out, but I had in hand my gene's sequence, so I figured that there must be some program on the web to figure this out, and, in fact, there is and it's pretty easy to find. So I put my gene into that program and then looked to make sure that EcoRI would not cut my gene, which it doesn't. Okay, so I did the whole thing, and then was supposed to check the results by running a gel.

Now, although I can run gels now easily, I am not yet sure what in all, or even in most cases I'm supposed to get as a result. So I asked -LZ-, and she showed me what to expect. HOWEVER, in showing me how to do this, she asked me which vector I was using, and I told her that I was using the T-Gem kit. "But which vector?" she persisted. Is there more than one?! I had only the vaguest idea of what she was talking about. So she showed me in the manual that there are two vectors, one of which is cut by EcoRI and one of which isn't. Which one had I used? Now, I had studied the manual for this kit VERY VERY CAREFULLY before beginning into this process. But it was suddenly clear to me that I hadn't understood a word that it was saying! There are two vectors?! Um, well, whatever one you handed me. As it turns out, I was lucky and had used the vector that is, in fact, cut by EcoRI, and the gel actually worked -- or, more precisely, the restriction worked -- or, even more precisely, one of the four copies of the restriction worked (I've learned to do everything in four copies because three of them won't work. I'm hoping against hope that after a few years I'll be good enough at this that I won't have to do this all the time.) So this time it happened to work out, and I learned a small but important fact about the use of this kit, which is that you have to keep track of which vector you're using.

But that isn't the interesting part of this story. What is interesting is that nearly at the moment at which -LZ- explained to me about the two vectors, something much larger clicked into place for me. I don't know quite how this happened, but somehow I had all the pieces of the puzzle (well, this local puzzle anyhow) in hand and identified, but hadn't put them into the frame. When -LZ- showed me the picture in the manual of the two vectors, with their various restriction sites, that was the frame for the whole procedure, and all the pieces fell right into it, and I very suddenly -- literally in a matter of a few seconds -- "saw" what I had been doing for the past day: I could see why we were cutting the vector and amplifying the gene, and ligating them together and why I had to use EcoRI. And then I understood, all in that same perceptual unit, how to figure out what to expect from the gel. Maybe this was just the first time I had actually had time to think, as opposed to feverishly cooking and being lost, but it doesn't feel like that. I think that I've been trying to think all the way along, but there just wasn't enough material to think with, or there were crucial pieces missing, or the frame was missing, or something.

There should be some way of talking about this cognitively, but I don't know that anyone at all has looked at it. It's sort of like all of the View Application steps happened at once -- they were all just sitting there, and just all fell together at once. Perhaps we need a new way of thinking about this sort of process. It's more like parts finding one another automatically when their correct ends all finally line up right. Like one of those tent poles that you can fold up and is has a bungee down the middle. Until you pull it out straight, it's all disconnected, but once you get it aligned, the whole thing connects at once. I always found those to be very elegant devices. I hate to use the evil chaos dynamics and attractors metaphor, but maybe it's actually appropriate in this case? Yuk!


There are many things in molecular biologists' toolkit that seem weird until you begin to think of the toolkit more like the toolbox carried by a plumber than like a doctor's medical bag. We assume that a medical bag will contain sensitive instruments, but not so a plumber's toolbox, which is mostly full of grimy wrenches and hammers. One of the weirdest yet simplest MolBio tools is gel purification. Suppose that you have a bunch of various DNA all mixed together. You might have gotten this by using a non-selective restriction enzyme to cut a long chain of DNA into a bunch of smaller pieces, and now you need to get the pieces apart. Usually you want to do this because you're only after one of the many pieces left. The way this is done is to run a gel electrophoresis of the mixture. DNA of different sizes moves at different speeds along an electrical field, so if you stick a bunch of random-length DNA at one end of a gelatinous surface, and then run a current along the gel, the DNA will chase the current. Since the gel is sort of viscous, it impedes the DNA, impeding the smaller fragments less than the larger ones, so that the smaller ones run fastest.

After a while you end up with the DNA spread out along the gel like so many runners who all start the race in a bunch near the starting line, but end up spread out all over the course in accord with how fast they run. This is only step one in purification: Spreading the fragments of DNA out from one another. Now, supposing that you know the size (in terms of molecular weight, usually) of the fragment you want to purify. How do you get it out of the gel? Here's the weird part: With a razor blade! That's right; you take a razor blade and literally cut out the slice of the gel that you think has the DNA of the desired size. Then there's a gel purification kit that you can use to purify DNA that's been cut out of a gel.

Now, I don't know about you, but using a razor blade to cut chunks out of a gel in order to purify the essence of life itself, seems to me, a bit crude. But MolBio is full of tools like this, some of which I'll talk about in the next note.


In his 1985 book on Science, Harry Collins tells the story of the TEA (Transverse Excited Atmospheric) Laser, demonstrating that it is often difficult to get an experimental to work, even for experts. Collins gives an account of someone's attempt to construct two versions of the laser (Collins 1985, pp. 51-78). The laser physicist in question had previous experience with many sorts of lasers, and was in contact with experts in the field, yet he had great difficulty in getting the TEA laser to work. Ultimately he found errors in his apparatus the laser worked. Collins says: "The ability of the laser to vaporize concrete, or whatever, comprised a universally agreed criterion of experimental quality. There was never any doubt that the laser ought to be able to work, and never any doubt about when one was working and when it was not." (Collins 1985, p. 84) Similarly, Paul Rabinow, in his oral history of the discovery and perfection of PCR (Making PCR, 1996), tells of the problem wherein one has to do a protocol a number of times before getting it to work.

What's going on here? I want to give an example from my own experiences with PCR, in fact. The story goes like this. My PCRs were only sort of working, even though I was meticulously following -LZ-'s protocols, and -LZ- had even checked my materials prep and PCR programming. A funny thing was happening, which was that the PCR process, which involves cycling the temperature of the reaction mixture repeatedly, was causing the liquids in the reaction vessel to evaporate! It couldn't get very far because it's in a sealed container, but when I pulled the reaction out of the cycler, the liquid was gone -- dispersed into the air of the reaction chamber, and some had recondensed on the walls or top, but anyway, it wasn't where it should have been, which is in the bottom of the vessel. -LZ- had never seen this before, and no one else I spoke to had seen it. The reaction was sort of working, but not very well, as one might expect if the thing had cycled a few times before evaporating the water. In fact, the effect was, at first, not a major anomaly to me, and so I didn't think to really push it among my labmates. On the one hand, it seemed very obvious that if you repeatedly heat and cool a liquid, some would evaporate, and at the same time it was obvious that the liquid evaporating in the cycler would have an adverse effect on the reaction. Was it right or wrong?

A clue came from someplace - I don't remember where. It used to be, I have read in books on PCR, that, presumably in order to avoid this exact problem, you put a drop of mineral oil on top of the PCR reaction, but no one has done that for years, as, somehow, the technology had gone beyond that. I assumed that something about the physical structure of the cycler had made this advance, as had, as it turns out, almost everyone else. Another clue came from the fact that early on I literally evaporated the liquid completely out of my reactions, and out of the reaction vessel by not cranking down the top of the PCR cycler, which has a sort of screw-down cap. This cap is unfortunately sort of artistically done - it doesn't look anything like a screw-down cap -- so that if you don't know it's there, you wouldn't know to do it, which I didn't. But even after I figured out the cap, the reaction mixture would vaporize; it just wouldn't leave the vessel, so I had taken to ice cooling my reactions after the run so that I'd get the liquid back - A private procedure that made up for an error. I should point out that I did mention this problem in passing to -LZ- and -DB-, and they said that it seemed abnormal, but they didn't immediately click on the problem.

So, I know that you're sitting on the edge of your seat wondering what was wrong. Okay, I won't keep you in suspense much longer. I was walking with -DB- one day, to a lab meeting, bemoaning the problem to her, which she had never heard of either, when she, sort of mused about the heated lid. Heated lid? Yeah, in fact, I think that that was turned off. Aha! She was now adamant that that was the problem, and, in fact, she was right. You have to enable the heated lid. I asked -LZ-, and she, and everyone else said, "Of course! Of course you have to heat the lid. That's the whole principle!" So, what's with the heated lid? Why wasn't I using it? What could it have to do "in principle" with PCR? When you start the PCR program, which is the only part of the process that -LZ- had NOT watched me through, it asks: "Enable heated lid?", where the default it "Yes" I had been consistently saying "No." Why? I don't know. No one told me to use the heated lid, and the reaction wasn't in the lid. I guess I just assumed that the heated lid was for something special and that it was just the way the machine was left, or something, so I'd been saying "No", don't heat the lid, thank you.

If you look on the web, you can right away find out that this is a mistake:

From www.hybaid.co.uk/tcsprint.html: Auto-Adjust Heated Lid_ The auto-adjust heated lid enables the user to undertake oil free thermal cycling without having to set the lid height to suit the tube in use.

From: www.alkami.com/reviews/prdmtube.htm: Thermal cyclers can be fitted with a heated lid, which increases the sealing ability of the cap and reduces the need for sealers like an oil overlay.

From: www.eppendorf.com/tubes/introduct.html: for manual handling in thermal cyclers with a heated lid. The design of these plates guarantees optimal contact with the lid and thus maximum evaporation protection.

From: nunc.nalgenunc.com/resource/technical/nag/DP0032.htm: In order to prevent evaporation it is crucial to have a heated lid that can apply an even pressure on the inserted PCR vessels. If there is no heated lid, or the pressure is not evenly distributed, oil must be used.

See, once you know what you're looking for, it's easy to find. Anyhow, so once I started selecting "enable heated lid", or, more correctly, NOT DE-selecting it, my PCR reactions started to go just fine and dandy! Now, what's this got to do with getting things to work by doing them three times? Well, one big difference is that you figure out little things that no one told you about, that you were doing wrong. Another is that once I've done it a few times, I've got most of the "unit tasks" down pretty well, and the little mistakes that one makes get worked out. For example, it takes some practice to put the multiple 1-to-3ul droplets of liquid all into the bottom of the PCR vessel w/o doing any number of wrong things: getting materials on the pipette tip that will transfer to another vessel, flinging the rack of pipette tips all over the place by putting the pipette body into it at the wrong angle w/o holding it at the same time, and, my most common error: getting the tip of the pipette caught on the rim of the vessel as I'm removing the pipette, which causes the vessel to bounce slightly, flinging its contents out all over the table top. This last is an example of how completely uninteresting - at least scientifically-speaking -- damaging errors can be. Other errors of this sort include getting your gloves caught in the top of the tube when you close the cap (which, thankfully usually doesn't end up in a spill), and, my favorite of all of these: the horror of autoclave tape!

Autoclave tape is a kind of masking tape that has a bunch of white lines across it which turn black when they are heated to the right temperature and time, so they tell you that the object that has been taped has been autoclaved. It's a simple and brilliant means of marking things as sterile (or not), but it has one drawback, which is that it has a nearly magnetic attraction to lab gloves! Even when removed, the slime trail left by autoclave tape will stick fast to any lab glove that brushes against it. I have already broken two bottles of (thank god, cheap and non-dangerous) chemicals by trying to put them on a shelf and then, although my fingers have let go of them, my gloves and the autoclave tape slime trail were still in amorous tryst, not to be separated, causing the bottle to follow my, now open, hand off the shelf and down several feet to either the (slate) lab bench or the (distant) floor -- in either case, far enough to cause a very spectacular crash, at which point, of course, everyone in the lab freezes long enough to find out if I'm going to run to an emergency shower, and when I don't, they each need to ask me if I'm okay, and come see the spectacular mess that I'm busily cleaning up.


-CA- has this idea that one thing that I learn to be able to do is to parse protocols, or procedures, better. I'm not completely sure what she means by this, but it might have to do with being able to "see" unit tasks that I have already learned when they appear, either in watching someone else do them, or in the manual.

There is some truth to the former of these ideas; I can, in fact, see unit tasks better when, for example, I watch -LZ- do something. And this helps me a little when she is showing me how to do something because I don't have to take either mental or physical note of every little step. But even in this case, it can also hurt me, because there can be variability in the unit tasks based upon conditions that I don't know, or she can be doing something that I "see" as a unit task, but which really isn't the one I think it is (or any unit task at all!) In both of these cases, parsing skill doesn't replace understanding, whatever that is, and can hurt my performance instead of helping it.

The other sense in which I don't think that parsing is very useful is on running protocols from a text. There's almost no parsing at all that I have to do in this case since the protocols are nearly entirely written out in detail. One thing that does help, of course, is knowing the unit tasks, and also in knowing what things are NOT made explicit in the protocols (like the spinning down the reaction mixture that I mentioned above). DNA cutting by restriction, for example, is a unit task that is written out in detail, but which I can do perfectly fine w/o the details at this point. Gel running also, and PCR probably as well, although I'm less sure of these because I've had problems with them. ((Note the issues of visibility v. invisibility of problems that differentiates how hard it is to run a task.)) And there are places where there is a procedure that isn't even stated in the protocol, but which is obviously required, such as spinning small quantities of reactants down to the bottom of a tube in order to combine them when they are stuck on the walls. But being able to parse really isn't a skill that is deployed during protocol execution. The protocols are sometimes VERY VERY explicit, telling you exactly how, for example, you have to modify a unit task in order to make the protocol work well. For example, there are various ways to mix reactants. The "typical" (default) way is to use an agitator (called a vortexer), but this can cut up single-stranded DNA, which isn't very strong, and can hurt live cells, so sometimes a protocol will say something like "mix by gently tapping" or by "inverting the tube", or some such kinder, gentler method.

Now, whether by parsing or not, certainly being able to understand the protocol in chunks is a prerequisite to attaching meaning to it. There is some psychological process that assigns meaning to named things -- like words -- or tasks with names, or with some sort of separate existence whether or not they are named. This is a very interesting constraint, if true. Moreover, this need to name and objectify things leads, I think, to a significant bias in terms of thinking of things as static unit objects. For example, I found out in a recent talk that trans-membrane proteins actually spin (not necessarily around, but at least move on their axes) quite fast in the membrane. If you watch the swirling of the rainbow pattern on the surface of a bubble, you'll see immediately that the surface isn't static. In fact, everything's moving all over the place all the time in the cell, and at quite high speeds in some cases. But we - or at least I, and this appears to be the case for most of the texts as well - seem to think of these things as mostly immobile, and so don't consider the dynamical aspects of what we are working with. I don't mean to be saying that one needs a massive revolution in terms of a dynamical systems model of molecular biology; such a thing would probably make things so complex as to render them inaccessible to reason, but it's worth laying in bed on occasion and going to sleep by envisioning the blooming busing confusion that is the life of a cell. Bertrand Russell, I think, said in the introduction to his popular rendition of relativity that if we were the size of an atom, and able to watch what was going on at that level, relativity would be the natural way of thinking of things. I try to be the size of a large organic molecule - a protein, for example -- sometimes, and watch the cell operate all around me.


Ah, the beauty of a well-run gel! I finally got up the guts, and enough plasmid DNA -- from a series of failed attempts to use the stupid plasmid extraction kit -- to actually run the next step in my greatly extended experiences with gene inactivation. The gel will be done in a half hour or so, and I'm keeping my fingers crossed that it works, in which case I'll purify the DNA from the gel, run a "blunting" reaction to trim the ends of the gene that was clipped by my restriction reaction, and then, um, transform the ligated gene, with the spec cassette inserted into it, back into cyanobacteria. With all that done -- many days, many gels from now -- I'll be able to finally test whether the cyanobacteria with the interrupted gene shows the same phenotype as the original mutant -- the crucial experiment for this whole operation. And then? Then on to the next gene, and maybe a publication? This is extremely long and intricate work!

I had another run in with the autoclave tape of death. As I was loading my gel, my fingers got some of the business side of the autoclave tape on them, and started to stick slightly to everything I touched. What a pain in the ass; I had to use two hands, or some extra fingers, anyhow, to put anything down!


Molecular biology is both work intensive and record intensive. I've come to rely heavily, as do all molecular biologists, so far as I can tell, upon three sources: My protocol book, my lab notebook, and a 3- volume set of protocols called "Molecular Cloning" by Sambrook, Fristch, and Maniatas, which everyone calls just "Maniatis" for some reason -- maybe that was the original sole author. If you lose your protocols, you might be okay, because they are mostly collected from other people. If you lose Maniatis, you just buy another, or borrow it from the lab next door - everyone has M.! The protocols in M. are so complete that people usually just read them straight out of the book. It's sort of The Joy of Cooking for molecular biology: Everything you always wanted to know about....

But if you lose your lab notebook, you're hosed, mainly because you'll never figure out what the hell is in the hundreds of obscurely- labeled tubes in the various freezers in the various boxes with the various obscure markings on them. I haven't come upon a perfect scheme for organizing all this yet. The protocols I just put in date order and then I have an index in the front that tells me what date to look at for what protocols. Also, I've given each of the important (or long) protocols a code, for example, Akiko's Gene Inactivation Protocol is called AGN, and if you look on the date/page where it lives, there are various "check points" with labels like "AGN1", "AGN2", and at the end is says "AGNX". These are labels that I'll put on tubes that have reached that stage, so that I know where they are in their chemical careers. Also, every tube has a date on it, and these refer back to my lab notebook (NOT the protocol book!) In my lab notebook, I'll have a page with that date (if I've done my bookkeeping right!) and it'll have some notes telling me what I was up to, so that I can tell what I'm working on. Also on the tube there's usually, some sort of sequence number. So a tube might say:


RV1 HP45


All this has to be scribbled in very small, and hopefully permanent penmanship on the top of the tube, which is a circle about 1cm in diameter. Then these get put into a set of boxes, each of which has inside it a 10x10 gridwork of paper place holders. My goal is to make the box contents go sequentially by date, but I have yet to get that organized. Also, there are 4-dC (4 degrees centigrade) boxes, -20dC boxes, and -80dC boxes, as well as room-temperature boxes. It's an organizational nightmare held together by a very thin thread of paperwork.

[Note added 20000823; I've got my boxes organized now and the system works great!]

A page from my protocol notebook describing a particular PCR reaction. Click to expand. Notice the protocol code (TA1) and date stamp, as well as the scribbled warnings about the heated lid!

A page from my lab notebook. Click to expand. Notice mention of tube labels (e.g., 0726 1176 xform) and protocols (e.g., p20000721)


Oh my; It's 4:30, and with the exception of a couple of hours working with a student over lunch, I've been running protocols all day. Things worked better today than they have been over the past week. I redid the "miniprep" that hadn't been working, asking -LZ- for advice at every step. There was one thing that I might have been doing a little too quickly, which might have reduced my yield. The question boiled down to when something was considered to have reached "clear." Anyhow, doing it -LZ-'s way worked a tiny bit better than my way had, and I also concentrated my poor yields from the past few days into a combined yield of about the same as doing it -LZ-'s way gave, so I felt that I had enough to cut the vectors (via restriction reactions) and then run a new gel. Gels hadn't been working for me either in the past few days, so I pre-tested everything several times before running the real thing, and it mostly worked out in the end. Then I purified (cf. "razor blade" purity above) and now, in theory, I'm ready for re-ligation of the gene and spec cassette -- um, actually, I have to run a blunting operation on the gene and then re-purify it and then I'm ready for ligation. Ugh. And this is only ONE GENE OF MANY -- well, of 2, anyhow. If it works, I'll go on to the next one. And if it doesn't work. Well, then I start all over again, this time probably using two genes.

DNA concentration is one of those scary and magical procedures. What you do, in theory, is precipitate the DNA from solution (DNA is soluble in water), centrifuge it into a tiny pellet at the bottom of the tube, pour off the liquid, and then re-suspend the DNA in water (actually you use TE, which is mostly water). Sounds easy, and it is! But here's what really happens: You take, say, 1ml of DNA in water, which is really 1ml water with about a microgram of DNA in it. To this you add about 3ml of alcohol and a little Sodium Acetate. Nothing visible happens at this point. Next you spin it hard - full speed for 15 minutes at least. As a result, pretty much nothing happens. (If you have a LOT of DNA in the tube to begin with, and you look really really carefully, you'll see a tiny yellow spec at the bottom of the tube, but in most cases, it's nearly invisible. Next comes the scary part, pouring off the supernatant. If you have taken it on faith so far, that you have precipitated and spun out your DNA, then at this point you can take it on faith that you're not about to pour all of your hard-won DNA out into the sink! So you cross your fingers and pour out the tube, and then look into it again and pray that, even though you still can't see anything, you haven't just poured out your DNA. Next, another act of faith: Wash the DNA with another little bit of alcohol. This one's even worse because the alcohol you use this time is only 75%, the other 25% being water. Recall that DNA dissolves in water! So you fill the tube with alcohol and water, and then pour it out again, and cross your fingers really hard this time! Okay, so then you dry the tube and fill it with water (or TE, or whatever) and then let it sit overnight. (This time you put less solution in than you had originally - thus, the result is a higher concentration of DNA in solution.) And in the morning, voila!, a higher concentration of DNA! (Or, if you've done something wrong, none at all!)

Someplace I should talk a little bit about the computer tools that I use to figure out which restriction enzymes to use. There are a million -- well, nearly a hundred common ones, anyhow -- restriction enzymes. These are the monkey wrenches of molecular biology -- they loosen the nuts that hold DNA together, and so effectively cut it. The reason I call them the monkey wrenches instead of, for example, the scissors, is that, like wrenches but unlike scissors, REs come in many varieties, and you have to use the right one for the right job. Similarly, ligation reactions are the duct tape of molbio, or, more precisely (if precision in such a metaphor has any meaning!), the ligation enzymes are the duct tape. Actually, they are more like the welding torches, since the enzyme itself doesn't end up on or around the DNA the way that duct tape does. Actually, the connection, when it works, is complete -- a molecular reconstruction as though the ends had never been apart. When you have what are called "sticky ends", ligation is an easy operation. Sticky ends have matching overhanging DNA, like GACT on one strand, and CTGA on the other, so that they "stick" together. Some of the restriction enzymes cut blunt ends, and some cut sticky ends. You usually aren't lucky enough to be able to get sticky ends right where you want them, so you have to do a blunting operation, which fills in the overhang and then you can do a blunt end ligation. But a blunt-end ligation is quite a bit less efficient than a sticky end one, in large part because once you've got blunt ends, they'll stick to anything, including themselves! So in addition to the ligation you want, you get a bunch of self-ligations where one end of the DNA sticks to the other end! And these are more likely than the desired ligation because, of course, when you're holding one end of the snake in your hand, the closest thing to you is the other end of the same snake. So, anyhow, you try to get compatible sticky ends if you can.


This is an incredibly planning-intensive field. Inactivating a gene is a multi-day multi-step process, and there are a hundred constraints on either the times at which you have to do things, or on the size of the pipe through which you can shove things, and often both at once!


Short times: In E. Coli transformation you have to put a mixture into water at EXACTLY 42dC for EXACTLY 50 seconds. I kid you not! Why so exact? I have no idea. Probably they tried a lot of different values and the performance just falls off really fast when you miss those marks. How do you get water to EXACTLY 42dC? Heat it up higher and then throw tiny pieces of ice into it, stirring with a thermometer. When it's just right, the cells take the plunge. To make matters worse, there are time-sensitive exacting procedures on both sides of this one! Moreover, there's a pipe constraint: You can only put eight tubes into the holder at once to do this. So, given all the constraints, you pretty much can only do eight tubes at a time. If you have more, you have to find what I'll call a "time-out" point. That's a point where you can keep cells (or whatever it is you're working with) for indefinite periods of time while you get other things ready, or while you push things through the pipe a portion at a time. This usually consists of freezing something, or at least putting it in 4dC (i.e., in the fridge). As I get more familiar with the procedures, I can start to intersect them with one another. Mainly I do this by hooking them into one another's time-out points. This is another sense in which I "understand" these procedures -- I can see where they are going. If it's too early, if I'm still in the fog and only know the name of my destination but can't see the stations in between through the fog, then I can't make procedures hook together like this and can only do one at a time, and just hope that there's a time out point -- a station, as it were -- coming up. There are usually enough time-out points (once you know them) that you can hook procedures together pretty easily, but sometimes there's a rush to get a bunch of stuff to the next time-out point at the right time, because they are going to be used in the next step. I'm sure that PERT and GANT chart people have names for these time-out points, but I don't know that they are, or don't remember.

[Note added 20000823: Since I now "get" what's happening here, there's a nice image that goes with the 42dC for 50s. What's happening is that you've put the E. Coli into a tube that's crawling with plasmids - little circles of DNA. What you want to have happen, is for them to take these up. To do this, you have to get them to open their pores. So you heat shock them, but not so much as to reduce their breeding efficiency, because what you want them to do is to take up and reproduce your plasmids. That's what gene cloning is all about! The image is of someone in a forest full of flies, breathing really shallow through tightened lips so as not to suck up a fly. Then you punch him in the stomach - not too hard, but hard enough that he has to take a deep breath, and in go a bunch of flies!]

Expert lab technicians know how to set things up so that they get to the time out points at about lunch time and dinner time. I don't, which is why I'm here writing this at 9:30pm!

There's also a much larger sense of planning and procedure integration that goes on, relating to what the overall scientific plan is. But that's usually so far above the level of day-to-day work that there are about ten levels in between. These levels are usually defined by, or at least highly correlated with, how long things take. So, lots of things take about 1 minute or less (e.g., spinning down a mixture), 5 (setting up a single PCR) or 15 minutes (spinning down the steps in a DNA extraction), some take a half hour or an hour (e.g., an enzyme reaction) or a couple of hours, and a few take overnight (e.g., PCR amplification) a day or even a couple of days (growing up a usable liquid culture from a plate pick). One or two take a week or more (e.g., growing colonies from singles).


Today I plated 100ul of infected LB broth onto four selective plates, containing an LB base medium with 100ug/ul Amp and 25 of Spec. In principle, tomorrow morning I'll have some transformed colonies appearing on those plates. These transformations arose by adding about 2ul of T-Easy vector DNA where a gene with a spec resistance cassette inserted into it was ligated into the T-Easy vector.

[Note added 20000823: T-Easy is an off-the-shelf kit. The concept of a living thing coming in kit form is one of those weird new ways that you have to think in molecular biology. I get the impression that the experts don't think of these as living things, but just as pieces of editable mobile DNA. I also think of it that way, when I can understand at all what's going on. In fact, I don't know if anyone every actually gets to the point of thinking of the vectors as living things. I'm not even sure that they ARE living things; Certainly the plasmid isn't -- it can't reproduce outside of the E. Coli -- but I don't think that we think of the E. Coli as living either. If they were cute bunnies or cats, maybe we'd have a better sense of the nature of what we're up to! Not that I have any ethical qualms about using E. Coli in this way. I'm perfectly happy to blenderize bacteria! I'm just musing about how people's sense of what life means might change as a result of learning to manipulate it.]

Each step along the way -- each purification -- looses some DNA. When I was through with all the ligation, etc, I had only 1.5ug of DNA left total. And I split that four ways for the ligation reaction, which can happen in any number of ways, so we're talking NANO-quantities of DNA, but giga-quantities of individual reactions, and hopefully giga-quantities of cells plated onto each of those plates...and I only need ONE of these to actually work right! That is, only one among millions of cells, which will form one colony, needs to have been transformed with only one among millions of ligation reactions, which was done between only one of millions of copies of the gene, and only one of those possibly interrupted by a spec cassette. Each step along the way is very inefficient, say, 5%, so the likelihood of getting it all to go through is about .05^5, which is 3 in ten million chances of working, so in order to get three colonies, I have to run about ten millions cells through this gauntlet. Time will tell.


For the longest time I was literally afraid to carry out one part of the gene inactivation procedure. I think that it might be interesting and important to figure out why I felt this way. The procedure I couldn't get myself to carry out for, literally days!, actually turns out to be relatively simple. It's the part where you cut the gene and the vector containing the spec cassette, purify each, and then ligate them together so that you have the spec cassette inserted into the gene, inactivating it, and at the same time conferring spec resistance. Pretty neat. Part of what was keeping me from doing this is that I was having trouble getting reasonable yield of the vector containing the spec cassette for reasons I couldn't figure out. But it was an extremely simple procedure to get this, and -LZ- claimed that she had no trouble with it. So I had to work and work on getting better yield, and the fact that I couldn't get that very simple procedure to work right had me doubting that I was going to be able to get anything at all to work, so how could I go on to the next step, which was in fact much more complex than the vector preparation, when I couldn't even get the stupid miniprep and gel extraction kits to work!

So each day for nearly a week I put off taking the next step by focusing on the spec vector purification that I just couldn't get right. -LZ- even offered to give me some of her already-purified product, but taking hers would have been double trouble: Not only would I have declared defeat, but I'd have to actually continue with the protocol! Finally, I got okay, though not great yield from the damned kits, and also concentrated the combined yields from previous failed tries. And I somehow managed to get myself over the hump and into the next step, which, although I didn't have what I thought of as enough DNA, was actually pretty straightforward, and so far seems to have worked okay.

[Note added 20010509: Some large part of the fear here is (was?) a result of some combination of the invisibility of the process and the small amounts that we're working with. Since everything is pretty much just pipetting clear liquids into clear liquids to produce clear liquids, and since you can only measure, not see, the DNA, knowing that there is some DNA there isn't enough; you have to know that it's the right DNA, or things will fail down the line. However, in order to find out if you've got the right DNA, you have to use some of it up, for example in a restriction analysis, and I was getting such poor yields that it was hard for me to convince myself to give up any of it for this analysis. If I could tell that I had the right thing in hand without having to destroy some of it in the process, for example by turning it over in my hands and looking at it, I think that I'd have felt better about moving forward.]


My inactivated transformants grew! Three of them, anyhow. One on each of three of the selective plates on which I plated out the transforms from yesterday! Three out of uncountable millions -- at least I don't know how to count them! But, as I said yesterday, one is all I need!

I want to talk a little about analogy. I'm reading one of Kevin Dunbar's papers on near and far analogies. I can't think of having made an out-of-domain creative analogy yet, and I can't recall one that anyone has made in a lab meeting yet, even though we talk science all the time. I'm not sure that scientist, nor people of any ilk, engage in a cognitive process that can be fairly understood as analogy of any sort, even when what they do on the surface *looks* like an analogy. Sometimes, of course, we use explicit analogy, using the word "like", etc. But in the cases that I can think of, it's trying to explain something to someone; an explanatory analogy, not a creative one.

Now, it can be argued that most of what I've been doing is technical work, and so, since it's not discovery per se, one wouldn't expect to see creative analogies in use. But I'm also doing -- and our lab is certainly doing a lot of -- creative model construction. I think that if you were to stop me and say: "You just used an analogy there!" I'd be somewhat taken aback; "Well, so I did; ya got me!" But did some sort of analogical process take place in my head? Possibly not. How can we tell? Which came first, the creation (by some other process, such as View Application) and then the explanation (whether explicit or not)? I don't know if we can tell without (a) a very clear distinction made between these, and (b) a much better understanding of how cognition really works (i.e., in the brain) neither of which is forthcoming. These issues are part of the reason that I don't put much weight in analogy as a process, nor in the analogical debates that have been talking place in cognitive science for 20 years, and to some extent, why I don't put much weight in cognition anymore.

Explanation is a different thing altogether, because it's defined in terms of a surface phenomenon. That is, the process of explanation is the thing itself, whether I express it or not. The cognitive processes underlying explanation can be quite complex, and are poorly understood. And I don't think that the study of explanations sheds much light on them, but if you don't care about cognition -- if you care about explanation for itself, you're certainly going to be showered by them in our lab, and in my thinking. I try to explain things a lot; sometimes when something goes wrong, and sometimes when something goes right, and sometimes because that's the activity we're engaged in -- that is, trying to build models (composed of explanations) for complex processes.

Anyway, my inactivated transformants grew, so I plated them out to a new plate for safe-keeping, and have been growing them up in liquid medium at the same time, in order to get the plasmid vectors out of them (through a miniprep). The plated copies grew up fine, but my first try at the liquid culture failed. Why? Dunno. My "theory" -- my *explanation* for this -- is that I didn't warm up the liquid to room temperature, or maybe I didn't put enough on the toothpick to get them to take in the cold liquid medium. These accounts suggest that I should try again with the same medium (now that it's been warmed up; since the problem is not the medium itself, according to this explanation, even though I don't really in my heart think that this will matter!) So I did, and they worked the second time. Well, 2 out of 3 worked. What's wrong with the last one? Dunno, but the cost benefit analysis of trying to think up reasons, v. just starting over with that one, or just dropping it on the floor, since I only need one has lead me to stop thinking about it anymore. These aren't very deep explanations, but we also do the more interesting sort of "discovery-oriented" explanations, which I'll talk about in my next note.


Kevin Dunbar claims that when something unexpected happens, the first explanation heuristic is to blame the data. I haven't found this to be the case in our lab, at least not consistently. Perhaps if something OBVIOUSLY WRONG happens, then you blame the data, but that is a much deeper concept, requiring some very complex reasoning before blaming anyone or anything. For example, when we discovered that glycogen pathway enzymes were apparently involved in high light adaptive response, we said to ourselves: Hmmm, well, that's weird, we'll see if it's confirmed, but in the meantime, let's think about what could be happening. In so doing, we produced an explanation that turns out to be the one that we have stuck to, about glycogen production being an electron valve. It turned out that our discovery was confirmed -- that is, we found the same genes in the same phenotype over and over, and different genes in the same pathway, which is possibly the best possible confirmation!

In this case we didn't blame the data, nor just believe that data. Instead, we did a very reasonable thing: We thought about it, and I'm nearly certain that this is the not-very-deep response to nearly all claims of heuristic action: Scientists don't blame the data in a sort of "First blame the data!" blind rule-following way; No! Instead, what happens is that scientists think about what's going on, and the data being wrong (i.e., some kind of experimental error) is one possible explanatory operator among many.

My general reading of all this psych-of-science is that it's shallow. It looks at the surface behaviors and learned skills, and not very deeply at that. The same goes for analogical processes. These aren't psychologically fundamental processes, although they may well be educationally or practically important. Things that I put into this category include explanation and causal reasoning, theory change, analogy and its friends. Each of these, of course, relies upon fundamental psychological processes, such as memory, Commonsense Perception, etc., but the explanation processes, etc. are not in-and-of- themselves psychologically fundamental, and trying to get to the psychologically fundamental processes through these is like trying to do astronomy through a stained glass window, and taking the color of the stars to be fundamental astronomical data!


Kevin Dunbar has a theory about unexpected results, that scientists first blame the data (or procedure) and then, only when they are forced to believe that results, come up with a meaning for the results. I've argued that this is not right; that what one does with results is to form a complex cognitive construction including cost-benefit analysis and memory (e.g., of similar problems), etc. Not that one actually does this entirely *consciously*, but regardless, it's not as simple as "first blame the data," or anything of the sort.

Here's an example of some of this complexity. I'm faced today with a MAJOR error. I ran gels of the plasmid that should have contained my interrupted gene, at ~4.3kb (2kb of spec cassette + 2.3kb of gene), and the remains of the vector from which the gene was cut (at something like 2kb), but what I got was just one line at about 2.5-3kb. But these are from E. Coli that grew on restrictive media, so they must have both amp and spec resistance, so how could I not see the spec cassette!?

First, I want to make a distinction between errors that are RESULTS, and errors that are CHECKS. Results take a certain form, like a sequence annotation match, if you're trying to find a gene from sequencing, as we were when we "discovered" sll0158 and slr1176. You know when you're going to get this: At the end of a whole series of steps, each of which (or many of which, anyhow) have internal checks that must be met -- manipulation checks, in psychological terms. The thing that failed above is a check. Some checks are explicit (like this one) and some are implicit; you expect a certain thing to work a certain way, and you don't even notice unless it doesn't work that way. So I consulted a bunch of people as to what could have gone wrong. -LZ- couldn't figure out how this could be happening. -DB- had a theory that sort of makes sense, if you don't think about it too hard, which is that when I got the spec cassette I got some uncut vector (with a spec cassette in it) too, and that since that vector transforms BETTER than the modified one containing my gene, these got into the E. Coli instead of ones containing my gene, and they are what grew, and these are what I'm seeing. This only makes partial sense, but I have to start someplace, so I'm trying several work-arounds, some suggested by -LZ- and some suggested by -DB-, backing up through about a week's worth of work. Ugh!

[Note added 20000826: This is all additionally complexified by the fact that uncut, or unclean DNA moves at a different speed than cut/clean DNA of the same molecular weight, so who knows where the uncut bands really end up. People in the lab seems to have different ideas of what happens to the uncut DNA. If it's very clean, it runs faster, if not, it runs slower than cut (linear) DNA of the same MW.]


Labs are full of stuff. Stuff to the rafters, shelves all around, and to the roof on every lab bench. Our lab, with about 15 people in it, has no less than 2000 linear feet of shelf space, and that doesn't count people's desktops, and the countless yards of shelving in other rooms where warehoused stuff is stored, oh, and drawers, that I didn't even count; maybe 4000 feet of storage altogether, probably more! Some of this stuff is broken or otherwise unused or unusable, etc. But most of it is actually used pretty often.

Stuff floats from person to person, esp. common stuff, like test tube racks. Sometimes people collect a few of these, what I'll call "common" things. Not in a hoarding way; they just pick one up and put something in it, then they pick another up, then another, and before you know it, they have collected seven pieces of the same common thing. Other than paper, pens, and books there's surprisingly little "non-lab" stuff around, and few books -- just the ones that are used all the time, like Maniatis. Even for all the random stuff around, the lab is pretty efficient. In fact, the fact of there being a hundred copies of some common stuff makes the lab work smoothly. If I had to go hunting, or begging for a tube rack every time I needed one, it'd be quite difficult to get anything done; I'd spend all my time hunting for racks, or beakers, or whatever. Sometimes I do, but rarely.

Uncommon -- "special" stuff -- like chemicals and pieces of expensive equipment -- are kept in known places, and there is a code of honor about putting them back where they were when you're through. (No such code exists for common stuff. I can leave a tube rack on my lab bench indefinitely.) There are so many reagents and enzymes that our lab has a database for the very important (or dangerous) ones. The database lives on one of the computers. Unfortunately, like most databases, it's sometimes out of date, so you sometimes have to search for things anyhow. One problem that I have with this database is that I don't know what the thing I'm looking for LOOKS LIKE, so it'll say that such-and-such chemical is in a particular place, but I have to look at everything in that place anyhow, because I don't know if I'm looking for a liquid, solid, or what.

A different code exists for each type of thing; you could categorize things by the way they are treated (as well as in many other ways, of course). If I have three tube racks, someone can grab one w/o asking, but not if I have just two, and not if any of them look like they are in use (which may or may not involve actually having anything in them -- it's more like the way they are sitting on my bench -- carefully or haphazardly.) But my reagents, which I sweated blood to make, are sacred.

Each person is assigned a shelf in the various freezers, and people seem to differ on whether they care about other people putting things on their shelves. There's good reason for the "reagent privacy act," which is that, in addition to my having sweated blood over them, I often have just enough to do my own work, but also I know my reagents, and some of them are atypical, in other words, for example, reagents often have to be pH'ed -- but some people pH them before use, and others pH them at the time they are made. You have to know this in order to use a reagent, so you probably don't want to borrow someone else's reagents anyhow. Even given all these rules, anything's possible if you ask, and there's a strong sense of "we're all in this together", at least in our lab, so that, if you ask first, mostly anyone will let you do anything you like.

Some people have special responsibilities for some of the common stuff -- usually techs, or people playing "tech-for-the-day." For example, -AK-, who works on motility in cyanobacteria is, at this very moment, setting up pipette tips for autoclaving, presumably because we've run out of autoclaved ones. I don't think that she is paid as a tech; it's just one of those responsibilities that one takes up when we run out of tips. Reagents are a different story. -LZ-, who is paid as a tech, is responsible for making sure that the stock solutions for the common reagents, like TAE for running gels, is always available. -LZ- also knows what she's doing. If someone asked me to make up a stock reagent, I'd have to ask a bunch of questions.

[Note added 20010311: I have come to be responsible for the CO2 supply. There are two reasons for this. First, it requires some heavy lifting of the gas cylinders once a month or so, and I'm the biggest person in the lab. Also, the system is computer controlled, and I'm the defacto computer engineer. In fact, I can hardly understand how the system works because it is very hard to get in balance! I also have taken to cleaning up after people, esp. in the gel room and around the sinks, and in the algae house, and harassing people when they don't clean up for themselves. I don't really understand why anyone wouldn't clean up after themselves, but it seems to be an individual difference.]

Often you need to move a lot of this stuff from one place to another; your bench to, say, the spectrometer station, or some-such. One could try to balance it all, or make multiple trips. I've developed a technique of working in and out of a plastic wash tub; about 10x10 wide and 3 inches deep. At my bench, I'll put into the tub all the things that I need. Often I have to make a trip to the walk-in fridge, or the algae house to get other stuff, and I'll add them to the tub. Then I take the whole thing to the destination, and work into and out of the tub, finally returning everything to its proper place. When the tub's empty, everything is in its place, at least in theory!


Okay, I've decided to get aggressive about this! Since my last procedure seemed to fail in an uninterpretable manner, I've got a couple of tools under my belt (restriction and gels), and I'm starting to understand how these actually work, I'm going to throw the book, so to speak, at the latest results and see if I can figure out what went wrong! Specifically, I'm going to throw three different restriction enzymes at it, which I think should cut it in various ways, and see what I get. One of these is the same one I used last week, which gave me the uninterpretable results; this is a control just to make sure that I get the same (wrong) result; also, I'm going to leave these "in the oven" (37dC -- actually, a heated water bath) for a little longer; I only left it an hour last time, but I'll leave it for two hours this time, just to make sure that the things get cut, since one of the possible reasons I wasn't seeing something meaningful was that it didn't cut the gene+vector.

Basically, what happened was this: I ligated my gene, interrupted by a spec cassette which was itself cut out of a plasmid vector, into a "T-Easy" vector -- another, commercial, plasmid. To check that this worked, I tried to cut the gene back out of the vector with EcoRI. What I should have seen was two bands: one for the interrupted gene at about (+ (length gene) (length spec-cassette)) and (length T-Easy-vector), but I only saw one band at about (length T-Easy vector). Now, this is nearly impossible, since the vector was grown on Amp+Spec medium, so it must have had both an Amp and Spec resistance, so at the least it should have been (+ (length T-Easy) (length Spec-Cassette))..... So who knows what's up, but I have enough copies of the gene, and enough tools now at hand to debug the damned thing!

There's one tool that I don't have, and which I'd really like, and which *should* be the easiest one for me, which is a program that tells me where to do the cutting I'm trying to do. I have the beginnings of that tool, but it turns out to be quite a clunky thing to calculate these, as the relevant databases are only available in the form of the output of web pages, not in any reasonable representation, so I have to do a bunch of clunky text hacking in order to get the information I need. Why can't I just send a query to the database and get a result in some reasonable internal format? Is that what XML is for? Maybe I'll bother the NLM about that now!

[Note added 20000830: Everyone says EcoRI: "Eco-R-one," (not "Eco-R-Eye") but there seems to be disagreement about some others. For example, -DB- calls BglI "Bagle-one", which sounds dumb to me. And -LZ- calls SmaI "Smaaa-one", which also sounds dumb. Generally, unless the thing is very common and very pronounceable (like EcoRI), I prefer to spell out the letters, as "B-G-L-one" or "S-M-A-one", and I've heard people do this. Generally, saying things wrong is among the most embarrassing beginner mistakes one can make. It marks you as not knowing what you are talking about, unless you are a foreigner, even though these are all made-up words, so who cares! I suppose that this is because people feel like you aren't a member of the family if you don't know how to pronounce the family name. It's more an emotional than a logical thing, and I think that you can get people to change the way things are pronounced if you have enough power. Anyhow, this seems not to matter in the case of restriction enzymes; people do it both ways, and no one cringes when I do it my way.]


I've written before about the weirdness of pipetting clear liquids into clear liquids and ending up with something wonderful, for example, DNA. Well, today I pipetted water into *air*, with the firm belief that I'd end up with something similarly wonderful: The tools needed to clone a gene. Specifically, the other day I ordered two "primers", which are custom synthetic oligonucleotides that match the ends of the gene I'm after. You order them on the web, and they show up a couple of days later in a FedEx package that contains two apparently empty vials, which, so I was told in the enclosed materials, contains my custom primers. To make them useful, you just add water and stir. Literally! So, I did, and tonight I'll try to use them to amplify my gene. I'm keeping my fingers crossed.


I'm screwed! This is a complete information nightmare. I'm only working with a few genes -- in fact, only ONE gene, a couple of very common and well-documented vectors, and well known and well understood restriction enzymes, yet I'm already overloaded with information to the point that I can't compute what's going on, so none of my debugging strategies work! There are LOTS of reasons for this confusion. First off, I don't know the EXACT sequence of the gene, since I'm actually using a slightly larger form of it, extracted by -LZ-'s primers, which are at some arbitrary distance outside of the gene itself. Second, the information you get from the gel is approximate at best; the ladder (the size standard) is all bunched up where I want to read it, and the legend doesn't seem to match the ladder quite right, so I can't really tell what the sizes of my fragments are, except approximately. And, to make matters worse, if two fragments are the same size they end up in the same place. And to make matters worse yet, there is a ghost on the gel from uncut DNA, which either runs faster, or slower, or the same -- n-o one seems to know (although the consensus seems to be that it runs slower!), and to make matters even more worse yet, I'm getting a LARGER fragment when I cut it than I get when I think that I'm NOT cutting it! Ugh. So all my debugging procedures have gone to hell.


My PCR last night failed. I got nothing -- just a dull smear on the gel. This was the one with my brand new primers, trying to excise and clone SLR1176 -- my own gene! What could be wrong? The primers could be wrong. I double checked that they actually match the gene. They do. What else? Did I hydrate the primer correctly? I followed the instructions. Did I get confused and add one primer twice instead of both once? Is my H2O is contaminated? I haven't changed it in days. Did I over-dilute the primers? (From 250pmol/ul to 2pmol/ul = 125 times!)

I can't re-hydrate the primers, so I redid the dilution using double the amounts so that it's easier to ensure that some actually got into the solution, replaced my water, made sure that I did each primer once. What else could be wrong? Was the program wrong? Did it run right? I.e., Did I program the cycler wrong?

It's impossible to debug this stuff because you can't do high speed trial and error, and can't fencepost. None of my debugging skills work when I don't have a meaningful result to hang on to.

I'm rerunning it, and making more gels. Molecular biology takes a LOT of gels!

[Note added 20000830: Molecular biology is possibly the most distributed science. It requires tens of people doing the same thing before one of them more-or-less accidentally gets it right, and then the various pieces of rightness, culled from a pile of debris, are combined into a result. Most people think that this will all eventually be done by robots, but I don't think so. Sure, some painstaking yet simple mechanical tasks could be taken over, like gel loading, but even that would require a hell of an intelligent robot. There are a lot of reasons that I don't think robots could do this. Pretty much everything that I do is "fuzzy" in some way. Take the gels, for example. They are never in quite the same thickness, and the holes are never quite the same depth, and usually they are sort of twisted so that the holes don't line up. Okay, so if you had a gel-producing machine, it'd take care of these, and mixing the materials for loading is easy enough. But then when you load the gel, sometimes the material floats to the surface, and other weird effects that you'd need a very sophisticated vision system to see and understand.

Okay, that's all pretty non-intellectual, but there are more interesting issues, even just with gels. Take gel interpretation. Is it a band or isn't it? It is one band or two? What's all this smearing? Is it important smearing? Should I ignore it, or change my levels of belief of the band amplitudes based upon the smearing? And if you don't get the bands you're expecting, then what? At the moment, robots can only do standardized procedures, and are bad at doing anything that requires flexibility or in-stream interpretation. Some of this could be standardized, and made robotizable, but flexibility and in-stream interpretation is what this field is all about -- at least the lab part of it!]


I changed my water, and remade my primers, and it worked this time! I think that last time I'd actually got myself confused and put in only one of the two primers twice or something. Anyhow, at the last minute this afternoon, after the boring and poorly taught radiation course (Part I), I got my new PCR products, and decided to go for the gusto, and loaded all four vessels-full into the gel, and voila! Pay-dirt; hello SLR1176!


I'm having some problems -- a little DNA lost to gel-loading weirdness -- but in general I'm getting things done and getting results. I've even started to interleave a couple of tasks together, and I tried a new procedure yesterday (double digestion), which seems to have worked!

Today I'm purifying the products of that double digestion via gel -- which is where I lost a small amount to gel-loading difficulties -- and I've also started up the first step of a ligation for my cloning products from SLR1176 -- "my" gene! (For which the PCR amplification that I did last night -- second try -- worked!)


As I start to interlace protocols, it reminds me a little of playing a fugue. A lot of the protocols have similar steps, and you interlace these, sometimes putting similar sub-steps together. For example, it happened that several 1 minute centrifugation steps co-occurred, or were so close that I was able to make them co-occur by holding some of them up slightly, so I was able to run these all at once. (Important to label your vessels!!!) So it's as though there is a theme: Gene Hacking -- and variations: Cloning, Digestion, Ligation, etc, and a rhythm that everything goes at, and each of these varies slightly, but has its own "center", and then they get interleaved.

Now, when I'm playing a fugue I usually don't play each part separately and interlace them, rather I just know the movements that I go through, and the themes come out right by virtue of Bach being a genius. I can *hear* each theme separately if I want to -- in fact, I'm not sure that I can suppress them, and sometimes, though rarely, I can *reason* about them together ("I know where this one's going, next note is....") But rarely am I *playing* them as separate yet interlaced themes. What I'm usually playing is the whole piece as one unit. But there are those rare moments after hundreds of passes at a particular piece, when I get a deep understanding of Bach. When I *am* Bach, so that, just for a few seconds, I am controlling the music, playing through it consciously as a set of specific themes that I know, can feel, and can control; can make parts faster or slower; can see the next note and choose it consciously (and sometimes wrongly!). At these moments I, not Bach, own the fugue. Maybe this is what master musicians feel all the time.

Am I getting to that point with some of these protocols? I have so far run them rote from the book, being careful not to do anything even slightly wrong, and not make any changes in timing or amounts, but recently I've advanced to "owning" some of them, like gel running and digestion, so that I know where they are flexible -- or at least I have a sense of it -- and can put them together, just a little, into the counterpoint that is laboratory molecular biology.

(I need to get and read David Sudnow's "Ways of the Hand"!)


So, yesterday my double digestion and gel purification worked surprisingly well, but then I did it again (with the rest of the digestion product, some of which had been lost in the previous gel mess-up), and although the gel loaded perfectly this time, the results of running it were not as nice. But I extracted both of these from the gel anyway, and will see what the yield is like.

Meanwhile, back at the ranch, I'm in the last step of cloning the new gene (slr1176). I realized that last time, when I was in a fog, I neglected to make an important inference, which is that once I had cloned the gene -- that is, grown it in E Coli in a plasmid vector, I should PLATE AND STORE some of that E Coli with the plasmid+gene in it for future reference. Duh. I don't know what made me realize this all of a sudden yesterday when I was going over the steps to clone the new one. Maybe it was when I realized that I was using up all of the cloned sll0158 plasmid that I'd made last time, and that I'd have to make more, and I was at that very moment making the new one, which is a hard procedure -- the one involving EXACTLY 42dC for EXACTLY 45sec, and all that jazz -- that I realized that I should save my partial results at all times, esp. ones that are hard to come by.

This brings me to a thought about lab notebooks, a lot of my notebook is indicating partial results of protocols, what step I ended at, etc. I don't remember if Afferent's protocol thingy -- the thing that -DC- called the "lab notebook" -- has this capability, but it should. I do it by having these codes (previously described), and labeling the partial results (and even the final results) with the appropriate code, and the date, so that I can go back to my lab notebook and figure out what I was doing, and where I left off, and with what inputs I was doing it.

[Note added 20000906: -DC-'s notion of a lab notebook is trivial compared to what us working molecular biologists actually do with lab notebooks. I tried to tell him this when I was still at Afferent, and was only working part time in the lab, but he's an egomaniacal moron, and wouldn't listen. What his lab notebook does is just record protocols in an overly simplified stupid graphical representation, and then associate the protocols with samples of material. This lets you do some simple tracking of what you are doing, but I do very complex manipulations with my samples which aren't easily captured in a simple protocol-per-sample representation. Moreover, I told him a number of times that the chronological nature of notebooks is very important. You want a record of what you did when and in what order to what, as well as why. The Afferent lab notebook doesn't give you any of this capability.]


I've been after procedural UNDERSTANDING here, not so much procedural DOING or procedural expertise or skill, but my understanding has been so blockaded by the difficulties of the DOING part, that until very recently -- like this week! -- I have not really felt like I had any understanding at all of what is going on. Was it that I didn't have the handles or hooks onto which to hang pieces of content, or maybe I was just in such a fog that I didn't have the mental space, time, or energy to make the connections, or maybe I just hadn't done each procedures more than once, and so really wasn't thinking about them, or maybe there were little pieces of understanding that had to coalesce, just as little pieces of the protocols themselves coalesce into the whole in my head. Probably a little of all of these, and other factors.

But today, although things are going no better than average, I think that I can actually begin to THINK about what I'm doing at a meaningful level, instead of merely thinking about how to do the next step -- or even just how to figure out what the next step should be. Remember the train that is driving through a fog bank, where I know, but can't see the eventual goal, and I can't see more than a few inches ahead, either? Well, that fog bank is starting to clear, and I can see the tracks a little. But more importantly, I can see some of the things that are around me, and around the tracks ahead of me, as well. I'm still nearly exhausted all the time, but I don't feel both exhausted and lost -- just exhausted.


I'm starting to be able to be more aggressive about debugging, or at least re-running my protocols when they don't work very well. It's very important to be able to do fast-turn-around debugging, since you can't fencepost for errors -- well, you can, by running gels at every point along the way, but this uses up some of your materials at each check, and takes time, and sometimes the results are confusing. So usually, for a procedure that takes only about a day, it's enough to simply rerun it.

There's a lot of material building that's required in order to do this, esp. gel-making, which uses Ethidium Bromide, which is very bad stuff, even in small quantities because it intercalates itself into DNA, causing your DNA to unravel! This is, of course, why it's used, but it's also why it's very bad stuff, and even though gels have only 50ul/l of it, you don't want to get any of the gel material on you, or breath it, or anything, so there's a lot of care (and time!) taken around gels, making them less useful for debugging.


So, last time I did a transform and plated them on XGal+IPTG, with sll0158, I got three clones. This time, I've got hundreds! What did I do differently? No idea, but I'm not complaining! I guess there's an issue of how I choose which one or two of the hundred or so to pick. Maybe I'll pick four, one from each plate, just to be sure. I think I'll pick them from plates that had some blue clones as well, in order to ensure that the plate is actually selective!


So, things are starting to work better. I got better yield off my miniprep using -CWC-'s idea of using 65dC water, and 5 of six of my latest double digestions seem to have worked out. I also think that I know part of what was wrong with the gel loading, which is that I wasn't cooling the reaction before trying to load it, and the gels were pretty cold, having been in the 4dC, so the warm mix simply floated to the top. That and there wasn't enough TE in the running gizmo, so there were weird eddies. Anyhow, so with all this fixed, I might finally have got pretty good spec cassette, oh, and my SLR1176 clones came out like gangbusters (!), so I'll grow some of them overnight (and this time I'll remember to plate them out to save some!) and then miniprep those, and maybe I'll be able to try an inactivation this weekend..... ?

[Note added 20000830: The theory that the temperature was making the gel loading materials float may or may not have been bogus. Some hunting around on web pages produced another suggestion, that there is left-over alcohol in the materials. You use alcohol to precipitate DNA, so I now have this ritual of (a) heating the materials open to burn off alcohol, and (b) icing them, before loading.]


I now know a bunch of micro-procedures pretty well, like minipreps, making gels, gel running, gel extraction, digestion reactions, etc. I still do most of them with the manual or my protocol book in front of me, but I sort of know the routine, and I'm making them work a little better, now, the nth time through.

I'm wondering what I'd do if someone asked me to show them something. Would I just point them to the relevant manual? Would I give them my (messy -- needing to be cleaned up!) protocol pages and make them copy them? Would I point them to the manual and add a few words of advice? (There should be a collection of "what the manual doesn't tell you -- tricks of the molecular biology trade" book!) So, what? I'd probably be more patient than -LZ-. No, that's not right. -LZ- is perfectly patient with me. "Short" might be a better word for it, or "Fast." She goes through things once and expects me to get them, not realizing that I'm in a fog -- or that I was at the time, anyhow. Oops. Probably should stop my gel!


I'm doing the latter steps of the inactivation protocol again, this time with my own gene: slr1176, and I'm not in too much of a fog about it either! I'm actually planning the next step based upon the LOGIC of the operation, rather than just reading -LZ- (or -AK-'s) protocols and doing what they say, and I can reason about things like the blunting reaction and the ligation reactions. I'm having good success with restriction reactions and with gels, and I can compute how long the product is supposed to be, and (usually) tell from the gel whether I've got it or not. I got a lot of clones when I tried to clone 1176 (As opposed to the 3 I got for 0158), and although I shouldn't count my clones before they hatch, I'm feeling quite confident about the steps I've taken so far, and I'm expecting some success this time through.

All of this apparent understanding and positive feeling about what I'm up to has come upon me rather rapidly; in the past week or so, as you can tell from my writing. I've been full time in the lab since mid June, so only about a month and a half, and I've only really been concentrating on techniques for that period of time. Although I had done once or twice pretty much all of these things during my part time period, I was definitely not focusing at that time, and, as I've written, I was fogged in until just recently.

What has happened is complicated. I've proceduralized the procedures well enough that I can think about them, and I can tune them slightly, and debug complete errors successfully (although not subtle errors, yet). I've mentally cross indexed the procedures in many ways -- that is, they are available to me in many ways: what reagents they use, what they do, how long they take, what larger-scale procedures they appear in. Also, I've gotten myself organized to the point that I'm not spending a lot of time searching for things any longer. Also, I've become less timid regarding take action -- trying things out.


Well, I was on top of things yesterday, then everything went south. Well, not everything, but the gel extraction I did to recover the linearized DNA from my perfect digestion got bupkas (I wonder how that's actually spelt?! :) So today I'm re-doing the digestion, and this time I'll show the gel to -DB- before doing the extraction.


Again, debugging pays off, at least a little. I ended up theorizing that the water that I was using to elute the DNA from the get extraction column wasn't the right pH. In fact, they give you a TE elutate to use, but I hadn't been using it because the manual says that you can use water, and -LZ- told me I could just use water. To test this theory, I dragged through the trash to find the gel columns from yesterday, and redid the elution with their buffer instead of my water, and in 3 of 4 cases I got a lot more DNA. So, now I'm rerunning the digestion, and rerunning the gel (which isn't working so well, for unknown reasons -- the product seems to be seeping out of the gel in all directions! -- and I'll redo the gel extraction with their TE elutate, and then, finally, maybe have enough DNA to do the blunting reaction and actually get some product!


Statistics: You don't run statistics on car repair, do you? When you pull something out, it's out, when you put something in, it's in. You have to make sure that you've got it out or in, but when it's out or in, it's out or in. Maybe if you had a Mole of cars, you'd need to run stats on your work.

Things not to do with your gloves on: I. Avoid embarrassment by (a) Trying to slow down the centrifuge by hand. (b) Trying to take the tip off of a pipette between the sides of your fingers while holding things in both hands. (c) Putting on or taking off autoclave tape (unless absolutely necessary). II. For your safety: (a) Scratching your nose or rubbing your eyes. III. For other people's safety: (a) Using the phone. (b) Typing on the computer


Well, this isn't going all that well. -DB- says that my gels look messy to her, which means starting over -- again! -- because I'm running out of materials. I also have to add a cleanup step to my miniprep, which means another gel, which means more DNA loss in the pipe, although things might be a little cleaner. Sigh.


As I've previously written, one is often required, in molecular biology, to pipette extremely small, nearly invisible, quantities of clear liquids from one place to another. Getting them into the pipette tip is the easy part and, as a result of capillary action, takes nearly no suction at all. Given that, how does one get the liquids back out of the pipette into the target? There are several answers, none of them completely obvious. The easiest is that you touch the tip to the surface of the liquid into which you are pipetting, so that when you extrude the enzyme, or whatever, it diffuses into the medium. There are two problems with this. First, what about the first material? Second, this contaminates the pipette tip, and often it's more of a pain, as well as a waste, to have to change tips between each of a hundred pipetting operations of the same chemical.

The non-obvious solution is to touch the tip of the pipette to the wall of the tube, and then extrude a tiny bubble of material onto the wall. For reasons that I don't know, and that only physicists care about, the bubble prefers to be on the wall of the tube than on the tip of the pipette. Indeed, you can array all four or five of the reactants on the walls around the tube, and then spin them down into the tip of the tube to run the actual reaction.

Centrifuge tubes have other amazing and useful properties. My favorite is that you can put them down on their sides OPENED and they won't spill out, having something to do with the surface tension of water being such that it essentially makes a bubble against the air so no mixing occurs. I've even dropped an open tube and had the materials stay in! Also, you can centrifuge them pretty darn hard, and not break or bend them, and the seal will always open. All in all, centrifuge tubes are pretty useful and hardy little buggers.

[Note added 20000831: I have broken things in the centrifuge, and it's both spectacular and messy! Our tabletop Fuges (pronounced like "kludges", but if I wrote it "fudges" it'd be chocolate!) run about 14,000 RPM ~16,000G's, and the Ultras run 50-60,000rmp - God knows how many G's! I'm told that the Ultra has thick metal walls that will spin freely if you hit them, in case something lets lose in there!]


Things went fairly well today, although I didn't do very much. I learned some new tricks from -DB-. Mainly that you shouldn't pay much attention to the kit manuals precise directions, but do things a little more aggressively to get better results. For example, the kit says to elute the DNA in a column with 50ul of elution buffer, or water, and that you can use 30ul for higher concentrations, but let it stand for a minute before spinning it out. However, the real instructions should read more like: "Only use our elution buffer, and heat it to 65dC, and when you put it in the column, let it stand for 5 minutes, and then do it twice to collect the remains, and then concentrate the product to get the concentration you want." I've got this now from a series of comments and suggestions from various folks. -DB-, -CWC-, and others, and I made up the "elute through the column twice" thing myself. Sometimes I get more the second time through than I got the first! Don't ask me why-- maybe because I'm letting it stand longer?

Anyhow, so I'm concentrating my DNA, which is unfortunately an overnight operation. The amount of DNA we're talking about here is quite small: Order of 500 nanograms, so when you spin it down, you can't see anything, and when you pour off the supernatant, it's a complete act of faith that you haven't poured out your DNA -- the proverbial baby with the bath water! But I've done this before, and it worked fairly well, even though then too it looked like I was pouring out my DNA, so we'll see what happens. I'm pretty optimistic that I'll be able to do the blunting operation tomorrow, and maybe actually set up the ligation, although I won't be able to actually DO the ligation until I get back from San Diego! Sad; I'm excited about doing it, but it looks like it'll have to wait until next week, now. :(


I know how to do things now. I'm not afraid of new procedures, although I'm still afraid of some of the chemicals we work with -- Ethidium Bromide and Phenol, especially. There are a few procedures that I can do without a having to concentrate on them, and I do them fairly quickly. (It seems to me, having been through this experience, that a lot of performance improvement, at least in terms of speed, is savings in the "getting over the hump" of actually being willing to ACT, esp. for actions that are dangerous. I have to do it a few times before I can convince myself rapidly to "just do it!" This doesn't explain understanding or improved precision, but I'll bet that it has a lot to do with speed.)

I've wanted to write about a few things, but can't really remember what they are. I haven't written in nearly a week, maybe because things have been going moderately well so there's not much that is problematic to write about. We'll see tomorrow morning whether my clones come up, and if they do, then I might have finally inactivated a gene for real! We'll see about that tomorrow.

Oh, one thing that I wanted to write about was ritual. I talked to -CA- some about this, but we couldn't come up with a very good definition for a ritual. What I have in mind, regardless of what you call it, is a procedure that may or may not have a real function, but whose meaning is lost to antiquity -- at least, *I* don't remember why I'm doing it, or if I do remember why, I'm not sure that it actually makes sense. A good example is the ritual that a lot of people have of leaving a machine (like a TV) that they have just turned off, in the off state for a few seconds before turning it back on. I think that this is a hold over from tubes, but I'm not 100% sure, so I do it, just like everyone else that I know, although the times that I have failed to do it, nothing bad has ever happened.

I've produced many lab rituals of my own. The ones that I'm most aware of are most recent, and have to do with this damned gel extraction process that I can't ever get good yields from. So I've taken to heating the elution buffer, running the materials through the column multiple times, and leaving the various steps sit for ten minutes each, esp. the pre-elution step. Are these functional? Who knows. I can make up ways in which they are, but I'm not sure. Taken together, I seem to be getting somewhat better yields, but still not great. (-DB- claims to be getting 90% yields!)


So, this morning I had no visible clones from last night's hectic transformation. I could have thrown out the plates, but, on a hunch, didn't, and this afternoon, voila!, some colonies arose from the ashes! As the same time, I asked -LZ- what to do, and she suggested redoing the ligation, which I've set up, but with any luck, I won't need the new ligation, but will just use the late blooming clones, and one of them will be right.

This afternoon -AG- sat -LZ-, -QF-, -T-, and I down and laid out the program of our research, as he sees it, at the moment. It's going to be a lot of work, running a large number of mutants through time courses, and collecting cells for RNA from these at each of a number of time points. Ugh.

-LM- just showed up and we're going off to a CIW BBQ.

(Later: BBQ was boring, so I'm glad that -LM- was there. The French women were the most interesting -- wearing elaborate outfits. They always wear these sorts of things around the lab; I don't know why. Everyone else wears jeans. Our lab group, except for -LM- and me, sat in a boring group. One of -LM-'s old TAs is now at CIW so we talked about boring things.)


The transforms took longer than expected to grow up, and now only two of the six cultures that I made from the tiny transformed colonies seem to be growing, and they are growing very slowly. (At least I had the foresight to take six of them!) By this morning any that had grown should have been thick with bacteria. I'd understand what was happening if they were ALL clear, but there are two good ones in 6; 33%, not all that horrible, esp. since the colonies that I picked were tiny tiny tiny; but why aren't they thick?! I had another procedural brainstorm this morning, faced with this problem, and plated from the two that were growing onto LB/AMP/SPEC plates, so that at least when I try to miniprep these poor growers, I'll have backup in case I get nothing from the miniprep.

Was computer debugging ever this frustrating for me? I don't think so. At least with computers you can hold conditions the same trivially, and start again under different conditions. This wet lab stuff is incredibly painful to debug. I wonder if, when I'm an expert at it, I'll have a bank of "Things that can Go Wrong," so that I'll be able to avoid the problems, and recognize the problems when I run into them. In ten years will this seem as easy to me as computers? I doubt it. -DW- spent a week here this past week, and he was working near my bench. He was having problems -- something didn't cut. -B-, a professor from someplace else, was here when I first got here. She was having problems -- contamination, things not working, etc. etc. Frustration.

This is another thing that makes this one of the most distributed sciences: Things don't work in one person/lab's collective hands will work in anothers'. Frustration for some, success for others, Nobels for the few. I'm not holding my breath.


One thing that bugs me no end, as you well know by now, is when my procedures don't work. It would be very helpful to know what the baseline for success in these various procedures is, and whether there are "developmental" trends in them. There is a persistent claim in various informal writings on science that an experiment has to be done three times before it works. Although I'm sure that in reality this is 3+-7 times, there remains the point that it often takes inexplicable practice to get things to work. One of my goals in this work is to figure out why that is. Toward that end, it would be very helpful to know whether this is actually the case, and I'm fairly sure that no researcher has asked the question in a careful way. I also do not intend to address the issue very carefully, but I'd be interested in hearing from others how well things work, and how many tries it takes to get to "success." At the very least this would make me feel a bit better -- or perhaps a bit worse! -- depending upon the results.

[Note added 20000831: -DB- talked to me yesterday about how to debug the end of my inactivations, which involve a very hard blunt-end ligation. Her proposal was long and complex and highly heuristic. It would have taken a week or more, and there's only a small chance that it would have given me the information I need to figure out what's wrong.]


Kary Mullis has apparently used my lovely car metaphor, which I decided was better than the train. I'll have to read his book now, I guess.

Nothing worked. I'm starting from scratch. Nothing grew at all. I'm going to blame the Spec cassette, so I'm setting up from scratch on that, and I'm going to re-clone the 0158 gene, because I neglected to keep copies of the clones. Duh. Sigh. I may be able to start from the PCR product, if I can find them. This will be a good test of my record keeping!

What was I thinking on the way in? Oh yes, the triangle of knowledge: At the three corners we have objects (e.g., clones on plates), procedures, and models. What holds them together until I have them in my head is my notebooks, qua *mediating objects*!

Mediating procedures as a focus for this talk? Yes, that might make a good focus. Can I tie mediating procedures together with View Application for scientific discovery? I need to find out how Vygotsky really uses the concept of mediation.

Hutchins has this hard-to-understand neo-Vygotskian thing about Distributed Cognition involving "culturally constituted functional groups" rather than an individual mind. His theory "reconceptualizes 'information' as the propagation of representational states of mediating structures that make up the dynamic and substance of any complex system." These include internal and external representations, and lab notebooks would be among his favorite artifacts, I'm sure.


When a reaction fails that has worked before... blame your reagents! So, I put up something that I have grown, I don't know, as many as 10 times before -- spec vector (pHP45) and it didn't grow! My current theory is that my medium is bad -- or perhaps my antibiotics are too strong, or something. So I'm using "fresh" LB, made by someone else!! and we'll see what happens. I *think* that I've used this medium before, but I'm not at all sure....


I was wondering in a previous note about how I would put together my observations on learning procedures, with learning the knowledge that puts them together. (Huh?) I had hoped that I could put View Application in there someplace, although I'm not sure that it goes. But here are the beginnings of some understanding on this, esp. in the area of thinking about the specific DNA cutting protocols, E. Coli transformation, etc.

One idea is that when I have a protocol well in hand -- literally! -- I can start to think about its function instead of its form. In fact, I can dwell upon its function, which is often what I find myself doing when things don't work -- a very common occurrence for me! In that case, I dwell upon the functions of procedures, and in so doing take small steps toward a full understanding of what's going on. For example, what could be wrong with my ligations? If the LB is wrong, it could be that the pH is wrong; what does that do? I don't know, but I could ask. Maybe I have to begin to grow the cells w/o antibiotics, and then add the antibiotics (lab members differ on this). But then the plasmid could jump out of the cells (according to the dissenting parties in this discussion). Etc. Each of these small analyses only leads a couple of inferential steps, and these are mostly mundane inferences and guesses, but taken together, they form a rich understanding of the function of what I'm doing, of the science, and of the whole ball of wax.

Other examples: Why blunt end ligations are hard (everything sticks to everything else). Why gel purification has low yields (e.g., need to warm the water). Why it's hard to get the spec cassette out of the vector (sizes are too close to resolve). Which cutters to use (blunt v. sticky ends, details of restriction enzymes, etc.)


So, I'm sitting here staring at my latest restriction. Things are actually going fairly smoothly, and I have a new theory about why the ligations haven't been working: That I've been getting the wrong band from the gels to extract the spec cassette. So I redid -DB-'s double digestion, and took out *both* of the likely bands, and was going to sequence them, but -T- suggested just finding an enzyme that will cut the cassette but not the vector, and digest some of the product, and the one that gets cut -- that is, the one that shows up in pieces on a new gel -- is the cassette! Okay, I'm game, so that's the digestion I'm watching now.


As it turns out, molecular biologists aren't very smart. Or, more precisely, they don't think very deep thoughts very often. Molecular biology is pretty much a solved problem, like chess. And, like chess, figuring out anything you don't know is merely a matter of technique, or brute force, or some combination of these, but never anything very deep in the content area -- no new principles -- even the old new principles aren't very deep; it's a "How does this thing work, in terms we already know?" problem, not a "Are we using the right terms?" problem. This is not to imply that molecular biology is simple. Quite to the contrary, it is extremely complex, but the complexities are technical ones, not conceptual ones.

And, of course, it's extremely important; much more important than most actually "deep" scientific problems, like cosmology or psychology. But public sanitation engineering is also more important than those two in practical terms! In fact, I'm not sure that molecular biology should be classified as a science at all! 80% of it is just ripping things apart to see what's in them, and another 18% is fitting what you find into what's already known. Then there's about 2% that's actual "discovery" in any interesting sense of the term -- making up new mechanisms and such.

I've been with the lab, mostly part time, for nearly a year! -DE- pointed out that I had come around her previous birthday, and it's now her Birthday again!

[Note added 20030207: Another sense in which molecular biologists aren't "doing science," is that most of them don't follow anything vaguely like principles of statistically valid inference. Usually they take one observation as a result, even if the signal/noise ratio is very poor, and try to figure out what's going on. This is, in part, because biological protocols are extremely expensive and time consuming, and also in part because they don't seem to quite get the idea that one-off observations can be misleading. This is esp. the case for microarray experiments where the S/N is extremely low, and you're doing thousands of tests without paying the statistical price for that by doing any sort of multiple-testing correction.

Given this abuse of statistics, I'm amazed that biology has gotten anyplace at all. I have this image of biologists lined up against the problem of biological discovery like Civil War soldiers: line upon line of guys with bad guns, just walking toward one another; one line gets mowed down, and the next one is there, walking relentlessly towards the problem, and getting mowed down in turn. Eventually someone comes up with a solution that's right, just by chance, and that person gets a Nobel. Everyone else was just so much cannon fodder.]

20000821 (u)

On the T in Boston -- the Red line -- traveling from South Station to Alewife (named after a fish that looks like a sardine), I "know" the places that we are traveling under underground as we go - diving underground after Charles, Kendall at MIT, Central - a church, I think - Lori used to live near there. And then Harvard Sq. -- Steve used to live near there -- and then out to places unknown. The feeling is very subtly different where I know what is above my head, even vaguely - at least I think I know - from where I do not know, have not been walking or driving aboveground. As a result, I know the lay of things, the direction with respect to the city that we have come, and where we are going, how long it will take to get there, whether it makes sense to take a cab or bus or walk instead of taking the T. However, from Harvard onward and out to Alewife, I do NOT know what's above my head. As a result, the feeling of taking the T in these circumstances is very slightly different. If I were driving, I'd be a little more worried not knowing where I'm going, even driving rails so that I can't make a wrong turn. That is a little of how it is with the molecular biology protocols: I can do them, but I don't know what's going on "above ground", so to speak - inside the test tube or cells, and I can't make decisions about whether to walk or drive or heat or cool or add more buffer, or take the T.

It will be very interesting to see how my thinking about all this changes after returning from a week's vacation.

Tue, 29 Aug 2000 14:28:09 -0700 (u)

Yesterday was fully occupied by setting up a new High Light tank, which is largely a function of finding and piecing together old stuff, deciding what's usable and what's not, and patching things that leak or are broken. Very macro-scale mechanics. So today I made up for all the macro by nano-mechanics of blunting my 1176 genes, purifying them on a gel, and then setting up a ligation with the spec cassette, which will run overnight. Very nano-scale mechanics. But pretty much the same thing. Interestingly, I only had to consult a manual ONCE for the gel purification, and it worked pretty well, and setting up the ligation was easy. We'll see if it works at all! Tomorrow I'll transform into E. Coli, and then plate them out, and by Thursday I might have clones...although I'm not very sanguine about this possibility, since it's failed a few times before. The difference this time is that since I couldn't figure out if it's band 4 or 5 that contains the spec cassette, I did two of each (4 reactions in all).

20000831 (u)

The transformation failed again! It is with a heavy heart that I commit these many good plates to the trash. What now? I think I'll spend a few days on things that I can make work. Computers, writing, sleep. I have occasionally worried that although I claim that what I'm becoming here is a molecular biologist, really all that I'm becoming is a lab tech. Maybe molecular biologist really do think deep thoughts, but I've been missing them because I'm focused on all this procedure. Well, I got some validation today that I'm not too far off in my assessment.

-AG- has been working for several months on a review of how plants respond to environmental stressors, just as high light and low nitrogen. It's 50 dense pages, and a bibliography of the same length! I'm not exaggerating! And, more than that, it's nearly completely a-theoretical! Or else what passes for theory in this field is the same thing that passes for a "principles of operation" manual in car mechanics. Pretty much every sentence describes a mechanism that I could program with simple qualitative rules (and may so do!).

It's scary how detailed and mechanistic it is; I should put an example in here. Just as amazing is how INCOMPLETE 50 pages of review can be! Organisms -- even "simple" ones, like bacteria -- are incredibly complex, and we haven't figured out all that much about them. The brain is complex too, and we've figured out even less about it. Maybe that's why there is more theoretical discussion in brain science, or maybe it's because the brain gives the illusion of having a high level description that no one can quite get their heads around. No one thinks that there's a high level description of cyanobacteria. All the action is in the detailed of the machine!

Another reason that the brain is more theoretically challenging than organisms is that it is not mechanically produced. That is, although the basic structure of the brain is obviously genetically coded, and can (should) be understood at the level of the biological mechanics of its production, the overall "computational structure" of the brain is NOT genetically coded, but, rather, is learned as an interaction between the unfolding of molecular development, and the organism's interaction with the world. We don't have good ways to talk about this, nor, more importantly, a handle on what good ways to talk about this would look like. In the case of Cyanobacteria, although we don't know how it does most of what it does, we know pretty much what form the answer will take.

Here are some quotes taken more-or-less at random from the paper:

"The rate of influx of NH4+ into Arabidopsis roots shows a diurnal rhythm with a maximum that occurs toward the end of the light period (citation)." p.10

"Arabidopsis has two genes that encode NR, Nia1 and Nia2 (citations). In Arabidopsis seedlings Nia2 appears to be responsible for approximately 90% of the NR activity, while Nia1 accounts for the remaining 10% (citation)." p. 20

"S atoms present in SO4-- esters and sulfonates differ both in the atoms to which they are bonded and their oxidation states." p. 30

"The predominant form of available P in the environment is Pi, which is incorporated into many cellular molecules and is a major component of nucleic acids, phospholipids, and intermediary metabolites." p.40

"Psr1 results only specific responses of Chlamydomonas to P starvation, and not the general responses." p. 50

And, just to emphasize how common it has to be in biology to have a whole paper full of incredibly detailed facts, I reproduce here the turgid concluding paragraph in full:

"Researchers have just scratched and dented the surface of the edifice that conceals processes critical for acquiring, assimilating, and distributing nutrients in plants. The legislative machinery that lies at the heart of this edifice and that governs a more holistic integration of nutrient availability with a diverse set of environmental cues, as well as with the growth potential and the developmental stage of the plant are still hidden within the heart of this edifice." p. 50

Block that metaphor!

9/1/2000 (u)

It's been about a year, now, since I joined -AG-'s lab at The Carnegie, and about 4 months since I've been doing molecular biology full time. I've finished re-reading and first-pass editing my notes through the year, so this seems a good time to go see whether I've made any progress, either for myself, for science, or toward the goals of taking these notes, which is to understand something about becoming a molecular biologist.

For myself, I've seen DNA, learned what it means to clone a gene, and a lot about the foundations, both chemical and computational, of modern biotechnology. I've also learned a lot about what it means to actually "do" molecular biology. I may not yet have earned my merit badge for molecular biology lab technique, but I'm getting there. Another year of this, full time, will probably get me there, esp. as I don't feel like I really started to learn until I was doing it full time. Do I think that I'll be able to stand up in front of an audience, or in a job interview and say something that I'm proud of? Not yet, but again, I feel like I'm making progress in this, as opposed to stagnating. I had hoped to learn and do more computational work, but, somewhat to my surprise, there isn't that much computing that is used in day-to-day molecular biology, aside from BLAST, which is a solved problem. And although I've had glimmerings of creative advances to make in this area, I don't have the time to work on them. Nor, I must say, do I truly have the motivation to do so. Even though my gene inactivation is a pain in the ass, I'm having fun just doing the day-to-day mostly uncreative work of lab techniques. Sort of like driving the T; Even though it's probably boring going up and down the same line every day, you get someplace interesting, and the rails keep you on track (so to speak :), and I've slowly learned what's going by aboveground -- in the test tube -- and in the case of molecular biology, it's really really cool! Or at least really really important. The fact that I'm working on issues of environmental stress in plants, esp. cyanobacteria, is something that at least I can tell people and it sounds nearly as important as it is!

Have I gained the connection that I want to plants and the environment? Somewhat. Cyanobacteria are only barely plants (I don't think that they are even classified as plants, although I don't know -- easy to find out, doesn't matter), but that's okay. They are certainly fun and important. This goes to the second issue, whether I've accomplished anything for science. The answer to that, is probably yes, although only a tiny step of discovering glycogen creation as an electron valve (to quote -AG-'s theory on this). If I ever get my genes inactivated, maybe we'll actually publish this, but at this rate, that will be a year from now! The other thing that I think will eventually be an accomplishment for science is my slowly coming together ideas on how to join metabolic, genomic, and expression data in understanding the complex activity of biological systems. Sometimes I feel like I'm blocked on this by my day-to-day lab work, but I don't have a clear enough idea on it yet to actually make it happen. The work that I'm doing with NASA, mostly being hacked by Julian, my intern, is actually making some small progress on this front, but I haven't really made much of an intellectual commitment to this yet. It's just starting to get interesting, though.

What about progress on the matter that led me to taking these notes; A contribution to the psychology of learning complex activities, esp. scientifically important ones, and specifically as regards the relationship between technique (i.e., skill) and understanding? On this, I think that I have learned a great deal, and maybe there is a little bit of a contribution to be made here to cognitive science. It's not too easy to make clear or concise, but I should start to bring these thoughts together.

Practice does several things for one's skill and understanding. First, there is indeed a sort of chunking of unit tasks, but these tasks needn't be hierarchically related to one another. I learn bits and pieces at all levels, and eventually these make contact with one another to unify and smooth out the whole skill. Until I have made significant practical progress on a particular set of skills, my attention is totally taken up by the doing of them, and I work in a fog where I have no intellectual energy left, nor enough of the pieces *conceptually* unified, to be able to start to understand what's going on. Note several things here. First, there is both a physical practice effect and a conceptual one. I don't mean that very deeply, it's just that I both have to physically and mentally "know" the structure of the activity before I think that I know it.

Also, what "knowing" it means, practically speaking, is that I know what it does, how it responds when I flex it, how to take it apart and interleave it with other procedures, and the way that it fits into other tasks, both more and less abstract. Take gel purification, which contains a number of subtasks (making the gel, loading the gel, running the gel, selecting bands, purifying the DNA), and fits into many higher level tasks (in my case I've been using it in gene inactivation, but it's used pretty much everyplace!) The shape of gel purification is this thing with three bug "lumps" (making the gel, running the gel, purifying the results), and each of these has other lumps. The lumps have to be done in order, but I know how to handle them -- where I can pause, where I can't, how exact one needs to be with each. Moreover, at the same time that I was learning all of that, I was learning the physical skill of doing the unit tasks: taping the gels, pouring them, loading them, learning to "see" the bands (with UV, which turns out to be a whole interesting cognitive story in itself!), cutting out the bands, doing the purification procedure. Grossly speaking, there are a What, How, Why, Where, and When of tasks, which I'll collectively call the 4.5Ws (4 Ws and an H).

I don't know if you can learn the 4.5Ws separately -- certainly we normally learn them together -- but I think that they are mostly separate things. I "knew" them at different points during my practice of the procedure. I usually knew How before I knew When, and all that before Why. Although bits and pieces of the 4.5Ws came in at various times at many levels. I don't think that there is a general theory that puts one before the other, although there are practical issues at play, for example that you can't learn Why you do something before you learn a very general version of What it is, although you can learn How before Why, or Why before How, or How before What, and most other combinations. Also, given that there are so many levels of analysis at play in a complex protocol, learning of the 4.5Ws of all the levels overlap one another. If I wrote down everything I've learned about gel purification, it'd probably be on order of a thousand facts (including skill units) that would be all interconnected, but I don't think that there would be any grand organizing theory about which came first, the What or the How, etc. It just doesn't seem like there was any consistency to this, separate from the shape of the activity itself. So the order may not have any psychologically fundamental principles, but certainly the phenomena of these being learned and coming together does. What is it? Dunno, but I know that I'm babbling now and it's time to go to the lab, so I'll continue this at another time.

20000902 (u)

Last night our lab had a pot luck party for two members who are returning to Japan, -HD- and -AK-.

-AK-'s bench was directly behind mine, so she was the easiest person for me to ask question of, and she was also in many ways the most useful. We were doing similar work, and she wasn't as busy as -LZ-. Also, she was always completely happy to answer any questions, and to give me advice or entire protocols. Generally speaking, anyone is happy to do these things, but, just as in life, some people seem more approachable than others, and -AK- was by far the most approachable person in the lab, at least from my point of view. She was also very organized. More importantly, I think, she and I were both in a sense beginners. It seemed to me that she had a lot more lab experience than I did -- she had worked in -HD-'s lab for three years, I think -- but we were both doing these particular protocols for the first time, so although I had to also learn a lot of just physical stuff that she already knew, we were both in the same boat as far as the details of molecular protocols. So I will specifically miss -AK-!

Our lab likes to drink wine. Or, more precisely, -AG-, the PI likes to bring wine, and presumably drink it, so most of us drank as well. I'm an alcohol light weight, so after a couple of small glasses I was out of it. We were all telling lab horror stories, and -AG- was saying that most of what we work with isn't actually very dangerous. We talked a little science, but not much. It's funny how hard it is to talk molecular biology in pleasant company. You need drawings pretty quickly, or at least paper to keep track of everything, and it's not like you can make it very interesting to anyone but another molecular biologist in the same field. I mostly talked about the recent results that our team working on the NASA environmental data has had, and about the model that Julian and I are writing to do discovery in metabolic models. Continuation of my thought on learning, from a couple of days ago:

Now, what is most interesting, I think, is not the skill itself, but how it relates to what the protocol is doing. (I'm not sure whether this is What or Why, but anyway, those were just gross categories.) I've used the T metaphor to talk about this: What's going on aboveground as I'm riding (or driving) the rails through The Tube. In molecular biology, it's largely gene manipulation, carried out by fairly simple (from my point of view) chemical reactions, mostly involving nano-machines -- enzymes. On the one hand this is very simple to understand, and on two others, very complex. It's very simple to say that a particular restriction enzyme cuts double stranded DNA at a particular point. It's complex to know the details of where it cuts and to compare that to your particular string of DNA, since there are hundreds of restriction enzymes, and the strings of DNA are thousands (sometimes hundreds of thousands) of base-pairs long. This is clearly a job for computers, and there are tens of freeware and webware programs that will, for example, tell you what enzymes to use to cut your DNA. The other thing that is hard, and this is in my experience the most difficult thing of all, is to keep track of what-all is going on when you do this. Here's where it's not like car mechanics! In car mechanics, the car has a few parts, and the parts are different enough from one another that you don't get them confused. Moreover, the car stays put (usually) when you stop working on it, and you can look it over, choose your next tool, unbolt something, set it aside, take a break for a week, and the state of the world as you left it tells you where you were. In molecular biology, there are a million (more!) almost-the-same pieces of material floating around invisibly. If you lose track of what you were doing, you can't just look at the state of things and tell where you were and what you have to do next. Even if you could SEE DNA, there'd be too much of it in too many possibly configurations to be able to figure out by looking at it where you were. So there's a huge cognitive load imposed upon the experimentalist to keep track of what's going on; what state things are left in, and where things are going. Moreover, unlike the car, DNA (and esp. living organisms) will take off an do whatever they do if you just leave them to themselves -- hybridize or degenerate or overgrow or die. So not only do you have to know what if going on, you have to be able to make decisions on a schedule. Lab notebooks help a lot in keeping things straight, and being able to freeze away things helps a lot with this, but you can't freeze, for example, organisms, and when unfrozen, reactions will continue in perhaps undesired ways, so lots of things are done on ice, and still more things just have to be done in a timely way.

I've gone back and forth between car and train metaphors. I'm not sure what they are really worth, but I'll use another, just because it seems apropos. Instead of driving the T, or a train, imagine trying to drive through a thick fog in an unknown city, at or near rush hour, and get someplace on time that you've never been before. Until you've done it a hundred times -- until you've lived there! -- you just don't know the physical layout of the city, and the temporal structure of rush hour. And since you're in a fog, you can't look around you and see what's coming; where the curves are, when the traffic is stopped, and for how long. Imagine the frustration of doing this the first few times, until you know some work-arounds, and know what's up ahead, and moreover, know where to expect traffic at what times. Now imagine that you not only need to know all that, but your task is actually to use your knowledge of that to manage the street lights for the city! That's more like the job of the molecular biologist, to thread long strings of cars (DNA) through the city efficiently, using gross manipulations like changing the traffic lights. Imagine someone who does not know the city well trying to do that job for the first time -- in a thick fog!, or without the aid of cameras or other mechanisms of telling the state of things other than, every once in a while being able to call some of your friends who are in cars in various parts of the city and ask them if they are in traffic, and what kind of cars are around them, and occasionally, with great effort, being able to ask for a very high level traffic report (running a gel). Okay, it's not the best analogy, but it does capture some of the intensity and difficulty of doing this sort of work.

It is these sorts of Knowing: Knowing what the protocols do, Knowing how to put them together to do larger tasks, Knowing how to make the whole thing work together smoothly, Knowing where you can start and stop and pause things, Knowing how to debug problems, Knowing what's going on all the time in all of your hundred variously labeled tubes in six different freezers at six different temperatures, Knowing the sequences are of all your genes and how to manipulate them and reason about them, and Knowing in the end what the results mean in the larger conceptual space of biological science. It's these sorts of Knowing that make this an interesting and difficult thing to do.

Tuesday, Sept. 5th (u)

The mind is weak, but the body is willing... Or is it: The body is weak, but the mind is willing? Anyway, at the moment both are weak and unwilling! I have to set up cells for a huge high light run that -AG- wants us to do to collect RNA for chip work. This is a HUGE protocol, involving about 50 culture tubes, and tons of time, temperature, and cleanliness sensitive sub-procedures, most of which I haven't done before. So I'm avoiding starting into it by writing notes, and whatever else I can figure out to occupy myself with. Oh, actually, I just remembered that we can't start until I have the surgical tubing from Cole-Parmer, which is on order, but which hasn't arrived. Oh, goody. I guess, though, that I should sterilize the various gear, get new light bulbs, and whatever else I can think to do in the meantime.

One of the main reasons that I don't want to do this is that I'm sure it's going to be a disastrous waste of time and resources. I think this because no one can agree on the protocol. There are four of us working on it: -T-, -QF-, -LZ-, myself, and -AG- is kibitzing. -LZ- is out for the next couple of months having babies. -T-, -QF- and -AG- can't agree on anything in detail. And I'm trying to put together the "master" protocol, but no one's going to follow it, except me since no one can agree on enough details to even write them down. I go round robin between -T-, -QF-, and -AG- on every little detail and end up having to interpolate anyhow. Oh, and I've got -AK-'s protocol as well, which I'm personally well disposed toward because it's at least neatly written down. Actually, her protocol looks like a cross between -T-'s and -QF-'s, so that speaks well for it ... or maybe not; maybe it's just a mishmash.

What's at issue here is how to get RNA out of the cells without (a) losing your RNA to RNases, which are apparently both ubiquitous and very fast acting, and (b) contaminating your RNA with DNA, which is more easily extracted and isn't destroyed by RNases, which decompose RNA. First of all, in order to get enough RNA to run enough chips to have statistically significant results, you need about 150ml of cells, which means running three culture tubes of 50ml each, and we're doing it at five time points while they are treated with high light, and then you have to duplicate the whole thing for wild type control. That's 3x5x2 = 30 tubes! Next, you have to bake all of the glassware and all the implements and autoclave all the reagents because RNases are everywhere, esp. on your hands, so you can't touch anything. Also, you have to quick- freeze the collected products so that the cell's own RNases don't mess up your yield. Then, when you do the nucleic acid extraction, you need to use special procedures to clean out the DNA. Here's where the major disagreement was. -T- uses Sodium Acetate to precipitate both the RNA and DNA, and then applies DNases after-the-fact to delete the DNA. -QF-, on the other hand, relies more on using Lithium Chloride instead of Sodium Acetate, which is supposed to differentially precipitate only single-stranded nucleic acids, this leaving DNA in liquid. -AG- doesn't think that this works well, and -AK- uses both.


Wed. Sept. 6th (u)

I resolved a couple of important problems with my inactivation protocol today. Well, not actually the protocol itself, which remains unchanged, but some possibly reasons that it didn't work the last time (and the time before that and the time before that). You'll recall that I've been having a significant amount of trouble getting my Spec cassette, because I couldn't read from the gel where it was supposed to be, so I was doing things like trying all the possible bands! Well, I finally got a copy of the spec cassette from -DB-, and was going to use it, but instead it occurred to me that I could use her cassette as a ruler to tell which of my bands was the right one. (Give a man a spec cassette that he uses, and he'll inactivate for the day; give a man a spec cassette that he uses to learn how to make his own, and he'll inactivate forever! -- or something like that...) So, I did, and I managed, as a result to get a bunch of what I think is clean spec cassette out of one of my old double digestions. And, as it turns out, I had guessed wrong in the last transformation run, so it couldn't have worked!

There was another problem with the latest run, which I realized as I was pouring over my lab notebook, which is quite tangled in recent days. I think that I forgot to linearize the 1176 gene clone with HindIII before doing the blunting operation, which comes just before the blunt end ligation (to the spec cassette), which comes just before the transformation. Blunting the un-linearized gene is useless, as it ligating and transforming with it. So I linearized some of my old cloned gene today, and then blunted it (all in one day!), purified that. All of that worked well. Then I blunted the spec cassette. The gel for this didn't look so good, but I extracted it anyhow, and purified it, and I'm concentrating it overnight. Tomorrow morning I'll put up the ligation. (Maybe I'll do the blunting again, since today's didn't look so good; I don't think I used enough DNA.) And then, on Friday I'll try the transformation again.

Friday, Sept. 8th (u)

I saw Lisa Moore around Shauna's lab, which is next door to ours. We traded that "I know you, but have nothing really to say to you", half-smile that people who know one another but have nothing really to say to one another trade. I'd like to tell her that she started me down this long strange trip. I will eventually run into her at a party or something and will let her know. I don't know whether to thank her yet. We'll see.

Fri, 8 Sep 2000 (u)

Cynaobacteria are soooooo weird! One of the lab members, -QF-, gave a lab meeting this morning. He's working on the same CyanoBacterai that I'm working on, as is about half of our lab: Synechococcous PCC 6803. At the start of the talk, -QF- gave a number of reasons that PCC 6803 was a good organism to work with. These amount to: 1. It grows fast (double in about 8 hours), but not so fast that it gets out of control; 2. It grows on either light or glucose; 3. It has a fully developed two-part photosynthetic system, just like higher plants; 4. It has been completely sequenced and annotated, and it easily web accessible; 5. It is readily transformable through double homologous recombination, enabling us to easily edit its genome; 6. pretty nearly every required technique has been developed for it. He left out: 7. They are are probably the most abundant group of organisms on earth, and certainly the most abundant oxygen producing organisms on earth, and so among the most important the health of the earth, aside from people (who are important in the negative sense!)

Later on I was talking to him about what a weird and wonderful organism Cyanobacteria is ... it should be "are" since there are over 500 species, and, taken together, they represent a huge range of experiments in life. Consider just the fact that 6803 can both photosynthsize, plus it's got a nearly complete glycolitic pathway, enabling it to live in either the light or the dark, if there's a glucose source.

-QF- has only really worked with 6803, but 7942 is a common lab strain. If there are 500 speices, it would only take 500 labs to cover them all. There are Cyano meetings, and according to -QF- they used to be well attended, maybe a thousand or more people, but lately interest has waned into eukaryotes because they are better models for what people perceive to be the more important eukaryotes, esp. Chlamydamonous -- essentially photosynthetic yeast.

But cyanobacteria remains the "hobby" organism of choice for plant hackers. It's fascinating and fun, and not dangerous, and like clay, you can mold it easily into many different forms, each of which has interesting different properties. It's sort of the E. Coli of the plant kingdom -- the gene hacker's photosynthetic system. Seems like the right place to be, to me.

Sat, 9 Sep 2000 (u)

God damn it! I just cannot get this inactivation to work! This is the fourth time through, and nothing!

This time through the ligation and transformation I did something vaguely clever (which the protocol has said to do all along, but which I've been ignoring all along), which is to use a plate that is only selective for Amp resistance to figure out the transformation efficiency. That is, since I'm ligating a linearized Amp resistant plasmid with a Spec resistant gene, the resulting plasmid should have both Amp and Spec resistance. So, to find those I'm growing them on a plate that has both Amp and Spec. But I can also grow them on just an Amp plate to see how often I'm getting the plasmid into the host E. Coli, regardless of whether it happens to have a spec resistance in it. So, I did that this time through, and it's very low; only about four colonies per plate! That's way too low to expect the much lower probability event of getting the Spec resistance to ligate in to happen.

What's wrong? Well, for one, I know of one slight mistake I made with the ligation, which is that I blunted the spec cassette, but the spec cassette was already blunt, so that was a waste of time and DNA and has the small possibility of going wrong, so I can redo the ligation with raw spec gene. I might set that up tonight. But why is the efficiency so low regardless of the spec gene? I let the ligation go for a LONG time -- several days -- so unless there was something really wrong with it, I should have had a pretty high concentration of plasmids of one sort or another. I'm doing the transformation protocol just as it's written, and when I've done it before for gene cloning, it hasn't been a problem. What What What is Wrong Wrong Wrong?

I'll set up the new ligation with the raw spec gene, and use a higher concentration of spec to ensure that there's a high likelihood that I get the cassette into the resulting ligation product, then I'll try it again, tomorrow. But I know already that it's not going to work. Am I pessimistic or what?

-LM-, -KO-, -AO- and I went to Costco. I was just being social. The place makes me insane. I felt I needed to buy something, so I made myself and the guy selling Monterey pasta happy by buying some nice looking (locally made!) fresh pasta with pesto. Compeletly unlike anything else in that place! We ran into -QF-. I'm not sure that I've ever seen anyone from the lab on the outside. -DE- invited me to the Oasis one night, but I couldn't go. (I would have, if I could have!) I see -CX- every once in a while in downtown PA. She must live there, since I'm not there too often.

-CX- left the lab a little while after I got there. So did -GX-. Lots of biologists go to industry. It's a *lot* more money, and possibly less work, although -CX- went into some kind of marketing thing with a company that makes biotech kits. I think that -GX- went into actual research.


Fun with liquid nitrogen. I'm running the time series from hell. There are thirty 50ml cultures of two types under high light treatment. At specified intervals of 30, 60 ,120, and 360 minutes (oh, and of course at the beginning: 0 min., too!) I collect the cells from six of them, quick cool them, spin them down, and then freeze them. All of this has to be done as fast as possible because something evil happens to the cells if you don't do it fast ... the RNases get to the RNA, or some such, and it's the RNA that we're after, so we want to stop the reactions that destroy them. It's not clear to me that doing this so fast matters all that much. After all, they've been in the high light chamber for a long time, will a few more minutes make much difference?

The first couple in the series are hard because the spinning and cooling process takes about 15 minutes, and then a few minutes of cleanup, and then you're on to the next one. But now I'm on the hour-long wait between 60 and 120, so I thought I'd write some notes. (It's about 10:30pm. A quick calculation will tell you that the last one will happen at about 3am!)

I'm using liquid nitrogen (aka. N2 (Aq) -- aqueous N2) to quick cool the cells that I collect. This is the most fun you can have (legally) in a lab, because while you're waiting you can play liquid nitrogen games, like freezing various things and then breaking them. The most fun thing that I froze so far was a sponge, which obligingly smashed into a million pieces that I'll never find. Unfortunately, I didn't have the presence of mind to do this outside, so I have to make an effort to clean it up. Another cool game to play with liquid nitrogen is to stick your fingers in it and pull them out fast, so that they get cold, but no too cold, and they're completely dry when you take them out. (Another thing to be careful about with gloves on is playing in liquid nitrogen. On the one hand (so to speak!) the gloves insulate you a little from the cold, but if your gloves aren't wrinkle-free, the cold liquid can get into the wrinkles and give you a little cold burn. Frozen by Science!

Another liquid nitrogen game is to pour the remaining liquid nitrogen into the bucket of ice I've been using. This makes an *amazing* amount of fog. Unfortunately, I once again didn't have the sense to do this outside, and the lab started filling with fog. I was afraid that it was going to set off a smoke alarm, so I grabbed it and ran out side to dump it.

When we're done with this high light run we'll extract the RNA from the cells and then run DNA chips for each time point. Then we'll look at how the expression levels of various genes change with high light exposure in the various mutants that we have.

Sep. 18, 2000 (u)

Dancing with The Master.

Yesterday -AG- and I did production RNA extraction. Well, mostly -AG- did production RNA extraction, and I watched the first 3/4 of all of it, helped half of the last 1/4, and did the last 1/8 myself.

We started with frozen cyanobacteria, and ended up with frozen RNA (and DNA, as it turns out, but it'll be pure RNA soon enough). The only difference between RNA extraction and DNA extraction is that you have to add DNase to the results when you're done, and you have to be very clean and very fast. It turns out that RNases are ubiquitous -- in your sweat, in your saliva, in the cells themselves. If you touch anything, or fail to sterilize anything, the left over RNases will eat your RNA. When I collected the cells I did it really fast, and as cleanly as possible, getting them into the -80 freezer as quickly and cleanly as possible. In the reverse direction, we need to get them into the first phenol extraction phase of the production line as quickly as possible, so that their own RNases don't get them.

First we set everything up; 5 time points x 2 species (WT and 0426) x 6 tubes (beater, phenol:chcl3, phenol:chcl3, chcl3, and final repository with NaOAc) x 3 copies of 150ml each incoming culture = 180 tubes. We have to label each and every tube! Molecular biologists know very explicitly about the importance of keeping track of things, since they do large productions like this a lot. We always use two means of track-keeping: labeling and spatial position reference: All of the tubes go into 6 racks in spatial order of their use.

Next we load them all up with their respective dangerous chemicals (esp. the ones with phenol in them), and then we go at it: Get the frozen sample, add phenol, mix, put into beater tube, beat three times for 30sec each with ice cooling between beatings, then do the next one. After this first phenol treatment, the RNases in the cell are inactivated, so although we don't have to worry so much about speed any longer, we do still have to always touch everything only with gloves on!

So, even though I could probably have done this all myself, -AG- walked me through it, and, as I said above, did most of the operation himself. But in all we only ended up going through 1/3 of the samples, so I get to do it all again twice over, probably by myself, this time! I'm looking forward to it!

There are a lot of interesting things that I could talk about about this whole event, but, ya know ... I'm just too damned tired right now!

Wed, 20 Sep 2000 (u)

A simple example of new technique discovery, akin to the Agre and Shrager copier study. How to get a chunk out of an agar plate:

I'm trying to culture spec cassette vector. I've an agar plate with E. Coli that has a spec resistance gene in it, and a bunch of tubes of LB (bad smelling E. Coli growth broth). I need to get a small piece of E. Coli off the plate and into each tube. You can just scrap some off the surface with a toothpick, but usually you dig a chunk of the agar out.

Old technique: First, heat up a wire loop -- a 5 inch wire and has a loop at the end -- in the burner. Second, dip the loop into the agar to cool it. (Something I keep forgetting to do!) Third, use the loop to "saw" a square chunk of agar out of the plate. This requires putting the loop into, and pulling it out of the plate four times, once on each side. Fourth, worry the chunk out with the wire loop. This last step is problematic because, unless you have sawed it completely out, which is hard to do with a loop, it sticks to the rest of the plate, and then when you try to pull out the chunk, it won't come out cleanly, so you use more force, until ... SPROING! it comes flying out and shoots across the room, or worse, contaminates something else on the plate or in the hood.

New technique: First (heat) and second (cool) are the same. Here's the new part: Third, plunge the loop into the plate next to the part you want to extract, then, instead of sawing out a block, just sort of move the handle in a circle in a certain way that slightly bends the wire. With a little practice this cuts an inverted cone (tip down) *cleanly* out of the plate. Then all you need to do it get the loop under the tip of the cone and lift it out.

Why didn't I think of this before? More interestingly, why *did* I think of it just now? Who knows. I wish I'd been watching as I "discovered" this technique, although I'm not sure that I would have really noticed anything. Anyhow, this has made my life significantly more pleasant, and my technique significantly more efficient.

Fri, 22 Sep 2000 (u)

I've gone native!

I realized, over the last few weeks, that I've completely gone native. I think (dream, sometimes!) of DNA, cellular pathways, and lab procedures. This is coincident with my starting to stay later at the lab, and wanting to come in earlier, like a graduate student. Moreover, I'm becoming less interested in the computational aspects. Instead, I'm having more fun growing things and working with their molecules.

It's definitely not that I'm getting things to work better. If anything, they're worse than ever. This might in part because I'm taking more liberties with the protocols as I come to understand them. Eventually I'll learn where I can and can't do this, but at the moment it's a process of exploration. However, I can run pretty nearly every common thing in DNA manipulation without having to consult the manuals, and I'm not afraid of them any longer. And generally, though not always, they work pretty well.

For example, yesterday I ran an RNA gel, which is nearly the same as a DNA gel, except that you have to add formaldahyde -- which is really poisonous -- to the broth, and make up a bunch of new media. Also, you have to at Ethidum Bromide directly to each gel's sample loading buffer. So you're working with a lot more dangerous stuff than normal. No sweat. I was able to use -T-'s scribbled protocol, and, although it actually didn't work (!) I think that it was that the RNA prep didn't work, not the gel! Or (-AG- and -T- both theorize) I didn't let it run long enough. The point is, though, that I wasn't afraid of it. Well, at least not after a couple of minutes.

The other interesting thing that happened recently that made me realize that I'd nearly gone native as a molecular biologist is that I'm less interested in computer opportunities that arise nearly daily; at least, ones outside of my lab needs. Specifically, the BioLisp.org project, and the NASA metabolism projects. These are pretty interesting computationally; one at the AI level and one at the systems development level, but I'm really just not very into them. I'd rather hack DNA. It's a lot harder, but also a lot more fun.

I have this "working with my hands" feeling in the lab, that I don't get with a computer. Maybe I never have had that feeling at the computer, but maybe that's not a feeling that you get from computers; they're more working with my mind. If that! Actually, mostly computer programming is working with pretty much nothing to produce pretty much nothing interesting. If you're lucky, you get to make the program do something really interesting, but usually it's just grunt work; rearranging deck chairs on the Titanic.

Also, the computational aspects of biology aren't very interesting from either a computational or an AI standpoint. There are things that could be interesting in that way -- my idea about a book on the Views (as in View Application) that are used in molecular biology. Maybe I should get back onto that.

Also, this Stanford CHI course is not holding my interest very well. In fact, I'm surprised that it's holding my interest at all, considering that I think that CHI is mostly bullshit. I guess that the $10k is worthwhile, and if I can make it a little bit deep and a little bit interesting, it'll be worthwhile. I'll try to get a review paper out of the effort.

Sun, 24 Sep 2000 (u)

Damn damn damn damn damn. Well, my spec digestion worked better this time, but the gel is in some new uninterpretable state. This time I have reasonable bands where the spec cassette should be, and a common one that I get lower than that, but no others -- actually, I also got one very light (color) light (weight) one that I expected. But I'm supposed to get several bands of higher MW, and I got them before nicely (cf. 0817)! So now I don't know what could have happened! Did the digestion work so well this time, and actually fail last time? (The high MW bands, which are missing this time, are due to various version of incomplete digestion.) I'm going to go with that interpretation.

This time I ran the digestion for 1hr at RT and one at 37C, since Sma1 seems to want to run at RT. Maybe before, since I ran it at 37C, the Sma1 hadn't worked well. This is all plausible, let's run the numbers and find out. Meanwhile, I'm rerunning the RNA gel, since it completely failed on Thursday.

Footnote: So, I did the calculations, and it looks like I actually got complete digestion, which is *great*! (See n0922)

Mon, 25 Sep 2000 (u)

I have a post-it above my bench that tells me what I'm running right now. At the moment it says: "ligations", "linearizations", "blunting", "xformations", "text -> finkle", a couple of other things that are intelligible, and then a bunch of nearly unintelligible scribbles.

Learning to "see" gels isn't very hard. In fact, seeing them, or reading them off, is quite easy, depending upon how clean your DNA is, and how cleverly you've set up your gels. Among the most important issues is how you're going to figure out which band or bands are the ones you want. There are two general ways to do this. First, you use a "ladder", which tells you the sizes of the bands -- approximately. Or, you can use a standard marker of some sort. For example, when I get spec cassette out of a double digestion of the pHP45 vector, I used to use a ladder, but it was confusing because the sizes of the vector and the cassette were close to one another, and the ladder doesn't distinguish very well between then. -DB- suggested that I use a double digestion to pull the sizes apart. This pulls apart the two I'm interested in, but I end up with two that straddle a ladder marker band, so I have to be precise about figuring out which of the two I want, and you really can't be very precise on a gel because things usually don't really flow at perfectly even rates in every track. So I used a standard that -DB- gave me -- one of her spec cassettes that she believed really worked -- to sort out the band that I wanted from the rest of the noise on the gel. There's no issue of whether this is really DNA or not; of course it's DNA. The issue is just which of several possible chunks of DNA that I could have, do I actually have. The apparent simplicity of this is complicated by many factors, such as how complete the reaction was that separated the pieces, how long and at what voltage you ran your gel, and then various practical problems, such as leaking wells, alcohol (or whatever it was) in your sample that made it float, and so forth.

In the end, you end up with the right DNA in your tube, ... or not. Only sequencing, or a successful ligation in the end will tell you if you got it!

Wed, 27 Sep 2000 (u)

Today I extracted DNA and RNA from light-treated Cyanobacteria, transformed E Coli with the ligation product of a spec cassette and the cloned, linearized, and blunted SLR1176 gene, worked with -ZD- and -CWC- on the shotgun assembly software for Chlamydomonas, and with -T- on the cDNA microarray analysis for the environmentally stressed Cyanobacteria, and talked with -DB- and some weird gene ontology person about where to place photosynthesis in the gene ontology database. Oh, and spent some time w/-DE- on searching the Chlamy database for her PSBS genes.

So far I haven't dropped or spilled anything, messed up any protocols, accidentally deleted the wrong file or closed the wrong window. But the day is young; it's only 5pm!

Thu. 28 Sep 2000 (u)

I'm doing the success dance!!!! The damned blunt ended ligation and subsequent transformation worked! I got clones on the selective plate! What did I do differently? Reduced the reaction volume and increased the enzyme concentration four-fold, on recommendation of one of the part time techs. Increased the concentration of spec cassette v. vector? Froze the thing for a couple of days after running it overnight. Who knows. I was sure that this wasn't going to work. I only ran one plate; not because I was sure it would work, but because I was sure it wouldn't, and I didn't want to waste plates. But it worked! I even managed to culture 8 of the resulting colonies in selective liquid medium -- Every one that I tried! Now, on to extracting the plasmids, sequencing them to make sure that they contain the insert, and then transforming them into fresh cyanobacteria!

Let's see, when did I start into this? How many months have I been after this damned inactivation?..... Looks like it was Jul 14th that I first tried; that's TWO AND A HALF MONTHS!

Thu. 5 Oct 2000 (u)

Yesterday was the first day in I don't know how long that I actually programmed for an entire day! Actually, although I programmed all day, I extracted DNA after hours, so it wasn't a complete loss! :) I was working on our experiments with phred and phrap, the shotgun gene assembly tools, along with -CWC- and -ZD-. I'm sorry to say that it's actually more or less a complete loss working with these folks; not that they aren't smart, good, and earnest people, but this turns out largely, at least at this point, to be a computational problem, and I'm the only one really competent to handle it. Also, it's a somewhat HARD computational problem, involving large amounts of data and lots and interacting across several computers over several different operating systems.

It was actually quite a relief to do some good-ole-fashioned DNA (actually RNA) extraction -- phenol and all!

Fri. 6 Oct 2000 (u)

I learned something important today. I amplified my interrupted genes with the uninterrupted genes' primers, which should have worked fine, but the amplification didn't work very well. I had used a protocol that I had previously used to clone the genes out of Cyanobacteria genome, which asked for 500ng of DNA. -T- said that that's way too much, and that I should use only 10-30ng DNA. Why? Because the E. Coli plasmid, which is where the interrupted gene lives these days, is a lot smaller than Cyano genomic DNA, so the number of complete units (circles) per ng will be a lot MORE. Why would more be bad? Because I'm using a fixed amount of primer, and in order for the PCR to work, the primers have to stick to the SAME unit (circle) of DNA. But with so many units around, the likelihood of getting both primers on the same unit (circle) is reduced. Hmmmm. Okay, that actually makes some sense. In fact, I've had the idea for some time that the size of the unit (circle) makes a lot of difference in various places, but hadn't, until now, run into an actual case where it made a practical difference.

Tue. Oct 10, 2000 (u)

I introduce myself now as "senior bioinformatics engineering" to some people, as "gene chip project lead" to others, as a molecular biologist to a room full of hackers, and as a computer scientist to a room full of biologists.

I've been trying for days to confirm that my inserts are actually inactivated 1176 genes, by PCR amplifying the gene out of the vector, and then gel verifying that it's the right length = 2kb for the spec cassette and 1kb for the gene, total = 3kb, but my PCR hasn't been working for shit. Do I know why? No. -T- told me that it was way too much DNA, and Dennis last night told me that I have to vary the PCR parameters because the vector rung will snap back together and knock the polymerase off the gene. Reducing the amount of DNA didn't help, at least not visibly, and I don't know how to change the parameters, so I'm going to try another thing that Dennis suggested, which is to linearize the vector and then do the amplification. Actually, all I need to do is to linearize the vector and then measure it. Since it's ~2kb long, the whole size should be 2+2+1, or 5. But I'll probably try to amplify the gene again as well, just to be sure that it's there!

Thu. Oct 19, 2000 (u)

I must be a real biologist, I've got my pens in a beaker on my bench. I just did this today. I have been doing a lot of heavy programming over the past week, mostly on the chlamy chip project, but also on -T-'s microarray project, and stuff had been collecting on my desktop, so I decided all of a sudden to get organized, and cleaned off the desk top and also the top drawer, which was mostly pens, so now they live in a beaker on my desk/bench top, where any card-carrying scientist's pens should live.

One of the fundamental questions I've been working toward answering is whether you can know a field w/o being a member of that field. Can you understand molecular biology -- the practice of the science -- w/o being a molecular biologist? I'm sure that anthropologists have struggled with this. Now, I'm not an anthropologist, although I have nothing against them, so I want to address this question cognitively. Can a cognitive scientist understand the cognition involved in molecular biology w/o being one? Can you get to the foundations of biological discovery by sitting in lab meetings and interviewing biologists, or doing experiments on them outside their labs, or even sitting with them at their bench taking protocols on them while they are doing their work. (This last I think might actually work, but I'll get back to that later.)

One of the things that my work in the lab has led me to understand in a way that I don't think I could have otherwise, is just how little theory there is in molecular biology, and how hard it is to get anything resembling facts. If you were to read the literature that has been produced by sitting in on lab meetings, or from the literature, or in the popular press, it sounds like biologists sit around all day and think, and then do some experiments to confirm their thoughts, or get more data, or to generate new hypotheses or test existing ones. But in reality, we spend 99% of our time at the bench tinkering with protocols and genes and such, and only a tiny bit of time -- probably less than an hour a week -- actually thinking about the implications, or theory, or whatever of it all.

Fri. Oct 27, 2000 (u)

So, for two weeks now I've been doing computing, and only have respite from it today because both of the biologists that I'm working with on this particular project are out for the day.

It's interesting to see them struggling from the other side, and I wonder if they are feeling that same way that I felt when I started off, a few months ago, into the fog of molecular biology. They are probably even more distressed and frightened because, unlike in my case, their jobs are actually on the line. Also, it really is the case that they can break something, and small errors in molecular biology procedures are more forgiving (in most cases) than in computers. If you leave the -R out of the cp command, or even use the wrong cAsE, you get a completely different command and possibly lose all your data! I should talk to them about how they are feeling about the computing.

So I'm pretty much doing all the computing on this project, which at the moment is most of the work, and they are running the protocols that I set up. Now, I will admit to not trying hard to make any of this simple for them. The heart of the project is hard enough -- cleaning up the mess made by gene shotgun assembly programs! -- so I haven't done any interface at all. -CWC- has made a valiant effort of trying to make my scribbled (even in email) protocol notes understandable, but since she doesn't know what's important, she can't actually do a good job of it. So, for example, she gets case wrong and forgets close quotes -- things that I know by experience are common problems. But, by analogy, I get the tube that you're supposed to be using wrong, and neglect to switch pipette tips, etc. Simple errors that can hose a whole procedure.

Computers are in many ways more forgiving and in many ways less forgiving than biological protocols. But it's usually easier to tell what you've done wrong in the case of a program or command sequence, at least since you can see the program or commands, or their remains, w/o having to run a gel!

Tue. Nov. 14, 2000 (u)

I met -BX- at NASA today and we went for coffee and talked for a couple of hours, and managed not to piss one another off. It happens that her office is on the same floor as a NASA educational museum that is in an old wind tunnel. I hung out in the museum for a little while, watching docents explain airplanes and the like to 4th graders (or so), and talked with the ex-UCSC physicist who ran the place, under the cover story that I'm a museum science education researcher. (I have so many cover stories that I'd make a great science spy!) Anyhow, so -BX- and I went to a cafe, and she told me about how she was learning to do trapeze tricks, which sounded a lot like my learning to do molecular biology trick. Specifically, she had "epiphany days", as she called them, when she would learn something that, although small helped a lot. She also bemoaned the fact that people who came into the school with gymnastics, etc. skills were better at it than she was. She was learning it "all at once", and I could have put the "I'm in a fog" words into her mouth.

Another part of our conversation I was telling her how molecular biology was like car mechanics, and that these wasn't much important theoretical stuff going on. She said that it sounded like I was in the wrong field, that I should be in ecosystem biology, or something of that kin. Although there's something to this insight, I think that I can find a vision of a scalable molecular biology, esp. if I get to work on cyano and other important organisms, and if I can shape a view that combines higher and lower level together. But I have to get my head out of the nuts and bolts long enough to do this!

I'm trying to spend some time each day web surfing taxonomy. There are a number of very good biological education sites that I like to browse, and I'm trying not to be so dumb about both the organisms that I work with, and the plants that I am surrounded by every day!

Th. Nov. 16, 2000 (u)

I'm not sure that I can offer a complete and general perspective on cognition in science from the inside, but I may be able to contribute a few useful insights (no pun intended) on relevant matters.

Education: Why do mature fields seems to require more postdocs -- that is, you have to do a couple of post docs in biology before getting an academic job, and in pysch it used to be none back when I was in school, and now it's one. There could be many reasons: as a field matures, you get a glut of senior faculty, and faculty at all levels want to have and to keep post docs for their research. But also it might be that there is just more technique involved as a field becomes "concrete", and less "theoretical" flaming that one can get away with and appear brilliant. The stakes of "quality" come to be being able to make brain images, which requires knowing a LOT about a LOT of technical stuff, whereas when I got out of school you could just do a couple of good experiments, or even not so good, but interesting experiments, now you have to be able to produce brain images. In biology you have to be able to do, if not master, hundreds of techniques before you can do any significant research. Also, the field is completely full of research, so it takes years to home in on a topic and figure out how to do it, and then actually do it. Ph.Ds regularly take 6 or 7 years in biology!

I had another experience of seeing the Big Picture today in -WB-'s great talk where he reviewed several papers on plant circadian rhythms...and the big picture scared me. There are hundreds (maybe as many as a thousand) genes know to have PAS domains, which are somehow involved in signaling, and a given organism may have as many as thirty or even more of such proteins. I asked Winslow what the current best theory was (briefly) of how circadian rhythms work, and he said that it's complicated, and mostly poorly understood, but that it's probably some sort of feedback-caused oscillation, like in drisophila. How can you possibly reason about the dynamical interactions of 30-100 proteins all working in concert to do something so complex?!

Another example: I hypothesized aloud, to my embarrassement, that a photoreceptor shouldn't itself be photoregulated. I was led to this bold claim by an interpretation of a graph where they found that this particular photoreceptor (which has a PAS domain, no surprise) does NOT appear to be regulated, at least not varying in accord with the circadian rhythms of the plant. Well, I was promptly shot down by everyone who knows what they are talking about: In fact, photosignaling proteins are upregulated, downregulated, unregulated, and every other kind of regulated, and you can make up reasonable accounts to go with each possibility.

All of this leads me to wonder if molecular biology is possible. That is, will it even be possible for a human being to reason about a system that is this complex? Answer: No. We have to have machines do it. My vision of this is that I want a machine that will read the abstracts of all the papers in Science (etc.) and "understand them", sort of like Schank's various programs "understood" news stories, and then reason about them.

I had lunch with -CWC-. We talked about whether she should do a Ph.D, which she had asked me about a while ago. I didn't know what to tell her.

Th. Dec. 1, 2000 (u)

It is amazing how much computation modern molecular biology depends upon, and how little understanding of the complexity of these problems most biologist have. I don't know what -AG- would have done had I not dropped in out of the blue, and now, daily, -CWC- and -ZD- are asking me to do what to them are simple computations. Indeed, these computations are *conceptually* simple, but computationally often very very difficult. For example, -ZD- asked me to trim the singletons that are part of the assembly results for his library. If there were ten or a hundred singletons, this would not be a big issue, although the computational edifice needed to get to them is non-trivial. But there are thousands of these, and it takes many seconds of computation to process each one. Say, 3000 sequences at 5 seconds each is 15000 seconds, or about four hours of computing. Even if I got the computational time down to 1 second, it'd be nearly an hour of computing to run through these. I have been starting to teach -CWC- a little programming because I think that she can actually do it, and because I haven't had even a minute to breathe between her and -ZD- asking for "simple" computations all the time.

Moreover, they don't get that certain problems need to be addressed in statistical ways, and that arbitrary rules that work for the few cases that they have seen will not apply to the whole. -CWC- came up with a rule for deciding when a match was a good one, that it should have a score over 600, but this was based upon looking at a few examples, and the score turns out to be based upon the length of the matching part, and not of the whole sequence, so I showed her a couple of examples where it would not work, but her response was to add more special case rules, and finally to ask to redo the whole analysis from the beginning in order to reduce (but this would not eliminate!) the number of problem cases.

Th. Dec. 7, 2000 (u)

I am surrounded by insane car mechanics. These people are entirely obsessed by details. So much so, that they don't understand the concept of a database when it is explained to them numerous times over. Or, perhaps they understand it, but they don't believe it. We spent over an hour today in a meeting whose stated purpose was to get a high level update on the state of the Chlamy sequencing project. Instead, most of the meeting was spent with -CWC- and -AG- trying to figure out what to name contigs so that they could get the name would tell them everything they could possibly need to know about the contigs. To me the contigs have completely arbitrary names, which are keys into a database, but they insisted on designing names that would indicate the nearly complete history of the contigs, what well the longest clone comes from, how many contigs are in that well and which clone in the list (ordered by length) is the selected one, if it's not the longest. Ugh Ugh Ugh. In part this clearly arises from what the are used to, which is having to write all sorts of info on the labels of eppendorf tubes in order to keep them straight, whereas my stuff is all organized by simple codes that index into my lab notebook. I finally, after nearly half an hour of this, had to assert my vague authority as the person who had to actually program this thing, to call a halt to the meeting that had devolved into a trivia session, mostly based upon complete misunderstanding of what was possible for the computer to keep track of. This sort of obsession with detail seems to be endemic of molecular biologists.

Th. Dec. 15, 2000 (u)

I've been thinking about why I haven't been writing many notes recently. Partly, I've been doing mostly bioinformatics for the past few months, although there is a great deal of biology in that as well. The effect that I think is most important is that I've stopped writing becuase I've stopped having interesting things to say. This, in turn, is because I've actually gone native and, as a result, have stopped having most of the understanding problems that I had when I was learning, and the concomitent noticing events, which combined led me to have interesting things to say. One could possibly actually plot my "nativity", so to speak, but plotting the number of diary entries per week, or some such thing.

More evidence that I've actually gone native comes from the talk that I gave this morning to a room full of molecular biologists, about my computer models of biological knowledge. In addition to a room full of grad students, technicians, and post docs, were -AG-, and Winslow, the former director of the division, and a major plant biologist. I showed off my own models of biological pathways, and talked in detail about clones and mutants and genomics and transcriptions and translation, and did not appear to get anything at all wrong. In fact, there was tons of constructive discussion, and no one at least overtly said anything like: You got that part wrong! (One of the great things about plant biologists, at least here, is precisely that this sort of constructive interaction appears to be the rule! But I have seen talks where everyone just sits on their hands when they think that the person is bullshitting, and this wasn't what happened here.)

Yet there are still many fundamental things that I don't yet have a complete handle on. The best example comes from the past week's work on primers for the chlamy chip. We're trying to amplify double stranded genes from ESTs in vectors. To do this you just run a program that makes primers for you, and then you run the PCR. It's mostly automatic, aside from a bunch of data massaging. But you have to get the direction of the primers right, and their "polarity" -- there's no word for this... Actually, this is totally confusing, and there should be some way to remember it. You have "forward" and "reverse," and "normal" and "complemented," and then, you've got two strands, right, so you have to know which one you're reading, and put it into the normalized direction. Oh, and then DNA v. RNA. Ugh Ugh Ugh! I guess evolution didn't build the thing with understandability in mind. It's amazing that it's as simple as it is. The code could have been a nightmare, instead of a simple three-letter code, and restriction enzymes could be hellishly complicated, but it's all pretty simple, so I guess that I shouldn't complain that some parts are a little hard to talk about and remember.

Speaking about talking about and remembering, I've been noticing, in my travels in molbio land, the points at which I don't have to think about things any longer to talk about them. This is the same sort of thing that happens when you learn someone's name. At first you have to think about what it is each time you want to refer to them, then there a stage where, if you spend enough time with them, you don't have to think hard about it any longer, but it still doesn't really flow from you, and then there eventually comes a point where it does just flow. Your connection with that person is different in each of these stages, although I doubt that this has anything to do with the naming, but is analogous and correlated just in the sense that it happens in stages over time of familiarity. We don't have words for these stages: acquaintance, friends, colleague don't register the right dimensions. First you have just met, then you work around them for a while, knowing who they are but unfamiliar with their habits in detail, and eventually they are part of the natural flow of your environment. The same, of course, is true with the tools of the trade, both physical and conceptual. I can make medium and grow stuff pretty much without trouble now, but for most things I'm still in that middle stage of knowing them but not being in a natural flow with them. It's sort of like I've got them in my head, but not in my hands (speaking metaphorically, since most of the things I'm referring to are conceptual things, not physical ones.)

Please note:

There are about 75 entries up to 20001215 that I am slowly converting to HTML. If you would like an update when there are major new additions, send me email.

Except for the occasional "borrowed" image, all of this text and my personal images are Copyright (c) 2000-2001 by Jeff Shrager. This work was presented publicly for the first time at the Workshop on Cognitive Studies of Science and Technology, organized by Mike Gorman and held March 24-27, 2001 at the University of Virginia. If you would like to cite any of this material, please refer to this page, and/or to:

Shrager, J. (2003) On Being and Becoming a Molecular Biologist: Notes from the Diary of an Insane Cell Mechanic. To appear in Gorman, et al. (Eds.) [working title]. [download pdf version].

And, if you use the contents of this web page, please cite its URL.