How to bamboozle

Chapter 13 of Grading, Testing and Instructional Format by William Kirby Revised, 1995

HOW TO BAMBOOZLE

It is fun to discuss grades with some students. You can hear the most hilarious things. The best way to go about it is to bring the talk around to how teachers and professors decide what grades to give. Someone will state that the teacher tosses papers down a flight of stairs and gives high grades to those that land on a high step. Somebody else will suggest that high grades are given to the neatest work or to the longest papers. Eventually, though, someone will give a different answer.

Well, first, the papers are marked. Then they are put in order by number of correct answers or points earned. Then, what happens next depends on whether the teacher curves or not.

Curves?

Sure, you know, either the teacher marks standard--90% correct for an A--or marks on the curve.

What curve?

You know, the curve. It’s shaped like this. (Here the student will indicate the silhouette of a gently sloping symmetrical hill with her hands.)

What happens if the teacher does mark, as you say, on this curve?

Well, he gives the top 10% of the papers an A and the bottom 10% an F.

Why the same number of Aís and F’s?

Because the curve has the same shape on either side of its middle.

Why 10% instead of 4% or 7%?

I’m not sure, but 10% A’s and F’s and 20% B’s and D’s and 40% C’s adds to 100%.

So does 4%, 24%, 44%, 24% and 4%.

Well, I don’t know but whatever he does, he gets his figures from the curve.

What curve? Where did he hear about this curve? Why does he pay any attention to it? Who made this curve and when?

If you have any of the answers to these questions in your head, you can be sure that you are a very unusual student indeed. While this kind of searching question is easy and normal to ask about things that don’t concern oneself intimately, it is very difficult for students to vigorously question the logic behind the procedures used to grade them. The better adjusted and more self-confident they are, the more it may seem like childish and unworthy quibbling to probe the reasons for a low grade. It would seem weird and bad-natured to probe the reasons for a high grade. That leaves only the unusual student who ignores the grades’ personal reflection of her work and such students are few. Therefore, it behooves teachers and parents to do the questioning for the student.

This chapter could be titled Univariate Statistical Models because the curve that has been mentioned is just such a model. It is a statistical or mathematical model and it is only one of many that exist. It is by far the most famous. In fact, it is a little too famous. It is so influential that you would be well-off to know something about it. The curve has a shape about like this

Z scores

That shape can be achieved by a physical process. If you nail a funnel to a bulletin board and arrange identical pegs or headless nails just below it in a pattern like this:

you will have an apparatus through which you could pour BB’s or beads. Most will fall back and forth through the middle pegs but a few will fall to the right all the time, or to the left at every peg they meet. If compartments are made below the pegs to catch the beads, the middle compartments will be the most filled and the extremes will catch the least. Infinitely many compartments catching infinitely many beads that fell through infinitely many pegs would catch beads in exactly the shape of the curve.

What does all of this have to do with what grade a student should get? I am sorry to have to say that it has just about nothing to do with it. Mathematicians and statisticians developed the most recent formula for the famous curve over a hundred years ago. In its simplest form, it refers to Z scores

h (Z) = 1/sq rt(2 Π) x e (-z2/2)

This says that the height of the curve above any Z score equals e (a certain number) raised to a power of that Z score and multiplied by (the number.3989...). Thus, if we lay out a baseline for Z scores and mark off points for scores of 0, 1, 2, 3, -1, -2, and -3, we can draw the curve accurately. All that is needed is a dot at the proper height above various Z scores. Enough dots will provide guidance for us to string them all together and we will have the famous curve.

Regardless of what the formula means, it is clear that something which involves only Z scores cannot really be the final authority on who gets what grade or how many grades of a certain kind there should be. It should be equally clear that if a teacher knows that every student in a class has done terrible work, the existence of the above mathematical formula in no way lessens that teacher’s responsibility to give them all terrible grades.

Some people in education and psychology refer to this curve as bell-shaped (some bell!) and some call it the normal curve. This last is a particularly unfortunate name. While it is true that this statistical model has successfully predicted many real distributions in nature, it is by no means true that only the distributions it predicts are normal, healthy or natural. I believe that it helps to put this troublesome model in perspective if you know a little about some of the other statistical models. The one I have been discussing was developed, in part, by Gauss, a very important mathematician. One name for this curve is the Gaussian distribution. It is in fact the most famous and widely applied statistical model.

The second most widely used model is probably the Poisson distribution. It has predicted such diverse things as the probability of being kicked to death by a horse while serving in the Prussian Army, mid-air plane collisions, oral reading mistakes, vacancies on the Supreme Court and telephone busy signals. If you are studying or predicting the frequency of a rather rare phenomenon, one that frequently does not occur, the Poisson distribution might very well be just what your local statistician would prescribe. If you were to consider all possible Lady Testing Tea tests, you would get involved with the hypergeometric distribution. These are two of the many statistical models that exist, You might look at Measuring Uncertainty by S.A. Schmitt for a list of others.

The statistician or mathematically-oriented scientist may try to develop or use a mathematical model to simplify his understanding of a phenomenon. There are certain mathematical assumptions that lead to the Gaussian distribution. (They are set forth rather clearly in an old book by Peters and Van Voorhis called “Statistical Procedures and their Mathematical Bases”. Also Feller’s “Introduction to Probability Theory and Its Applications” or Drake’s “Fundamentals of Applied Probability Theory” may be helpful. But I warn you, they are irrelevant to instruction and testing.) Anyhow, if the Gaussian distribution predicts fairly closely how some variable will distribute itself, then assuming that the variable works according to the mathematical assumptions that lead to the model may lead to new insights.

The assumptions behind all statistical models involve probability and chance events. But educators are paid to make education a process that improves the student. Therefore, while the Gaussian distribution may predict very well how children will be able to perform before instruction, it is certainly no sign of good education to show that they are Gaussian - distributed after it. In fact, the ideal distribution after excellent and successful instruction wouldn’t be spread out at all. It would be a single very high pile of scores marked “perfect.”

Nevertheless, this curve and a lot of supposedly magic mathematical symbols used with a properly mysterious attitude can effectively bamboozle people. A fully bamboozled student or parent will ask fewer questions. He will accept negative judgments more humbly. He will maintain a properly respectful manner in your presence better and in general make your life easier and more majestic. If you want such a life as a teacher, one that substitutes show and symbol for truth and liveliness, you owe it to yourself to perfect all phases of half-assed confusing. Pompous bamboozling with the curve is a good place to start. It is easy and doesn’t take much intelligence. To get started, decide how many symbols you will use. Say, for instance, you decide to award five grades, A through E. Then, divide 6 by the number of symbols. (Other figures than 6 can be used but it is a pretty standard figure. Concentrate on learning to hoodwink people, please.) If there are to be five grades, the result is 1.2. Whatever your result, have a statistician friend help you find the proportion of the Gaussian distribution that lies half this distance (1.2 / 2 = .6), up and down from the curve’s middle. It doesn’t really matter if you don’t understand what I am writing about here. Eventually, your friend will hand you a percentage for each kind of grade you wish to give. He will have read the figure from a statistical table that slices the Gaussian distribution into many fine vertically-cut pieces. He gets your percentages by packing these pieces into as many groups as you wish to have grades. Once you have these figures, you are all set for confidence tricks and chicanery, except maybe for facial control and tone of voice. For five grades, the percentages are within 1% of the easy-to-remember set of 4%, 24%, 44%, 24%, and 4%. To grade easily and regally, you merely put test papers in order by number of correct answers and pick up the top 4%. With most class sizes, 4% will not work out to be an integer. In that case, just come close to the figure.

I do not recommend the procedure. In fact, I detest covering ignorance with pomp. Why ought I to derive percentages for grades from the Gaussian distribution? The best answer to the question is that I actually ought not to do so. There is no persuasive reason to derive grade percentages from any statistical model. The best thing I know involves deciding what skills and experiences that (1) I am probably capable of helping my students acquire and (2) they will probably benefit from acquiring. You can see the word “what” in the preceding sentence. That word denotes particularity, individuality. I must decide and re-decide periodically what skills to concentrate on, what errors to avoid. “What”--not “how many.” At bottom, instruction and its evaluation concerns things in a non-quantified way. Of course, numbers and probabilities can be helpful in summarizing educational situations at times. But basically, fundamentally, instructional testing and grading pertain to individual items.

Instruction is an activity that is pictured here as involving a human learner and, usually, a human teacher. Activities go on best between humans if they are conducted with a little spice, a little sensitivity. I don’t want to end this discussion with a picture of a mechanized human being reading down a checklist of conditions toward an automatic decision of a student’s grade. There are many educational decisions that are reached quite well as a matter of routine. But the main thing is always the student. There may well be times when your best judgment is that a student should get a higher grade without the agreed-on items all being checked off. There may be times when you judge it best to require additional surprise items from one student and not another. Not too many times or you have a signal that your routine procedures need revising. But there will be some times when you must exercise your best judgment. The point of this chapter, though, is that no matter if you knew twice as much math as you do, you still would not find the so-called normal curve giving much guidance. Not unless you have been bamboozled.

References

Beyer, W., Handbook of Tables for Probability and Statistics, 2nd edition, 1968, Chemical Rubber Company Press.

Feller, W., An Introduction to Probability Theory and Its Application, 3rd edition, 1968, Wiley.

Haight, F., Handbook of the Poisson Distribution, 1967, Wiley.

Peters, C., and Van Voorhis, W., Statistical Procedures and Their Mathematical Bases, 1940, McGraw-Hill.

Schmitt, S., Measuring Uncertainty: An Elementary Introduction to Bayesian Statistics, 1969, Addison-Wesley.