The Abilities

Scientific abilities and their assessment

E. Etkina, A. Van Heuvelen, D. Brookes, S. Brahmia, M. Gentile, S. Murthy, D. Rosengrant, A. Warren

This document describes scientific abilities briefly, a more complete description of the abilities and their assessment is posted below as a different link.

Defining scientific abilities

We use the term “scientific abilities” to describe some of the most important procedures, processes and methods that scientists use when constructing knowledge and when solving experimental problems. We use the term “scientific abilities” instead of “science process skills” to underscore that these are not automatic skills, but are instead processes that students need to use reflectively and critically[1]. The list of scientific abilities developed by our physics education research group is as follows:

the ability to represent information in multiple ways;
the ability to use scientific equipment to conduct experimental investigations and to gather pertinent data to investigate phenomena, to test hypotheses, or to solve practical problems;
the ability to collect and represent data in order to find patterns, and to ask questions;
the ability to devise multiple explanations for the patterns and to modify them in light of new data;
the ability to evaluate the design and the results of an experiment or a solution to a problem;
the ability to communicate.

This list is based on the analysis of the history of practice of physics[2], the taxonomy of cognitive skills,[3] and recommendations of science educators[4].

To help students develop these abilities, one needs to engage students in appropriate activities, and to find ways to assess students’ performance on these tasks and to provide timely feedback. Activities that incorporate feedback to the students are called formative assessment activities. As defined by Black and Wiliam, formative assessment activities are "all those activities undertaken by teachers, and by their students in assessing themselves, which provide information to be used as feedback to modify the teaching and learning activities in which they are engaged."[5]. Black and Wiliam also found that self-assessment during formative assessment is more powerful than instructor-provided feedback; meaning the individual, small-group, and large-group feedback system enhances learning more than instructor guided feedback. Sadler[6] suggested three guiding principles, stated in the form of questions, that students and instructors need to address in order to make formative assessment successful.

1. Where are you trying to go? (Identify and communicate the learning and performance goals.)

2. Where are you now? (Assess, or help the student to self-assess, current levels of understanding.)

3. How can you get there? (Help the student with strategies and skills to reach the goal.)

As noted above, students need to understand the target concept or ability that they are expected to acquire and the criteria for good work relative to that concept or ability. They need to assess their own efforts in light of the criteria. Finally, they need to share responsibility for taking action in light of the feedback. The quality of the feedback rather than its existence or absence is a central point. The feedback should be descriptive and criterion-based as opposed to numerical scoring or letter grades without clear criteria.

With all the constraints of modern teaching, including large-enrollment classes and untrained teaching assistants (TAs), how can one make formative assessment and self-assessment possible? One way to achieve this goal is to use scoring rubrics. A scoring rubric is one of the ways to help students see the learning and performance goals, self-assess their work, and modify it to achieve the goals (three guiding principles as defined by Sadler above). The rubrics contain descriptions of different levels of performance, including the target level. A student or a group of students can use the rubric to self-assess her or their own work. An instructor can use the rubric to evaluate students’ responses and to provide feedback.

Fine-tuning scientific abilities and devising rubrics to assess them

After making the list of scientific abilities that we wanted our students to develop, we started devising assessment rubrics to guide their work. Rubrics are descriptive scoring schemes that are developed by teachers or other evaluators to guide students' efforts[7]. This activity led to a fine-tuning of the abilities, that is, to break each ability into smaller sub-abilities that could be assessed. For example, for the ability to collect and analyze data we identified the following sub abilities:

the ability to identify sources of experimental uncertainty,
the ability to evaluate how experimental uncertainties might affect the data,
the ability to minimize experimental uncertainty,
the ability to record and represent data in a meaningful way, and
the ability to analyze data appropriately.

Each item in the rubrics that we developed corresponded to one of the sub-abilities. We agreed on a scale of 0-3 in the scoring rubrics to describe student work (0 – missing, 1 – inadequate, 2 – needs some improvement, 3 – adequate) and devised descriptions of student work that could merit a particular score. For example, for the sub-ability “to record and represent data in a meaningful way” a score of 0 means that the data are either missing or incomprehensible, a score of 1 means that some important data are missing, a score of 2 means that all important data are present but recorded in a way that requires some effort to comprehend, and a score of 3 means that all important data are present, organized, and recorded clearly.

Simultaneously, while refining the list of abilities, we started devising activities that students could perform in recitations and labs. Defining sub-abilities and developing scoring rubrics to assess them informed the writing of these activities. After we developed the rubrics, we started using them to score samples of student work. Each person in our nine-person group assigned a score to a given sample using a particular rubric; we then assembled all the scores in a table and discussed the items in the rubrics where the discrepancy was large.

Based on these discussions we revised the wording of the rubrics and tested them by scoring another sample of student work. This process was iterated until we achieved a nearly 100% agreement among our scores.

In the sections below we list scientific abilities and corresponding sub-abilities that we identified, provide examples of scoring rubrics that we devised and discuss where in the instructional process we use the rubrics. For each scientific ability, we provide examples of the tasks written for the students. In subsequent sections we will report how we used the rubrics to study students’ acquisition of some of the suggested abilities. Notice that the list of abilities below does not match the list above and does not match the list of rubrics, the discussion below is a description of our work as it was progressing and we slowly zeroed on the abilities for which we wrote the rubrics. Read this section slowly with the printed rubrics at hand so you can see how our thought process led to the development of a particular set of rubrics. The list below however matches different types of tasks that we have on this webpage.

1. Multiple representation ability

While constructing and using knowledge, scientists often represent the knowledge in different ways, check for consistency of the representations, and use one representation to help construct another.[8] For example, in the 1950s Feynman diagrams helped quantum electrodynamics move forward somewhat more rapidly by providing a more visual and understandable representation of a scattering process. Rules were also developed for converting these diagrams into complicated scattering cross section equations. Such qualitative representations, particularly diagrammatic or in some cases graphical representations, help physicists reason qualitatively about physical processes and to see patterns in data without engaging in difficult mathematical calculations.

In introductory physics courses students are often given a verbal description of a physical process and a problem to solve relative to that process. They can start their analysis by constructing a sketch to represent the process and include in the sketch the known information provided in the problem statement. They construct more physical representations that are still relatively easy to understand—for example, motion diagrams, free-body diagrams, qualitative work-energy and impulse-momentum bar charts, ray diagrams, and so forth. Finally, they use these physical representations to help construct a mathematical representation of the process.

What sub-abilities help to make this multiple representation strategy productive for reasoning and problem solving?

The ability to correctly extract information from a representation;
The ability to construct a new representation from another type of representation;
The ability to evaluate the consistency of different representations and modify them when necessary.

In addition to such sub-abilities that students need to master while using multiple representations, there are specific sub-abilities needed for each type of representation. For example, to use force diagrams (or free-body diagrams FBDs) productively for problem solving, students must learn to:

Choose a system of interest before drawing the diagram.
Use force arrows to represent the interactions of the external world with the system object or objects.
Label the force arrows with two subscripts (for example the force that Earth exerts on the object is labeled as F_{E on O}).
Try to make the relative lengths of force arrows consistent with the problem situation (the net force should be in the same direction as the system object’s acceleration).
Include labeled axes on the diagram.

Such diagrams if drawn correctly can be used to help write Newton’s second law in component form—to represent the situation mathematically. Based on these considerations we constructed a rubric to help students self assess themselves while drawing force diagrams.

We also made a list of several types of multiple representation activities (a task may consist of some combination of these activities). For example:

· Provide students with one representation and have them create another.

· Provide students with two or more representations and have them check for consistency between them.

· Provide students with one representation and have them choose from a multiple-choice list a consistent different type of representation (for example, provide a mathematical description of a process and have students select from a list a consistent word description of the process).

· Have students use a representation while solving a problem.

2. Ability to devise and test a qualitative explanation or quantitative relationship (paying attention to the type of the experiment students conduct)

One of the purposes of science is to explain observed phenomena[9]. Hypotheses that scientists generate to explain phenomena need to be testable – this means that they can be used to make predictions about the outcomes of new experiments[10]. If the outcome matches the prediction, it does not mean that the hypothesis under test is always correct; it only means that the hypothesis was not ruled out by the testing experiment. Thus, it is more productive to try to design an experiment whose outcome will not match the prediction based on the hypothesis under test. However, the outcome of the testing experiment depends not only on the correctness of the hypothesis but also on other auxiliary hypotheses used to make a prediction. These are usually simplifying assumptions about objects, interactions, systems, or processes involved in the phenomenon[11] . Based on these considerations we identified the following sub-abilities that we want our students to develop:

· Ability to make a reasonable prediction based on the proposed hypothesis

· Ability to identify assumptions used in making the prediction

· Ability to determine specifically the way the assumptions might affect the prediction.

· Ability to revise the hypothesis based on new evidence.

For each of the sub-abilities, we developed an item in the rubrics to guide students.

To engage students in testing hypotheses, we provide them with alternative hypotheses that they need to test. We emphasize that they need to try to design an experiment to rule out the hypothesis, not to support it. For example:

· Design an experiment to test the following proposed hypothesis: an object always moves in the direction of the unbalanced force exerted on it by other objects.

· Design an experiment to test the following proposed hypothesis: in an electric circuit the current is used up by different elements.

3. Ability to account for anomalous data

Another important ability that scientists use in their work is the ability to account for anomalous or unexpected data. Often when a scientist performs an experiment, she obtains some information that seems to contradict her expectations. After performing the experiment she needs to modify the explanation or revisit the simplifying assumptions. We devised “surprising data tasks” that engage students in similar activities. They are used at the stage of learning when students have constructed some scientific understanding (explanation) of relevant phenomena. Students are asked to predict what will happen as a result of a particular new experiment. Students need to write the prediction and an explanation of the prediction. After making their prediction and writing an explanation, students observe the experiment directly. Most likely the outcome of the experiment will not match their prediction – they will have anomalous data. Then the students have to revise their prediction by revising the explanation or the simplifying assumptions that they used to make it. In doing so, the students will need to either revise the model or revise their assumptions.

4. Ability to design an experimental investigation

Here is it very important to discus the role of experiments in the development of physics knowledge by physicists and by students. Although in our traditional view of physics all experiments seem to be epistemologically equal, the analysis of the history of physics and the work of physicists today shows that all experiments can be grouped into three big categories: experiments that help generate new ideas (hypotheses, relations, etc.), experiments that test proposed ideas (hypotheses, relations, etc.) and experiments that combine several tested relations to solve a practical problem. Based in this reasoning one can classify experiments into three big groups: OBSERVATIONAL EXPERIMENTS, TESTING EXPERIMENTS AND APPLICATION EXPERIMENTS. In the learning of physics experiments can play a similar role. [12]

When conducting an observational experiment, a student focuses on investigating a physical phenomenon without having expectations of its outcomes. When conducting a testing experiment, a student has an expectation of its outcome based on concepts constructed from prior experiences. In an application experiment, a student uses established concepts or relationships to address practical problems. However, in the process of scientific research the same experiment can fall into more than one of these categories.

What abilities do students need when designing these investigations? We have identified the following steps that students should take to design, execute and make sense out of a particular experimental investigation. We assigned a sub-ability for each step and wrote corresponding descriptors in the rubrics. The results of these discussions are presented in Table 3.

For each of the identified sub-abilities, we devised a rubric item that describes different levels of proficiency.

Students use these rubrics in their labs. Ideally we want them to continuously refer to the rubrics while designing and performing the experiment. The rubrics guide them as to what experimental aspects they should specifically pay attention to. After they perform the experiment, they write a lab report (in the lab). During the process of writing, they use the descriptors in the rubrics to improve their report. For example, in one experiment, students used a thermometer to measure the temperature of a hot rock. They recorded the rock’s temperature in their report. However, what students actually measured was the temperature of the water in which the rock was submerged. Using the rubrics they self-assessed their writing and revised their description of how they used available equipment to measure the required physical quantity. In the revised report, students wrote that to determine the temperature of the rock they measured the temperature of water in which the rock was submerged and waited for a certain time before recording the temperature so that the thermometer, rock and the water were in equilibrium.

5. Ability to record, represent and analyze data

Data collection and analysis are important in the practice of experimental science. These abilities are independent of the type of experiment that is being performed, and hence have been placed in a different category. We identified sub-abilities that students need for successful data collection and analysis and devised rubrics for each sub-ability. (The simplified list below is appropriate for students – scientists do this at much more sophisticated level.):

- Ability to identify sources of experimental uncertainty.
- Ability to evaluate of how experimental uncertainties might affect data.
- Ability to minimize experimental uncertainty,
- Ability to record and represent data in a meaningful way.
- Ability to analyze data appropriately.

The rubric for each sub-ability has descriptors indicating what needs to be done for satisfactory achievement. Students develop these sub-abilities in labs. As we discussed above, the lab write-ups we provide guide them through the process by focusing their attention on the sub-abilities outlined in the rubrics.

6. Ability to evaluate experimental predictions and outcomes, conceptual claims, problem solutions, and models.

We define evaluation as making judgments about information based on specific standards and criteria[13]. More specifically, a given particular is judged by determining whether it satisfies a criterion well enough to pass a certain standard. Scientists constantly use evaluation to assess their own work and the work of others when conducting their own research, serving as referees for peer-reviewed journals, or serving on grant-review committees.

Evaluation is a crucial ability for our students. During a physics course, students are expected to identify, correct, and learn from their mistakes with the help of an instructor. This aid may come in many forms, such as when an instructor provides problem solutions to a class, or tutoring to an individual student. However, in each case the student relies upon an instructor (or sometimes a textbook) in order to determine whether, and how, their work is mistaken. Since the students are not given any other means with which to evaluate their work, the students come to see evaluation by external authorities as the only way for them to identify and learn from their mistakes. This dependence on external evaluation has several negative effects on students, inhibiting their learning and desire to learn[14]..

There are several sets of criteria and strategies that are commonly used by practicing physicists, and if we want physics students to engage in evaluation they too must value and use these strategies. Each of these strategies relies upon hypothetico-deductive reasoning[15], whereby the information is used to create a hypothesis which is then tested. The logical sequence for this testing can be characterized as: If (general hypothesis) and (auxiliary assumptions) then (expected result) and/but (compare actual result to expected result), therefore (conclusion). For example, when a student derives an equation and needs to evaluate it with dimensional analysis, the logical sequence is:

If the equation is physically self-consistent,

And I correctly remember the units for each quantity in the equation,

Then I expect the units for each term in the equation to be identical,

And/But the units for each term are/are not identical,

Therefore the equation is/is not physically self-consistent.

The types of sub-abilities that students need to develop to be successful in evaluation are numerous. Some of them are:

ability to conduct a unit analysis to test the self-consistency of an equation;
ability to analyze a relevant limiting/special case for a given model, equation, claim;
ability to identify the assumptions a model, equation, or claim relies upon;
ability to make a judgment about the validity of assumptions;
ability to use a unit analysis to correct an equation which is not self-consistent;
ability to use a special-case analysis to correct a model, equation, or claim;
ability to judge whether an experimental result fails to match a prediction;
ability to evaluate the results of an experiment by means of an independent method.

Evaluation sub-abilities are integral components of multiple representation abilities, design abilities and are represented in evaluation rubrics. Not only do we want students to learn each of the evaluation strategies, we also want students to value them and incorporate evaluation into their personal learning behavior. We have developed two categories of tasks to help achieve this. One category consists of supervisory evaluation tasks, wherein students act like a supervisor by evaluating (and, if necessary, correcting) someone else’s work (usually the work of an imaginary friend). The other category consists of integrated evaluation tasks, which ask the students to evaluate, and if necessary to correct, their own work. For both categories of task, the evaluated work may be a problem solution, experiment design, experiment report, conceptual claim, or a proposed model. Supervisory evaluation tasks are meant to help the students learn the goals, criteria, and method of use for each evaluation strategy, while integrated evaluation tasks encourage students to incorporate evaluation into their learning behavior. During a semester we tend to use mostly supervisory tasks for the first few weeks so that the students can get acquainted with each strategy, and then transition to integrated tasks so that they gain experience at using the strategies to evaluate and correct their own work.

7. Ability to communicate

An important ability in the work of scientists is their oral and written communication, an ability that can be fostered in a physics course. For example, the quality of a lab report can be judged for its completeness and clarity. A communication ability rubric can help students know what is expected in communications in the scientific world.

[1] Salomon G., & Perkins, D. N. (1989). Rocky Road to transfer: Rethinking mechanisms of a neglected phenomenon. Educational Psychologist, 24 (2), 113-142.

[2] Holton, S. & Brush, S., “Physics, The Human Adventure,” New Brunswick, New Jersey: Rutgers University Press, (2001). Lawson, A., “The Generality of Hypothetico-Deductive Reasoning: Making Scientific Thinking Explicit,” The American Biology Teacher 62 (7), 482-495, 2000. A. Lawson, “The nature and development of hypothetico-predictive argumentation with implications for science teaching,” International Journal of Science Education, 25 (11), 1387-1408 (2003).

[3] B.S. Bloom, ed., “Taxonomy of Educational Objectives: The Classification of Educational Goals, Handbook I Cognitive Domain,” New York:David McKay Co, Inc., 1956. P. Dressel and L. Mayer, General education: Exploration in evaluation. Final Report of the Cooperative Study of Evaluation in General Education. Washington, DC: Council on Education, 1954

[4] C. Schunn and J. Anderson, “Acquiring expertise in science: Exploration of what, when and how,” K. Crowley, C. D. Schunn, & T. Okada (Eds.), Designing for science: Implications from everyday, classroom and professional settings Mahwah, NJ: Erlbaum. 2001, 351-392..

[5] P. Black and D. Wiliam, “Inside the black box: Raising standards through classroom assessment,” London: King’s College, 1998, p. 2.

[6] R. Sadler, “Formative assessment and the design of instructional system, Instructional Science 18, 119-144, 1989.

[7] Brookhart, S. M. (1999). The Art and Science of Classroom Assessment: The Missing Part of Pedagogy. ASHE-ERIC Higher Education Report (Vol. 27, No.1). Washington, DC: The George Washington University, Graduate School of Education and Human Development.

[8] 1) R. D. Tweney, “Scientific thinking: A cognitive historical approach,” K. Crowley, C. D. Schunn, & T. Okada (Eds.), Designing for science: Implications from everyday, classroom and professional settings 141-173, 2001, Mahwah, NJ: Erlbaum. 2) D. Hestenes, "Toward a modeling theory of physics instruction", Am. J. Phys. 55, 440-454, 1987.

[9] Kourany, J.A. (1987). Scientific knowledge: basic issues in the philosophy of science. Belmont, Ca.: Wadsworth; Nagel, E. (1961). The structure of science. New York: Harcourt, Brace, & World.

[10] Lawson, A. (2000). How do humans acquire knowledge? And what does it imply about the nature of knowledge? Science & Education, 9, 577 –598.

[11] E. Etkina, A. Warren., and M. Gentile. “The role of models in physics instruction”, The Physics Teacher, xx, 2005.

[12] Etkina, E., Van Heuvelen, A., Brookes, D. & Mills, D. (2002). Role of experiments in physics instruction – A process approach. The Physics Teacher, 40(6), 351-355.

[13]Anderson, L.W. & Krathwohl, D.R. (2001) A Taxonomy for Learning, Teaching, & Assessing: A Revision of Bloom’s Taxonomy of Educational Objectives. New York: Longman..

[14] Warren, A., “Evaluation as a Means for Student Learning,” Unpublished doctoral dissertation.

[15] Lawson, A. (2000). How do humans acquire knowledge? And what does it imply about the nature of knowledge? Science & Education, 9, 577 –598.