Experiments in Improvement of Long-Term Learning

Glenn Ledder, University of Nebraska-Lincoln

In my undergraduate experience, multi-variable calculus seemed to be a fairly simple subject. All I had to do was extend the thorough knowledge I had in single variable calculus to a higher dimensional framework. Neither partial differentiation nor multiple integration posed any significant difficulties for me, since these topics are logical generalizations of single variable calculus. Only the final chapter on vector calculus seemed difficult.

When I began my teaching career, I expected to see low rates of success in entry-level mathematics courses such as Calculus I. Indeed, roughly 40% of the students beginning Calculus I at the University of Nebraska in 1989 either withdrew, failed, or got a D in the course. I expected that Calculus II and Calculus III should have much higher success rates. These courses are restricted to students who have got a C or better in Calculus I; since these students had shown themselves capable of success in calculus, it was only logical to expect them to have a high probability of success in later calculus courses. I expected that grades of W, F, and D would be given only to students who had lost interest in college, budgeted inadequate time, or were hampered by personal problems. To my great surprise, I learned that the success rates in Calculus II and Calculus III, both at Nebraska and every other institution for which I have data, is essentially the same as that for Calculus I at the same institution.

The observations I made as a calculus student and as a calculus teacher were hard to reconcile at first. As one moves through the calculus sequence, the courses do not get harder but the amount of ability per student increases due to losing the weaker students and keeping the stronger. Grade distributions should go up! It was not long before I understood the problem. Calculus I students were severely handicapped by a lack of mastery of precalculus, Calculus II students were equally handicapped by a lack of mastery of Calculus I, and the unfortunate Calculus III students suffered from a lack of mastery of all prior courses, which combine together to produce the required background for multi-variable calculus. The problem was now obvious: all high school teachers and most of my faculty colleagues were passing students who had not learned the subject. I resolved to maintain high standards for my students. I also instituted a policy of giving a take-home single variable calculus assignment to my Calculus III students in the first week of class, in order to find out at the beginning who was under-prepared.

One semester, I taught Calculus III after having taught one of the two large lecture sections of Calculus II the previous semester. This meant that half of my Calculus III students had passed my Calculus II class. Surely I would find students better prepared for success because of my high standards from the previous semester. Right?

Wrong, as anyone with a reasonable amount of teaching experience could have predicted. My former students were indistinguishable from the rest, being equally ill-prepared. Was I only deluding myself that I had maintained high standards? Curious, I pulled out the old final exams of those of my former students who did poorly on the review exam. What I discovered was that these students had been perfectly competent, although not outstanding, on my Calculus II final only a month before showing themselves to be incompetent at the very same material. The real problem? College courses reward only short-term learning, while future success is based only on long-term learning. I was not prepared for this, because my own learning in calculus classes had been long-term. The same is true for students who are extremely successful in my Calculus III classes now. But there are not very many of these students.

From then on, I have viewed every calculus class as an opportunity to do classroom research on the question of how to get long-term learning in a short-term context. The focus had to be on new teaching strategies, new assessment strategies, or some combination of the two.

My first experiment was an attempt to improve performance through new teaching strategies. It was based on the observation that students don't seem to get much out of class meetings. They come in to class having not understood the previous material, possibly having not even done the homework exercises. The instructor then begins covering the new topic, very slowly and starting from the beginning. Class ends before sufficient time has been spent on subtleties or difficult examples. This, in turn, makes it harder for the students to do the exercises and master the material. They pass the exams by cramming rather than mastering the material. But what if students were required to answer very basic questions about the new section before coming to class? Then class meetings could take the form of enrichment rather than basic coverage. My plan was to create a 1-page worksheet for each day's class. Each day, I handed out a worksheet on the material planned for the next day's class. On it, students had to answer a basic conceptual question, do a very simple calculation, and suggest a question that I could answer in class. The worksheets were worth 1 point each. I awarded one point for worksheets that were mostly correct, discounting minor errors, and a half-point for worksheets with significant errors but some correct work. The worksheets had to be handed in at the beginning of class, before I covered the material. When I arrived at the classroom, I quickly scanned the questions and chose a representative set. A significant share of class time was spent with my answering the questions asked by the class.

This experiment was largely unsuccessful. Students hated the worksheets and argued over the difference between a whole point and a half-point. Often the worksheets were done in the last five minutes before class by copying from other students or guessing. My spontaneous class presentations were not as good as my prepared presentations. Sometimes the worksheet questions were too hard. The emphasis on reading ahead was good, but it distracted from the more-important goal of doing the previous section's exercises.

My next experiment was an attempt to improve performance through new assessment strategies. It was based on the idea that long-term learning comes from mastery rather than multiple partial successes. I taught differential equations that semester, and I used a modification of the mastery-grading scheme championed by psychology professor Dan Bernstein. In Dan's version, students repeat a test as needed until they get a passing score. The course grade is based on how many tests the student passes during the semester. This could not work in a math class, because class meetings are essential. No matter what I might cover on a given day, some students would be ahead of that point and others behind. It also concerned me that passing a test once might not really indicate mastery. I therefore wanted a plan whereby a student had to validate earlier successes on the final exam. My solution was to divide each chapter's topics into basic and advanced. Each student had to demonstrate mastery of the basic material in each chapter. Doing so and repeating that success on the final exam would guarantee a C for the course. Grades higher than C had to be earned through bonus points, given for satisfactory work on advanced topics and projects. I gave 7 exams in that class (one for each chapter), each having 2 basic pages and 1 advanced page. I graded in the usual way, and I defined "mastery" to mean performance at a B+ level on the basic part. Students who did well enough on the basic portion got a checkmark, rather than a numerical score. The advanced portion was graded for a numerical score. Students who did not pass the basic portion had to repeat it outside of class until they eventually passed. I set no limit to the number of attempts on the basic chapter tests, but I had to limit attempts at the basic final to two. I did this by giving an optional 1-hour "pre-final basic exam" during the week before the final. (All students but one chose to take it.) Those who did not pass (with "pass" defined as B or better) the pre-final exam had to take another version of the basic final during the first hour of the 2-hour final exam period. The second hour was reserved for the 1-hour advanced final. Any student who did not pass either version of the basic final could get no higher than a D, regardless of what they had done up to the final. This, of course, was a big risk since I had allowed unlimited attempts on the hour exams. I decided that I would consider giving a student a third try when the students returned for the next semester, if necessary.

This experiment was quite successful, as measured by student outcomes. I began with 29 students. Of these, 4 withdrew or gave up. Of the remaining 25, I had no grade lower than C. This was quite remarkable. One student in particular illustrated the power of the assessment method. This student needed 19 attempts to pass the basic exams for the 7 chapters. Then he passed the basic final, albeit with no points to spare, on the first try. I am convinced that he learned much more in earning a C than he would have learned in earning a C using standard assessments. This comparison was somewhat moot, as it is highly unlikely that he would have passed a standard version of the course, where each exam could be attempted only once. On my teaching evaluations at the end of the semester, a large majority of the students indicated that they thought the experimental system was better than the standard system. Some complained that they had to work a lot harder, but admitted that they learned more in the process.

The experimental assessment system also had some serious drawbacks. A number of students worked only the advanced questions on the exams, because they knew they had opportunities to pass the basic portion later. This problem could be fixed by distributing the basic and advanced exams into two class periods; during the second period, students would either get a repeat basic exam or an advanced exam, as appropriate. Surprisingly many students, 9 out of the 25 who passed, made no serious attempt to do better than a C. Two of these students had to work very hard just to pass and one was taking the course pass-fail, but the other 6 simply decided early in the term that a C was good enough. These students would certainly have learned more under the standard system, but perhaps they would not have had any more long-term learning. The real Achilles' heel of the system was the time commitment required to write, administer, and grade all those tests. I was constantly scheduling make-up basic exams for groups of students who were behind. Pass rates on repeat exams were low until I began to require students to correctly work missed problems as homework before they could sit for a make-up exam. Grading was much quicker when I stopped grading for points but merely marked the errors and the outcome, P or N. But even then, the creation and administration of the make-up exams was far too time-consuming.

I learned a lot from these first two experiments. I still felt that insisting on daily preparation was a good idea, but the worksheet method was too much disliked by the students to be useful and forced me to teach class without being fully prepared. Accepting only work that passed a minimum standard was clearly a good idea, but I had to find a way to do it without the intensive time commitment of my second experiment.

More recently, I have been experimenting with the use of computerized tools to accomplish the same ends, but without requiring so much extra time or engendering so much resentment. My department had already adopted a "Gateway" (or basic skills) exam requirement in Calculus I and Calculus II. These were exams on differentiation and integration techniques that students in the appropriate course had to pass. They were administered online using software created by my colleague John Orr, software that is currently marketed under the name of "Wiley Web Tests" and in the process of revision under the name "EDU." Our department uses this software only for summative assessment after a subject has been studied. However, the system can also be used for pre-class worksheets. My first experiment with this use was partly successful. I learned that the creation of a data base of online worksheets requires a great deal of time and that questions had to be carefully field-tested before use. I often modified the questions after seeing the difficulties that prevented the first two or three students from getting the correct answers.

I am currently working on an NSF-funded project with John Orr and colleagues in physics and geology to create and analyze online assessment instruments (using the EDU software). EDU allows questions to include randomized data so that a single question template can be used to randomly create thousands of individual questions in order to make sure that each student's question is slightly different. I intend to create daily worksheets for all three calculus courses as well as mastery assignments, in which a student is guided through a multi-step solution process, with success at each step required before the student can proceed to the next step.