The Context of Schooling: 1845-1905
The educational reforms that took place in Canada in the 1840s echoed those that took place in the United States. In 1837, Massachusetts became the first state in America to establish a Board of Education, headed by reformer Horace Mann. Mann advocated for reforms such as age-graded classrooms, uniform textbooks, teacher training, comprehension-based learning (rather than emulation), school inspections, and the application of statistics to schools. In Canada, similar reforms were advocated by Egerton Ryerson, which culminated in the Common School Act of 1846.
There were a number of social and political factors that influenced the ideas of Mann, Ryerson, and like-minded reformers at the time. As the industrial era took hold, governments became increasingly interested in applying the principles of bureaucratic management to other disciplines – including education. Governments began to use quantitative, standardized measures to gain a better understanding of their economies (e.g. trade, population growth, mortality). Statistical societies emerged in the 1830s, making it possible for “citizens in all walks of life [to] better measure, count, compare, and determine trends over time” (Reese, 2013, p. 56). In education, Mann saw the potential for written tests to replace oral examinations to allow stakeholders to make quantitative comparisons within/between schools and to facilitate the efficient and objective sorting of students (US Congress, 1992). Both in Canada and the US, the concern for quantitatively measuring students and schools increased as schooling became universal, free, and compulsory, since taxpayers wanted assurances that their money was being spent wisely (US Congress, 1992).
Dawn of the Test
In 1845, Mann and his examination committee administered the first written standardized test to nineteen grammar schools in the city of Boston. These tests came unannounced, to the chagrin of schoolmasters and pupils, replacing the end-of-year oral exhibitions in May-June of that year. After years of being “puffed” in local newspapers, part of Mann’s motive for this surprise ‘attack’ was to drive schoolmasters “out of the paradise of their own self-esteem” (Reese, 2013, p. 60). Following a series of bitter exchanges and personal attacks between himself and Boston’s schoolmasters, Mann wanted to embarrass the schoolmasters by proving that their antiquated approaches to education (e.g. emulation, rote learning, and corporal punishment) were ineffective. According to the US Congress:
“The idea underlying the implementation of written examinations…was born in the minds of individuals already convinced that education was substandard in quality. This sequence – perception of failure followed by the collection of data designed to document failure (or success) – offers early evidence of what has become a tradition of school reform and a trusim of student testing: tests are often administered not just to discover how well schools or kids are doing, but rather to obtain external confirmation – validation – of the hypothesis that they are not doing well at all” (US Congress, 1992, p. 108).
Consistent with the above statement, the results of Boston’s first written examination were dismal. The average score across all nineteen grammar schools was 30%, with nearly half of the responses being left blank (Reese, 2013).
In designing his test, Mann created questions that drew from practical, real-world scenarios, requiring students to demonstrate their understanding of lesson topics rather than simply stating memorized facts. Because the field of statistics was in its infancy, little was known about concepts such as reliability or validity in developing test instruments. External examiners administered the test over the course of eight days, ensuring that the test-taking conditions were uniform from school to school (Reese, 2013). The tests for all nineteen schools were then marked by six members of the examining committee, and schools were subsequently ranked on the basis of these scores. Although the creators and scorers of this test were openly biased against the schoolmasters, citizens’ blind faith in the objectivity of statistics was enough to convince them that “numbers did not lie,” and that these quantitative “facts” proved that Boston’s educational system was in decline (Reese, 2013).
The 1845 tests in Boston had an enduring impact on educational assessment and policy. Foremost was the idea that the blame for poor performances rested with teachers or ‘the system’ rather than the students. (Meanwhile, strong performances were attributed to student merit). This mentality – which persists in present-day discussions around standardized testing – was deliberately propagated by Mann as a means of discrediting the schoolmasters in the midst of their longstanding feud (Reese, 2013). Convinced that parents would otherwise villainize the examiners, Mann conspired with his longtime colleague, Samuel Gridley Howe, to publish a series of newspaper articles convincing the public that the low exam scores revealed the incompetency of the schoolmasters and as such they deserved to be fired. The summer of 1845 marked the first time in history when the negotiation of teacher contracts was based upon students’ test scores (Reese, 2013). This also marked the beginning of the feminization of the teaching force, since male schoolmasters were then replaced by their much less expensive female counterparts, who were believed to embody the nurturing qualities necessary for child-centered reform.
Testing in Canada
In 1871, Ontario became the first province to mandate compulsory schooling, and as school enrolments increased in the second half of the nineteenth century, concerns surrounding the efficient classification and sorting of students grew (Axelrod, 1997). As the “gospel of testing” spread from the United States, the use of written tests in classrooms became more commonplace, allowing teachers to assess student learning regularly throughout the school year. Situated within age-graded classrooms, students were no longer seated at benches but rather organized into rows of single desks such that each student became an individual “pedagogical unit.” As Finkelstein (1991) notes: “Grades supplemented whips, report cards supplemented spelling exhibitions, and rewards of merit took the form of dollar bills…As the technologies of the new marketplace made their way into schools, so too did the impersonal, bureaucratic method” (p. 477).
In 1875, George Paxton Young introduced the practice of “Payment by Results” into Ontario’s secondary schools (Wilson, 1970). The Payment by Results scheme combined written examination scores with school attendance records to determine the level of funding that would be provided to various high schools and collegiate institutes (Wilson, 1970). While Young’s intention for this scheme was to raise academic standards, this approach instead served to promote practices such as cramming and drilling; “The exam’s the thing” became the catchphrase of Ontario’s secondary schools (Wilson, 1970). Additionally, many students came to believe that getting high marks on an exam was equivalent to being educated. While the Payment by Results scheme ended in 1882, its impacts on learning practices have persisted into the present day.
The amount of time teachers spent designing, administering, and grading tests contributed significantly to their workload – particularly as school attendance and test frequencies increased (Danylewycz & Prentice, 1986). In 1891, Ontario and Quebec announced that teachers would also be required to produce written assessments of each individual student’s progress (similar to modern-day report cards), which included a “mind chart” of their achievements as well as a recommendation for promotion to the following grade (Danylewycz & Prentice, 1986). Thus, in addition to the time devoted to written examinations, teachers also began to assign and evaluate other learning artifacts as well (e.g. workbook pages, essays, stories, drawings) in order to inform their written reports.