Friday, November 12, 2010

The "evidence" in evidence-based decision making with High Stakes Testing

High Stakes Testing provides the opportunity for data-based decision making. The question is: will doing so be an opportunity for a "garbage in, garbage out" process, or for true "evidence-based practice?"

Advantages of High Stakes Testing:
At the individual student level, one little-discussed advantage to high-stakes testing is that appropriately calibrated and normed tests can give data that validate grades earned. This would end the dilemma faced by students with high grades from poorly-achieving schools. Students presently enrolled in schools like New York City’s John Dewey High School face challenges in applying to prestigious colleges: admissions counselors can say: “Sure, you have a high GPA, but it’s from a school under registration review. What does it mean?” If that grade is accompanied by a set of correspondingly high scores on achievement tests, it’s clear that the student’s grades are reflective of earned accomplishment.

At the school building level, the use of data from high stakes testing (and other assessments) can clear the path for educational leaders to change students’ experiences to alternate modes of education (for example to an in-school tutoring model, to different materials, or to Direct Instruction curricula) so that they no longer have to repeat an experience that was already measurably ineffective for them. This use of test data can help solve a “culture problem” in education today: the rift between philosophy and outcomes. Doing what is right for children isn’t always easy in education because we are in a field with many stakeholders, and each group tends to hold disparate philosophies of education. It seems that many sub-groups, by dint of previous training and orthodoxy, are equally assured that their way of providing education is the “one true way.”

Disadvantages of High Stakes Testing:
At all levels, just because a decision is “data-driven” doesn’t mean that it is correct. Knowledge may be power, but power corrupts; and the absolute power that corrupts absolutely probably had its case argued by mis-analyzed or frankly manipulated data. In some localities, administrators and school boards use aggregated performance data to make wrongheaded decisions; take for just one example the article’s report that in Texas low-performing schools are bolstered with additional financial support and rarely closed. In some fields, such an arrangement would be called “anti-merit pay.”

Finally, from a personal level, an observation that data from a poorly constructed or badly normed test may be worse than no data at all. I was a member of the cohort of students that was in the fourth year of Ohio Ninth-Grade Proficiency Tests (now being phased out) and the first year of Ohio Twelfth-Grade Proficiency Tests. My Ninth Grade proficiency tests were not a problem at all. As a daughter of one of the high school faculty members, it was expected that I performed well in high school. In fact, I represented the school in statewide comparisons by taking “Scholarship Tests” each year. When my results came in for the Twelfth Grade tests, imagine my surprise to learn that I had “pegged” all the other subtests, yet failed Reading Comprehension. To this day, I have no explanation for this spurious result. Was the test faulty? Was the wrong key used to score my test? Did anybody in the school pass? Was I “one bubble off” in my response sheet? Don’t know, don’t care, glad it didn’t count, except for the fact that my high school diploma did not have an extra sticker indicating that I passed those Twelfth Grade tests. Who knows what became of other “star students” who came in the next cohort for whom that subtest was a requirement, and not a pilot project. It could have been a disaster, and in that place at that time, I’m personally very glad the State of Ohio was NOT “playing for keeps.”

No comments:

Post a Comment