May 4, 2007

AFT NCLBlog: Two Cheers for NAEP, and What's A Flack For?

Just one of a bunch of responses to the Bracey piece...

Two Cheers for NAEP, and What's A Flack For?
May 4, 2007 10:59 AM
After reading Eduflack here and seeing John's post as well, I thought I'd add some thoughts. The Flack puts the hammer to Bracey for criticizing NAEP. Now let me be clear. I adore NAEP, and double adore state NAEP.
For example, I was just at an event (more on that later) where Joydeep Roy of EPI mentioned that Georgia has a lower graduation rate than Arkansas. But Georgia has higher NAEP scores across the board. Which indicates that perhaps the issue is Arkansas' low graduation standards rather than Georgia's guidance counselors. And when President Bush first proposed NCLB, I compared NAEP scores in New York and Texas and was stunned -- even before trying to figure out disaggregation of data issues in NY -- about how many NY schools would fail and how many Texas schools didn't fail despite NAEP scores that looked pretty similar. The difference for NY was not in ourselves, but in our choice of standards. David Grissmer's work looking at state NAEP scores is one of my favorite bits of research. And I think its great that people are examining issues in American math instruction that are raised by international comparisons that include the one Bracey is writing about.
But one thing you will almost never see me do is judge anything based on NAEP's performance standards alone. I stick to looking at scale scores. Where NAEP draws the line on what is below basic and what is proficient strikes me as the wrongheaded result of a process that needs some work. Beth's post pointed out the results of that process, as did Bracey's op-ed. In fact, if you read the Department of Ed's website here, you'll see that they cite a National Academy of Sciences study on this issue thusly:
"The Panel concluded that "NAEP's current achievement level setting procedures remain fundamentally flawed. The judgment tasks are difficult and confusing; raters' judgments of different item types are internally inconsistent; appropriate validity evidence for the cut scores is lacking; and the process has produced unreasonable results."
And yet they're still using it. This bothers me because measuring the problem correctly is pretty important for devising solutions and distributing scarce resources. Now it's possible that something has been done to remedy this while I wasn't looking and the Department of Ed hasn't updated the webpage. Absent that, I suspect this is a case where Bracey is making a good point, but it is lost in the noise of a DC conventional wisdom cocktail party that appears to have swallowed Eduflack whole. What to do when people -- including those in power -- are misusing data to exagerate the flaws in public schools seems to be a real PR issue. Rather than knock Bracey, if the Flack isn't actually flacking for a different agenda here, I'd ask what advice there is to give Jerry about this problem.

No comments: