A New Problem of Grade Deflation

In the past, lacking serious problems to think about, politicians were worried about grade inflation. For ten years, more and more students were getting top grades which caused problems for universities admissions officers who said they could no longer fairly pick the smartest pupils.

Teachers often think it was the Coalition government who whipped up anger about grade inflation. Actually, Labour were the prime movers. Ed Balls wanted to tackle the problem with a new system that ensured grades only went up if there were genuine improvements in young people’s knowledge and skills.

Unfortunately, the solution we ended up with means that next year we may face a new problem: grade deflation.

A Tale of Comparable Outcomes

The solution to grade inflation was a much misunderstood mathematical process known as ‘comparable outcomes’.

The maths behind it is quite complex, but the essential assumption it hangs on is that children don’t change much in skill levels from year to year. If grades suddenly jump up, it’s probably because something about the exam process has made it easier to get a high grade.

That said, we can’t rule out the possibility that one year group is freakishly smart or that changes in teaching practices have improved things. For example, if teachers focus on reading more (as they have) it is possible that when those students get to GCSE, they will achieve better. But how can we know that a grade increase is genuinely because students are smarter rather than because of an easier exam?

One way to check is to look at other evidence about a GCSE year group. Looking at results from when the students took SATs test at primary school is one check. If a year group had unusually high SATs scores, then it would seem normal for them to have unusually high GCSE scores. Another way to check, is to have an ‘anchor test’, which is a low-stakes test, that students don’t revise for, to show if the cohort have imbibed more learning along the way. In England, these tests are called the national reference tests. They happened in February and early March this year, and were taken by a large-enough sample of pupils to be able to estimate what this year’s GCSEs should be.

Using this information, exam boards and Ofqual are able to triangulate the grades given by any particular exam board for any particular subjects. Should AQA French be handing out more As (or Es) than the prior data and national reference test calibrations would suggest, then Ofqual will highlight the discrepancy. The exam board can present evidence as to why it thinks a variation should be allowed. (Past examples include an influx of private school pupils without prior SATs results). If the evidence stands up to scrutiny, the grades stay. If not, they are adjusted.

In doing so the grades given each year are therefore ‘comparable’ – in the sense that it shouldn’t matter when a student sat their exam (be it 2018 or 2019) they should still expect to get the same grade. In doing so, differences in grade outcomes that are caused by changes to the exams are stopped. But if the students are simply smarter than those in previous year groups, as evidenced by their higher prior attainment and better national reference tests, then more high grades will be handed out.

Here comes the nightmare

The comparable outcomes system is posited on a world that inexorably tends towards grade inflation and tries to move us away from it. No one has ever thought about what would happen if student knowledge suddenly collapsed. In that case, if the system continues working properly, we will get grade deflation.

Why? Let’s think the system through again.

Imagine a Year 10 group who for some reason – say, a massive pandemic – are suddenly interrupted in their learning. Although they muddle through as best they can towards their exams, it’s fair to say that the cohort as a whole knows less by the time it hits its GCSE tests than the Year 11 in the year above. The national reference test confirms this to be true.

We can go back and look at the cohort’s prior attainment and see that the students were, indeed, roughly similar to the prior Year 11 when they left primary school. Had the pandemic never happened, the exams regulator might have handed out very similar grades to those given to the previous year group.

BUT we have really good reason to believe that the cohort have, in fact, taken a hit. As awful as it sounds, the group are less knowledgeable than the year above. And, at some point in the future, both will be in the labour markets together, fighting for jobs, using their academic grades as weapons.

Which is more fair? To keep the grade profile consistent with the previous year group – and give grades on a ‘here’s what you would have won’ basis? Or, do you do what comparable outcomes is supposed to do, and give grades commensurate with how knowledgeable the cohort currently are? And, if you do, that will it mean – bluntly – that the playing field is massively tilted towards kids from better-educated families.

I can’t see a world in which the first situation is allowed to prevail. No one is going to think it’s fair that 15-year-olds, randomly hit by a pandemic, should get lower grades. Fair enough.

But, that now means that comparable outcomes is dead.

Instead, we are likely to turn to so-called ‘norm referencing’. Norm-referencing is where a particular percentage of grades are given each year. For example, 10% of students get an A, 10% a B and so on. The percentages are likely to be anchored to what the current year 11s received. Hence, even if there are real increases or decreases in students’ capability these will not be reflected in the outcomes.

On the one hand, norm-referencing seems instinctively fair. But, if we go with this system long-term, then if students in five years’ time are substantially smarter than they are now – for example, the average A grade student in 2020 is only getting a C grade in 2025 – then it seems a bit unfair that they are both labelled as the same grade. Norm referencing can lead to this situation.

Likewise, if students start getting less smart every year, to the point that a C grade student in 2025 can only do what a student getting Fs can do right now, then that’s also problematic. People may think that kids are getting dafter – but the data won’t help us know for sure. All of which then becomes a problem for school league tables.

And I Haven’t Even Mentioned The Class of 2020

So far I’ve only talked about the class of 2021, i.e., the current Year 10s who will sit their GCSEs next year. The class of 2020, the current Year 11s, are being subjected to comparable outcomes via ‘best guess’. Instead of sitting exams, their teachers are guessing what grade they would have got and then those guesses are going through the comparable outcomes mill. A big reason for doing this is to ensure that the students of 2020 don’t suddenly have massive grade inflation or deflation, and that the currency of their grades continues to be comparable to other year groups.

If the Year 11s were the only ones affected by the crisis, this would be fine. But, if we accept that Year 10 are going to be affected (and I think we are getting to a point where this seems likely) and so that screws up comparable outcomes for the reasons above, then… why are we bothering to make Year 11 grades comparable?

I cannot tell you how much this conclusion hurts my heart. As one of the people who has spent time really trying to understand the maths behind comparable outcomes – having visited Ofqual’s offices, sat with their researchers, seen the hard work that goes into it – I am often amazed at what a system they’ve created in order to ensure justice of exams year-on-year. People who dismiss comparable outcomes as a way of ensuring some kids fail are missing the point. What comparable outcomes does is ensure that when kids pass, they can be absolutely certain that they hit the pass mark that anyone in any other comparable year also achieved. They didn’t get lucky, or have an easy exam board. They really passed it.

But I fear the loss of this year’s exam series, compounded with a year group who will know less, is likely to break the system.

Is There A Solution?

One way to solve the problem would be to norm-reference for a few years, just while things settle down. That way the Year 10s would automatically get the same grade profile as the Year 11s – or, at least, not get lower grades – and while we might all know they didn’t show the same knowledge levels, it just doesn’t really matter that much in the great scheme of things.

The difficult question is when do you pick it back up? What if the current Year 6s are still showing lower grades on the national reference test even in five years’ time? Should they still get a pass? Maybe the answer is yes.

What I think this shows us overall is how easily a fixation of politicians to stop grade inflation never once considered that the opposite might happen. That one day we might face grade deflation. It is yet another example of how we have lived our pre-pandemic lives thinking that the world will always be the same. And now, we know, it won’t.