When ChatGPT launched, the question most students asked was whether it would make essay writing obsolete. Three years later, the more interesting question is why so many students who use it for essays are still getting 2:1s — and not Firsts.
The answer reveals something important about what university examiners are actually marking — and why the shortcut doesn't work.
What ChatGPT is good at
To understand why AI writing fails at the First class level, you have to first understand what it does well. ChatGPT is exceptionally good at:
- Writing grammatically correct, fluent English prose
- Summarising established positions in a field
- Structuring arguments in a logical order
- Citing sources (though it frequently halluccinates them)
- Producing content quickly at 2:2 to 2:1 level
For a student who struggles with grammar, clarity, or basic essay structure, AI tools can genuinely help — at the level of getting a 2:2 up to a low 2:1. But this is exactly where they stop working.
Why AI essays top out at the 2:1 ceiling
First class essays require something that AI cannot generate: original intellectual engagement with a specific argument. Here's why each of the five First class criteria is beyond what ChatGPT reliably produces:
1. Original argument
ChatGPT synthesises existing positions. It cannot develop a genuinely novel argument because it has no stake in the question, no access to your seminars, your reading, or the specific texts your course has covered. The arguments it produces are — at best — well-expressed versions of things that have already been written. Examiners reading the same question for the 40th time notice immediately when an argument is generic.
2. Genuine critical engagement
Critical analysis requires knowing the field well enough to identify where arguments are contested, where the evidence is thin, and where established positions are vulnerable. ChatGPT produces the appearance of critical engagement — acknowledging that "some scholars argue..." — without the depth. A real examiner can tell the difference between a student who understands why a counter-argument matters and one who has listed it as a formality.
3. Your specific course context
University essays are assessed partly on engagement with the specific texts, debates, and theorists covered in lectures and seminars. ChatGPT doesn't know what your module has covered. It can't reference the specific reading your tutor mentioned, or engage with the particular framing your course takes on a canonical debate. Examiners notice the absence of course-specific engagement immediately.
Beyond the quality ceiling, AI detection is improving rapidly. UK universities are increasingly using tools like Turnitin's AI detection alongside stylometric analysis. The risk-reward calculation for AI-written essays is getting worse every year.
4. Intellectual voice
First class essays have a distinctive voice — the voice of a person who has thought hard about a problem and has something specific to say. AI-generated prose is fluent but characterless. Experienced examiners — especially at tutorial-based institutions like Oxford and Cambridge — recognise the absence of a real intellectual voice. They've read enough student work to notice when something doesn't sound like thinking.
5. Appropriate epistemic humility
Paradoxically, First class essays are often more uncertain than 2:1 essays. They identify where arguments run out, where the evidence is genuinely ambiguous, and where the question resists a clean answer. ChatGPT tends toward false confidence — producing authoritative-sounding prose even on contested questions. This misrepresents the actual state of knowledge in a field, which examiners notice and mark down.
The comparison table examiners don't tell you about
| Criterion | ChatGPT essay | Your own essay (improved) |
|---|---|---|
| Argument originality | Generic synthesis of existing positions | Your specific position, developed through your reading |
| Critical engagement | Surface acknowledgment of debate | Genuine engagement with the strongest objections |
| Course engagement | No knowledge of your specific course | Integrates your seminars, lectures, and reading |
| Intellectual voice | Fluent but characterless | Distinctive, develops over your degree |
| Academic integrity | Serious risk of detection | Zero risk |
| Long-term benefit | None — no skills developed | Skills compound across every subsequent essay |
What actually works
The students who consistently get Firsts are not the ones who write the most or read the most. They are the ones who get the most precise feedback on their own writing — and act on it.
Historically, this meant tutorials at Oxford or Cambridge, or a very engaged personal tutor. For most students at most universities, neither of these is available. Seminars are too large, office hours too brief, and peer feedback too inconsistent.
This is the gap FirstClass was built to fill. Not to write your essay — but to mark it the way a real examiner would, identify exactly where it falls short of a First, and show you what stronger writing looks like at the sentence level.
The difference between a tool that writes for you and a tool that makes you better at writing is the difference between a shortcut that caps out at a 2:1 and a process that actually builds toward a First.
Use AI to improve your thinking, not replace it. Ask it to steelman the counter-argument to your thesis. Ask it what's missing from your evidence. Use it to find sources — then verify them. These uses make you a better writer. Writing for you does the opposite.
The bottom line
ChatGPT will not get you a First. Not because it can't write — it can write well. But because First class essays reward the thing AI cannot fake: a real person's genuine intellectual engagement with a hard question, over time, in a specific academic context.
The students who get Firsts are the ones who do the thinking — and get better at it with each essay. That's the only path that works.
Free · Your work, made better · 30 seconds