Why AI papers are hard — and how researchers actually read them

The volume is overwhelming. The notation is dense. The claims are often weaker than they sound. Here is how working researchers read sustainably without losing their minds.

25 minutesReading and reflectionNo tools required

By the end of this lesson, you will:

Understand why most AI papers feel impenetrable on a first read — and why that is not a sign of weakness on your part.
Know the three-pass reading strategy that working researchers use to handle the volume of literature.
Be able to decide, in under a minute, whether a given paper is worth your time today.

The volume problem

In 2017, when the transformer paper was published, arXiv received roughly 100 AI-related papers per day. In 2026 that number is closer to 600. Nobody reads them all. Nobody close to the field reads even a tenth. The people you see at the frontier of AI — the well-known researchers whose names appear on the papers that matter — are not better at reading every paper. They are better at reading the right papers, and at reading them in the right way.

The first thing to internalise is that the question is never "have I read enough?" It is always "did I learn what I needed to from what I chose to read?" Researchers triage ruthlessly. So should you.

Why AI papers feel impenetrable

Three reasons, all separate, all addressable.

Notation density. An AI paper packs a lot of meaning into a small number of symbols. The same symbol — a Greek letter, a subscript — can mean different things in different papers. When you do not yet know the conventions, every equation looks like the next equation. After fifty papers, the conventions become invisible; you read the equation as you would read a sentence. Lesson 3 walks through the conventions.

Compression. A typical conference paper is eight pages of body text. The work behind it might be a year of a PhD student's life. The compression ratio is brutal. Everything is shortened: motivation is compressed into half a paragraph, results into a single table, ablations into an appendix you have to hunt for. Lesson 2 maps which sections do what, so you know where to look for what you need.

Tribal knowledge. A paper presupposes the reader knows the previous five papers in the area. If you are coming in cold, you will spend half your reading time on the related-work section and the citation graph. This is normal and unavoidable. Lesson 5 shows you how to navigate the field, not just the paper.

The three-pass reading strategy

The standard technique in computer-science research — popularised by Srinivasan Keshav's 2007 paper "How to Read a Paper" — is to read each paper in up to three passes, each with a different goal. You decide after each pass whether to continue or to stop. Most papers you only first-pass.

Pass 1 — The five-minute triage (always)

Read the title, the abstract, the introduction, the section headings, and the conclusion. Glance at the figures. Skim the references for names you recognise. That is the whole first pass. Five minutes, maximum.

Goal: answer five questions. What is this paper about? What is its central claim? Is the claim plausible? Have I read related work that this builds on? Is this worth a second pass?

For most papers, the answer to the fifth question is "no". You learned what the paper is for, you noted the claim, you move on. You read three hundred papers a year this way without burning out.

Pass 2 — The thirty-minute reading (sometimes)

If the first pass said "yes", you do a more careful read. Read the body of the paper in full, but ignore the heavy proofs and the dense mathematical derivations. Look at every figure carefully. Read every figure caption. Note any references you want to follow up on. Sketch — on paper, by hand — the high-level idea of the method in your own words.

Goal: be able to summarise the paper to a colleague in three minutes, including its main contribution, its key technique, and the experiments that support it. If you cannot do this after a second pass, you need a third pass — or the paper is poorly written.

Pass 3 — The deep read (rarely)

Reserved for the small number of papers you are going to build on, criticise, or teach from. Re-implement the method in pseudocode, mentally if not in real code. Challenge every claim. Check the experiments against the assumptions. Compare to the references. Take notes that you could turn into your own paper or your own teaching material.

Goal: know the paper well enough to defend its claims to a sceptic — or to refute them with evidence.

Aside · The novice's mistake

The most common mistake new readers make is to do Pass 3 on every paper. They open arXiv, find a paper that sounds interesting, sit down, and try to understand every line. After two hours they have understood one paragraph and are exhausted. Working researchers triage: most papers get five minutes, a few get thirty, fewer still get a real read. The discipline is not to read deeply — it is to read deeply only on the right papers.

Choosing which papers to read

You will not solve this problem on your own. Researchers find papers through three filters, in roughly this order:

People they trust. If a researcher you respect tweets, blogs, or quotes a paper, that is a strong signal. Build a small set of trusted voices — five to ten people whose taste you have come to rely on. Their reading list becomes your shortlist.

Venues and acceptance. Papers that have been peer-reviewed and accepted to NeurIPS, ICML, ICLR, ACL, EMNLP, CVPR, or AAAI have passed at least one quality filter. arXiv-only preprints are useful for speed but unfiltered.

Citation context. When you are reading something, you naturally find references that are relevant. These are well-targeted recommendations from people who have already done the deep work. Follow them.

Notice that "arXiv front page" is not on the list. Reading whatever was uploaded today is the opposite of strategic reading. It will burn you out and teach you almost nothing.

Exercise — Run a five-minute triage (10 minutes)

Open arXiv. Go to arxiv.org/list/cs.LG/recent. Pick any one paper that catches your eye.
Set a timer for five minutes. In those five minutes: read the title, abstract, introduction, section headings, and conclusion. Glance at the figures.
Write down, on paper, three sentences: what this paper is about, what it claims, and whether you would do a second pass.
Now read your three sentences out loud. Are they specific? Could a colleague who had not read the paper get the gist from your sentences? If not, the issue is in the writing — yours or the paper's. Refine.

The reading practice

The most consistent thing about researchers who keep up with the field is not their reading speed. It is their reading consistency. They read most days, even briefly. Twenty minutes of triage per day across a year is over a hundred hours of paper reading — enough to know a sub-field. Two-hour binges twice a year are not.

Three suggestions if you want to start a sustainable practice. First, pick a time of day (mornings work for most people) and protect it. Second, pick a sub-field — alignment, vision-language models, reinforcement learning, mechanistic interpretability, whatever — and concentrate on it. Reading across all of AI is a recipe for shallow understanding. Third, write what you read. Even one paragraph after each pass. Your notes are how the knowledge sticks.

Self-check

What are the three reasons AI papers feel impenetrable?
What is the goal of the first pass, and how long should it take?
When would you do a third pass on a paper?
Why is "browse the arXiv front page" a bad reading strategy?

Looking ahead

Lesson 2 maps the standard sections of a modern AI paper — abstract, introduction, related work, method, experiments, ablations, appendix — and tells you what each is actually for, so you know where to look for what you need.

← Course overview Lesson 2 — Anatomy of a paper →