Reliability & validity

How accurate is the online Stanford-Binet?

Test-retest reliability, the standard error of measurement, and the honest psychometric answer.

Alfred Binet examining a child seated next to a kymograph and other psychophysical measurement instruments.
Binet at work with a child and psychophysical instruments, c. 1900.

This is one of the three questions every careful buyer asks before paying for an online IQ test. Most online tests answer it with hand-waving. Here is the honest answer, with the actual numbers.

The short version

For an adult test-taker who follows the instructions and is not impaired by sleep, stress, or external distraction, the Full-Scale IQ-equivalent the Stanford-Binet Online returns will typically fall within 5 points of what the same person would score on the same test on a different day. That is the natural test-retest range of every well-designed IQ instrument — there is no shortcut around it, regardless of who administers the test.

For some test-takers the gap will be larger. A noisy environment, an unfamiliar interface, or anxiety about being timed all shift scores downwards. The points below are the levers that move the result most.

What pulls a score up or down on the day

+
Quiet room, focused, well-restedWorking memory in particular peaks here. Most readers see their highest stable score in good conditions.
Tired, distracted, or rushedWorking memory drops first, then Innate Intelligence. Knowledge and logical-mathemtical intelligence are more stable across states.
+
Familiarity with the item typesThe free five-question sample lets you preview the format so the real test isn’t the first time you see it.
An off daySleep debt, a cold, an emotional event the day before — all real, all common. Re-take in the 14-day window if the day wasn’t fair to your usual self.

What “accuracy” actually means in psychometrics

There is no “true” IQ in the way there is a true height or weight. There is only a measurement, with a known range of variation. Two concepts do most of the work:

  • Reliability — would the same person score similarly if they took the test again? The Stanford-Binet Online aims for test-retest reliability in the ~0.90 range over six months, the industry benchmark for a calibrated instrument. The natural variation on any IQ test is roughly ±5 points.
  • Validity — does the score predict what it’s supposed to predict? The five-factor structure of the Stanford-Binet has been studied for over a century. The factors really do correspond to distinguishable cognitive abilities, replicable across populations, with predictive validity for academic and occupational outcomes that has held up under repeated meta-analysis.

What you get in the report

  • The five-factor profile. Where you’re strong, where you’re average, where you’re weak. Every item is written to load on one of the five factors of the modern Stanford-Binet, so the breakdown is meaningful at the factor level.
  • A Full-Scale IQ-equivalent. Calibrated to the standard population norms (mean 100, SD 15, range 40–160).
  • Age-banded percentile. Your percentile is computed against your own age cohort (5–80), not the general population, so the read is fair at any life stage.
  • A score that compares cleanly. A 120 here means the same thing — 91st percentile, “Superior” range — as a 120 reported anywhere else calibrated to the same conventions.

If you take the test in good conditions — quiet room, decent sleep, focused — the result will be a credible read on your cognitive profile. The factor breakdown is where most readers learn the most.

Frequently asked questions about accuracy

What is the standard error of measurement on the Stanford-Binet Online?

About ±5 points. That means roughly 90% of retakes by the same person, in similar conditions, fall within a 10-point window centered on the previous score. This is the same standard-error band any well-designed IQ instrument shows.

How is reliability measured?

Two ways. Internal-consistency reliability looks at whether items measuring the same factor agree with each other (typically reported as Cronbach’s alpha). Test-retest reliability looks at whether the same person scores similarly across sittings. Both should sit in the 0.85–0.95 range for a calibrated instrument. The Stanford-Binet Online targets ~0.90.

Can I improve my accuracy by preparing?

Modestly, and not in the way most people think. Preparation that genuinely helps: sleep, hydration, and previewing the item types so the format isn’t a surprise. Preparation that doesn’t: rote memorisation of practice questions (the test rotates items adaptively, so repetition has limited transfer).

What if my score on the day was clearly off?

Use the 14-day retake window. The most common reasons retakes shift more than ±5 points: poor sleep the night before, an unfamiliar interface that wasted attention, or strong emotion bleeding into the session. None of these are character flaws of the test or of you — they’re state effects every IQ instrument shows.

How does the Stanford-Binet Online compare to other online tests?

It uses a published cognitive model (the five-factor Stanford-Binet structure) and standard score conventions (mean 100, SD 15) — both signs of a calibrated instrument. Most free quiz sites do neither. See the side-by-side comparison.

Is online IQ testing as accurate as testing in a lab?

For self-understanding, career thinking, and tracking how your profile changes over time — yes. The factors that matter most for accuracy on any IQ test (item quality, normative sample, age-banding, score conventions) are properties of the test, not of the room you take it in.

Curious where you score, and what your factor profile looks like?

Take the Stanford-Binet Online35 to 45 minutes · Full-Scale IQ + five factor indices · From $49