What the Stanford-Binet actually measures — and why it still matters

The Stanford-Binet is one of the most cited tests in psychology, yet most people who search for it have no idea what it actually measures. Here is what the five-factor model does, and why it still matters.

Vintage Stanford-Binet IQ Test Scale
Vintage Stanford-Binet IQ Test Scale

A test by any other name

Search for “sb test” and you will mostly find landing pages that offer to give you a score. What you will not find, at least not easily, is a clear explanation of what the Stanford-Binet Intelligence Scales actually measure, why the test is structured the way it is, and what the results can and cannot tell you. That gap is worth filling.

The Stanford-Binet has been through five major editions since Alfred Binet and Théodore Simon published their original 1905 scale. The current version, the SB5 (published by Riverside Publishing in 2003), is a substantial departure from earlier editions. It is not just a refreshed version of the same thing. The factor model was redesigned, the norming sample was updated to 4,800 participants stratified to match U.S. Census data, and the age range was extended to cover individuals from two years old through adulthood. Understanding what the SB5 actually measures requires engaging with that factor model directly.

The five-factor model

The SB5 organises cognitive ability into five broad factors, each assessed across two domains: verbal and nonverbal. That gives ten subtests in total.

Fluid Reasoning is the capacity to solve novel problems without relying on previously learned procedures. A classic fluid task asks you to identify the next item in a pattern you have never seen before. Fluid reasoning is the factor most closely associated with what psychologists call g, the general factor of intelligence that sits beneath all the specific abilities. Research by John Carroll (1993) and others using factor-analytic methods consistently finds that fluid reasoning loads most heavily on g of any single cognitive domain.

Knowledge captures the breadth of information someone has accumulated over a lifetime, sometimes called crystallised intelligence in the Cattell-Horn-Carroll (CHC) framework. Vocabulary, general information, and verbal comprehension tasks fall here. Knowledge scores are sensitive to educational opportunity in a way that fluid reasoning scores are not, which is one reason the SB5 reports them separately rather than collapsing everything into a single number.

Quantitative Reasoning measures the ability to solve numerical and mathematical problems. This is not arithmetic drill; it includes the kind of relational reasoning that shows up in algebra and logic puzzles. Quantitative reasoning is a strong predictor of academic achievement in STEM fields, and the SB5’s separation of this factor from general fluid reasoning reflects decades of research showing it has incremental predictive validity over and above g alone.

Visual-Spatial Processing assesses the ability to perceive, analyse, and mentally manipulate visual forms. Folding and rotation tasks, pattern analysis, and form-board problems all tap this factor. Visual-spatial ability is meaningfully distinct from verbal ability even at the level of brain activation, and its inclusion in the SB5 reflects the CHC framework’s insistence that a comprehensive ability battery should not be dominated by verbal tasks.

Working Memory is the capacity to hold information in mind while simultaneously processing or transforming it. It is the factor most sensitive to attentional control and executive function. Working memory deficits are associated with ADHD, learning disabilities, and acquired brain injuries, which is part of why clinicians find the SB5 useful for diagnostic purposes beyond simple IQ estimation.

Each of these five factors produces a scaled score (mean of 10, standard deviation of 3), and the ten subtest scores combine into a Full Scale IQ (mean of 100, standard deviation of 15). But the factor scores are often more informative than the composite, because a flat profile and a jagged profile can produce the same FSIQ while telling very different stories about a person’s cognitive architecture.

Why the nonverbal/verbal split matters

One structural feature of the SB5 that often goes unmentioned in popular coverage is the deliberate pairing of verbal and nonverbal versions of each factor. Every factor is measured twice: once through tasks that require language, and once through tasks that do not.

This matters for several practical reasons. It reduces the test’s cultural and linguistic loading, making it more useful for individuals who are deaf, have limited English proficiency, or come from educational backgrounds that differ from the norming sample. It also allows clinicians to detect discrepancies between verbal and nonverbal performance within a single factor, which can be diagnostically informative in ways that a single composite score would obscure.

The verbal/nonverbal distinction is not unique to the SB5. The Wechsler scales have used a similar architecture for decades. But the SB5’s implementation is notable because it applies the split consistently across all five factors rather than treating it as an afterthought.

What the SB5 predicts, and how well

The predictive validity of the SB5 is well-documented. Full Scale IQ scores correlate with academic achievement at around r = 0.50 to 0.70 across multiple studies, depending on the outcome measure and the age of the sample (Roid, 2003). That is a large effect by the standards of social science, where correlations above 0.30 are often considered practically meaningful.

Fluid reasoning and working memory show the strongest correlations with novel learning tasks. Knowledge and quantitative reasoning are stronger predictors of grade-point average and standardised academic tests, probably because both outcomes reward accumulated information and practiced numerical skill.

The SB5 is also used in clinical contexts where the stakes are high: giftedness identification, intellectual disability diagnosis, learning disability assessment, and neuropsychological evaluation following brain injury. In those settings, the full factor profile matters more than the composite, and the test is always administered by a trained psychologist who can contextualise the scores against background history, observation, and other assessment data.

None of this means the SB5 is a perfect instrument. No test is. The norming sample, while large by psychometric standards, was collected in 2001 and 2002. Flynn effect corrections (the well-documented secular rise in IQ scores of roughly 3 points per decade) mean that older norms systematically overestimate a person’s standing relative to the current population. The SB5’s norms are now more than two decades old, which is a legitimate limitation that clinicians account for in practice.

What online tests are and are not

It is worth being direct about what an online self-administered assessment is, relative to a clinical SB5 administration.

A clinical SB5 takes two to four hours, is administered one-on-one by a licensed psychologist, and produces a detailed report with factor scores, confidence intervals, and clinical interpretation. The examiner can observe behaviour, note off-task responses, and adjust administration based on the examinee’s presentation. The result is a legally and clinically defensible document.

An online assessment, including the one available on this site, is a different kind of tool. It can give you a reasonable estimate of where your cognitive abilities fall relative to a reference population, and it can surface the kinds of reasoning tasks the SB5 uses. But it cannot replicate the standardised administration conditions, the one-on-one examiner relationship, or the clinical interpretation that makes the SB5 useful for high-stakes decisions. If you need a formal evaluation for a school placement, a disability accommodation, or a legal proceeding, you need a licensed psychologist and a full clinical battery.

For most people, though, the goal is simpler: curiosity about where they stand, a rough benchmark for cognitive strengths and weaknesses, or just the experience of engaging with the kinds of problems the SB5 uses. For that purpose, a well-constructed online assessment is a reasonable starting point.

Why the SB5 still matters in 2025

The Stanford-Binet has survived five editions and more than a century of competition from other intelligence batteries because it has continued to evolve with the science. The CHC framework that underlies the SB5 is the most empirically supported hierarchical model of cognitive ability available, and the SB5’s factor structure reflects that framework more faithfully than most competing instruments.

That does not make it the only valid test. The Wechsler Adult Intelligence Scale (WAIS-IV) and the Woodcock-Johnson IV are both well-validated alternatives with their own strengths. Researchers studying specific populations sometimes prefer narrower batteries. But the SB5 remains the instrument most directly connected to the original Binet-Simon tradition, and its combination of age range, factor coverage, and clinical utility keeps it in active use in schools, hospitals, and private practices.

Understanding what it measures, and why those five factors were chosen, is not just academic trivia. It is the foundation for interpreting any IQ score intelligently, whether that score comes from a clinical evaluation or from an online assessment you took on a Tuesday afternoon out of curiosity.

If you want to engage with the kinds of reasoning tasks the Stanford-Binet tradition uses, try our free online cognitive assessment and see how you approach fluid reasoning, pattern recognition, and working memory problems in a self-paced format.

FAQFrequently asked questions

What does SB test stand for?

SB test is shorthand for the Stanford-Binet Intelligence Scales. The name combines Stanford University, where Lewis Terman revised Binet's original French scale in 1916, and Alfred Binet, the French psychologist who created the first version in 1905.

How is the SB5 different from earlier editions?

The fifth edition (2003) introduced a fully redesigned factor model based on the Cattell-Horn-Carroll framework, added a consistent verbal/nonverbal split across all five factors, and extended the age range to two years through adulthood. Earlier editions used fewer factors and did not apply the nonverbal split as systematically.

Can an online IQ test give me a valid Stanford-Binet score?

No online test produces an official SB5 score. A valid SB5 score requires one-on-one administration by a licensed psychologist under standardised conditions. Online assessments can give a reasonable estimate of cognitive standing and expose you to similar reasoning tasks, but the results are not clinically equivalent.

Why does the SB5 report factor scores separately instead of just one number?

Because two people can have the same Full Scale IQ with very different profiles across the five factors. A person with high fluid reasoning and low working memory needs different support than someone with the reverse pattern, even if their composites match. The factor scores carry diagnostic information that the composite obscures.

ReferencesSources

  1. Stanford-Binet Intelligence Scales, Fifth Edition: Technical Manual Gale H. Roid (2003)
  2. Human Cognitive Abilities: A Survey of Factor-Analytic Studies John B. Carroll (1993)
  3. The Cattell-Horn-Carroll Model of Intelligence Kevin S. McGrew (2009)