The Story of Traditional IQ Tests (GMAs), Their Subtests and a New Alternative

Jiaying Law

People Scientist

We all want โ€œsmartโ€ people in our teams since we believe they will be competent and learn things fast. But how can we actually tell if the person we are going to hire is truly โ€œsmartโ€?ย 

One of the most popular ways to measure โ€œsmartnessโ€ is through the traditional IQ test , also known as General Mental Ability Test (GMA). Even though it has strong predictive power for job performance, it faces challenges in keeping up with todayโ€™s workplace demand. Thatโ€™s why some experts suggest using tests that focus on how people think, rather than just what they know, as a better option for hiring.

In this article, we will discuss:

  • A deep dive into general mental ability tests (IQ tests)
  • Does interpreting peopleโ€™ intelligence from the indexes and subtests make sense?
  • The alternative: assessments based on cognitive processes

A history of GMA tests

Nearly 100 years ago, the first ever version of the intelligence test was created. Individuals who could read newspapers or speak English were given verbal and quantitative tests, where they were asked questions about general information, common sense, and verbal knowledge (Figures 1; Naglieri et al., 2015). Those who could not read or speak English were assigned nonverbal tests, where they were asked to complete mazes, create designs using blocks, identify what was missing in a picture, and more. (Figures 2). At that time, intelligence tests were not based on a clear definition or theory, but rather on all-around ability to navigate everyday life (Pintner, 1923; Naglieri, 2020).

In 1993, three psychologists worked together to systematically combine their ideas about intelligence into whatโ€™s known as the Cattell-Horn-Carroll theory (CHC theory; Carroll, 1993). They believed that general intelligence (called the g factor) is made up of different broad cognitive abilities, which in turn consist of more specific (narrow) cognitive skills, see Figure 3.

Figure 3: Carrollโ€™s model of cognitive abilities (Carroll, 1993)

To understand this idea, consider the language we use everyday.ย 

How would you know if someone is good at English? Most of usย  will judge it by looking at their reading, speaking, writing and listening skills (broad categories). If we look more closely, we would try to find clues from more specific elements (narrow skills), such as fluency, tone and speed when they are speaking; grammar, sentence structure and vocabulary from their writing.ย 

This approach is frequently used by most intelligence tests. By combining scores from the subtests that measure narrow cognitive skills, several scores for broad cognitive abilities (they usually are known as index) can be determined, which then form a total score for general intelligence.ย 

Does this structure actually work?

Letโ€™s look deeper into this on two levels – broad and narrow cognitive abilities.

Does interpretingย  intelligence at a broad-cognitive-abilities level make sense?

Letโ€™s first address the problem of an unequal number of broad cognitive abilities within different intelligence tests.ย 

Since intelligence tests were not initially built on a clear definition, the exact number of broad cognitive abilities included in GMA tests remains in a grey area. This means test publishers have the freedom to decide how many broad cognitive abilities to include in their tests.ย 

For instance, the Wechsler Adult Intelligence Scale-Fourth Edition (WAIS-IV) includes index scores for 4 broad cognitive categories (Figure 4), while the Stanford-Binet Fifth Edition (SB-5) includes 5 index scores to represent different broad cognitive categories (Figure 5).

In other words, if a test publisher uses the CHC intelligence model mentioned earlier as the basis for a GMA test, they could potentially include up to 8 broad cognitive categories.

Figure 4: Wechsler Adult Intelligence Scale–Fourth Edition (Wechsler, 2008)

Figure 5: from Stanford-Binet Intelligence Scales (5th ed.), by G.H. Roid, 2003

Researchers have dedicated significant effort to figuring out how many broad cognitive abilities should be included in a GMA test (e.g., Canivez & Watkins, 2010; Benson et al., 2018; Canivez et al., 2017; Golay & Lecerf, 2011; Canivez et al., 2016). However, there is little evidence supporting GMA tests with more than 4 broad cognitive abilities. This suggests that using these tests to assess intelligence, especially when making critical decisions could be problematic.

Imagine if we are selecting a candidate for a high-stake role, such as a senior executive position. We have determined 4 essential abilities that a senior executive should possess to succeed in this role: strategic thinking, leadership, financial acumen and communication. Now, if we try to assess the candidate on additional factors like punctuality, handshake style or culture-fit, we might dilute the focus on what really matters for the role. What is worse, these extra factors may not provide any significant additional insight into their suitability for the job โ€” just like adding more than 4 broad cognitive abilities in a GMA test might not enhance your understanding of a personโ€™s intelligence.ย 

The problem with overestimating oneโ€™s intelligence

Although some studies support 4-index structure in GMA tests, there is a concern about what these indexes truly represent. Most of the index scores are primarily influenced by general intelligence (the g factor), while the broad cognitive abilities they are supposed to measure only make up a small part of the scores. This means that, even if the test has only 4 indexes, interpreting index scores alone is not the best approach for making important decisions (Canivez et al., 2016; McGill et al., 2018).ย 

Consider a time when you went to a restaurant and had four different dishes: a soup, a salad, a main course, and a dessert. How would you judge the chefโ€™s cooking skills? If the chef is excellent overall, they might do well on all four dishes. However, if the chef has a specific talent for making a perfect soup or salad, would this significantly change your overall impression of their cooking ability?

In the same manner, even if a GMA test uses 4 indexes, focusing solely on those indexes without considering the broader context might not be the best way to make important decisions about someoneโ€™s intelligence.

How about at a narrow cognitive skills level?

Letโ€™s think about it: What would you do to improve your English proficiency? Typically, we might start by identifying our weaknesses, such as vocabulary, and then work on improving those areas with the expectation that this will boost our overall English skills.ย 

Sadly, this isnโ€™t always the case.

Consider taking an English proficiency test with various subtests like grammar, vocabulary, and comprehension. You might think the vocabulary test is specifically designed to show how strong someoneโ€™s vocabulary is, right? But in reality, if a person scores well on the vocabulary test, itโ€™s likely because they have strong overall English skills, not just because they memorised a ton of words from a dictionary. Whatโ€™s more, thereโ€™s a high chance theyโ€™ll score well in the other subtests, too!

In other words, the specific skills that each subtest is designed to measure (e.g., grammar, comprehension) contribute less to the overall English score than you might expectโ€”just like how narrow cognitive abilities contribute less to total intelligence test scores.

Studies on the structure of intelligence show that most subtest scores are largely determined by general intelligence (g factor; Benson et al., 2018; Canivez & Watkins, 2010). Specifically, a personโ€™s general intelligence explains a significant portion of the subtest scores (with a common variance of 63%; Golay & Lecerf, 2011), while the narrow cognitive abilities that each subtest is supposed to measure contribute much less (with variances of 4.7% to 15.9%).

ย 

In short,ย 

Interpreting intelligence based on broad and narrow cognitive categories can be misleading, as it may cause us to overestimate or underestimate someoneโ€™s capabilities. It is recommended to treat the results from GMA tests as a whole, rather than interpreting index or subtest scores individually.

Instead, try cognitive-processes-based assessments!

Thanks to advancements in neuroscience, we are now able to directly observe brain activity during cognitive-related tasks. This has provided a new perspective and a strong theoretical basis for understanding intelligence more concretely, through cognitive-processes-based assessments.

Cognitive-processes-based assessments focus on evaluating executive functionsโ€”what is going on in our minds when we handle daily tasks or solve problems (van Aken et al., 2019). For instance, if we want to assess someoneโ€™s problem-solving ability, we could choose a test designed to activate the same brain area when solving problems.

This approach provides an opportunity to shift from content-focused tests to cognitive-process-focused tests to measure intelligence. Instead of measuring how much knowledge a person has to excel in intelligence tests, we now focus on cognitive processes that are typically beyond our awareness and more closely related to performance criteria.

Naglieri and colleagues (2016), for example, introduced the Cognitive Assessment System (CAS) as an alternative to traditional GMAs. It measures the ability to perform complex decision-making, to focus and resist distractions, to understand inter-relationships, and to grasp the underlying rules of sequencing.

An additional benefit that comes with cognitive assessments is fairness (Holden & Tanenbaum, 2023). Given the minimal involvement of hard knowledge and verbal tests, cognitive-processes-based assessments generally show lower score differences between races compared to traditional GMA tests (Naglieri & Otero, 2018). This suggests that cognitive- process-based assessments can offer a more equitable way to assess intelligence across diverse populations.

ย 

All in all,ย 

Given the long history of intelligence testing, considerable efforts have been made by researchers and test publishers to systematically define intelligence. It is recommended to view IQ test scores as a whole rather than relying solely on individual index or subtest scores, as this may lead to overestimating or underestimating oneโ€™s aptitudes. Be cautious when using intelligence tests with more than 4 indexes for important decisions, as the structure of these tests is not (yet) supported by research. Instead, consider using modern IQ tests that focus on cognitive processes, with the added benefit of reduced score differences between racial groups.ย 

References

Benson, N. F., Beaujean, A. A., McGill, R. J., & Dombrowski, S. C. (2018). Revisiting Carrollโ€™s survey of factor-analytic studies: Implications for the clinical assessment of intelligence. Psychological Assessment, 30(8), 1038. https://doi.org/10.1037/pas0000652ย 

Canivez, G. L., Watkins, M. W., & Dombrowski, S. C. (2017). Structural validity of the Wechsler Intelligence Scale for Childrenโ€“Fifth Edition: Confirmatory factor analyses with the 16 primary and secondary subtests. Psychological Assessment, 29(4), 458. https://doi.org/10.1037/pas0000358ย 

Canivez, G. L., Watkins, M. W., & Dombrowski, S. C. (2016). Factor structure of the Wechsler Intelligence Scale for Childrenโ€“Fifth Edition: Exploratory factor analyses with the 16 primary and secondary subtests. Psychological Assessment, 28(8), 975. https://doi.org/10.1037/pas0000238ย 

Canivez, G. L., & Watkins, M. W. (2010). Investigation of the factor structure of the Wechsler Adult Intelligence Scaleโ€”Fourth Edition (WAISโ€“IV): Exploratory and higher order factor analyses. Psychological assessment, 22(4), 827. https://doi.org/10.1037/a0020429ย 

Carroll, J. B. (1993). Human cognitive abilities: A survey of factor-analytic studies (No. 1). Cambridge University Press.ย 

Golay, P., & Lecerf, T. (2011). Orthogonal higher order structure and confirmatory factor analysis of the French Wechsler Adult Intelligence Scale (WAIS-III). Psychological assessment, 23(1), 143. https://doi.org/10.1037/a0021230ย 

Holden, L. R., & Tanenbaum, G. J. (2023). Modern assessments of intelligence must be fair and equitable. Journal of Intelligence, 11(6), 126. https://doi.org/10.3390/jintelligence11060126ย 

McGill, R. J., Dombrowski, S. C., & Canivez, G. L. (2018). Cognitive profile analysis in school psychology: History, issues, and continued concerns. Journal of school psychology, 71, 108-121. https://doi.org/10.1016/j.jsp.2018.10.007ย 

Naglieri, J. A. (2020, April). 100 years of intelligence tests: We can do better. https://www.apadivisions.org. Retrieved: August 1, 2024, from https://www.apadivisions.org/division-5/publications/score/2020/04/intelligence-tests

Naglieri, J. A., & Otero, T. M. (2018). Redefining intelligence with the planning, attention, simultaneous, and successive theory of neurocognitive processes. In D. P. Flanagan & E. M. McDonough (Eds.), Contemporary intellectual assessment: Theories, tests, and issues (4th ed., pp. 195โ€“218). The Guilford Press.ย 

Naglieri, J. A. (2015). Hundred years of intelligence testing: Moving from traditional IQ to second-generation intelligence tests. Handbook of intelligence: Evolutionary theory, historical perspective, and current concepts, 295-316. https://doi.org/10.1007/978-1-4939-1562-0_20ย 

Pintner, R. (1923). Intelligence testing. New York: Henry Holt.

Roid, G. H. (2003). Stanford-Binet Intelligence Scales (5th ed.). Itasca, IL: Riverside.ย 

Van Aken, L., van der Heijden, P. T., Oomens, W., Kessels, R. P., & Egger, J. I. (2019). Predictive value of traditional measures of executive function on broad abilities of the Cattellโ€“Hornโ€“Carroll theory of cognitive abilities. Assessment, 26, 1375-1385. https://doi.org/10.1177/1073191117731814ย 

Wechsler, D. (2008). Wechsler Adult Intelligence Scale–Fourth Edition (WAIS-IV) [Database record]. APA PsycTests. https://doi.org/10.1037/t15169-000ย 

Williams, T. H., McIntosh, D. E., Dixon, F., Newton, J. H., & Youman, E. (2010). A confirmatory factor analysis of the Stanfordโ€“Binet Intelligence Scales, with a highโ€achieving sample. Psychology in the Schools, 47(10), 1071-1083. https://doi.org/10.1002/pits.20525ย 

Our inspirational blogs, podcasts and videoโ€™s

Listen to what they say about our product offering right here