Yesterday, Andrew Sullivan linked to my piece on the Canadian polygamy ruling. Unfortunately, around that, he’s spent the last three days how difficult the world is for those researchers who study race-based IQ differences and how we need to look into this more. He knows this because the people who agree that there is something there say the political climate is stifling.

Now, as I’ve noted before, there are plenty of people who study this link, plenty of people who talk about it in various forms of media, and plenty of people who cry politics when factors other than genetics are pointed to as major factors in IQ test differences. All it really takes to know this is to pay attention to both sides of this debate.

The problem is that this can be difficult to do. The racially based IQ story is simple and plugs into our just-world biases. The actual picture involves dealing with complex data and statistical analyses. In order to give an overview of the general data on IQ and heritability, I put together this bibliography a couple of years ago. It was originally posted here.

Apropos of the continuing tendency for white supremacists to show up crowing about IQ, here is some reading that may help people understand the history of IQ testing and its relationship to the complex phenomena that are lumped under the term “intelligence.”

IQ Tests: Do They Measure Intelligence?
A quick overview of the topic in lay terms.

Stalking the Wild Taboo–Intelligence: Knowns and Unknowns
The comprehensive report of a task force established by the Board of Scientific Affairs of the American Psychological Association in response to The Bell Curve. Includes references.

Never a Dull Moment
A follow-up to the above report, addressing critiques of the report. Presents additional references, including a critique of Rushton’s work.

IQ tests: Throwing out the bathwater, saving the baby
An argument for a very limited use of IQ tests in educational assessment, with a clear discussion of their limitations.

Alfred Binet
A biography of the psychologist, including a discussion of his development of a scale of activities to measure “mental age.”

Lewis Madison Terman
A biography of the psychologist, including a brief history of the development of the Stanford-Binet test.

Wechsler Adult Intelligence Scale
An overview of the subtests and scales involved in the most commonly given IQ test for adults.

The Construct Validity of IQ Tests–A Comprehensive Psychometric Meta-Analysis
A meta-analysis designed to determine how many types of intelligence the WAIS is measuring.

Individually Administered Intelligence Tests–The Testing Process
A sample of the variety of intelligence tests offered.

“Reliability” and “validity” of IQ tests
A discussion of the different types of validity required of scientific tests and how well those requirements have been met in IQ testing.

SoYouWanna score higher on an IQ test?
Not any sort of definitive site. However, it lists strategies for practicing to the test, which does have an effect on even tests that are supposed to measure innate, unchanging qualities.

Excerpt: ‘IQ: A Smart History of a Failed Idea’
Coaching a child to perform well on an IQ test in order to get into a prestigious private school.

Is IQ actually AQ? (Mistaking Achievement for “Intelligence”)
A discussion of what is measured by IQ tests.

Stereotype threat
How the knowledge of low expectations can lead to lowered IQ scores.

The Chitling Intelligence Test
A facetious look at how cultural background can influence the development of intelligence tests.

Shattering Intelligence: Implications for Education and Interventions
James Flynn (of the Flynn Effect) breaks apart the concept of general intelligence. Discusses the interaction of cognitive skills and exercise.

Heritability Estimates Versus Large Environmental Effects: The IQ Paradox Resolved
Uses basketball as a model to discuss how small genetic differences can interrelate with environment to exaggerate the measured heritability of a trait. Aimed at the results of Jensen’s twin study data.

Lewontin vs. Jensen debate
Lewontin answers Jensen’s objections to targeted educational enrichment. A classic debate on the topic.

Human Diversity
A review the book by Lewontin (a population geneticist) on the intersection of genetics and culture. Or read Human Diversity (Scientific American Library Series)


[ETA] Anthropology and Race – December 2011
A collection of anthropological writings on the validity of race as a concept.

The Mismeasure of Man
A summary of the book by Stephen J. Gould on the study of biological determinism. Or read The Mismeasure of Man


The Impact of National IQ on Income and Growth–A Critique (pdf)
Criticism of Lynn and Vanhanen’s work on the basis of imprecise modeling and insufficient controls.

A Review of the Bell Curve: Bad Science Makes for Bad Conclusions
A brief but broad overview of the unsupported assumptions and confounding variables used by the authors of this “simple treatise of conservative ideology” that attempts to link race to IQ to social outcomes directly.

Measured Lies: The Bell Curve Examined (book)

A “thoughtful, readable anthology” of essays critiquing The Bell Curve.

…In Different Voices
Part one of a technical but accessible Q&A on the topic of the heritability of intelligence. Much snark.

Those Voices Again
Part two.

    race matters. suburban whites will generally have higher iqs than inner city blacks by virtue of being better nourished and access to activities, like piano lessons, that improve iq

    Wow. People with privilege have more resources at a younger age, and thus get better IQ scores? Whodathunkit?

    This is the biggest reason why studies purporting to link race with IQ will fail: you can’t control for this. And in fact, the people studying it presently don’t seem to WANT to control for it, they actually want to reinforce it.

    I’d love to see someone who wants to link race to IQ define the human races. If you’re going to compare groups, being able to do that would seem to be a basic starting point. If you run an experiment on the properties of different chemicals, then you need a metric that allows you to distinguish between the chemicals. The Bell Curve uses ill-defined groups that have significant overlap; even if “g” really exists and can be measured (a dubious premise) the fundamental premise of The Bell Curve that the difference between these amorphous groups can be quantified and that the difference means something is so flawed as to be utter nonsense. If you don’t know who you’re measuring, you can’t make any comparisons. It is frustrating to see conservatives claim this as a case of “pc”. It’s not
    – it’s a case of bad science. Given your links I didn’t spend time on the nonsensical ideas that there is such a thing as “g” and that it can be measured. You seem to have covered that

    How would you label someone who thinks that:

    IQ tests measure something important (perhaps the tallest midget, but IQ is the most powerful variable in social science),

    race groups differ on average on IQ scores,

    the differences correlate strongly with race gaps in human well-being,

    Science now has the tools to figure out wtf?

    For balance, you should consider including this article in your list (APA journal, NOT “Intelligence”):

    Your SA guest blog suffers from severe selective citation. At least include one article that covers the other side?

    Clustering studies. An interesting read on how this is possible– even if you completely disagree with it– is Nick Wade’s Before the Dawn.

    Yes, there is overlap– I don’t know anyone in the literature who claims race is dichotomous. I see it like handedness– some people are extremely left or right handed, some in between. Finding an ambidextrous person doesn’t mean left handers don’t exist. Same for mixed race people.

    How do forensic scientists identify race from bones if race is purely a social construct?

    How do forensic scientists identify race from bones if race is purely a social construct?

    I never argued that race was purely a social construct. I do think it’s mostly a social construct, though. More specifically, because I hate postmodernism and avoid the use of terms such as “social construct” : I think race is a non-analytical unit of measurement, where all observations are necessarily made in terms of units of measurement, another example of a non-analytical unit of measurement is a “foot” and example of an analytical unit of measurement is an electron. I’ve come to this conclusion based on what I know about population genetics as well as the creation of analytical units of measurement in science.

    P.S. It interests me that various (white and Asian) researchers continue to argue that black people are dumb for biological reasons despite the academic success of African immigrants in the United States. African immigrants are blacker than African Americans, who tend to have varying degrees of white ancestry thanks to American slavery. You’d think they’d be dumber, yeah? At least, according to the “blacks are naturally stupid” argument.

    Thanks J for intelligent comments without any attacks. Thanks too for hating post modernism.

    What properties do analytic units possess that non-analytic ones lack?

    I think the biggest contribution social science had made to humanity is on figuring out how to measure things (psychological constructs) that we can’t see (e.g., motivation, IQ, religiosity).

    Reasonably sophisticated statistical tests exist to show both whether a test measures what it claims to measure, and how much of the variance in test scores is due to error, versus true individual differences in the construct being measured.

    I think IQ tests measure g with very little error, and I think g represents some basic, global index of brain efficiency.

    Self reports of race are far less valid (but not so invalid as to be worthless). I predict labels based on clustering studies will correlate very strongly with self-report measures of race (I think literature might already exist that shows this). So, the self-report measure of race is a more practical (but indirect) measure of the construct. Why genetically test subjects if a self report correlates .89 (making up this number) with the results of the expensive cluster analysis?

    And, I’m ok with you discounting non-analytic measures, but you’d have to throw out most all of social science, versus just IQ tests (e.g., measures of motivation; our blog host’s preferred third variable explaining IQ’s predictive validity).

    I don’t think blacks are dumb. I don’t think you’ll find any person whose published on this topic who claims that. It’s trivially easy to find smart black people and dumb white people.

    Instead, I think a 1 sd mean difference exists on IQ test scores across these two groups. The difference maps on strongly to differences in other things (education, health, income) that create human well-being. These are both facts (the data are the data). The controversy comes when people try to explain these facts.

    Suppose there was a 1 sd difference on, say, pre-natal nutrition levels, and that these differences mapped on to other variables that index well-being. We’d be throwing billions of dollars into studying why / fixing the problem. Because the 1 sd difference here relates to IQ test scores, we ignore it, or we marginalize anyone who wants to find out why as racist blind to his privilege.

    Bryan, two of the five references listed in my SciAm guest post are studies with Paul Thompson as first or senior author. Are those not “the other side”?

    By the way, you’d do better to ask me what I think of a scientist who tells me that (the variables represented by) socioeconomic status or stereotype threat are irrelevant in explaining these differences because they don’t–each–cover all of the variability.

  11. 14

    Thompson’s cool — elite and a well-established / top notch scientist.

    Despite citing him, the SA post implies that blank slate is the consensus in the field. That’s blatant misrepresentation.

    Afaik, no researchers claim that IQ is 100% determined by genes.

    I can’t think of any in field who assume it’s 100% environmental, either (but that’s pretty much the take home message I get from you SA post– after “dismantling” heritability calculations by citing a blog post).

    The APA task force article (old now, but the consensus in the field back then for sure) is that 50-75% of variance in IQ scores is due to heredity.

    Strong indirect evidence for a genetic link, yet you discount it completely til the exact set of genes responsible for IQ is discovered. That’s an usually high standard of proof. I think you demand it only because it lets you preserve a naive empiricism.

    Also, odd that there’s no mention of adoption studies, which consistently (but not always) show zero correlations between foster parents and adopted kids, yet large correlations between biological parents and kids they gave up for adoption.

  12. 15

    I do believe stereotype threat is completely irrelevant / cannot explain group differences in IQ scores.

    SES is relevant. I see, however, no way to untangle the direction of causality (IQ causes SES, vice versa, or both?).

    So, I’m not sure it’s appropriate to control for SES when looking at the effects of IQ on some other variable. Would you demand that studies control for IQ when looking at the effects of SES (or income inequality) on some other variable? I’d bet not.

    Actually, Bryan, the SciAm guest post very explicitly says that there is disagreement on this topic. It also says that all the direct evidence we have points to environmental causes of variability. For example, adoption studies can’t tell us what differences are due to prenatal environment or the caustic effects of racism.

    So, did you have some criticism to make of Shalizi’s work on heritability estimates, or are you just going to smear it as a simple blog post so you can cite the old numbers?

    Comment 15, by the way, is so entirely perfect a measure of your worth as a scientist commenting on this topic that I’m going to let it stand without a response.

    So I should invest serious time into debunking stereotype threat and defending the utility of heritability estimates here only to get your vague / arrogant replies?

    How did I smear Shalizi, other than by mentioning it was a blog post, versus a peer-reviewed article?

    Re, the scientific worth of my comments on SES: many in the field (most are smarter than I, but obviously not you) are grappling with this issue. Controlling for variables may not always be wise– at the very least, regression results from single studies where one variable wins the variance-explained battle are not compelling.

    This has been a fairly regular theme (without yet any good answer) in the 31 articles on IQ I’ve peer-reviewed over the last 2 years, but I will defer to your expertise.

    Oh, so controlling for socioeconomic status when studying IQ is actually controversial, despite the fact that you’re born into your socioeconomic environment and it’s therefore relatively easy to control for? Or are you saying your IQ is set when you’re but a mere baby, so you could control for IQ at birth when studying people’s socioeconomic statuses?

    Patently ridiculous.

    Patently ridiculous, or is it possible you just lack knowledge about statistical inference?

    You’re missing my point (not surprising). It would indeed be nice to directly test how IQ affects SES, and vice versa. And, it would be nice to directly test whether IQ or SES is more important (or causal) in explaining other things (like health, income, education).

    But we can’t randomly assign people to SES or IQ, so we’re stuck with correlational data(welcome to social science).

    Example: Any one study has a measure of SES (with its own reliability and construct validity) and a measure of IQ (with different R and V). IQ and SES are not independent, though. For whatever reason they correlate around .3 to .5.

    We use SES and IQ to predict, say, education. Suppose SES significantly predicts education, even when controlling for IQ.

    Have you “proved” that IQ correlates with education spuriously, with SES being the true causal third variable?

    Wait, we should also control for race, sex, and income inequality. Now, a different conclusion emerges in terms of the variables that “win” the variance explained contest. What inference do we make here?

    Add to that, most important social and psychological variables are strongly inter-correlated. Pick your variable of choice, and the right regression model and you can prove whatever you want.

    You should enlighten us as to the research strategy that directly resolves “why” for all this. I know you can do it, as bloggers are quite good at confidently dismissing decades of evolving scientific literature with off the cuff opinions that must be true (and which have never occurred to the incompetent scientists who actually study the topic).

    Gollee, Mr. Dr. Pesta, it’s good to know I’m not smart enough to write a post that you:

    • Suggested I didn’t have the credentials to write instead of engaging with directly;
    • Didn’t want to discuss the details of, working instead to set up an argument over a side issue;
    • Suggested I was using to shut down debate, when instead it describes the shape of the debate;
    • Told me didn’t include references it did;
    • Told me inferred something that was specifically ruled out by the thesis;
    • Mentioned I cited a blog post, without (still) having anything critical to say about the information presented therein;
    • Suggested I should have mentioned adoption studies, when the researchers behind the best-known adoption study agreed with the point I made here that those studies can’t reliably determine genetic influence; and
    • Used as evidence something you “believe” without backing it up.

    I’m so lucky I have you here to tell me how stupid I am.

    Now, are you going to say anything substantial? Some cut-and-paste troll who doesn’t check to make sure his links are good has commented more substantially than (though just as wrongly as) you.

    Why is Django a troll for posting links that he thinks present evidence relevant to the discussion?

    I never questioned your intellect, nor do I think you are stupid.

  19. 23

    This study demolishes Shalizi’s hypothetical:

    Not at all. It’s a simple study that says that these tests correlate. We already knew that. It doesn’t tell us why.

    Bernard Davis demolishes Mismeasure of Man:

    Nah. It’s well-known Mismeasure has some flaws. Many of those that Davis considers damning, however, rely on him doing things like ignoring that a genetic basis for variability in intelligence is the modern equivalent of “cerebral energy” and acting as though political biases are removed from science just because someone says that what they’re doing is science. Plus the old, “Oh, no. You’re introducing politics!” play, which is silly. The politics are and were already there. Shooting the messenger is no way to correct for them in the scientific process.

    Fraud in Mismeasure:

    Fraud is quite the charge. Gould was wrong, yes, which is included in the Wikipedia link I provided. Do you give that link to everyone who mentions Mismeasure without reading what they have to say about it?

    Stereotype threat fraud

    Stapel’s studies, which may or may not have been fraudulent (individual fraudulent studies have not been named) aren’t foundational studies in stereotype threat. They’re studies trying to determine the basis of stereotype threat. They are also a tiny fraction of the studies on stereotype threat.

    and publication bias:

    Do you have a copy of the presentation? I’ve been dying to see it, since it’s another one of these standard links that shows up the minute someone says, “Stereotype threat.” The research was never published anywhere else. Also, that pdf is broken and won’t open. You might want to check these things before doing a comment slam.

    Lewontin’s fallacy:

    Nifty and all. Want to say anything about why you included it here?

    Also, write your own damn blog posts if you have an argument to make.

    Bryan, a bunch of unrelated, misrepresented, and broken links is not exactly an attempt to advance a discussion. That’s why it’s trolling, of the troll bridge variety.

    “Not at all. It’s a simple study that says that these tests correlate. We already knew that”

    Did you read the actual study? If Shalizi were correct, results from different test batteries wouldn’t produce g correlations of at least .95.

    “It’s well-known Mismeasure has some flaws. Fraud is quite the charge. Gould was wrong, yes”

    Outright fraud is not a minor “flaw”. It’s a strange book to cite as authoritative. Contrast Gould’s track record with Jensen’s.

    “Want to say anything about why you included it here?”

    The Lewontin’s fallacy link is of interest to anyone who has read his book Human Genetic Diversity.

    “a bunch of unrelated, misrepresented, and broken links is not exactly an attempt to advance a discussion”

    All the links I provided are relevant, none are “misrepresented”, whatever that’s supposed to mean.

    You understand that development of new tests uses old tests for benchmarking, yes? If they were not highly correlated, someone would have been doing their job very wrong. That still doesn’t mean that they measure what they purport to measure.

    You still haven’t demonstrated fraud. Just repeated your assertion.

    The Lewontin link is of interest to people who have read something I haven’t listed or referred to here? That’s basically the definition of “unrelated.” As for what “misrepresented” means, try reading my comment for comprehension instead of trying to advance your argument.

    “That still doesn’t mean that they measure what they purport to measure”

    Actually, it does. Read the study.

    “You still haven’t demonstrated fraud”

    It was either fraud, or gross negligence. Given the nature of the errors Gould made, fraud is the most likely explanation.

    “The Lewontin link is of interest to people who have read something I haven’t listed or referred to here?”

    The Lewontin’s fallacy link addresses arguments made in the Lewontin’s Human Diversity.

    My apologies, Django. I’d forgotten the Lewontin was on the list. It’s one I didn’t reread before reposting. I’ll address the fallacy later, as I have guests.

    No, given the nature of bias, fraud is not necessarily more likely than negligence. Also, many of the differences between Gould and Morton come from Gould limiting the sampling to groups of a certain size.

    But do you seriously not understand that demonstrating a correlation between tests is not the same thing as demonstrating that those tests measure a general intelligence? It is a leap that is made repeatedly in studies like these, but it isn’t actually supported. Not only does the study not “demolish” Shalizi’s hypothesis, but you seem to have completely missed that this was his point.

    Ralph Holloway thought it was deliberate fraud on Gould’s part. There is a small possibility Gould was merely sloppy and incompetent.

    Regarding the Johnson, Nijenhuisc, and Bouchard paper, factor analysis artifacts wouldn’t produce g correlations of at least .95.

  26. 30

    Wow, Brian. Just wow. You are so uniquely incapable of taking criticism from people who aren’t on your side of the IQ/race wars. If you had perhaps thought about what I was saying, instead of leaping to the thought that perhaps I did not understand statistics, you’d realize that I’m saying you can’t control for IQ at birth, but can control for SES, in studies. And because racial differences in IQ manifest at age 2 — which post I hadn’t read before I made the point, so it’s a nice bit of dovetail.

    Which means, if you’ll follow the logic, that unless you rigorously control the environment, how the hell do you know that you’re not seeing a racial biasing of the numbers? Not that you care — because your idee fixe is that it’s the other way around, that race determines IQ, so you’ll avoid that particular mental avenue at all costs.

    And beyond that, how do you know IQ is a real thing? As Stephanie points out to Django, these newer tests correlate with previous tests proving that they all match up to the originally built construct. That doesn’t prove the original construct is anything at all. They correlate by design.

    No problem JT.

    I don’t think race determines IQ.

    I do think it’s important to consider environmental factors– like SES– when trying to explain race gaps. My point was that the environmental factors themselves correlate with IQ. This creates a scenario where no strong conclusions seem possible (given the present state of the art in social science– feel free to offer insights into the unique experimental design that’s conclusive with regard to this issue. It’s eluded many people smarter than I. I suspect you can do it though!).

    So two IV’s suffering from multi-colinearity are entered in a regression equation predicting y. In any single study, one variable wins or loses. How this resolves the issue is beyond me (even in those studies where IQ emerges as victor over SES).

    Stephanie cites a sophomoric critique of IQ– she sees convergent validity as sorta “duh,” and maybe it is. But, lack of convergent V is rather informative, so testing for it is necessary, and not so much incompetent science — it’s just one key source of data researchers need when estimating the validity of their measures (validity is based on the totality of evidence, with one important aspect being how the test correlates with other tests purported to measure the same construct. That seems absurd in your world view?).

    But, given your complaint, I find it ironically odd that how well one judges line lengths correlates with one’s vocabulary, which correlates with how quickly one can arrange colored blocks to form a predefined pattern.

    And, I believe that that g is real given it’s moderate to strong correlations with cognitive processes (working memory) and basic information processing ability (reaction and inspection times). These cognitive and information processing measures co-vary with race. In fact, control for the cognitive processes and the race difference on paper and pencil IQ tests goes away (as far as I know, no other third variable completely “explains” 100% of the gap). This to me rules out many– but not all– environmental factor X’s that people use to explain the gap.

    So, I don’t really know why races differ in mean IQ. I think the answer, though, plus an intervention that fixes it, would do more to increase human well-being than anything else mainstream science is currently studying.

    Stephanie. Cosma’s critique is a tautology. Surely, when variables in a study are all (mostly) positively correlated, a general factor will emerge. He does a nice job simulating data that shows this.

    So, when the correlation matrix is mostly positive and non-zero, a g emerges (by definition!).

    What he ignores: The century old literature on IQ could have produced any number of empirical facts about iq’s reality. For example, it’s intuitive (common sense!) that multiple intelligences exist (witness the popularity of Gardiner’s, or Sternberg’s stuff). Ask an educated lay person whether a person can be smart on some sub-domain of IQ (math) but poor on some other sub-domain (verbal), and most will say yes.

    But, data trump opinion and intuition. So, look at the data. The positive manifold is so well-replicated, that it is indeed an empirical fact. A matrix of positive correlations among cognitive ability tests defies common sense; yet it’s existence is one of the most replicated findings in social science.

    If reality showed unique / independent /orthogonal sub-domains of IQ, I would be here touting Sternberg or Gardiner. Unfortunately, the data squarely contradict these theories. But, keep fighting the good fight!

    Ah, Django. Forgot about you until I needed to add to this list.

    No, really, you need more than a correlation, even a high one, to demonstrate construct validity. For something like a general intelligence factor, you have to demonstrate that you’re not measuring something else entirely, or several something elses, all of which correlate well with those tests. Simply saying nothing but a single factor can produce correlations that high doesn’t even come close to cutting it.

    Now, as for Lewontin’s “fallacy,” I found the quote I was looking for:

    The most concerted attack on Lewontin’s position came much later, from the geneticist A. W. F. Edwards, who contended in 2003 that, as more genetic loci were considered, the higher became the probability that an individual could be “correctly” classified into his or her population. He thus christened the Harvard biologist’s position “Lewontin’s fallacy.” Actually, the plant geneticist Jeffry Mitton had made the same observation in 1970, without finding that Lewontin’s conclusion was fallacious. And Lewontin himself not long ago pointed out that the 85 percent within-group genetic variability figure has remained remarkably stable as studies and genetic markers have multiplied, whether you define population on linguistic or on physical grounds. What’s more, with a hugely larger and more refined database to begin with, D. J. Witherspoon and colleagues concluded in 2007 that although, armed with enough genetic information, you could assign most individuals to “their” population quite reliably, “individuals are frequently more similar to members of other populations than to members of their own.”

    Whether or not you accept Lewontin’s reasoning, his conclusions are playing out as predicted.

