secrets of the sat
absolutely nothing here
VIEWS OF AUTHORITIES ON INTELLIGENCE AND TESTING concerning the SAT's relation to the IQ test, its ability to predict success in school and the debate over whether the SAT measures aptitude or achieve'
navigation, see text below
JOHN KATZMAN: He is President and founder of the Princeton Review.

read the full interview Is the SAT an IQ test?


Why isn't it an IQ test?

Because it doesn't measure IQ. It is used that way. And it was developed from the army IQ test. But even the College Board will refuse to say that this is an intelligence test. And I'd love to see them say it. I'd love to see them say anything because then you can attack it. But there's this kind of mushy response that when you work your way through it, there's sort of nothing left--'Well, it has a slight predictive validity to freshman year grades in college.' We spend a 100 million dollars a year for that? You know--your grades in high school predict college grades better than this and we didn't have to spend anything.

They say the SAT provides a common yardstick for comparing grades at different schools.

Right, that's where all of the anti-test people, I think, are wrong. And where the testing folks are right. You do need a common yardstick. You do need some way to judge an A at this school or this teacher versus an A at this school or this teacher. But there are lots of common yardsticks. Again, you could use blood type. You could use height. Anything is a common yardstick. What you have to say is, fine, it's common. But it is useful? And there are lots of tests that are more useful than the SAT that are also common.

Such as?

Well, for instance, the Advanced Placement Tests. They are rigorous. They're difficult. There are lots of them. You can say, my interest is history. And so I'm going to take a history Advanced Placement test. And I'm going to get--I want kids to be rigorous. I want curricula be rigorous. But I want them to be not one-size-fits-all and not mindless. Let's have kids studying hard but let's have them studying something useful, hard.

You say the SAT doesn't measure intelligence. What does it measure?

The SAT is said to predict freshman year grades in college, a little. And it does. It measures it a little. Almost anything you do, including family income, will measure freshman year grades a little. But the point is that it doesn't measure intelligence. It doesn't measure anything that's worth a 100 million dollars a year prepping for it.

BOB SCHAEFFER: He is Public Education Director of FairTest, a standardized test watchdog group.

read the full interview Are you saying the SAT is a guessing game?

Part of the SAT is a guessing game. Fair Test's position is reasonably new on it. We've never said that the SAT is not measuring something meaningful. There's a little part of meaning, there's a little part of background, there's a little part of schooling. But there's a lot of test-wiseness. There's a lot of 'how shrewdly you can play the game?' There's a lot that can be taught in coaching courses that has nothing to do with any of the skills you need to succeed in college or in life.

Wayne Camara, head of the College Board's Office of Research, says the SAT is a measure of verbal and mathematical reasoning ability.

Well, Wayne Camara, who I know is a decent person--maybe, he believes that. But if you talk to representatives of tobacco companies, they tell you their products are good and healthy and don't lead to diseases. And auto manufacturers have told you that their cars don't burst into flames. And back at the beginning of the century, you know, stockyard owners claimed that they were slaughtering the animals in a clean way. I think we've learned not to trust those who profit from manufacturing products, as the sole source of information.

Yes, one of the things the SAT is measuring in small part, are those skills. But it's measuring a whole lot of other things. And when you use the SAT as the major factor--or worse, the sole factor--to make high stakes decisions, to define what is merit, you're relying not on what most people think of is merit. But on these very trainable skills that coaching courses and others help people learn.

What does the SAT predict?

The sole scientific claim of the SAT is its capacity to predict first year grades. According to the technical studies done by the Educational Testing Service and College Board, the SAT predicts about one factor in six--one sixth of the difference between two kids' first-year grades. The predictive value declines after that--looking at four year grades or graduation rates. So even the test makers agree that five out of six parts of whatever it takes to predict how well you're going to do in your freshman year, is not their test.

It does correlate extremely highly with an IQ test. It was developed from the army IQ test...

That's part of the seedy under side of the SAT. The SAT was originally developed by straight out racists--eugenicists, people who thought my forbearers--not just people of color--were imbeciles and shouldn't be allowed in their country because they didn't know the language and couldn't score high on their test. I wouldn't suggest the current people who run those companies share those kinds of ugly views. But it's a self-reinforcing notion of defining intelligence as that which whatever the dominant group in society has. Ends up giving that group higher scores and lower scores. The fact that test scores correlate with test scores is rather meaningless. The tests are measuring the same set of factors. What's more important is whether the test accurately predicts how well you're going to do.

Would you say that we are the only country in the world that administers a national IQ test?

Well, despite the efforts of the Educational Testing Service--which is a global corporation with nearly about half a billion dollars in total revenues--the US still is the major country that administers a test like this across the board to college bound seniors. If you take the SAT and its competitor, the ACT--which about 80 percent as many kids take--the vast majority of college bound kids take those tests. And yes, the SAT in particular has its roots in IQ testing. Which are at best controversial and, at worst, quite, quite poor predictors of anything of value.

So is it an IQ test?

It's a variant of an IQ-like test. It is set up somewhat differently. It begs the question of, what is an IQ test measuring? What is intelligence? And you talk to test makers. And intelligence is what their test makes. And that's a circular definition. So to the extent that it's measuring the same that an intelligence test is measuring--then, yes it is. But there's three fallacies there: That there is such a thing as intelligence--that it can be measured. And that you can put the measurements on a linear scale. And other, even people who believe that there is such a construct as intelligence believes that intelligencee is not one thing but seven or eight or possibly nine different things. Robert Sternberg at Yale says it's three different things.

At best, the SAT is badly measuring one of those parts of what goes into intelligence.

WAYNE CAMARA: He heads up the College Board's Office of Research

What exactly does the SAT measure? You say reasoning and ability skills.

The SAT measures two areas. It measures developed verbal reasoning, which are the type of skills that would be measured by reading long reading passages. For example, in our new test students have an essay where they would read two contrasting views on a topic. It could be political. It could be in humanities. It could be in science. And they need to piece together similarities and differences of the arguments--contrasting views. So that's a type of analytic thinking and critical thinking skills that are acquired when you read essays in college. Or the type of scientific or literature work that you'll encounter in college and in English.

In mathematics, the SAT measures developed mathematical reasoning. So it shies away from simple computation. As a matter of fact, the SAT of today--unlike the SAT that you and I probably took--allows and even encourages students to bring calculators. So it cannot measure simple addition or division or fractions because those would be incredibly easy with the use of a calculator. It has to measure reasoning problems, the type that you would have, in real world applications.

And it also has a number of items that are not multiple choice. Students have to read the context, understand the mathematical applications involved, and then generate their own answer.

Isn't it an IQ test?

No, it's not an IQ test. It's far from it. Developed reasoning skills measured on a test like the SAT, will link directly to the, the breadth and the depth of the curriculum students have been exposed to in school, but also out of school learning. Students who have read an incredible amount, whether it's in school assignments or out of school assignments, are more likely to do better on tests like the SAT but also in college.

So it's not an achievement measure, which would be redundant with what grades are. But it's certainly not an IQ test which would be an innate measure of ability. It's much more developed reasoning--the type of skills students develop over an extended period of time.

But the SAT test was basically a stepchild of the army IQ test. Right?


Even when it was first being proposed, Henry Chauncey would have to say--no, it's not measuring achievement. It's measuring ability more like an IQ test. Did it change then from that?

The SAT in, in the past 40-50 years has changed remarkably. Just as cognitive ability tests in general--that field has changed remarkably. The types of items and the way we consider intelligence and aptitude and achievement today, has evolved in the last 40 years. And tests have gotten much better and more accurate in what they're doing.

CLAUDE STEELE:  He is a professor of social psychology at Stanford University

read the full interview What does the SAT measure?

The classic phrase is that these tests measure what they test. And the SAT is no exception to that. The way items get on that test is the way items get on most tests of mental ability which is that they are items that correlate with performance in school. So an item that you would give to a norming sample that doesn't correlate very well with school success gets dropped off the test. Items that do correlate get put on the test. That's how tests get made up. They're just empirical creations, creations of American and European pragmatism. If you want to find out what actual mental capacity they measure you have to work backwards. You have to use statistical techniques to classify the kinds of performances that they're measuring and work backwards to "Well, if it measures this cluster of performances, maybe it measures this kind of capacity." And then there have developed big arguments about which performance this cluster measures and what performance that cluster measures and which are central to performance. So it's a very complicated game trying to work backwards and figure out what these tests actually measure.

But is this SAT an IQ test?

It is in a sense an IQ test. The SAT and IQ test correlate very highly. Between the SAT and the IQ, they correlate almost as much as the SAT correlates with a second administration of the SAT, as much as it correlates with itself. So they're very similar tests in content.

Give me the little history lesson.

The methodology for standardized tests of the kind that we use today was developed in the 19th century by Francis Galton who was as many say, the jealous cousin of Charles Darwin. And he was trying to get a test that would test his kind of evolutionary, social Darwinist hypothesis that intelligence ran in families. Of course all kinds of other things ran in families like wealth, advantage, and so on, but that didn't bother him. He wanted a test that would discriminate between basically upper class and lower class Brits.

He developed this technology of finding items and seeing how much they would correlate with other performances as a criteria for whether the item would be put on the test or not. So he had this situation in the British museum I guess where he would have people come in and perform tasks: reaction time tasks, visual acuity tasks, a whole variety of kind of physiologically-rooted tasks that he thought would tap into intelligence, sort of innate, physical intelligence. His presumption was that upper class Brits would do better on these things than the lower class Brits and he would therefore have a set of items that he could give to people that were a measure of intelligence that would discriminate. People who would score high on this would be more likely to be the upper class Brits. People who scored low on this would be the lower class Brits. So, he died a failed scientist, never finding a set of items that worked like that.

Alfred Binet in Paris at the turn of the century, beginning of the 20th century, was given a practical task of coming up with a test that would help identify kids who were retarded and wouldn't do well in school. So he simply used Galton's technology. He said, "Well, I'll make up a bunch of items. And the items that kids who do well in school get right, I'll put on the test. And items that kids who don't do so well at school get right, I'll put those off the test because they couldn't be measuring something relevant to school success." So he gets a subset of items that kids who do well in school can perform well on, and now he's got a test that when given to people will tend to identify those who are not going to do well in school. And he can do what the Paris school board asked him to do: screen out kids who are going to have real trouble with school.

Well, as everybody knows, that became the basis of the IQ test. It was transported into the United States, the Stanford-Binet test. That same technology of using success in school as a criteria for whether an item gets put on a test or taken off of a test. And that is how essentially, roughly speaking, all standardized tests are constructed. The SAT, the GRE, the mini-IQ test all have that inherent methodology to them.

The man who developed the SAT, Carl Brigham, was an outright racist. Do you even mull that fact?

As I say, that fact has not been wasted on me. And the area of standardized testing and intelligence testing has always been one of the most controversial areas of psychology for precisely that reason. It has often been used as a way of implementing racist intent, most recently with regard to blacks. But in the post-World War I wave of immigration, it was used to screen out Southern Europeans, Jews, and other groups who did not score well on tests at that particular time. So it has, as a tool, a very, very racist past.

What do you think of the SAT, personally?

I think it is an exam that can tell you something. I've used a metaphor, if you can indulge that, that I think captures the basic argument I would use. If you had to select a basketball team by the number of 10 free throws that a player could hit, the first thing you'd worry about is selecting a basketball player based on how they shoot free throws and you know you'd never pick Shaquille O'Neal because he's terrible at free throws even though he's a magnificent basketball player. That's what a standardized test is, compared to the domain of real school performance. Real school performance out there--it's like having to select a basketball player based on how well they shoot free throws. That's the first problem with standardized tests.

And the SAT reflects that. The predictive statistics reflect that. The SAT measures only about 18%, [an] estimate range from 7 to 25%, of the things that it takes to do well in school. This is something that people should realize about the test. People think of it as capturing a very large proportion of things that are important to school success. The people that make these tests tell us, "No, that is not true. They don't capture a large portion of the things--about 18%." In many of the samples I've done research on, much smaller than that, sometimes 4% of the things that are predicting success in college for example. So it's not great, just like a free throw is to selecting a basketball team. And SAT is not going to get you very far with predicting who's going to do well in college. And certainly not far with regards to who is going to do well in society or contribute to society. It's just not that good a tool and that's the first thing to realize about it.

The second set of problems have to do with interpreting the scores on SAT tests. And again, the free throw example is useful. If a kid comes in and he shoots 10 out of 10 or zero out of 10, you might take note of that kind of performance with regard to selecting him on the basketball team. If he hits 10 out of 10, you say, "Well, okay, he's probably pretty good and that probably reflects something about his basketball playing. I'll put him on the team. Zero out of 10, that probably reflects something about his playing, he's off the team." Same with SAT tests I think. When you get really strong scores one way or the other, even though they're not as reliable, they often can bring to light talent that would not otherwise be seen.

And so I am not one who thinks they should be done away with entirely. They can be useful in that regard as long as we understand how to interpret them and how little to use them. And I think many college admissions committees are very sophisticated about this. They are closer to this issue of how predictive tests are, and they can get a feel for it. So, that's the second thing.

Middling scores on the test are very difficult to interpret because you don't know. If the kid practiced a little bit more, maybe he would have hit 9 free throws. Maybe he hit only 4 and he's been practicing for 10 years. It's just hard to interpret the meaning of middling scores and the same is true with the SAT. A kid who gets anywhere from 10 to 1200, maybe he got those scores because of coaching or maybe he got those scores because he didn't have enough coaching or maybe he got those scores because he went to Europe every summer and got a great vocabulary about cathedrals and that happened to be on the test that day. All kinds of things can contribute to performance and it muddies up the diagnosticity of the test.

LANI GUINIER:  She is a Harvard law professor who has written on the limitations of the Law School Admission Test.

read the full interview Do we or don't we have a neutral and impersonal meritocracy measuring merit?

Well, it is certainly impersonal. I don't know I'd go so far as to say it's either neutral or meritocratic. FairTest, for example, would say that these tests--and I want to be clear, I'm [not] talking about all tests. I'm a professor, I believe in methods of evaluation. I think some methods are not only more fair but also more valuable. And what I'm talking about here in the guise of tests is aptitude testing, tests which are used to predict future performance, not tests which are used to give feedback, either to the teacher to the student, as to what they have actually mastered or what they are learning. I'm not talking about diagnostic tests. I am talking only about aptitude tests. Because it is the aptitude test that we are using as the proxy for merit. And it is as if this test functions as a thermometer. And you give each person the test as if you were taking their smartness temperature. And that unfortunately, is not how the test functions. Even the test makers do not claim it is a thermometer of smartness. All they claim is that it correlates with first year college grades. And if it's the LSAT, with first-year law school grades.

Now, correlate--that's a big word. What does correlate mean? There's some consistency. There's some relationship between the score on this aptitude test and your first year college grades. That's true. There is some relationship. The problem is it's a very modest relationship. It is a positive relationship, meaning it is more than zero. But it is not what most people would assume when they hear the term correlation. For example, your height correlates better with your weight than your test score correlates with your first year grades.

Jane Balin, Michelle Fine and I did a study at the University of Pennsylvania Law School where we actually looked at the first year law school grades of 981 students and then looked at their LSAT scores. And it turned out that there was a relationship between their LSAT and their first-year law school grades. The LSAT predicted 14 percent of the variance between the first year grades. And it did a little better second year: 15 percent. And I was at a meeting with a person who at the time worked for the law school admissions council who constructs the LSAT. And she said, well, nationwide the test is nine percent better than random. Nine percent better than random. That's what we're talking about.

So it may be an efficient tool in that you get the students to pay for it. The schools don't pay for it. It allows the schools to then rank order people based on a number that is assigned to them. But it is a fairly arbitrary tool. And it is certainly not a thermometer of merit, if by merit--and I'm assuming we don't mean merit is the equal of first-year college grades or first-year law school grades. Merit is a big word. And it has to carry a lot of weight. It does a lot of heavy lifting. It means more than just how you're going to do first year in college. Because if all we cared about is how well you do first year in college, we would have college as one year. Right? Why would you have to be there and pay tuition for three more years if this is only about first year of college? If it's such a good predictor, why do you even go to college? Just take the test and then get a diploma.

So there must be something going on within the institution of higher education or within the legal academy that we think also carries, quote, merit. In which people are learning how to work and play well together with others, in which people are learning intellectual self-confidence, in which people are being exposed to research skills, in which people are being trained to be leaders. None of this has any relationship to the testocracy. No one claims that aptitude tests predict leadership, predict emotional intelligence, predict the capacity to make a contribution to the society. The only relationship is between the test and first-year college grades.

And what I was about to say earlier was that, with FairTest and others, they will say that what the test actually judges is quick strategic guessing with less than perfect information. Boys, for example, do better on the math portion of the SAT than girls. They routinely score 40 to 50 points higher. Many people say, well that's because girls are ignored in high school math. That may be true. And yet the girls do just as well in college when they take math courses as the boys, despite their lower SAT scores on the math portion. And when you interview the boys as to how they approach the test, the answer is they basically viewed it as a pinball machine. And the goal was speed and winning. And the girls on the other hand, wanted to work through the problems before they put down the answer. That, apparently, is not merit.

Somebody who wants to work through a problem before concluding with an answer, is not guessing and they're not fast. And so on some level, what we are confusing as a result of this over-emphasis on the testocracy--what we're confusing merit with is speed and the confidence to guess.

CHRISTOPHER JENCKS:  He is professor of social policy at Harvard and co-editor of The Black-White Test Score Gap.

read the full interview Isn't one of the criticisms of the SAT that nobody's quite sure of what it does measure? Is it because ETS can't say, or don't want to say, what it measures?

I think it's hard to say exactly what it measures. And I'm very sympathetic. It's especially hard to say what the SAT measures if you want to keep the acronym SAT, which is most successful marketing tool in testing history. Well IQ is right up there too, but SAT is what everybody knows you have to take to go to college and it's the test that's marketed by ETS. So if they change the name in any way that does reproduce SAT, they're in real trouble. So I think they've got a constraint there.

But in fact, if from the beginning it had been called, say the Scholastic Achievement Test, I think it would've taken the political curse off the thing. It isn't exactly an achievement test, but it's certainly not exactly an aptitude test. But if you recognize in the label that this involves achievement, then people will say, okay, well it may not be the kind of achievement we should test, but it's reasonable that you should give a test if it measures achievement.

Whenever Henry Chauncey, who was working with James Conant, ever got close to the word achievement, Conant would say that's not what I want, because achievement was then the privilege of the guys who went at that time to Exeter and Andover.

Well I think, when these tests were originally developed, people really believed if they did the job right, they would be able to measure this sort of underlying, biological potential. And they often called it aptitude, sometimes they called it genes, sometimes they called it intelligence. But whatever they called it, they though that there was something there and if they just tweaked and fiddled and worked at it a little harder, they would get pretty close to being able to measure it.

I don't think people believe that anymore. They believe that how you do on almost any test is substantially affected by both your heredity and your environment and both things make a difference. Any psychologist would tell you that. But, the problem of finding a label for something which is both A and B, is a tough one and you could say it's the Scholastic Aptitude and Achievement Test for instance, but that doesn't sound good when you're selling it. Especially to someone like Conant who wanted a test that measured aptitude.

I mean it wasn't an accident that they came up with these terms. They came up with the terms because that's what people wanted to do. The fact that they didn't actually quite do it was, you know, well we've seen this in a lot of other fields of merchandising too. You know, people want a product that does so and so, then say it does so and so.

What's the only way to eliminate labeling bias in tests?

I think you've got to re-label these tests. You need to call these tests things that really reflect what it is that they measure. And then it's also a question of whether what kinds of things the tests should really measure that you give, but that's a different question.

Do the SAT's do a good job of predicting academic performance?

I think they really don't do a good job of predicting. They do a pretty poor job of predicting, but they're the best we've got. Well that means the people who are good at those things, high school grades and test-taking, are going to do well in getting into good colleges. And other people who would do equally well in college are just not going to make it because we have no way of picking out the kid who will do well even though his high grades weren't so great and even though his SAT scores weren't so great.

There are lots of kids out there like that and on the average they won't do as well as the other ones and the ones who will, are going to lose out because we can't identify. They come with little tags. If we knew what else it was, you know, if we could say, well it's stick-to-itiveness or, it's getting excited by a teacher or something, then we could measure it. That would help a lot. And it would probably help minority kids in particular because they don't do well on these tests and they are put at a disadvantage by that, and that's even more of an issue on the job where we know that tests are not terribly strong predictors of job performance and we know that lots of other stuff counts, But we don't know how to measure most of the stuff except by hiring somebody and seeing how they do.

Do you think we're misguided in using them?

I think we'd be way better off if we gave achievement tests and didn't emphasize the so-called aptitude test or now, just the mysteriously unlabeled SAT. I don't think it would change the results in favor of minorities to any great extent in the short run, but I do think it would have a good effect in the long run.

And the reason it would have a good effect is that if you start testing achievement you send a measure to people that this is what you've got to learn to go to a good college, or to any college, whatever. And we know from all kinds of evidence that if you actually set a task like that, the minority students can do better than they're now doing. So I think that if we kind of change the way we set up the task and said this is a question of achievement, it's just like lots of other forms of achievement. You've got to work hard it, you've got to practice, you've got to get good at it.

You would have a very different state of mind than when it seems to people that this is something that is aptitude, unchangeable, inborn, you know, if I can't do, I just can't do it. That's a signal for defeat and giving up. And of course, it's not just a signal to minorities for giving up, it's a signal to any kid who tests badly and says, gee, I just don't get good scores on these kinds of tests. Whereas if you tell him, you know this is a math test, you have to understand the test, lots of people can learn that math if they work hard at it.

home | discussion | who got in? | interviews | the race issue | sat & test prep | history of the sat
the screening process | test score gap | getting in to berkeley | bibliography | links | tapes & transcripts | press | links

FRONTLINE | pbs online | wgbh

web site copyright 1995-2014 WGBH educational foundation



../test/ ../race/ ../interviews/ ../who/ ../talk/ ../