Because test results are often the basis for decisions that affect students' educational futures, tests should provide equal opportunities for all students to demonstrate their abilities and knowledge. The issues of gender bias and fairness in testing are concerned with differences in opportunities for men and women.
This Digest provides a brief introduction to this complex topic. Commonly accepted definitions of gender bias and gender fairness are discussed. Approaches used to detect gender bias and fairness are introduced. Being aware of gender bias and fairness in testing will prepare you to intelligently question the uses of the test results which form the basis of decisions about individual test taker's futures.
Another type of error is caused by factors which do not change. Known as systematic error, it is the result of characteristics of the examinees that are stable (such as gender or race) and that are characteristics other than those the test is intended to measure. Gender bias in testing is often the result of such systematic error.
Test questions may be checked for:
* material or references that may be offensive to members of one gender,
* references to objects and ideas that are likely to be more familiar to men or to women,
* unequal representation of men and women as actors in test items or representation of members of each gender only in stereotyped roles.
If the questions involve objects and ideas that are more familiar or less offensive to members of one gender, then the test may be easier for individuals of that gender. Standards for achievement on such a test may be unfair to individuals of the gender that is less familiar with or more offended by the objects and ideas discussed, because it may be more difficult for such individuals to demonstrate their abilities or their knowledge of the material.
Although examination of the test items may reveal that the test contains questions which have the potential to yield biased results, such an examination may not be sufficient to determine bias. Statistical techniques are often used to examine for systematic gender differences.
Determining whether such a test is biased involves using statistical techniques to calculate the predictive relationship separately for each gender. If the relationships are the same for men and women, we can say with confidence that the test predicts equally well for both genders.
These techniques have been used in recent studies concerning gender bias in college entrance tests. Several studies (such as those reported by Rosser in The SAT Gender Gap) have found that, while women tend to earn lower scores than men on some college entrance tests, they tend to have higher grade point averages during their first year of college. While inconclusive, these studies suggest that either 1) the predictive relationship between test scores and freshman GPAs are not be the same for both genders or 2) there is a systematic bias in the assignment of college grades.
In a recent case, Sharif v. New York State Education Department, the plaintiffs charged that by using SAT scores as the sole basis for the award of state merit scholarships, the New York State Education Department was discriminating against girls who were competing for the awards. Although the girls tended to have higher high school grades than the boys competing for the scholarships, they also tended to have lower scores on the SAT, and so received fewer of the scholarships. The State Education Department argued that the SAT was the best objective measure available.
The SAT is intended to predict students' grades during the first year of college and does not claim to measure the achievement of students during high school. The stated intention of the New York scholarship program, however, was to base its awards on high school achievement. Since the program based its awards solely on the results of a test, the plaintiffs argued that the process denied girls a fair opportunity to demonstrate their eligibility for the awards. The court agreed and ruled that New York could no longer use SAT scores alone as a basis for these scholarship awards.
Not all educational policies have as easily measured and clearly different impacts on the genders as the policy described above. Considering whether such policies are supported by sound educational theory or research may be helpful in detecting possible difficulties.
Klein, S.S. (Ed.) (1985) Handbook for Achieving Sex Equity through Education. Baltimore, MD: Johns Hopkins University Press. ERIC Document Reproduction Service No. ED 290 810.
Rosser, P. (1989) The SAT Gender Gap: Identifying the Causes. Washington, DC: Center for Women Policy Studies. ERIC Document Reproduction Service No. ED 311 087.
Tittle, C.K. (1979) What to Do About Sex Bias in Testing. Princeton, NJ: ERIC Clearinghouse on Tests, Measurement, and Evaluation. ERIC Document Reproduction Service No. ED 183 628.
COURT CASE
Sharif v. New York State Education Department, 709 F.Supp. 345 (S.D.N.Y. 1989).
-----
This publication was prepared with funding from the Office of Educational Research and Improvement, U.S. Department of Education under contract number R88062003. The opinions expressed in this report do not necessarily reflect the position or policies of OERI or the Department of Education. Permission is granted to copy and distribute this ERIC/TM Digest.
###