MUST BE at least 250 words at least 3 scholarly citations in APA format. Any sources cited must have been published within the last five years. Acceptable sources
include the textbook, the Bible, and scholarly peer-reviewed research articles. CHAPTER 3 AND CHAPTER 4 ARE ATTACHED
After reading Chapter 4 of the Mosher textbook, what are some of the differences in using official crime data as discussed in Chapter 3 of the Mosher textbook and Self-Reported Data in Chapter 4? What are the particular strengths and weakness of each of these types of data sources?
The Mismeasure of Crime
Mosher, Clayton; Miethe, Terance D.; Hart, Timothy C.
CHAPTER 4
SELF-REPORT STUDIES
Respondents are a tricky bunch, and they do not always behave the way a researcher would wish or expect. In fact, surveys would be far more reliable without them.
—Coleman & Moynihan (1996, p. 77)
Self-report studies of crime were developed in the 1940s and 1950s, largely in response to concerns among criminologists that official measures of crime were systematically biased and provided a distorted picture of the nature and extent of crime and its correlates.
One of the primary advantages of self-report studies is that the information individuals provide regarding their behavior is not filtered through any official or judicial process. The criminal justice funnel, which illustrates how, at each stage of the system, fewer and fewer illegal behaviors are siphoned off for official crime counts, does not operate with respect to self-report data.
However, what individuals tell us about their behavior may or may not be a reliable and valid source for determining how involved they are in criminal activity. Memories of events—even of ones as dramatic as criminal episodes— may be fuzzy rather than clear, especially when it comes to recollecting the time period in which they occurred or the sequence of their occurrence. The questions that researchers ask may be phrased in ways that are different from the way people think of their behavior. For example, asking “In the past six months, have you abused or aggressed against a family member?” may elicit a different response than “In the past six months, have you slapped, hit, or punched anyone in your house?” Even with good nonjudgmental questions,however, respondents may be reluctant to answer fully and truthfully—at least partially because of the fact that they are being asked to admit to behaviors that might result in their arrest if the actions became known to authorities.
One purpose of this chapter is to make you a savvy consumer, as well as evaluator, of self-report measures of crime. Because of its importance to both self-report and victimization data, we begin with a brief discussion of survey methodology. We then review the methodology and findings of some of the more prominent self-report studies, including the National Youth Survey (NYS), the National Longitudinal Study of Adolescent Health (Add Health), the Monitoring the Future (MTF) Survey, and the National Household Survey on Drug Abuse (NHSDA, now known as the National Survey on Drug Use and Health [NSDUH]). This is followed by a review of a prominent and enduring debate in the discipline of criminology regarding the connection between social class and crime, a debate that led to further refinements and improvements in self-report methodology. We then discuss self-report data from known offenders, which have provided particular insights into the crime patterns of individuals who have been apprehended by the criminal justice system. The chapter concludes with an examination of studies focusing on the reliability of self-reported data on drug use.
THE METHOD BEHIND THE MEASURE
Self-report measures of crime are subject to the same constraints found more generally in survey research. Criticisms regarding the adequacy or accuracy of self-report as well as victimization data have as much, if not more, to do with how the data are collected than with what those data might tell us about crime and criminals. In order to anticipate and understand these criticisms, we briefly address the sources of error in survey research before discussing specific self-report studies of crime.
Sources of Survey Error
At the core, evaluating any survey and the data derived from it revolves around two central issues (Phillips, Mosher, & Kabel, 2000):
Were the right people asked the right questions?
Did they answer truthfully?
In survey research terminology, these two issues expand into the four sources of total survey error: coverage error, sampling error, nonresponse error, and measurement error (see, e.g., Dillman, 2000; Groves, 1989, 1996; Junger-Tas & Marshall, 1999; Salant & Dillman, 1994).
Coverage error means that researchers selected individuals from a list— a sampling frame—that did not include all the people they intended to study: the target population. To illustrate this principle, consider the following example. A researcher is interested in determining how welfare recipients feel about their encounters with social service and criminal justice agencies. They have access to a current list of welfare recipients in their state, from which they will select a sample. In this example, there are already two limits on coverage: only people who (1) receive welfare as of a certain date and (2) live in the state can be included in the survey and described by its results.
There is an additional limitation imposed by the survey mode used to obtain information from the sample members. A telephone survey may be expedient, but a high percentage of welfare recipients may not have telephone service. A mail survey would likely include most, if not all, welfare recipients, but literacy may be a problem in enough cases to contribute to two other sources of error: nonresponse and measurement (which will be discussed later). A face-to-face survey could address the deficiencies of either of the other two modes but at great cost in terms of time and money. As this example illustrates, coverage error can be relatively easy to identify but not easy to correct.
Sampling error is an automatic, unavoidable result of surveying a subset, rather than taking a census, of all the people in the target population. This is the source of survey error that is referred to when journalists report that a political exit poll or public opinion survey has a “margin of error plus or minus five points.” It means that it can be estimated, by using a well-established statistical formula, how closely the survey sample mirrors the target population. Although it is rarely reported by journalists, sampling error is estimated within a specified confidence level that indicates how sure we are about the estimate. For example, if a survey’s sampling error is estimated as +/- 5 percentage points at the 95% confidence level, we can be confident that 95 times out of 100, the percentage of sample members who gave a certain response will be within 5 percentage points either way of the true percentage in the target population who would give that response if asked. Unfortunately, confidence levels apply only to predictions in the long run (referred to as an infinite number of trials); any particular sampling outcome may fall within or outside of the specific range of the 95% confidence level. Although sampling error cannot be completely eliminated from surveys, it can be reduced by increasing the sample size and obtaining responses from a larger proportion of people in the target population.
Nonresponse error affects survey data when both of the following are apparent: (a) too many people in the sample did not respond to the survey, either because they could not be contacted via the survey mode or they refused to participate; and (b) the nonrespondents differ from respondents in ways that are important to the objectives of the survey. Why both conditions must hold is easily illustrated. Consider a survey in which researchers complete interviews with 70% of their sample, a quite respectable response rate for social surveys. Most of the respondents have brown eyes, whereas most of the nonrespondents have hazel, green, or blue eyes. Is this survey plagued by nonresponse error? The answer to this question depends on the research questions. If the researcher is interested in the relative sun-sensitivity reported by people with different eye colors or their preferences for using contact lenses of different hues, then non-response likely is a problem. Even though there is a high response rate to the survey, respondents differ from nonrespondents on a variable of potential interest—eye color. If, on the other hand, the researcher is interested in attitudes toward capital punishment, nonresponse on the basis of eye color would not be a source of error because eye color is not germane to this issue.
To return to the earlier example of surveying welfare recipients regarding their encounters with social service and criminal justice agencies, assume that the researcher obtains an 80% response rate. However, more than 60% of the respondents are female, and more than 70% of the nonrespondents are male. Nonresponse error constitutes a serious problem in this case because gender is a factor not only in the number but also in the character of contacts with social service and criminal justice agencies. It is possible for a survey with a 99% response rate to be subject to nonresponse error if the 1% who did not respond differ in predictably significant, substantive ways from the nonrespondents. Likewise, a survey with a response rate of only 40% may be immune to non-response error if the nonrespondents are similar to respondents in ways that might make a difference in analyzing the data from the survey.
As mentioned in Chapter 1, the process of operationalization involves attaching meaning to abstract concepts and developing specific indicators and measures of those concepts. How researchers decide to measure these concepts, the nature and number of indicators that are used to identify them, and the specific wording used to define them are all sources of measurement error. In evaluating measures of any phenomenon, social scientists are concerned with issues of validity and reliability.
Validity and Reliability
Validity is the degree to which a measure captures what it is intended to measure: If a measure is valid, it is true and accurate. Some measures have prima facie validity (i.e., clear or self-evident; often called face validity). Other measures possess validity only for specific cases and within strictly defined boundaries. Consider this example. Which of the following is more valid as a measure of the physical stature of human beings: (a) height and weight as recorded by physicians at routine physical exams or by coroners at an autopsy, (b) sizes of clothing most frequently purchased from the inventories of top- or bottom-tier manufacturers and department stores, (c) dimensions of seating and lavatory areas in commercial airplanes, or (d) observations of, and conversations with, people at public events or on the streets at rush hour? The first option—height and weight as recorded by a medical practitioner—does seem to have face validity for measuring physical stature, but the other three options have fairly obvious limitations when it comes to measuring what is intended. However, people who have physical exams or those whose deaths require an autopsy may not be representative of all human beings. Thus, even the validity of what appears to be the most accurate measurement can be compromised by an inadequate or biased sampling frame. We will return to these threats to validity after defining the second criterion for evaluating any measurement.
Reliability is the extent to which the same results are obtained each time a measure is used. If something is a reliable measurement, then it is a precise, consistent, and dependable one. A bathroom scale that showed an individual having three different weights on three different occasions over a 10-minute period would not be reliable. In the case of self-report studies of criminal and deviant behavior, reliability refers to the ability of the procedure used and questions asked to generate consistent responses from the same respondents on repeated administrations. For example, if individuals are asked whether they have ever stolen something and they answer yes, then they should answer yes the next time they are asked the same question. But just as all squares are rectangles but not all rectangles are squares, all valid measures are reliable ones, but not all reliable measures are valid.
In survey research, threats to reliability and validity (i.e., measurement error) derive from any of four aspects of the study (see, e.g., Aquilino, 1994; Aquilino & Wright, 1996; Dillman & Tarnai, 1991; Dykema & Schaeffer, 2000). The survey mode, whether it is telephone, mail, face-to-face, or Internet or web based, may result in different answers to the same question, even when posed to the same types of respondents. For example, studies have found that respondents are more likely to report drug use on self-administered answer sheets than in face-to-face interviews (Harrison, 1997; also see discussions later in this chapter). The survey instrument may include questions with categories that are not mutually exclusive or with terms that are not interpreted the same way by different respondents. The survey interviewer may unintentionally prompt a particular response by either attempting to clarify the meaning of a question (resulting in leading the respondent) or by giving the impression that a particular response is correct or expected (resulting in a socially desirable answer from the respondent). Finally, the survey respondent may misunderstand the question, may feel that the question is too nosy and prying, or may just plain lie. All of these conditions will result in mistakes in measurement.
To restate the sources of survey error in the context of the two key questions regarding research (i.e., Were the right people asked the right question? Did they answer truthfully?), if the coverage of the target population is inadequate or the sampling strategy is inappropriate, or the nonresponse rate jeopardizes either of them, the right people have not been asked the right question. If the measurement strategy elicits responses that are imprecise or might be inaccurate or cannot be compared to others, then the data do not allow us to determine if the respondent is telling the truth.
SELF-REPORTS ON CRIME AND DELINQUENCY
Chapter 2 documented that most of the early self-report measures of crime and its correlates were intended to discover, document, and describe the true dimensions—or dark figure—of crime. Some researchers believed that there was a great deal of illegal behavior that was not captured by official statistics. Rather than taking the official statistics at face value, they attempted to learn about criminal activities directly from the individuals who were engaging in them, whether or not those activities were detected by law enforcement.
The work of James Short and Ivan Nye, briefly discussed in Chapter 2, serves as an instructive example of both the strengths and weaknesses of self-report data on illegal activities (see Nye & Short, 1957; Short, 1955, 1957; Short & Nye, 1957–1958, 1958). Non-institutionalized adolescents was the targeted population for these researchers, and “because they seem likely to be more representative of the general population than are college or training school populations,” Short and Nye (1958, p. 297) drew their samples from public high schools, administering an anonymous questionnaire to these students. Exhibit 4.1 lists the items included in Short and Nye’s questionnaire that were designed to measure the youths’ involvement in delinquent and criminal activities. From responses to the questionnaire, Short and Nye (1958) drew the following conclusions, among others: (a) delinquent conduct in the non-institutionalized population is extensive and variable; (b) self-reported delinquent conduct is similar to official delinquency and crime in that boys admit committing nearly all delinquencies more often than do girls, and the offenses for which boys and girls are most often arrested are the ones they admit to committing most often; and (c) self-reported delinquent conduct differs from official statistics in that delinquency is distributed more evenly throughout the socioeconomic classes of non-institutionalized populations, whereas official cases are concentrated in the lower economic strata.
There are, however, a number of questions that can be raised regarding Short and Nye’s work. First, are students enrolled in high school likely to be representative of the general youth population? What about dropouts and other young people who might have been absent for one reason or another on the day(s) the questionnaire was administered? It is likely that such individuals are more prone to be involved in criminal and delinquent behavior. Second, many of the behaviors listed in the questionnaire are not described in legalistic, criminal terms. One of the many challenges associated with obtaining valid and reliable self-reports and comparing these to official data is translating the reported behaviors into categories consistent with those in sources such as the UCR. Third, and in a similar vein, many of the items included on the Short and Nye (1958) questionnaire are oriented toward the less serious end of the crime scale. The fact that many self-report instruments focus on relatively trivial behaviors, such as skipping school and defying parents’ authority, has become an enduring criticism of self-report studies.
Despite these shortcomings, Short and Nye’s work was important in the sense that it revealed that a considerable amount of crime and delinquency was not officially recorded. And much of this hidden delinquency was apparently committed by young people from relatively privileged backgrounds; Short and Nye found few social class distinctions in either the range or frequency of involvement in self-reported illegal activities. As a result, “Short and Nye’s work stimulated much interest in both the use of self-report methodology and the substantive issue concerning the relationship between some measure of social status (socioeconomic status, ethnicity, race) and delinquent behavior” (Thornberry & Krohn, 2000, p. 37).
Literally hundreds of self-report surveys that have been conducted in the past 60 years under the auspices of a variety of government agencies, academic institutions, and individuals largely confirm the findings from the earliest self-report studies of crime. However, as we will discuss in more detail later, several more recent studies—using more sophisticated methods, instruments, and analyses— have challenged the conclusions regarding little or no association between social class variables and involvement in delinquent and criminal behavior. In the following section, we describe four surveys, each of them national in scope and each of which have been used in numerous published studies, that arguably are standard bearers for collecting and analyzing self-report data. Not only do these provide self-report data on involvement in illicit activities, they also form the basis for research on and debate about techniques for improving the quality of self-report data. Two of the surveys (NYS and Add Health) measure both criminal and delinquent behavior in addition to the use of controlled substances. The other two (MTF and NSDUH) are focused on issues related to the use and abuse of legal and illegal substances.
National Youth Survey
First conducted in 1977, the NYS was designed specifically to provide both prevalence and incidence estimates of the commission of delinquent activities by youth. It is a longitudinal survey that uses a national-probability-based sample of young people who were 11 to 17 years old at the time of the first interview (Elliott, Huizinga, & Morse, 1986). Participants in this study were interviewed in their homes at one-year intervals through 1981 and at two- to three-year intervals at least through 1995. More than 90% of the original 1,725 participants have remained in the survey over time.Exhibit 4.2 provides a list of some of the questions used in the NYS.
Confidential, face-to-face interviews solicit information on the number of times the respondent has engaged in a specific delinquent or criminal activity within the past calendar year, with two different response sets used. If an individual’s response to an open-ended question indicates they have engaged in the particular activity more than 10 times, the interviewer asks the youth to select one of the following responses: (a) once a month, (b) once every 2 to 3 weeks, (c) once a week, (d) 2 to 3 times a week, (e) once a day, or (f) 2 to 3 times a day. Although described in nonlegalistic terms, the 47 activities asked about directly parallel offenses listed in the FBI’s Uniform Crime Report. Of the Part I offenses, only homicide is excluded; about 75% of Part II offenses are included, along with a wide range of misdemeanors and status offenses.
Exhibit 4.3 presents data on prevalence and incidence rates of self-reported offending for the first five waves of the NYS. Because the NYS is a longitudinal survey, the panel of respondents reporting on their behavior for 1976 is the same group of people reporting for 1980, and this is why the age range is different for each of the five years. With respect to prevalence rates (i.e., the percentage of respondents who report having engaged in certain types of crime) for felony assault and theft, both whites and blacks report lower involvement for 1980 than for 1976, and their self-reported rates of involvement in these offenses are nearly identical. For general delinquency, whites report a slightly higher involvement and blacks report a slightly higher involvement for 1980 than for 1976.
The NYS provided the database for a number of important substantive and methodological studies in criminology—a March 2010 search of Criminal Justice Abstracts,1 using the search term “national youth survey,” resulted in 177 entries, 137 of which were journal articles. We will discuss some of these studies in more detail in subsequent sections of this chapter, but here we mention a few to provide a sense of the range of topics that can be addressed by NYS data. Several studies have focused on gender, race, and social class similarities and differences in self-reported offending (e.g., Ageton, 1983; Huizinga & Elliott, 1987; Smith, Visher, & Jarjoura, 1991; Zhang & Messner, 2000). Some researchers have examined the relationship between drug use and involvement in predatory crime or juvenile involvement in violent crime (e.g., Chaiken & Chaiken, 1990; Elliott et al., 1986) or used NYS data to test the gateway drug theory (Rebellon & Van Gundy, 2006). Still others have used NYS data to test explanatory theories of delinquent and criminal behavior, including deterrence, strain, power-control, and control balance theories, among others (e.g., Blackwell & Reed, 2003; DeLisi & Hochstetler, 2002; Heimer & Matsueda, 1994; Jang 1999a, 1999b; Jang & Johnson, 2001; Lauritsen, 1999; Ostrowsky & Messner, 2005; Pogarsky, Kim, & Paternoster, 2005). Researchers have also used NYS data to examine the relationship between religiosity, moral beliefs, and delinquency (Desmond, Soper, & Purpura, 2009) and between marriage and involvement in crime (King, Massoglia, & MacMillan, 2007).
National Longitudinal Study of Adolescent Health (Add Health)
The Add Health study, initiated in 1994 under a grant from the National Institute of Child Health and Human Development and administered by the University of North Carolina’s Carolina Population Center, is a nationally representative longitudinal study that was originally designed to collect data on how social contexts (families, friends, peers, schools, neighborhoods, and communities) influence teenagers’ health and risk behaviors (National Institutes of Health, n.d.). Among the data collected in the Add Health study are suicidal intentions or thoughts, biomarkers, substance use and abuse, violence, delinquency, criminal offending, and involvement with the juvenile and criminal justice systems (Carolina Population Center, 2010). In addition, school administrators provided information regarding characteristics of the schools that respondents attended and, if participants agreed, data from high school transcripts.
Add Health has gone through four waves, with the first study involving a stratified, random sample of all high schools in the United States, administered in 1994/1995, and resulting in 90,118 school questionnaires, 164 school administrator questionnaires, 20,745 in-home interviews of adolescents, and 17,700 parent questionnaires. For the second stage of Wave I, an in-home sample of 27,000 adolescents was drawn, consisting of a core sample from each community in addition to selected oversamples. In this stage, parents were asked to complete a questionnaire about family and relationships. The second wave of Add Health was conducted in 1996 and consisted of close to 15,000 in-home interviews with adolescents and 128 school administrator questionnaires. Wave III of the study consisted of Wave I respondents who could be located and re-interviewed in July of 2001 and April of 2002, resulting in 15,197 young adult in-home interviews (and the collection of biomarker data) as well as 1,507 interviews with partners of the original respondents. Finally, Wave IV of Add Health, conducted in April and June of 2007 and January and February of 2009, consisted of 15,701 adult in-home interviews (of the original respondents, who were then between the ages of 24 and 32) and biomarker collection (Carolina Population Center, 2010).
A March 2010 search of Criminal Justice Abstracts using “Add Health” as the search term resulted in 18 entries, with 16 of these being journal articles. The Carolina Population Center website lists several hundred more publications that have used Add Health data, and the National Institute of Health’s website indicates that more than 3,000 scientists have used data from Waves I through III, resulting in the publication of more than 600 articles (National Institutes of Health, n.d.).
Criminological researchers have used Add Health data to study the relationship between family structure, family process, and economic factors and delinquency (Lieber, Mack, & Featherstone, 2009); the role of friendship sex composition in girls’ and boys’ involvement in serious violence (Haynie, Steffensmeier, & Bell, 2007); the impact of early puberty on experiencing violent victimization (Schreck, Burek, & Stewart, 2007); the role of social psychological processes in mediating the impact of neighborhood contexts on violence (Kaufman, 2005); and the impact of school and family attachments on drug use, delinquency, and violent behavior (Dornbusch, Erickson, Laird, & Wong, 2001). Others have taken advantage of some of the unique features of the Add Health data to address issues of criminological interest. As noted previously, the Add Health studies have collected extensive information on the parents of those surveyed—Foster and Hagan (2007) used these data to examine the effects of fathers’ incarceration on the detainment and exclusion of children during their transition to adulthood. Beaver, DeLisi, and Wright (2009) used the biomarker data in Add Health and concluded that genetic factors interact with delinquent peers and low self-control in predicting variation in delinquency.
Monitoring the Future: A Continuing Study of U.S. Youth
Since 1975, the MTF study has served as a primary source of information about illicit drug, alcohol, and tobacco use by young people in the United States (Johnston, O’Malley, & Bachman, 1999). Each year, published reports based on MTF data reveal the extent of use of several legal and illegal substances. The study also examines a variety of attitudes among 8th-, 10th-, and 12th-grade students, but it does not address involvement in other criminal and delinquent activities (MTF data are available online at http://www.monitoring thefuture.org).
MTF is an extraordinarily ambitious and costly project—between 15,000 and 20,000 students in each of three grades, in addition to between 9,000 and 16,000 college students and young adults, complete an MTF questionnaire each year. The data from any given MTF survey are directly comparable to those from previous years, largely because sampling techniques and question formats are consistent from one year to the next.
MTF began with a cross-sectional survey of a representative sample of all seniors in public and private high schools in the coterminous United States (Johnston et al., 1999), but it quickly became a longitudinal survey. With the exception of the first graduating class, follow-up questionnaires are mailed to a representative sample, consisting of approximately 2,400 individuals, of the members of each senior class who participated in the MTF. These follow-ups occur on seven occasions between the year of high school graduation and the year that the cohort reaches the age of 32, and they constitute the college student and young adult samples for each MTF survey year.
The MTF survey instrument has been modified over the years to accommodate the use of different types of drugs as well as corollary attitudes and behaviors. For example, a question on crack cocaine was first added to the instrument in 1986, and more detailed questions about all forms of cocaine were included in the 1987 version. Questions on crystal methamphetamine (ice) have been included since 1990; 8th, 10th, and 12th graders have been asked questions about MDMA (ecstasy) since 1996. Since 2007, MTF has placed emphasis on the use of prescription drugs (outside of medical supervision) and on the use of over-the-counter cough and cold medicines to get high. In addition to typical questions about licit and illicit drugs, such as age or grade at first use, frequency and quantity of use, and perceived availability of drugs, the MTF also queries respondents regarding their attitudes and beliefs about involvement in risky behaviors as well as their perceptions of the attitudes, beliefs, and behaviors of others with whom they associate.
The 2008 MTF survey included more than 46,000 students in 8th, 10th, and 12th grade in 386 secondary schools in the United States (Johnston, O’Malley, Bachman, & Schulenberg, 2009).Exhibit 4.4 shows the percentages of each MTF school sample group that reported having used various illicit drugs, alcohol, and tobacco at any time in the 30 days prior to completing the questionnaire in 2008. This table reveals that alcohol is the drug most frequently used by young people, with 43% of 12th graders, 29% of 10th graders, and 16% of 8th graders reporting they had consumed alcohol in the 30 days prior to being surveyed. Twenty-two percent of 12th graders, 16% of 10th graders, and 8% of 8th graders reported using any illicit drug in the previous 30 days, with marijuana being the most commonly used substance. Related to our discussion of socially constructed drug epidemics in Chapter 1, it is notable that less than 2% of 8th, 10th, and 12th graders reported using ecstasy in 2008, and less than 1% reported using methamphetamine in the previous 30 days. Rates of ecstasy and methamphetamine use were similarly low among college students and young adults.
MTF surveys also collect measures of regular or daily use of particular substances. Measuring regular use is important because a relatively large proportion of people who report having used a substance in the past month may be first-time and possibly only-time users. The more regularly a substance is used in a 30-day period, the greater the risk for negative consequences associated with long-term use of the substance. In addition, some substances are viewed as gateway drugs, whose regular use by young people may lead to more hard-core substance abuse and addiction (Johnston et al., 1999, p. I:25).
Exhibit 4.5 shows the percentages of the MTF sample groups that reported daily use (i.e., on at least 20 occasions in the past 30 days) of the so-called gateway drugs: marijuana, alcohol, and tobacco. Not surprisingly, the percentages are considerably lower than those in Exhibit 4.4: For marijuana and alcohol, they are as low as one tenth and never as high as one fourth of the percentages reporting use at least once in the past month. For cigarettes, the percentages in the two tables are closer—about one half to two thirds as many past month users reported being daily smokers.
Although the data do not appear in the tables included in this chapter, the 1988 MTF found that 2.3% of both 8th- and 10th-grade students reported heroin use, compared to 2.0% of seniors. These higher rates of heroin use by younger students may be an artifact of the MTF sampling strategy, as heroin users may be more likely than other students to drop out of school before their senior year (Snyder & Sickmund, 1999). The MTF researchers recognize the potential bias induced by the exclusion of high school dropouts but note that “since the bias from missing dropouts should remain just about constant from year to year, their omission should introduce little or no bias in change estimates” (emphasis in original, Johnston et al., 2009, p. 58). Nonetheless, the lack of responses from high school dropouts and those who are absent from school when the MTF surveys are administered need to be considered in interpreting the results from these surveys.
MTF surveys have provided valuable panel data on substance-use patterns among young people over time. Perhaps more important in terms of this chapter, they have provided essential data for evaluating the validity of self-report measures of illicit behaviors (see Caulkins, 2000; Harrison, 1997; Johnston & O’Malley, 1997).
National Survey on Drug Use and Health (NSDUH)
First conducted in 1971, the NSDUH (before 2002 this survey was known as the National Household Survey on Drug Abuse) is an enduring source of information about illicit drug, alcohol, and tobacco use in the United States (Substance Abuse and Mental Health Services Administration [SAMHSA], 2000). Like the MTF, each year the NSDUH reveals the prevalence as well as incidence of drug use. Unlike the MTF survey, however, the NSDUH is administered each year to a sample of non-institutionalized civilians who are 12 years of age or older (as opposed to just students).
Similar to the MTF study, the NSDUH is ambitious and costly, involving face-to-face interviews with almost 70,000 individuals across the United States. The survey is cross-sectional as opposed to longitudinal, and some groups in the target population are oversampled to ensure that there are a sufficient number of interviews to calculate reasonable estimates of drug use by those groups that either may not show up in sufficient numbers in a random sample of the population or may be of particular interest. For example, the NSDUH has traditionally oversampled people over the age of 35, and blacks and Hispanics have been over-sampled since 1985. In certain years, residents from rural areas have been over-sampled, while in other years, residents from urban areas or low socioeconomic status residents within those areas have been oversampled.
Similar to the MTF surveys, the NSDUH survey instrument has been modified over the years to accommodate trends in the use of different types of drugs and correlates of drug-using behaviors. Some of these modifications have been implemented for only one or two survey years. For example, the 1979 and 1982 surveys asked respondents not only about their own but also about their friends’ use of heroin in order to obtain a better sense of the prevalence of heroin use in the United States. The 1982 survey also included a special section on medical as well as nonmedical use of stimulants, sedatives, tranquilizers, and analgesics. In 1995, respondents were asked about their need for drug or alcohol treatment and their criminal record. Other changes in the questionnaire have become standard features of the NSDUH since they were first introduced. For example, since 1985, there have been questions on (a) the use of cigarettes and related products such as smokeless tobacco, (b) perceived consequences of using various drugs, and (c) the various ways in which cocaine is administered. In 1988, questions about crack cocaine use and sharing needles for drug injection were added to the survey. Questions about access to health insurance and total annual family income were introduced in 1990, about employment and drug testing in the work-place in 1991, and about mental health and access to health care in 1994.
Although the questions on the NSDUH survey have varied over the years, its mode of administration remained unchanged until 1999. For nearly three decades, the NSDUH was a face-to-face, paper and pencil interview (PAPI) that took approximately one hour to complete. Trained interviewers read the survey items aloud to respondents, who recorded their answers to questions deemed to be sensitive (such as those on substance use) on separate sheets so that interviewers could not see the responses. Interviewers recorded respondents’ answers to nonsensitive questions (such as those on occupational status and household composition) directly on the survey booklet.
The 1999 NSDUH heralded a major shift in the mode of administration. Rather than a PAPI, it was a combination of computer-assisted personal interview (CAPI) and a computer-assisted self-interview (CASI). The CAPI portion corresponded to the questions for which interviewers recorded respondents’ answers on the booklet, whereas the CASI portion allowed respondents to enter their own answers to sensitive questions. The use of computer-assisted interviewing (CAI) was expected not only to improve the efficiency of data collection and processing, but also to increase respondents’ honesty in reporting illicit drug use and related behaviors. However, ultimately, these changes did not result in improvements to the efficiency of data collection. Response rates early on in the 1999 survey were so low compared to previous years that additional, complicated subsampling and weighting techniques had to be applied. Furthermore, because the mode of administration has been shown to affect both response rates and the content of responses, “the NHSDA also included a supplemental sample using the paper and pencil interviewing mode for the purposes of measuring trends with estimates comparable to 1998 and prior years (SAMHSA, 2000, p. A:1).
Another complicating, though ultimately highly beneficial, change in the 1999 NSDUH was the introduction of state-based probability sampling. Through 1998, with the exception of those survey periods when particular regions were oversampled, the NSDUH sampling design was based on national figures. Estimates regarding drug use could only be applied to the United States as a whole, not to individual states. (Drawing inferences about drug use across different regions of the country was somewhat less problematic, although still questionable). To make it possible to calculate substance use estimates separately for states, as well as to allow for more detailed analysis of national patterns, the 1999 NSDUH drew “an independent, multi-stage area probability sample for each of the 50 states and District of Columbia” (SAMHSA, 2000, p. intro:1).
California, Florida, Illinois, Michigan, New York, Ohio, Pennsylvania, and Texas—eight states that together account for 48% of the U.S. population age 12 and older—were oversampled. Also oversampled were youths and young adults, so that each state’s sample was approximately equally distributed among three major age groups: 12 to 17 years, 18 to 25 years, and 26 years and older.
Exhibit 4.6 shows the percentages of NSDUH respondents in the 12 to 17, 18 to 25, and 26+ age groups who reported past month use of any illicit drugs, marijuana, cocaine, heroin, hallucinogens, inhalants, psychotherapeutic drugs,2 tobacco, and alcohol in 2008. Past month use of illicit drugs was highest in the 18 to 25 age category, and marijuana was the most frequently used illicit drug in all three age groups. The table also reveals fairly high levels of past month alcohol and binge alcohol use, especially in the 18 to 25 age group.
Although it is important to be cautious in making direct comparisons between the MTF data shown in Exhibit 4.4 and the NSDUH data presented in Exhibit 4.6, it is instructive to examine differences in what the two surveys reveal regarding substance use. Limiting the comparison to the 12- to 17-year-olds and 18- to 25-year-olds in Exhibit 4.6 and the 8th, 10th, and 12th graders and college students in Exhibit 4.4, MTF data show somewhat higher percentages of illicit drug, alcohol, and tobacco use overall than do the NSDUH data. These differences are likely due to differences in the mode of administration of the two surveys. Despite all best efforts to maintain privacy and ensure confidentiality, when questions about substance use are posed out loud, in person, and in one’s home, as is the case with the NSDUH, there may be a tendency to underreport one’s consumption of those substances. Alternatively, filling out a questionnaire anonymously, as is the case with the MTF survey, may allow a certain amount of bragging about, or overreporting of, one’s use of illicit drugs, alcohol, and tobacco.
1Tobacco products include cigarettes, smokeless tobacco (i.e., chewing tobacco or snuff), cigars, or pipe tobacco.
2Binge alcohol use is defined as drinking five or more drinks on the same occasion at least 1 day in the past 30 days.
3Heavy alcohol use is defined as drinking five or more drinks on the same occasion on each of 5 or more days in the past 30 days.
Similar to the NYS and MTF, the NSDUH has provided important data for substantive as well as methodological studies. Of particular value are studies that address measurement issues such as response bias and nonresponse error as well as the general validity of self-reported drug use (see Biemer & Witt, 1997; Caulkins, 2000; Gfoerer, Lesser, & Parsley, 1997; Harrell, 1997; Harrison, 1997; Miller, 1997; Turner, Lessler, & Gfoerer, 1992; Wright, Gfoerer, & Epstein, 1997).
SOCIAL CORRELATES OF SELF-REPORTED OFFENDING
Since their inception in the 1940s, self-report measures have consistently revealed dimensions of crime and its correlates that either were at odds with or could not be addressed by official statistics. For example, in self-report surveys, girls and women reported being just as delinquent and criminal, although less frequently and intensely, as did boys and men. Whites likewise admitted to involvement in a range and number of delinquent and criminal acts closely paralleling that of blacks. Middle- and upper-class youth self-reported similar levels of involvement in delinquent activities as lower-class youth. Their consistency notwithstanding, such findings were not uniformly accepted as valid among researchers and practitioners. Indeed, the debate over the “myth of social class and criminality” and the relationship between race or ethnicity and crime was, and continues to be, so essential to understanding the role of self-report measures that it warrants special attention here.3
Tittle, Villemez, and Smith (1978) standardized data from 35 studies, with publication dates spanning four decades, on the relationship between social class and criminality. Their conclusions were highly controversial and launched one of the more enduring and, at times, heated debates in the discipline of criminology. In essence, their analyses indicated that the negative association between social class and criminality revealed in official data was not only much more marked than the slight one observed from self-report data, but had also been declining substantially and steadily over the decades while remaining fairly stable in self-report studies. They found no support for the notion that people of lower social status were more involved in delinquency and crime. “In short, class and criminality are not now, and probably never were related, at least not during the recent past” (p. 652).
Hindelang, Hirschi, and Weis (1979) took issue with the findings of Tittle et al. (1978), arguing that misrepresentation of findings from self-report studies “create the illusion of discrepancy between the correlates of official and self-reported delinquency, when, in general, no such discrepancy has been demonstrated” (p. 996). Their main contention was that besides covering exclusively or primarily trivial offenses, self-report measures do not “tap the same domain of chargeable offenses as do official statistics” (p. 997). This domain should include a full range of types of offenses (i.e., “behavioral content,” p. 997) as well as “seriousness, both within (e.g., amount of theft) and across (e.g., school versus violent) offense types”(p. 997). Hirschi et al.’s analyses showed that if type of offense and seriousness are taken into account, then self-report data look much like official statistics in terms of a disproportionate involvement by males and by blacks in more serious offenses. They contended that neither self-report data nor official statistics were adequate to make any comparisons between social classes with regard to specific illegal behavior. Hindelang et al. (1979) concluded:
This evidence suggests to us that: (1) official measures of criminality provide valid indications of the demographic distribution of criminal behavior;(2) self-report instruments typically tap a domain outside the domain of official data; (3) within the domain they tap, self-report measures provide reliable and valid indicators of offending behavior; (4) the self-report method is capable of dealing with behavior within the domain of official data; and(5) in practice, self-report samples have been inadequate for confident conclusions concerning the correlates of offending behavior comparable in seriousness to that represented in official data. (p. 1009)
Elliott and Ageton (1980) similarly contended that self-report and official statistics do not measure the same things. They noted that “self-report measures of delinquency provide a different picture of the incidence and distribution of delinquent behavior than do official arrest records” (p. 95). Using data from the first year of the NYS, they constructed a measure of criminal behavior that was directly comparable to UCR data both in Hindelang et al.’s (1979) behavioral content (i.e., type of offense) and in time frame (i.e., the period during which the offenses occurred). They found “significant race differences for total [self-reported] delinquency and for predatory crimes against persons” (Elliott & Ageton, 1980, p. 102). In addition, Elliott and Ageton (1980) found that blacks and lower-class youth were disproportionately likely to be high-frequency offenders. In other words, these overall race and social class differences were largely the result of blacks and lower-class youth reporting the commission of so many, and so many more serious, offenses. Elliott and Ageton surmised that because “the more frequent and serious offenders are more likely to be arrested,” their NYS “data are more consistent with official arrest data than are data from most prior self-report studies”
(p. 107). Calling particular attention to the tendency in self-report studies to truncate measures of the frequency of commission of offenses while simultaneously paying little attention to the seriousness of offenses, Elliott and Ageton concluded:
The most significant difference may not be between the nonoffender and the one-time offender, or even between the one-time and multiple-time offender. Equal or greater significance may be found between those reporting over (or under) 25 nonserious offenses, or between those reporting over (or under) 5 serious offenses. (p. 108)
Clelland and Carter (1980) began their critique of Tittle et al. by asserting that “the proposition of no relationship is the new myth of class and crime,” and noting that “for Tittle et al., criminologists play the role of 900-pound intellectual gorillas—they define ‘crime’ any way they please.” They argued that self-report studies are “nearly worthless” for examining the class-crime relationship, primarily because of the fact that they focused on minor forms of delinquency, such as “skipping school and throwing eggs” (pp. 320–324).
Braithwaite (1981) also took issue with those who denied an association between social class and criminality and noted that “if [a total of 35 works for Tittle et al.’s secondary analysis] is all that could be found, then they did not look very hard. … Perhaps Tittle et al. take their own findings seriously and adopt no extra precautions when moving about the slums of the world’s great cities than they do when walking in the middle class areas of such cities” (p. 37).
Braithwaite (1981) examined the findings of 143 studies that could address the relationship between social class and crime, 97 of which were based on official statistics and 46 of which were based on self-report measures. Nearly 93% of the official-record studies showed higher crime rates among lower-class as opposed to middle-class people; on the other hand, about 53% of self-report studies showed significantly, or at least notably, higher levels of delinquency by lower-class adolescents. Citing Elliott and Ageton’s (1980) finding that differences in self-reported delinquency result entirely from the contrast “between the lower class group and the rest of the sample” (p. 42), Braithwaite went on to assert that “the nature of the class distribution of crime depends entirely on what form of crime one is talking about” (p. 47).
The debate continued when Kleck (1982) argued that the finding of no relationship between social class and crime was largely due to the fact that lower-class youth had a greater tendency to underreport their involvement in delinquency. He used the following example. Suppose that in a given sample, lower-class respondents had committed an average of six delinquent acts, whereas middle-class youth had committed an average of four. If the middleclass group reported 90% of their delinquent acts but lower-class juveniles only reported 60%, both groups would show an identical mean number of reported acts, at 3.6. Kleck (1982) also noted that several self-report studies had drawn samples from a single school or cluster of schools in relatively class-homogeneous areas, resulting in a truncated range on the social class variable. This sampling strategy thus omitted the theoretically relevant underclass, who, he argued, were more likely to be involved in delinquency. Tittle, Villemez, and Smith (1982), in response to Kleck, said “Kleck, (and others) for example, believes that poor people are not only more criminal than those of other classes but bigger liars as well” (p. 437).
Researchers have continued to attempt to specify the conditions under which socioeconomic status is associated with self-reported delinquency and crime. They have also devoted considerable effort toward documenting, if not necessarily improving, the reliability and validity of self-report measures of key variables. But as Tittle and Meier (1990) observed, “Regardless of the conceptual and methodological reasons, criminologists seem no closer to identifying the nature of the relationship [between social class and criminality] than 50 years ago” (p. 271). “Sometimes SES does appear to predict delinquency; most of the time it does not” (p. 294). A decade later, Dunaway, Cullen, Burton, and Evans (2000) drew much the same conclusion about adult criminality. Results from their mail questionnaire survey of a random sample of adults in a Midwestern city “largely reject the notion that social class has a strong main effect on adult criminality in the general population, and thus, they tend to support Tittle and Meier’s (1990) more recent evaluation of the class-crime debate” (p. 617).
RELIABILITY AND VALIDITY OF SELF-REPORT DATA
Debate over whether, and under what conditions, indicators of social class and race or ethnicity are associated with self-report measures of delinquency prompted a flurry of studies aimed at establishing the reliability and validity of self-report measures (see Elliott & Ageton, 1980; Fendrich & Vaughn, 1994; Hindelang et al., 1979, 1981; Huizinga & Elliott, 1986; Mensch & Kandel, 1988; Thornberry & Krohn, 2000). Perhaps most important—and least likely to be met—on the list of requirements for obtaining representative self-report data is to have a sample that is large enough to include sufficient numbers of relatively rare individuals, that is “high-rate, serious offenders most likely to come to the attention of authorities” (Thornberry & Krohn, 2000, p. 40).
Among the elements necessary for reliable and valid self-report survey instruments, four are particularly germane to measures of delinquency and crime (Thornberry & Krohn, 2000, p. 41): (1) a wide range and variety of behaviors must be included; (2) serious offenses must be covered if comparisons are to be made to other kinds of data; (3) respondents must be asked to report on the actual, not relative, number of times they engaged in a particular behavior so that people who committed robbery four times are not lumped together with those who committed it 60 times in the past year; and (4) follow-up questions often are required to distinguish chargeable offenses from others—for example, some respondents may initially indicate that they have committed theft when what they actually have done is hidden someone’s books between classes.
Panel studies have generally shown that self-reported delinquency measures yield stable and consistent results from one period to another; that is, they are fairly reliable. Similarly, most tests find that self-reports measure what they set out to measure; that is, they are reasonably valid. However, there is evidence that some groups, some crimes, and some survey modes yield noticeably higher rates of underreporting. For example, “lower class youths tend to score higher on ‘lie’ scales within self-report measures4 (Braithwaite, 1981, p. 47). Similarly, African American males substantially underreport their involvement in delinquency (Thornberry & Krohn, 2000). There is also some indication that girls may be more honest in reporting their involvement in delinquent behavior than boys (Kim, Fendrich, & Wislar, 2000). At least for one type of criminal behavior—the use of illegal drugs—rates of underreporting are higher for the more serious offenses and for telephone interviews compared to self-administered questionnaires (see Aquilino, 1994; Turner et al., 1992).
As Thornberry and Krohn (2000) observed, a conclusion drawn in the early 1980s may still be the most logical:
The self-report method appears to behave reasonably well when judged by standard criteria available to social scientists. By these criteria, the difficulties in self-report instruments currently in use would appear to be surmountable; the method of self-reports does not appear from these studies to be fundamentally flawed. Reliability measures are impressive and the majority of studies produce validity coefficients in the moderate to strong range. (Hindelang et al., 1981, p. 114, as cited in Thornberry & Krohn, 2000, p. 59)
SELF-REPORTS FROM KNOWN CRIMINALS AND DELINQUENTS
Some researchers believed that official statistics might be just the tip of the iceberg—not only in terms of who engaged in what kinds of crime but also in terms of how much crime those official criminals might account for. Theseresearchers determined to get information directly from the source by surveying known—that is, arrested or incarcerated—criminals. Two prominent studies in this genre are the RAND inmate surveys and the ADAM program.
The RAND Inmate Survey(s)
What is commonly referred to as the RAND inmate survey is actually two surveys conducted at different times and with different samples. The objectives of both of these surveys were the same, however, and their findings parallel each other. Primary among those objectives was to learn from the source about the illegal behavior of convicted criminals, that is, “to gather information on individual patterns of criminal behavior—types of crime committed, degree of specialization in crime types, and changes in criminal patterns over time” (Visher, 1986, p. 166).
RAND researchers first completed exploratory interviews with 49 California prison inmates who were convicted of robbery (see Petersilia, Greenwood, & Lavin, 1977, in Visher, 1986). Using those interview data to construct a self-administered survey instrument, the first inmate survey was conducted in 1976 (Peterson & Braiker, 1981, in Visher, 1986; Tremblay & Morselli, 2000). A total of 624 inmates (representing only a 47% response rate) from five California prisons completed the anonymous questionnaire. The results from the exploratory study and those from the inmate survey were similar: “Most inmates committed few crimes per year. … A small group reported much higher frequencies of offending” (Visher, 1986, p. 164). RAND researchers were not satisfied that the measurement and sampling for this study were sufficient for broader generalization; thus, a more rigorous, more representative inmate survey was designed and conducted.
The second inmate survey, conducted in 1978, drew samples from both jail and prison populations in California, Michigan, and Texas (Tremblay & Morselli, 2000; Visher, 1986). Attempts were made to ensure that the samples were representative of a typical cohort of inmates for those states and that the offenses for which they were convicted covered a broad range in terms of seriousness. A total of 2,190 inmates from the three states completed the confidential questionnaire.5 By making the survey confidential rather than anonymous, researchers were able to compare inmates’ official records with their self-reported information.
The survey instrument included detailed questions about inmates’ illegal activities as juveniles, their adult criminal behavior in the two years prior to the arrest that resulted in their incarceration, and past as well as recent use of drugs and alcohol. Inmates’ attitudes on specific issues, their employment history, and their demographic data were also solicited. Inmates were asked to estimate the number of times in the previous two years they had committed each of 10 crimes, including burglary, business robbery, personal robbery, assault during robbery, other assaults, theft, auto theft, forgery/credit card swindles/bad checks, fraud, and drug dealing.
Exhibit 4.7 summarizes some of the findings from RAND’s second inmate survey. It presents both the median number of crimes per year (i.e., the maximum number of crimes 50% of the inmates reported having committed) and the number of crimes per year at the 90th percentile (i.e., the minimum number of crimes 10% of the inmates reported having committed). Half of the active robbers reported committing the crime no more than 5 times per year, whereas 10% of them reported robbing a person no less often than 87 times a year. The difference between the number of crimes per year reported by low-rate and by high-rate offenders is even more dramatic for the other property offenses. These data led to the conclusion that most people who engage in illegal behavior, even convicted criminals, do so infrequently, but some individuals are involved in crime with such regularity that they can be labeled career criminals.
The RAND inmate surveys provided the database for constructing offender typologies—that is, a classification of offenders according to the types of crime they commit (Chaiken & Chaiken, 1984). One of the more important findings from the surveys was that criminals do not necessarily specialize in a single illegal enterprise but instead combine activities to accomplish a particular end. Researchers have also used RAND surveys to explore whether or not inmates were motivated by the expectation that crime pays and how much they claimed to have earned through engaging in criminal behavior (see Tremblay & Morselli, 2000; Wilson & Abrahamse, 1992). Others have used these data to examine the relationship between substance use and other types of illicit activity (see Chaiken & Chaiken, 1990). In many ways, and on the basis of attempts to replicate them, the RAND surveys persist as the standard for obtaining and analyzing self-report data from inmates (see Auerhahn, 1999).
Arrestee Drug Abuse Monitoring
Another example of obtaining self-reported information from known offenders is the Arrestee Drug Abuse Monitoring (ADAM) program, a program established by the National Institute of Justice to monitor drug use among arrestees in a number of jurisdictions in the United States (Taylor & Bennett, 1999). The forerunner of ADAM, the Drug Use Forecasting (DUF) program was initiated in 1987 and demonstrated the feasibility of urinalysis as a means of measuring drug use by arrestees. By focusing on arrestees, a group that is more likely than other populations to be involved in drug use, ADAM presented a different picture of drug use than general household surveys such as the NSDUH. DUF and ADAM have been used extensively to provide information for the purposes of criminal justice policies, and the studies represent a major resource for criminologists analyzing the association between drug use and involvement in criminal activity (ADAM, 2000).
At each ADAM collection site, trained interviewers conduct voluntary and confidential interviews with arrestees who have been in a jail or booking facility for less than 48 hours. The interview covers basic demographics, drug use history, current drug use, recent participation in buying and selling drugs, lifetime drug and mental health treatment, and, for those who report any illegal drug use in the previous 12 months, detailed information on arrests and housing arrangements. In addition, voluntary urine samples are taken from the arrestees. These urine samples are tested for the presence of (at least) five illegal drugs: marijuana, cocaine (including crack), opiates (including heroin),methamphetamine, and PCP. In 1999, the ADAM program collected data from more than 30,000 adult male arrestees at 34 sites and from more than 10,000 adult female arrestees at 32 sites. In addition, data were collected from more than 2,500 juvenile male detainees at 9 sites and more than 400 juvenile female detainees at 6 sites (ADAM, 2000).
Due to a lack of funding, the ADAM program was terminated in 2003; however, in 2007, the Office of National Drug Control Policy resumed data collection in 10 former ADAM sites as ADAM II (Office of National Drug Control Policy, 2009).
In 2008, across all 10 sites, a total of 4,952 booked arrestees completed the interview portion of ADAM II and 3,924 provided a urine specimen. We focus here on a comparison of 2002 and 2008 data on the percent of male arrestees in the 10 sites testing positive for marijuana, cocaine, opiates, and methamphetamine.
Exhibit 4.8 shows that, across the 10 ADAM II sites and in both 2002 and 2008, marijuana was the drug for which most arrestees tested positive. However, there were substantial differences in the percentage testing positive for cocaine, opiates, and methamphetamine over time and across sites. In 2008, the percentage testing positive for cocaine ranged from a low of 17.2 in Sacramento to a high of 43.8 in Chicago. Arrestees in Chicago also had comparatively high rates of positive tests for opiate drugs, at 25.1% in 2002 and 28.6% in 2008. In contrast, in Charlotte, North Carolina, only 2.3% of arrestees in 2002 and 1.1% in 2008 tested positive for opiates.Exhibit 4.8 also confirms that methamphetamine is a drug more commonly used in western jurisdictions of the United States (see Mosher & Akins, 2007), with arrestees in Portland, Oregon, and Sacramento, California, being much more likely to test positive for this drug. However, it is notable that with the exception of Minneapolis, where the percentage testing positive for methampheta-mine was identical in 2002 and 2008, each ADAM II site had a lower percentage of arrestees testing positive for methamphetamine in 2008, with a particularly large decrease in Portland.
In addition to providing useful information regarding patterns of drug use by arrestees, the ADAM project offers a rare opportunity to assess the validity of self-report data—that is, to determine to what extent people tell the truth when responding to a survey. Through comparisons of the self-reported drug use information to urinalysis results, researchers can analyze under- and over-reporting of drug use.
Verifying the Validity of Self-Reported Drug Use
Research has demonstrated that there is often a discrepancy between self-reporting of drug use and the results of urinalysis tests. For example, a study comparing self-reports and urinalysis results that relied on ADAM data from five U.S. cities (New York-Manhattan, Fort Lauderdale, Miami, Washington, D.C., and Birmingham, Alabama) found that 7.8% of arrestees underreported drug use, compared with 1.9% who overreported (i.e., they reported using drugs but their urinalysis results were negative; Taylor & Bennett, 1999).
Studies have also indicated that underreporting varies according to the type of drug. There is generally a higher concordance rate for marijuana use, but for harder drugs such as cocaine and heroin, underreporting is much more common. For example, on the basis of 1988 DUF data, 47% of arrestees in New York City reported cocaine use, whereas 75% had positive urinalyses for the substance. In the same year, 41% of Philadelphia arrestees reported cocaine use, but 72% tested positive. However, 28% of the arrestees in New York City self-reported marijuana use, and 30% tested positive. Similarly, in Philadelphia, 28% reported using marijuana, and 32% tested positive (Thornberry & Krohn, 2000). Lu, Taylor, and Riley (2001) similarly found significant underreporting of cocaine use from ADAM data, with less than 50% of those who tested positive for cocaine admitting that they used the substance.
Golub, Liberty, and Johnson (2005) used 2000 to 2001 ADAM adult arrestee interview and urinalysis data to examine disclosure of drug use and the correlates of disclosure. They found that arrestees were most likely to disclose recent marijuana use (82%), followed by methadone (69%). However, rates of disclosure for cocaine/crack, heroin, and methamphetamine were all about half. These researchers also found that white arrestees were much more likely than were black arrestees to disclose recent use of methamphetamine and that arrestees charged with drug offenses were generally more likely than those charged with less serious offenses to disclose recent use of each drug, with the exception of methadone. Perhaps most interestingly, Golub et al. (2005) found incredible variation in disclosure rates for particular drugs across ADAM sites. For example, the marijuana disclosure rate varied from a low of 68% in Fort Lauderdale to a high of 93% in Spokane, while the cocaine/crack disclosure rate varied from 28% in Chicago to 70% in Kansas City. They noted that this variation might be attributable to differences across sites with respect to the nature of the jail where the interviews were conducted and the privacy it provides, the nature of the arrest experience and the hostility it engenders, and differences in the disapproval of various drugs across communities. Golub et al. concluded, “This analysis raises serious doubts about the validity of self-reported drug use, at least among arrestees” (2005, p. 932).
Magura et al. (1987, as cited in Magura & Kang, 1997) compared self-reports of drug use with urinalysis results for patients who were receiving methadone treatment in four clinics in New York City. Among subjects who tested positive for each drug, 65% did not report opiate use, 39% did not report benzodiazepine use, and 15% did not report cocaine use. Magura and Kang (1997) also found that African Americans were more likely than other groups to underreport drug use. In general, research suggests that the quality of survey data on racial and ethnic disparities in substance use is compromised by differential measurement error across racial and ethnic groups (Johnson & Bowman, 2003).
The findings of differential honesty in reporting by type of substance are consistent with social desirability theory (Edwards, 1957), which suggests that the distortion of self-reports, by underreporting or overreporting, occurs as a function of the perceived acceptability of the behavior in question. Because the use of marijuana is less stigmatized than the use of hard drugs in U.S. society, subjects may be more likely to truthfully report using the substance (Harrison, 1997).
SUMMARY AND CONCLUSIONS
Thornberry and Krohn (2000) suggested that “the self-report method of collecting data on delinquent and criminal behaviors is one of the most important innovations in criminological research in the 20th century” (p. 34).
Considerable improvements in survey methodology, as well as research efforts focused on enhancing the validity of self-reports over the years, have yielded greater confidence in the data that are collected in this manner.
Self-report data, however, are by no means without their weaknesses. Self-report measures of crime and its correlates continue to be constrained by the same elements that affect self-report of all types of behaviors. Concerns over sampling, representativeness, generalizability (i.e., did we ask the right people?), along with instrument design, question wording and order (i.e., did we ask the right question?) plague survey researchers generally and can be especially problematic with respect to surveys on crime. Concerns over the validity of responses (i.e., did the respondents answer with the truth?) likewise are not unique to self-report measures of crime.
At the same time, self-report data have unique strengths: For all their problems, self-report measures of crime provide valuable information that is not available through other measures. This is especially true if researchers are interested in etiological issues such as explanatory variables and models (theory testing) for delinquency and crime, circumstances surrounding illegal behavior, age at beginning as well as ceasing involvement in criminal activity, patterns of offending over the life course, and related issues. To maximize the value of self-report data, proper care should be taken to approximate as much as possible the ideal in methods, sampling, and instruments.
Given that victimization data are also a form of self-report measure, as will be discussed in the next chapter, it is impossible to overstate their importance to our understanding of crime and its correlates.
The Mismeasure of Crime
Mosher, Clayton; Miethe, Terance D.; Hart, Timothy C.
CHAPTER 3
OFFICIAL CRIME DATA
“The statistics of crime and criminals are known as the most unreliable and difficult of all statistics. First, the laws which define crimes change. Second, the number of crimes actually committed cannot possibly be enumerated. This is true of many of the major crimes and even more true of the minor crimes. Third, any record of crimes, such as arrests, convictions, or commitments to prison, can be used as an index of crimes committed only on the assumption that this index maintains a constant ratio to the crimes committed. This assumption is a large one, for the recorded crimes are affected by police policies, court policies, and public opinion.
—Sutherland (1947, p. 29)
Official crime data are those that derive from the normal functions of the criminal justice system. These official counts of crime include police reports of offenses and arrests, charges filed by prosecutors, criminal complaints and indictments, imprisonment data, and prison releases.
Although official data come from a number of different sources, both the volume and nature of recorded crime incidents change dramatically through successive stages of criminal justice processing. A funnel analogy is often used to describe how both the number of offenders and the number of criminal offenses decreases significantly as one moves from police statistics to imprisonment data. Of all offenders and offenses known to the police, only a portion is subject to arrest. Only some of those subject to arrest will be prosecuted in courts, and of those, only some will be convicted. An even smaller proportion will be incarcerated. The most inclusive official measure of crime thus involves police reports of criminal incidents.
This chapter examines the nature and scope of police statistics on crime. We begin with a description of the crime reporting procedures in the United States. We then summarize historical trends in crime rates and the characteristics of offenders that derive from police reports and proceed to consider the various problems associated with using police data as a measure of crime. The chapter concludes with a discussion of cross-national data on crime.
UNIFORM CRIME REPORTS IN THE UNITED STATES
As discussed in Chapter 2, prior to 1930, police reports of crime in the United States were not collected or compiled in any systematic way across jurisdictions. Some large cities kept yearly counts of reported crime incidents and persons arrested, whereas other cities did not formally record such information. The classification of crime also varied widely across jurisdictions, with different community standards and legal definitions affecting how crimes were defined and whether particular activities were recorded as crimes in official data. Public tolerance and law enforcement activities toward lynching, abortion, spouse abuse, drug and alcohol use, dueling, and other forms of mutual combat varied widely both within and between southern and northern states. Both comparisons across jurisdictions and estimates of historical trends in crime are extremely hazardous prior to 1930 because of the lack of uniformity in definitions of crime and in the collection of police data on crime incidents.
In developing the Uniform Crime Reporting (UCR) program in the late 1920s, the International Association of Chiefs of Police (IACP) recognized that not all crimes are equally important. They therefore focused on seven types of crime that were prevalent, generally serious in nature, widely identified by victims and witnesses as criminal incidents, and most likely to be reported to the police. The original seven major index crimes, or what are also referred to as Part I offenses, include murder and manslaughter, forcible rape, robbery, aggravated assault, burglary (both commercial and residential), larceny, and motor vehicle theft. The reporting of other offenses (referred to as Part II or nonindex offenses) is not mandatory for police departments that participate in the UCR program. A list of Part I and Part II offenses is presented in Exhibit 3.1.
Although the number of police departments participating in the UCR program increased over time, the program remained essentially unchanged in its content and structure from its inception in 1930 until 1958. During that period, the FBI published crime data according to the size of the jurisdiction and did not provide reports of a national rate of crime because there was insufficient coverage of the entire country. Changes in 1958 included (1) the use of a composite crime index of all Part I offenses in the UCR, (2) the elimination of negligent manslaughter and larceny under $50 as Part I crimes, (3) the removal of statutory rape from UCR counts, and (4) the estimation and publication of crime rates for the entire United States.
Further changes to the UCR program, involving the development of state level officials to serve as intermediaries between local police departments and the FBI, were implemented in the 1970s. There are currently 47 states with special UCR programs that provide technical assistance within their state and submit data to the federal UCR program. The number of law enforcement agencies reporting to the UCR has almost doubled since the introduction of these state programs.
In 1979, arson was added to the UCR crime index as a Part I offense. This was in response to an apparently growing problem with this crime. In the United States in 1977, arson was reported to account for approximately one quarter of all fires and “perhaps about 750 deaths and possibly many more” (Simpson, 1978). Senator John Glenn (1978) was instrumental in having arson classified in the UCR, noting, “A criminal could steal a car in New York and drive it to New Jersey and his crime would be noted in the FBI charts. But let that same criminal torch a house or business—causing untold property damage and ruined lives—and his crime of arson will never make the charts. That’s a ridiculous situation” (p. 15). Despite protestations of FBI officials who believed it would be difficult to properly classify arson incidents in the UCR (Renshaw, 1990), Glenn’s argument that including arson as a Part I crime would focus national attention on a solution to the problem ultimately held sway.
The most fundamental change in the UCR program in the last three decades involves the movement toward what is known as a national incident-based reporting system (NIBRS), the special features of which will be addressed later in this chapter.
Although participation in the UCR program is voluntary, the proportion of law enforcement agencies that submit data to the program is remarkably high. A total of nearly 18,000 state, county, and city law enforcement agencies, covering more than 288 million inhabitants, submitted crime reports under the UCR system in 2008. A total of 95% of the U.S. population is covered by this data source, with participation rates slightly lower in cities outside metropolitan areas (88%) and in rural areas (90%).
DATA COLLECTION PROCEDURES UNDER THE UNIFORM CRIME REPORTS PROGRAM
Crime data under the UCR program are collected on a monthly basis from participating local law enforcement agencies, and they are typically submitted to a centralized crime records facility within their state UCR program. These completed crime report forms are then returned to the FBI for purposes of compiling, publishing, and distribution (FBI, 2004).
A national reporting system such as the UCR that relies on the cooperation of local and state agencies requires the development and establishment of standard operating procedures and uniform practices. Accordingly, the FBI has gone to considerable lengths to standardize these reporting procedures through the provision of training services and data collection manuals to local agencies.
According to the Uniform Crime Reporting Handbook (FBI, 2004), basic minimum standards in several areas are required for agencies providing data for the UCR program. First, a permanent written record is made of each crime immediately upon receipt of a complaint or a call for service. A follow-up system is used to examine whether reports are promptly submitted in all cases. Second, crime reports are checked to see that all offenses submitted in the UCR program conform to the UCR classification of offenses. Third, all records and statistical reports are closely supervised by the agency administrator. Periodic inspections are made to ensure strict compliance with the standard rules and procedures.
CLASSIFYING AND SCORING CRIMINAL OFFENSES IN THE UCR PROGRAM
Two essential components of the UCR data system involve the classifying and scoring of criminal offenses. Classifying crime offenses in the context of the UCR refers to the process of translating offense titles used in particular local and state laws into the standard UCR definitions for Part I and Part II offenses. Depending on the particular classifications used in individual jurisdictions, this conversion process may be more or less ambiguous for certain offenses. Scoring of criminal offenses, in contrast, refers to counting the number of offenses after they have been classified under the UCR typology and entering the total count on the appropriate form. Uniformity in both classifying and scoring criminal offenses across jurisdictions is essential for maintaining the integrity of the UCR.
The Uniform Crime Reporting Handbook (FBI, 2004) provides reporting agencies with detailed definitions and general rules for the classification and scoring of criminal offenses. The classification of offenses into particular UCR categories is based on the facts that underlie an agency’s investigation of the crime. The UCR program distinguishes between crimes against persons (i.e., criminal homicide, forcible rape, and aggravated assault) and crimes against property (i.e., robbery, burglary, larceny-theft, motor vehicle theft, and arson). Under the UCR scoring rules, one offense is counted for each victim in crimes against persons and one offense is counted for each distinct operation in crimes against property. Motor vehicle thefts are an exception to the property-counting rule in that one offense is counted for each stolen vehicle.
Given that UCR definitions of criminal offenses are a crucial element in the standardization of reporting practices, it is important to look more closely at how major criminal offenses are defined and counted under the UCR scheme. As described in the 2004 UCR Reporting Handbook and the methodological appendices for all publications of the Uniform Crime Reports, the Part I offenses are defined as follows:
Criminal homicide involves two subtypes of offenses. Murder and nonnegligent manslaughter are defined as “willful (nonnegligent) killing of one human being by another” (p. 15). The second type of criminal homicide involves manslaughter by negligence, which is defined as “the killing of another person through gross negligence” (p. 18).
Forcible rape is defined as “the carnal knowledge of a female forcibly and against her will” (p. 19). It involves two categories: (a) rape by force and (b) attempts to commit forcible rape. These offenses are restricted to female victims, and they are classified as forcible regardless of the age of the victim. Nonforceable offenses against victims under the age of consent, fondling, and incest are excluded.
Robbery is defined as “the taking or attempt to take anything of value from the care, custody, or control of a person or persons by force or threat of force or violence and/or by putting the victim in fear” (p. 21). Robbery involves a theft or larceny but is aggravated by the element of force or threat of force.
Aggravated assault is defined as an “unlawful attack by one person upon another for the purpose of inflicting severe or aggravated bodily injury” (p. 23). This type of assault is usually accompanied by the use of a weapon or by means likely to produce death or great bodily harm. Simple assaults are excluded.
Burglary is “the unlawful entry of a structure to commit a felony or theft” (p. 27). Attempted forcible entry is included in this category.
Larceny-Theft involves the “unlawful taking, carrying, leading, or riding away of property from the possession or constructive possession of another” (p. 31). Larceny-theft is subclassified into the following categories: (a) pocket picking (i.e., theft from a person by stealth), (b) purse snatching that involves no more force than necessary to snatch the purse from the person’s custody, (c) shoplifting, (d) thefts of articles from motor vehicles, (e) thefts of motor vehicle parts and accessories, (f) thefts of bicycles, (g) thefts from buildings, (h) thefts from coin-operated devices or machines, and (i) all other larceny-theft not specifically classified. Attempted larcenies are included in this category.
Motor vehicle theft is the theft or attempted theft of a self-propelled vehicle that runs on land surface and not on rails. Motorboats, construction equipment, airplanes, and farming equipment are specifically excluded from this category.
Arson involves “any willful or malicious burning or attempt to burn, with or without intent to defraud, a dwelling house, public building, motor vehicle or aircraft, personal property of another, etc” (p. 37). Fires of suspicious or unknown origin are excluded from the UCR.
Sources of Ambiguity
Coding crimes into these categories can be a complex process. As previously noted, the FBI provides training to local reporting agencies and presents numerous examples in the UCR Reporting Handbook (FBI, 2004) to illustrate the rules for classifying and scoring criminal offenses. However, there are several sources of ambiguity in the definition and coding of even the UCR Part I offenses that call into question the uniformity of reporting practices across jurisdictions. In fact, it is not unreasonable to assume that all Part I offenses are subject to considerable variability in counting and scoring across individual reporting units. The primary sources of variability include differences across local jurisdictions in their interpretation of crime incidents, the hierarchy rule, the diligence of record keeping, and the adequacy of follow-up procedures.
In the specific case of homicide, the main obstacle to uniform reporting and counting involves the follow-up procedures, the timing of police investigations and UCR filing, and definitional ambiguity in the classification of accidental killings and justifiable homicides. For example, the recording of situations of aggravated assaults that become murders because the victim dies as a result of the assault assumes equal diligence and detailed record keeping across reporting agencies in conducting follow-up investigations and correctly adjusting multiple monthly returns. Some less reliable agencies may simply count the aggravated assault and fail to record the subsequent death of the victim as a murder.
An interesting example of the confusion that can be created with respect to the coding of homicides comes from New York City, which, in 2006, saw an apparent increase in homicides from the previous year. However, part of this increase was fueled by an unusual number of deaths that were classified as homicides because the city’s medical examiner determined they were related to crimes that had been committed in earlier years. Of the 25 such reclassified deaths in 2006, 12 were related to injuries that had occurred at least 14 years earlier, including one case of a 72-year old man who has shot in 1974 and died of pneumonia in April of 2006 (Vasquez, 2006).
Depending on when in the investigative process the UCR incident is filed, a deadly shooting involving two juveniles playing with a gun may be classified as accidental (i.e., manslaughter by negligence) or willful killing (i.e., murder and nonnegligent manslaughter). Similarly, the killing of an individual by a law enforcement officer or private citizen in the course of the commission of a felony by that individual is a justifiable homicide under the UCR, but some local agencies violate UCR procedures and count such incidents as criminal homicides. Such differences in classification are not likely to be identified in the record-checking procedures used by the FBI.
The major source of ambiguity in the definition and classification of forcible rape involves what constitutes “carnal knowledge of a female forcibly and against her will.” Specifically, some jurisdictions may apply the strict definition of carnal knowledge as sexual intercourse (i.e., penilevaginal intercourse), whereas others may consider a fuller range of sexual acts and offensive touches. Also, when there is no apparent resistance on the part of the victim, some jurisdictions may count the act as consensual and, thereby, not against the woman’s will. Contrary to the instructions provided in the UCR forms, local reporting agencies may also vary in their inclusion of male victims and female offenders in their counts of forcible rape. There is also likely to be considerable variation across local jurisdictions in the inclusion and counting of forcible rapes that occur within the context of marital partners and intimates.
Sources of diversity in the classification of robberies are related to the distinction between strong-arm robberies and types of larceny from the person (e.g., purse snatchings). Under the UCR classifications, a purse snatching is classified as a strong-arm robbery when force or threat of force is used to overcome the active resistance of the victim. This force is also considered more than is necessary to snatch a purse from the grasp of the person. However, is it reasonable to assume that all local law enforcement agencies and, for that matter, individual police officers share the same interpretation of “more than necessary” force? Likewise, if the victim falls to the ground when a bag or purse is yanked from her shoulder, would this offense be classified uniformly as robbery or as larceny-theft? Does the classification change if the victim was pushed rather than falling or stumbling to the ground? In addition, jurisdictional differences are likely in the counting of robberies with multiple victims in the same behavioral incident. The UCR rule is to ignore the number of victims and count “one offense for each distinct operation” (FBI, 2004, p. 10), but can we be certain that this rule is uniformly applied? How is this rule actually applied across jurisdictions in cases of spree robberies that may be interpreted as a continuation of the original incident?
Definitional and classification problems with aggravated assault concern the interpretation of the provision that it is not necessary that physical injury results from an aggravated assault. Threats and assaults in the context of domestic violence are also subject to various interpretations. When assault situations occur in private places with no witnesses besides the victim, the absence of physical injuries makes it especially difficult to ascertain on a consistent basis whether an aggravated threat or attempt with a dangerous weapon actually occurred. The mere brandishing of a dangerous weapon may also be interpreted by some, but not by other local agencies, as an aggravated assault. Domestic assault situations are especially problematic in their classification across jurisdictions. Physical injuries to victims of domestic violence are often treated under state codes as gross or simple misdemeanors rather than felonies such as aggravated assault. Whether a threat with a dangerous weapon was involved (or a weapon was merely brandished) is also difficult to uncover in this particular context. Even under the best conditions of training and definitional clarity, local agencies will vacillate widely in their UCR classification of offenses with threats or no physical injury as aggravated or simple assault.
The major obstacles to uniformity in classifying and scoring the crime of burglary are the demonstration of intent beyond unlawful entry, the inclusion of attempts, the types of persons who qualify as being involved in an unlawful entry, and more general definitional misunderstandings. For example, burglary is a trespass with intent to commit a felony or theft, but how is this intent consistently determined when the alleged burglary is only attempted and not completed? Could the incomplete act be just a trespass, the destruction of property, or a type of vandalism? Does the apprehension of a suspect after breaking a window count as an attempted burglary or simply vandalism? Concerning the difference between lawful and unlawful entry, are acts of theft without forcible entry by previous intimates (e.g., ex-spouses, separated but not divorced parties, ex-roommates) counted as burglaries or larcenies? This determination will vary depending on the interpretation of particular parties as having the necessary legal status to define their behavior as lawful entry.
McCleary, Nienstedt, and Erven (1982) examined some additional problems in the classification of burglary. In an interview with a UCR coding clerk in a particular police department, they were informed that “a burglary has the element of breaking and entering a building. In a lot of cases, the thief breaks through a fence and steals something. That’s not a burglary, but a lot of officers don’t know that” (p. 362) and would still classify such an incident as a burglary.
Another issue with respect to the classification of burglary stems from what is known as the hotel rule. Under this rule, “if a number of units under a single manager are burglarized and the offenses are most likely to be reported to the police by the manager rather than the individual tenants/renters, the burglary should be reported as a single incident” (FBI, 2004, p. 62). Examples include burglaries of a number of hotel rooms or storage units in commercial self-storage buildings. Note that under this rule, even though a number of separate burglaries may have occurred, only one would be recorded in official data.
The major problem with the classification of larceny-theft stems more from the differential likelihood across jurisdictions of reporting particular types of thefts than from definitional ambiguity. Specifically, police underreporting and undercounting of particular thefts, such as shoplifting and stolen motor vehicle parts or accessories, is especially likely when these offenses involve minor financial losses and occur in large metropolitan areas. These frequently occurring offenses, however, may be more accurately reported to the FBI in smaller local areas.
Another, perhaps more obvious, problem with the larceny-theft category is related to the estimate of the dollar value of the item(s) stolen. The dividing line for UCR reporting was $50, and larceny more than $50 was the index offense that increased the most over the early history of the UCR—an increase of more than 550% between 1933 and 1967. However, because the purchasing power of the dollar in 1967 was only 40% of what it was in 1933, many thefts that would have been under $50 in 1933 were more than $50 in 1967 (President’s Commission on Law Enforcement, 1968).
Differences across local areas in the UCR counting and scoring of motor vehicle theft may derive from the lack of internal consistency in the coding of motor vehicle thefts and thefts of accessories and parts. Although the UCR manuals clearly specify the different categories, it is possible that some agencies may assume that the theft of motor vehicle accessories and parts falls into the category of motor vehicle theft rather than larceny-theft. The theft of boats and bicycles may also be improperly classified as motor vehicle theft by some local jurisdictions.
The differential interpretation of “willful” or “malicious” burnings and how suspicious fires are classified are the major problems associated with the UCR category of arson. The Uniform Crime Reporting Handbook clearly notes that suspicious fires of unknown causes should not be counted as arson. However, local areas are likely to vary widely in their investigative expertise in these crimes and their subsequent reporting of fires as arson.
The FBI has gone to considerable lengths in an attempt to monitor the accuracy of classifying and scoring crimes in the UCR. Starting in 1997, the FBI developed a voluntary Quality Assurance Review (QAR) for the UCR program that assesses the validity of crime statistics through an on-site review of local case reports. The review program also extends to the collection and compilation of crime statistics by the state repositories. Upon completion of the review, the QAR assessment team sends the agency a written evaluation of its performance in reporting methods, submission requirements, and overreporting or underreporting of incidents. Each state’s UCR program is subject to a QAR evaluation at least once every three years to determine the level of compliance with national UCR guidelines (FBI, 2008).
The Hierarchy Rule and Counting Multiple-Offense Incidents
The UCR’s hierarchy rule applies to the classification and scoring of crimes when multiple offenses are committed at the same time by a person or group of persons. When the hierarchy rule is applied in a multiple-offense situation, only the most serious offense in the series is reported, and all others are ignored. For example, if an individual breaks into a house, steals items from the house, kills the owner of the house, and makes a getaway in a stolen car, only the murder will be recorded in official statistics. Similarly, if, during the commission of a robbery, the offender strikes the teller with the butt of a handgun, runs from the bank, and steals an automobile at curbside, it would appear that three Part I offenses (robbery, aggravated assault, and motor vehicle theft) have occurred. However, because robbery is the most serious of the three offenses, only it would be counted; the two other offenses would be ignored (FBI, 2004).
The hierarchy rule, in theory, involves the application of a rather simple two-step process. First, the reporting agency classifies each of the separate offenses and determines which of them are Part I crimes. Second, the ranking of Part I crimes under the UCR system is used to identify the most serious offense, and that offense is recorded in the data. The decision to apply the hierarchy rule becomes more complicated when it is unclear whether there was a separation of time and place between the commissions of several crimes.
The major methodological concern regarding the hierarchy rule is how to determine compliance with it and what adjustments, if any, should be used to correct for potential classification errors. Greater oversight by the state or federal UCR program is an obvious way of determining compliance, but such coding decisions are usually not visible or detectable because the summary counts provided in monthly UCR data do not include the information necessary to make independent judgments of coder reliability. Perhaps ironically, the most direct solution to the problem of selective application of the hierarchy rule is its elimination through the greater utilization of the National Incident-Based Reporting System.1
NATIONAL INCIDENT-BASED REPORTING SYSTEM
A recent enhancement to the UCR program is the development of an incident-based reporting system for reporting offenses and arrests, known as the National Incident-Based Reporting System (NIBRS). It is described as “a new approach to measuring crime, one that is simultaneously ambitious, revolutionary, cumbersome, little known, and disappointingly slow to be adopted” (Maxfield, 1999, p. 120). Implementation of the NIBRS program requires (a) a revision of the definitions of certain offenses, (b) the identification of additional significant offenses to be reported, and (c) the development of incident details for all UCR offenses (see FBI, 1997). When fully implemented, it is believed that NIBRS data will be better able to measure the true volume of crime than standard UCR data because the former does not rely on the hierarchy rule and other practices that restrict the counting of crime incidents.
In contrast to the traditional UCR, which uses a summary or aggregate reporting approach, NIBRS categorizes each incident and arrest in one of 22 basic crime categories (see Exhibit 3.2 ) that span 46 separate offenses. A total of 53 data elements about the victim, property, and offender are collected under NIBRS (see Exhibit 3.3). Both Barnett-Ryan (2007) and Addington (2007) provided a fuller description of the NIBRS approach and its conceptual and methodological comparability with the UCR’s traditional summary reporting system and the NIBRS data.
NIBRS was intended to be implemented as a phase-in program, and it has largely developed at that pace. The FBI has accepted NIBRS data from local agencies since January 1989. During its first 10 years of the program, the FBI certified a total of 19 state-level programs for participation in NIBRS. As of February 2008, 31 states have been certified for NIBRS. Three additional states have individual agencies submitting NIBRS data, and other states remain in the testing or development stage. Five states (Alaska, Florida, Georgia, Nevada, and Wyoming) have no formalized plan to participate in NIBRS (see www.jrsa.org/ibrrc/background-status/nibrs_states.shtml).
Although NIBRS data have been used in federally published reports on crime, it remains too early to determine the overall effectiveness of this alternative UCR program. As of February 2008, only about 25% of the U.S. population is covered by NIBRS reporting. Participation in the program has also been concentrated within small- and medium-sized law enforcement agencies. In fact, none of the current agencies that report NIBRS data cover a population of 1 million residents or greater. The low rate of participation in NIBRS by agencies with populations of greater than 250,000 residents has raised serious questions about the accuracy of national estimates of violent crime that derive from NIBRS data (see Addington, 2007).
Despite its promise in terms of improving the accuracy of crime measurement, several potential problems exist with NIBRS data. Most obvious is the incredible complexity of the coding schemes: The coding specifications are documented in four volumes published by the FBI. As Maxfield (1999) suggested, few police officials are researchers, and diligence in paperwork is not among the skills most valued by police officers. As a result, missing data may become an even greater problem under the NIBRS because of the larger number of categories for which data are collected and the complexity of definitions within each of these categories. Furthermore, as Roberts (1997) noted, the incentives for law enforcement agencies to participate in NIBRS data collection are few. These agencies may feel that NIBRS data are of far more value to researchers than to themselves, and there is concern that the detailed, incident-level reporting required for NIBRS will require police officers to spend additional time filling out reports instead of responding to the needs of the public. A widespread perception also exists that NIBRS participation will result in an increase in reported crime because the UCR’s hierarchy rule will be eliminated. This presents a potential public-relations disaster for law enforcement agencies that are, to at least some extent, evaluated on the basis of crime rates in their jurisdiction.
OFFICIAL CRIME TRENDS AND PATTERNS BASED ON UNIFORM CRIME REPORTS
One of the primary purposes for the establishment of uniform crime reporting practices across jurisdictions was to provide a national barometer of crime and its distribution. The methods of classifying and counting offenses have remained relatively stable over time, allowing for estimation of national crime trends. Aggregate characteristics of particular types of offenses and some demographic characteristics of arrested persons are also presented in these national statistics.
Based on UCR data, the crime rate in the United States has vacillated over time and exhibits some variation by type of crime. Participation in the UCR program was sufficient to estimate national crime trends beginning in the 1960s. Starting then, the total crime rate per 100,000 inhabitants increased steadily until the mid-1970s, then decreased somewhat, and then peaked again in the early 1980s. It generally rose steadily from the mid-1980s to the early 1990s and has generally dropped since that time (see Exhibit 3.4).
Although the number of reported crimes exceeded 11.1 million in 2008, the crime rate of 3,667 per 100,000 is at the lowest point since 1968; the crime rate has declined by about 14% over the last 10 years (1999 to 2008). Declining crime rates are found in each region of the country. Southern and western states have continued to experience the highest rates of reported crime, and lower rates are found in the northeast and Midwest.
NOTE: Violent crimes include murder and nonnegligent manslaughter, forcible rape, robbery, and aggravated assault. Property offenses include burglary, larceny-theft, and motor vehicle theft. Arson is not including in the data presented.
The FBI’s Crime in the United States, 2008 (FBI, 2009) indicated that violent crime (i.e., murder, rape, robbery, and aggravated assault) accounted for about 12% of the total Part I offenses reported to law enforcement, whereas the remaining 88% were property crimes. This ratio of violent to property crimes in national data has been quite stable over time. Throughout the history of UCR reporting, larceny-theft represents, by far, the most common offense in these national data, whereas murders are the least common offense. Aggravated assaults account for about 60% of all violent crimes.
Homicide
Among all violent crimes, the most comprehensive police data are collected on murder and manslaughter. This is the case because (a) as the most serious UCR offense, this crime is never undercounted by the hierarchy rule; (b) murder has the highest clearance rate of all index crimes (i.e., 64% of murders known to the police in 2008 were cleared or “solved” by an arrest); and (c) additional police data are collected on each homicide through the Supplementary Homicide Reports (SHR).
Both the number of homicides and rate per 100,000 population have followed a similar pattern to the trend for all Part I offenses combined. Homicide rates increased throughout the 1960s until the mid-1970s, dropped somewhat in the late 1970s before the peak appeared in 1980, stayed relatively high until the early 1990s, and have decreased steadily since that time. In 2008, 16,272 homicides were known to the police, representing a 5% decline from 1999. The homicide rate of 5.4 per 100,000 population in 2008 was the lowest recorded in the United States since the mid-1960s.
Homicide rates based on UCR data vary across geographical areas. The homicide rate in southern states (6.6 per 100,000 population) is higher than in any other region of the country. However, each region has experienced a declining homicide rate over the last five years. Cities within large metropolitan areas had a 2008 murder rate of 6 per 100,000, compared to rates of about 3 per 100,000 for non-metropolitan areas. Homicide rates in particular U.S. cities over time, however, exhibit fairly unique patterns. Some cities have homicide rates that have fluctuated considerably between 1960 and 2000 (e.g., Houston), some cities have stable rates over this period (e.g., Baltimore, Phoenix, Seattle), and others have experienced general increases with dramatic upward swings in a particular decade (e.g., Detroit, New Orleans, Washington, D.C.).
Exhibit 3.5 reveals considerable variation in homicide rates across major U.S. cities in 2008, with a low of 4.8 per 100,000 in Seattle to a high of 63.6 in New Orleans.
Analysis of the FBI’s Supplementary Homicide Reports (SHR) for 2008 indicates several dominant patterns in the characteristics of homicide victims and offenders (see Exhibit 3.6). More than three fourths of homicide victims are males, and nearly 9 out of every 10 victims are aged 18 years or older. Almost half of all homicide victims are black. Black males have the greatest risk of being homicide victims of all sex-race combinations. Concerning offender characteristics, approximately 90% of homicides for which complete information was available are comprised of male offenders, and the vast majority of homicide offenders (91%) are persons aged 18 years or older. In 2008, about half of all homicide offenders were black, and the clear majority of homicides (87%) were intraracial killings. Males are most often murdered by male offenders (91%), and about 90% of female homicide victims are killed by males.
Based on police reports of known offenses, homicides also exhibit wide variation in their offense characteristics and situational contexts (see Exhibit 3.7). The majority of victims know their assailants, and most of these incidents involve killings by acquaintances or friends (47%) or a family member or intimate partner (31%). The killer is a stranger in approximately one in five murders (22%) in which information about the victim-offender relationship is known. Arguments and disputes are the most prevalent circumstances under which homicides take place (56%), and a sizable minority (31%) of killings occur in the course of the commission of another felony offense (especially robberies). A firearm was the most common lethal weapon used in homicide incidents; 67% of homicides in the SHR data involved the use of a firearm, whereas approximately 13% involved knives or other cutting instruments. The proportion of homicides involving the use of firearms has changed very little over the last 30 years.
Forcible Rape
Based on UCR data, rape rates in the United States increased steadily and more than tripled between the early 1960s and the early 1980s, remained high and fairly stable across the 1980s, and have generally decreased since that time. An estimated 89,000 rapes were known to the police in 2008, representing a rate of about 29.3 per 100,000 inhabitants.
Rape rates also vary by location. States in the northeast have considerably lower rape rates than other regions of the country. Forcible rape rates in metropolitan areas are far higher than those found in non-metropolitan areas.
Several other factors are associated with rape in the UCR data. Most rapes known to the police involve completed offenses by force (92%), whereas the remaining cases involve attempts. According to UCR arrest data, about 43% of those arrested for forcible rape in 2008 were under the age of 25, and about one third of arrestees for this crime were black.
Robbery
Robbery rates in the United States have vacillated widely over the last 40 years. These rates more than quadrupled from 1960 to 1980, then dropped in the early 1980s, rose dramatically in the late 1980s until 1991, decreased appreciably until the year 2000, and have remained relatively stable since that time. More than 440,000 robberies were known to the police in 2008, representing a rate of 145 per 100,000 inhabitants. Comparative UCR data indicate that the estimated number of robberies in the United States has decreased by approximately 36% from 1991 to 2008.
Similar to other violent crimes, robbery incidents vary by geographical location (see Exhibit 3.8). Southern states have the highest robbery rates of all regions, and the Midwest has the lowest rate. Although nearly half of all robberies in the United States are street muggings, a far higher proportion than the national average for muggings is found in the northeast. Convenience store robberies account for the highest proportion of robberies in the South. Although when the general public thinks about robberies they most likely envision bank robberies, these robberies account for only about 2% of all robberies, and they are rare in all geographical areas. Robbery rates are highest in the largest metropolitan areas, and street muggings account for a large proportion of robberies in such jurisdictions compared to other areas.
Several other characteristics of robbery are revealed in UCR data. For example, the average monetary loss from a robbery is approximately $1,300, which ranges from $712 taken in robberies of convenience stores to $4,854 for the average bank robbery. With respect to the type of weapon used in robberies, firearms (44%) are the most common, followed closely by strong-arm tactics (40%). Males accounted for about 9 out of every 10 robbery arrestees, and nearly two thirds of persons arrested for this crime were under 25 years of age. More than half of arrested robbers are black.
Compared to the UCR data from the early 1970s, there has been both change and stability in the factors associated with robbery over time. Robbery rates across this time frame have remained higher in major metropolitan areas than other locations, and similar proportions of robberies are found to involve strong-arm tactics over time. Based on arrest data, a similar proportion of robberies across time periods involve males, but the prevalence of robbery arrests for persons under 25 years old has decreased somewhat over time (76% in 1972 vs. 66% in 2008).
Aggravated Assault
Aggravated assault rates increased in almost every year from 1960 to the early 1990s, before decreasing rather steadily over the next 15 years. More than 830,000 aggravated assaults were estimated from UCR data in 2008. The estimated rate of 275 aggravated assaults per 100,000 population is the lowest recorded since 1978. Southern and western states have the highest rates for this offense, and rates of aggravated assault are more than twice as high in large metropolitan areas than in non-metropolitan counties.
Concerning offense and offender characteristics, the most common weapons used in aggravated assaults are blunt objects (34%) and personal weapons such as hands, fists, and feet (26%). Knives or cutting instruments (19%) and firearms (21%) are the other types of weapons used in these assaults. Males accounted for about 79% of those arrested for aggravated assaults in 2008, and approximately 40% of these arrestees were under the age of 25. Although the largest majority of aggravated assault arrestees were white (63%), black offenders represented a higher proportion of persons arrested for aggravated assault than their distribution in the U.S. population (34% of arrestees vs. 12% of the population).
Property Crime
Property crimes account for about 88% of the Part I offenses known to the police. As a group, property crime rates increased dramatically between the 1960s and early 1980s, vacillated up and down for the next 10 years, and then exhibited a steady decline after 1991. More than 9.7 million of these offenses were known to the police in 2008. The property crime rate of 3,212 per 100,000 in 2008 represented the lowest rate since 1972. Both property crime rates and incidents are highest in southern states and lowest in the northeast (see Exhibit 3.9). These rates and incidents are also far higher in large metropolitan areas than other locations.
Burglary rates in the United States more than doubled between 1960 and 1980 and have generally declined since the early 1980s. More than 2.2 million burglaries were known to the police in 2008. The burglary rate in southern states is more than double the rate in northeastern states (941 vs. 430 per 100,000). Metropolitan areas also have far higher rates and incidents of burglary than non-metropolitan areas. The vast majority of burglaries involve forcible entry (61%), and residential break-ins account for over two thirds of all burglaries (70%). The majority of burglaries occur during the daytime hours, whereas the majority of nonresidential burglaries happen at night. Males, persons under 25 years of age, and blacks are overrepresented among burglary arrestees.
Larceny-thefts are the most common crime in the UCR data, involving an estimated 6,957,412 offenses in 2008. Rates of larceny increased throughout the 1960s and 1970s, peaked and remained high in the 1980s and mid 1990s, and have dropped by about 30% since that time. Southern and northeastern states have the highest and lowest larceny rates, respectively. Larceny-theft rates are higher in cities within and outside metropolitan areas than in nonmetropolitan counties.
The average value of property loss due to larceny in 2008 was $925. The average take from pocket picking was $563, and losses from purse snatchings were approximately $427, compared to $196 for shoplifting and $1,540 for thefts from buildings. Thefts from motor vehicles are the most common type of larceny-theft, accounting for about 26% of these crimes. Only about 1% of larceny-thefts involve either pocket picking or purse snatching. More than half of the arrestees for this offense are under 25 years old, and about 29% are black. Females were arrested for this offense more often than for any other Part I offense, comprising 41% of larceny-theft arrestees.
Motor vehicle theft rates based on UCR data increased throughout the 1960s, hovered between 400 and 500 per 100,000 in the early 1970s to the mid-1980s, and similar to the trend with other crimes, have decreased since the 1990s. Nearly 1 million auto thefts were known to the police in 2008, and the theft rate was substantially higher in western states compared to other regions. Large urban areas have offense rates that are far higher than smaller cities and non-metropolitan counties. The average value of the stolen vehicle was approximately $6,751. Seventy-two percent of stolen vehicles were automobiles, 18% were trucks or buses, and the remainder were other types of vehicles. Arrestees for motor vehicle theft are disproportionately male (83%), under 18 years old (25%), and black (38%).
Although only about 80% of the U.S. population is covered in UCR estimates for arson for 2008, several patterns are revealed in these data. First, the overall arson rate is estimated to be 24 per 100,000 population in 2008. These estimated rates are far higher in cities with a population greater than 250,000 than in smaller locations. Second, most arsons known to the police involve physical structures (43%), followed by mobile property (29%) and other types of property (28%). Third, the average dollar loss per arson offense was about $16,000. Industrial and manufacturing structures had the highest average losses of $212,000 per offense. Fourth, arrested arsonists are disproportionately male (84%), juveniles under 18 years old (47%), and black (22%).
Hate Crimes
Hate crimes were added to the UCR in 1990 with the passage of the Hate Crime Statistics Act. Hate crimes, also known as bias crimes, are defined as offenses committed against a person, property, or society that are motivated, in whole or in part, by the offender’s bias against a race, religion, sexual orientation, ethnicity/national origin, or disability (FBI, 2008). The agencies that participated in the FBI’s Hate Crime Statistics Program in 2008 covered about 89% of the U.S. population within 49 states and the District of Columbia.
Based on this UCR program for Hate Crime Statistics, a total of 7,783 hate crime incidents involving 9,168 offenses were reported by agencies in 2008. Nearly all of these incidents involved a single type of bias. The most common motivation reported in these incidents was bias based on race (51%), religion (20%), sexual orientation (17%), ethnicity or national origin (12%), and disability (1%). Most of the offenses in racial-bias hate crime incidents were motivated by anti-black bias (73%), and the next largest group within these incidents involved anti-white bias (17%). The majority of hate crimes based on religion were motivated by anti-Jewish bias (66%),whereas anti-male homosexual bias (59%) was the primary motivation for hate crimes directed at sexual orientation. Across all types of hate-crime bias, the majority of offenses involved crimes against persons (60%) rather than their property (39%). The criminal acts of intimidation (49%), simple assault (32%), and aggravated assault (19%) were the most common types of crimes against persons in these hate-motivated incidents.
Although hate crime statistics have been used and improved over the last two decades, there remains considerable doubt about the reliability of hate crime statistics for a variety of reasons. For example, the FBI emphasizes that the presence of bias alone is insufficient for determining that a crime is actually a hate crime, and they caution that a criminal incident should be reported as a hate crime only upon sufficient evidence from law enforcement agencies. Unfortunately, this principle does not guarantee that uniform coding is used across participating jurisdictions. For example, in the FBI’s 2006 report of hate crimes, one city (Washington, D.C., with 64) reported more hate crimes than at least 10 entire states, and the southern states of Alabama and Mississippi, both with long histories of racial tension, reported one and zero crimes, respectively (Fears, 2007; see also http://bjs.ojp.usdog.gov/content/pub/pdf/hcrvp ). In addition, about 84% of the agencies that participate in the UCR hate crime reporting program reported no such crimes in 2008. This concentration of hate crimes within a relatively small number of agencies is attributable to two possible explanations: (1) Hate crimes are highly concentrated within particular jurisdictions across the country or (2) classifications of hate crimes are unreliable and selectively used across jurisdictions. Given that both of these explanations are reasonable, estimates of the prevalence and nature of hate crime in the United States based on these data must be viewed with considerable caution.
Clearance Rates
One measure of the effectiveness of local law enforcement agencies in apprehending criminals is the clearance rate. Crimes are cleared by either an arrest of a suspect or by exceptional means when some element beyond the control of law enforcement (e.g., the death of a suspect, international flight) precludes them from making formal charges against the offender (FBI, 2008). As is often illustrated by the arrest of serial killers, the arrest of one person may clear several crimes. Alternatively, several people may be arrested in the clearance of one crime. Clearance rates represent the proportion of crimes known to the police that lead to arrest.
Clearance rates as reported in UCR data have varied over time, region of the country, and across the various types of Part I offenses (see Exhibit 3.10). According to the most recent UCR data, only 21% of the Part I offenses were cleared in 2008. Clearance rates were higher for violent crimes (45%) than for property crimes (17%). Among the types of violent crimes, murders had the highest clearance rate (64%), and the lowest rate was for robbery (27%). Larceny theft had the highest clearance rate among property crimes (20%), and the lowest rate was for motor vehicle theft (12%). The northeast had the highest clearance rates for all regions of the country for both violent and property offenses.
When examined over time, clearance rates for some crimes have decreased more rapidly than others. There has been more than a 25-percentage-point decline in clearance rates between 1960 and 2008 for murder and forcible rape in the United States. Wellford and Cronin (2000) noted that a number of factors affect clearance rates for homicide. For example, the probability of clearance increases significantly when the first police officer on the scene quickly notifies the homicide unit, medical examiners, and the crime laboratory and attempts to immediately locate witnesses and secure the area. The greater difficulty in identifying offenders because of the rise in the number of violent crimes involving strangers over the last two decades is another explanation for the declining clearance rates for these crimes. Smaller decreases over time are found in clearance rates for aggravated assault, burglary, motor vehicle theft,and robbery. The clearance rate for larceny-theft has remained at 20% across this time period (see Exhibit 3.10).
From the perspective of solving crime and distributing justice to offenders, the decline in clearance rates over time is troubling because it suggests that crime control efforts that focus on the punishment of offenders are increasingly ineffective due to the fact that a growing number of offenders are not subject to arrest. This situation is even more dire when one considers that there is considerable pressure on police departments to inflate their clearance rates for various political reasons. This inflation and distortion of clearance rates has long been recognized by critics of official estimates of crime (Kitsuse & Cicourel, 1963), but the actual nature and magnitude of these distortions in current police practices is largely unknown. Thus, a reasonable conclusion is that clearance rates provide only a gross representation at best of the potential solvability of crimes known to the police. Given differences in police recording practices and the nature of crime across different geographical areas, attempts to assess the effectiveness of various police departments by comparing their clearance rates for particular offenses is a questionable practice that has little or no scientific utility.
PROBLEMS WITH POLICE DATA ON CRIME
Police reports are often considered to be the best official measure of the nature and extent of crime. Compared to prosecutorial, judicial, and correctional data, police reports are more comprehensive in their coverage of types of criminal offenses and include information on criminal incidents even when the offender has not been identified. However, as a measure of the true extent of crime in a jurisdiction, police statistics are inadequate for several fundamental reasons. The major problems with police data involve variation in citizen reporting and police recording practices, possible race and social class biases in the structure of policing, limited coverage of crime types under UCR data, conceptual and methodological factors that affect the classification of crime incidents and estimates of national crime rates, and political manipulation and fabrication of these data by police departments and other reporting agencies.
Variation in Citizen Reporting and Police Recording Practices
As discussed earlier, the term dark figures has been widely used by criminologists to represent the gap between the true extent of crime and the amount of crime known to the police. The primary sources of this gap are the inability of police to observe all criminal activity, the reluctance of crime victims and witnesses to report crime to the police, and variation in the recording of known crime incidents due to police discretion.
Contrary to the image portrayed in crime dramas and media depictions of police work, the vast majority of crime becomes known to the police through citizen complaints or calls for service. In other words, police mobilization toward crime and its detection is largely because of a citizen complaint. If a member of the public fails to contact the police about a criminal incident they have experienced or witnessed, it will remain undetected in most cases. The magnitude of unreported crime vastly exceeds crime reported to the police.
The reasons victims and other citizens do not report most crimes to the police are wide and varied (see Hart & Rennison, 2003). Some victims lack trust in the police or have severe reservations about the ability of law enforcement officials to solve crimes. Some fear retaliation and reprisals from offenders for reporting crimes; others think it is not worth their while to report offenses because, for example, the property is uninsured and probably will not be recovered. The victims in some crime situations may also be involved in criminal activities themselves (e.g., drug sellers or prostitutes who are victims of robbery), which decreases their likelihood of reporting. Others may believe the incident was a private matter, that nothing could be done, or that it was not important enough. Public apathy and the desire to not get involved may underlie some witnesses’ reluctance to report offenses they observe. Regardless of the particular reasons for underreporting of crime by citizens, this reporting gap raises serious questions about the accuracy of police data as a valid measure of the prevalence of crime.
Even if a crime incident is reported by citizens or directly observed by the police, there is no guarantee that such an offense will be recorded in police data. In fact, police discretion both across and within jurisdictions in recording an incident as a crime is a major source of inconsistency in official counts of crime. In this context, the role of the police dispatcher can be crucial. Pepinsky (1976) found that the decision of patrol officers about whether to report offenses was determined by the nature of the calls they received from the dispatcher. Apparently, if the dispatcher named no offense in the call or dispatched the officer to check a victimless or attempted offense, the chances were practically zero that the officer would report an offense.
In his classic study of police-citizen encounters, Donald Black (1970) identified the following factors that determine whether an incident reported by citizens is formally recorded as a crime by the police:
Legal Seriousness of the Crime
Police are more likely to write up a crime report when the crime is more serious. Approximately 72% of the felonies but only 53% of the misdemeanors in Black’s study were written up as reports. This means that the police officially disregarded about one fourth of the felonies they handled.
The Complainant’s Preferences
When called to a crime scene, police often follow the wishes of the complainant. They almost always agree with the complainant’s preference for informal action (as opposed to arrest) in minor cases. When the complainant requested official police action, the police complied in the majority of both felony (84%) and misdemeanor (64%) situations.
The Relational Distance
Police are more likely to file an official report in cases involving strangers rather than friends or family members. Black (1970) asserted that the victim-offender relationship is more important than the legal seriousness of the crime in terms of whether an incident is officially recorded.
The Complainant’s Deference
The more deference or respect shown to the police by the complainant, the more likely it is that the police will file an official crime report. This pattern was found for both felony and misdemeanor situations.
The Complainant’s Status
Police are more likely to file an official report when the complainant is of higher social status. The effect of the race of the complainant on recording practices in Black’s study was unclear.
Differences in citizen reporting and police recording practices are also likely to vary by region of the country and by rural and urban jurisdictions. It is for these reasons that statistics on crime incidents are highly suspect for comparisons across jurisdictions.
Race and Social Class Biases in Policing
There is considerable evidence of racial and social class biases in street-level policing, which dates back to the earliest studies of police in the United States (see, e.g., Chicago Commission on Race Relations, 1922; Myrdal, 1944; Sellin, 1928). Irwin (1985) argued that a tendency on the part of police to characterize lower-class persons and blacks as disreputable and dangerous may lead them to watch and arrest such individuals more frequently than is warranted on the basis of their actual criminal involvement. Although focused more explicitly on socioeconomic status as opposed to race, Sampson (1986) provided further evidence of police bias in arrest decisions. In a study examining the police processing of juveniles in the Seattle, Washington area, Sampson found that for the bulk of offenses committed by juveniles, official police records and referrals to court were structured not simply by the act itself but by the socioeconomic and situational contexts of such acts. In addition, law enforcement officials apparently perceived lower-class neighborhoods as being characterized by a disproportionate amount of criminal behavior and accordingly concentrated their patrol resources in those “offensible space” areas (Hagan, 1994). As Smith (1986) suggested,
Based on a set of internalized expectations derived from past experience, police divide the population and physical territory they must patrol into readily understandable categories. The result is a process of ecological contamination in which all persons encountered in bad neighborhoods are viewed as possessing the moral liability of the area itself. (p. 316)
Under these conditions, it is possible that at least some of the difference between minority and white crime rates is the product of a differential police focus on minority groups (Mosher, 2001).
Limited Coverage of Different Crime Types
Police statistics on crime such as those developed under the UCR system are restricted to only a small class of criminal offenses. Most of these crimes involve street-level offenses that occur among individuals. UCR data do not measure federal crimes or political crimes, and they severely undercount organizational and occupational crime. Corporate crimes such as price-fixing and environmental pollution are simply not covered by these data, and occupational crimes such as thefts and frauds by employees are underrepresented in UCR data. Beirne and Messerschmidt (2000) contended that there are at least three reasons why the FBI focuses on crimes committed by the powerless: (1) the FBI recognizes the fact that crimes typically or exclusively committed by the powerful are difficult to detect, often covered up, and seldom reported to the police; (2) the FBI is insensitive to the plight of the powerless; and (3) the FBI is politically biased in favor of the powerful.
Conceptual and Methodological Problems
Police data on crime in the United States are also problematic as valid and reliable measures of crime prevalence because of several conceptual and methodological problems. As described in detail earlier in this chapter, the major conceptual problems involve the definition of certain crimes under the UCR and the classification of a particular offense under one of the included crime categories. Even with extensive coding and classification rules, counting and scoring decisions in practice are subject to multiple interpretations and potentially large inconsistencies both within and across jurisdictions. Basic methodological problems involve estimating population figures in order to calculate crime rates in noncensus years, sampling error, imputation and estimation procedures, and the application of the hierarchy rule and other conventions in cases of multiple crime incidents.
Estimating Population Figures to Calculate Crime Rates
The UCR calculates crime rates per 100,000 population; however, the most accurate counts of population are only available for census years (i.e., 1980, 1990, 2000, 2010, etc.). In noncensus years, estimates of the population are used to calculate crime rates; and if these estimates are inaccurate, then calculated crime rates will be similarly inaccurate. For example, Bell (1967) noted that 1949 crime rates for California, which were based on 1940 population figures, were grossly inflated because the state’s population increased by more than 3 million people over the decade. When the crime rate “automatically dropped … [in 1950] it was not due to sunspots or some other cyclical theory, but to a simple statistical pitfall” (p. 153).
Another example of estimation problems involves crime statistics for Illinois in 1999. In particular, the Illinois State Police report for 1999 underestimated the population of Chicago by tens of thousands of residents, which produced an inflated crime rate for the city. An Illinois sheriff whose county’s crime rate was overstated claimed, “Hell, they never could add. You get those fellows off of chipped roads and they get confused” (as quoted in Berens & Lighty, 2001).
Sampling Error and Participation Rates
Participation in the UCR is voluntary, and police departments are not under any legal obligation to report their crime data to the FBI. The reporting area covered by the UCR program has remained high since the late 1950s. For example, the national coverage rate was 93% in 1972, 95% in 1999, and 95% in 2008. Active participation in the UCR program is highest among law enforcement agencies in large metropolitan areas and is lowest in cities outside metropolitan areas.
Sampling error is a problem in any research when sample data are used to estimate and represent population values. Two general sources of sampling error and possible sampling bias are found in the UCR system. First, not all police agencies in the United States report crime data to the UCR program. If there are differences in the crime experiences of reporting and nonreporting agencies (as is suggested by the differences in crime rates and participation rates by urban and rural areas), this sampling error is actually sampling bias that may distort population estimates. Second, agencies that are defined as participating may not be providing complete crime data to the FBI. In fact, data from six states were excluded in the 1997 UCR because of erratic or nonreporting behavior. A study of reporting behavior covering the years 1992 to 1994 revealed that only 64% of law enforcement agencies reported crime for the entire 36 months; 17% were classified as partial reporting (i.e., 1 to 35 months of data) and 19% provided no reports (Maltz, 1999). Given these conditions of incomplete reporting on the part of law enforcement agencies, claims that UCR data represent more than 90% of the U.S. population are misleading.
Incomplete reporting under the UCR program is due to a wide variety of reasons. Some of these include (a) natural disasters that prevent state agencies from submitting their data on time, or at all; (b) budgetary restrictions on police and the cutback on services; (c) changes in the personnel who prepare local UCR data and their replacement with persons with less training, experience, or commitment to the program; (d) new reporting systems or computerization of old systems that may cause delays or gaps in the crime reporting process; (e) small agencies with little crime that may feel it is unnecessary to file monthly reports; and (f) incompatibility in state and UCR definitions, resulting in data being submitted by states but not accepted by the FBI (Maltz, 1999). Whatever the reason, incomplete reporting and nonreporting have obvious implications for the estimation of national trends in crime.
Problems with Imputation and Estimation
Problems related to sampling error and potential sampling bias are compounded when estimating arrest trends and the profile of persons arrested for crimes. As noted earlier, clearance rates vary widely according to the type of crime, hovering around 50% for violent crimes and around only 17% for property crimes in 2008. Given that the majority of offenders are not counted in arrest data, inferences about the typical profile of particular types of offenders from UCR arrest data also represent a type of sampling bias because some offenders (e.g., nonstrangers) are more easily identified by victims, and subsequently arrested, than others. Another problem with developing offender profiles from UCR arrest data is that arrests are a reflection of differential police priorities and enforcement practices, further contributing to the likelihood of qualitative differences between those arrested and not arrested for even the same type of offense.
Since 1958, the FBI has used two different methods of imputing crime data for police agencies that have incomplete data or that do not provide reports at all (see Barnett-Ryan, 2007). If a particular agency reports for three or more months in a given year, the total annual crime for the jurisdiction is estimated by multiplying the reported number of crimes by 12, divided by the number of months reporting (Maltz, 1999). This procedure implicitly assumes that the crime rate for nonreporting months is the same as for reporting months, which is a rather dubious assumption—especially given that research has demonstrated that property crimes generally peak in the fall and winter months and violent crimes peak in the summer months (Baumer & Wright, 1996; Hird & Ruparel, 2007). If, on the other hand, an agency reports for less than three months, the number of crimes in that jurisdiction is essentially estimated from scratch. Such agencies are considered to be nonreporting agencies, and the FBI estimates data for these jurisdictions based on crime rates for the same year for similar agencies. These similar agencies are defined as those in the same population size category in the same state but that provide 12 months of data. If there are no comparable agencies in the state, the estimate is based on rates of crime in the jurisdiction’s region.
Unfortunately, if the nonreporting agency is different from the “comparable” reporting agencies on crime-related correlates other than geographical location and size (e.g., income distribution, unemployment rates, population density, racial composition), the assignment of equal proportions of crime in each jurisdiction will distort the accuracy of these estimates. The fact that no two cities are alike in their economic opportunity, physical structure, history, and culture raises questions about this estimation approach. And although it is possible that inaccuracies in crime data that result from such estimation procedures may not be significant, the real problem is that there is currently no way of determining whether the estimation procedures produce major or minor discrepancies in crime data. As Maltz (1999) pointed out, such imputation can be especially problematic for crimes that vary according to season.
Alternative imputation methods have also been used with UCR data. For example, the process of conversion to the NIBRS program required the estimation of totals for some entire states. Unique estimation procedures are also required when yearly data for a particular jurisdiction are incomplete and in other situations (e.g., the inability of some state UCR programs to provide forcible rape figures in accordance with UCR guidelines). For these problems, the UCR program has used known data from other geographical areas in the same time period, regional data from the United States for that year (e.g., mountain states, west north-central division), or state totals from previous years to derive population estimates. Such extrapolations, however, are accurate only if trends in other jurisdictions or the same jurisdiction in previous years are representative of crime experiences in the nonreporting areas.
Although it is often overlooked by UCR-data users, the UCR program has relied extensively on extrapolations from other jurisdictions or other time frames for estimating national crime trends. The most recent UCR report (FBI, 2008) provides the following examples of major nonreporting and estimation practices over the last 10 years.
Several states over various years did not report valid UCR Part I offense counts. Crime trends from previous years or from other states in their geographic division were used to calculate estimates of current trends. These estimation procedures have been used in Kansas (1998–2000), Kentucky (1998–2003), Illinois (1998–2008), Maine (1999), Montana (1998–2000), New Hampshire (1998–1999), and Wisconsin (1998).
Some State UCR programs did not provide forcible rape figures in accordance with UCR guidelines. These states included Illinois (1998–2008) and Minnesota (2005–2008). Forcible rape totals were estimated for these states using national rates within eight population groups (e.g., cities with more than 250,000 population, cities with 100,000 to 249,999 population, suburban counties, rural counties) and then assigning counts of forcible rape proportional to each state’s distribution in these population groups.
From these examples of imputation of UCR data, it is clear that cross-jurisdictional and over-time comparisons must be made with considerable caution.
Political Manipulation and Fabrication
An additional limitation of official crime statistics involves their manipulation and fabrication for political purposes. For better or worse, police departments are evaluated to some extent on the basis of the volume of crime in their jurisdiction. The mass media, city and county commissions, local chambers of commerce that promote tourism in their “safe” city, elections for incumbent police chiefs and sheriffs, and the general public are sources of considerable pressure on police departments to provide a positive spin on the effectiveness of their crime fighting activities. Although Chambliss (1984) suggested that “other things being equal, it is in the interests of the police to prove an increase in crime [because] higher crime rates … mean increased budgets” (p. 176), the image of a rising crime rate is not generally good news for local businesses and police departments that are held accountable for these crime trends. Favorable crime statistics apparently make everyone happy. In the early 1970s, for example, several large police departments in the United States downgraded their crime rates “to create the illusion that the country is a safer place to walk at night because President Nixon’s anti-crime measures are working” (Justice Magazine, 1972, p. 1).
In another example of this manipulation, Seidman and Couzens (1974) identified a significant decrease in the number of larceny-thefts of $50 or more in one jurisdiction as a result of the installation of a new police chief who threatened to replace police commanders who were unable to reduce the amount of crime in their precincts. The importance of the $50 criterion was that larcenies of less than $50 were not reported to the FBI. Thus, simply by estimating the value of stolen goods to be slightly less than $50, it was possible for the police to reduce the official crime rate. Similarly, McCleary et al. (1982) noted that a significant decline in the number of burglaries in one jurisdiction was related to a change in police procedure whereby detectives, as opposed to uniformed police officers, investigated burglary complaints. When this experiment of using detectives was terminated 21 months later, the burglary rate in the jurisdiction increased again. In another example, a 72% increase in the number of major crimes in New York City from 1965 to 1966 was primarily due to a change in crime reporting; the actual increase was only 6.5% (Weinraub, 1967). In a perhaps even more disturbing example, in 1973, Orange City, California, based the pay of its police officers on decreases in crime. At least partially as a result of this change, the reported crime rates for rape, robbery, auto theft, and burglary dropped by 19% in this jurisdiction over a one-year period (Holsendorph, 1974).
Given that the police have exclusive control over the dissemination of crime data and that there is little monitoring of the accuracy of their crime counts, one obvious way to demonstrate effective law enforcement is to distort, manipulate, and fabricate the number and nature of crime reports. The claim of “cooking the data” has long been alleged against law enforcement agencies, and numerous incidents of police misconduct over the last few decades have increasingly challenged the integrity of law enforcement and have led to growing suspicion about the pervasiveness of cooking data across the country.
When submitting crime data to the UCR program, there are various ways for local agencies to distort and undercount crime incidents. The most basic methods for creative accounting through falsifying crime reports include the following:
Not reporting all crime incidents on monthly UCR submissions
Combining separate events as if they occurred in multiple-offense incidents and falsely using the hierarchy rule to undercount the total number of crime reports
Declaring large numbers of reported crimes as unfounded so they are not counted in UCR annual summaries
Downgrading major Part I offenses to minor offenses so they are not tallied nationally in the UCR summaries
The particular reasons or motives for the police manipulation of crime statistics are wide and varied. Economic interests and political posturing are sometimes the underlying cause of the artificial inflation of crime statistics by law enforcement agencies, whereas these and other reasons may be the basis for the undercounting of crime. The following examples illustrate both the diversity of motives and the magnitude of distortion and manipulation of crime statistics by law enforcement officials.
Crime Reporting in Philadelphia, Pennsylvania
Some of the most serious allegations of fabrication of crime statistics involve practices in the Philadelphia Police Department. The distortion and manipulation of crime statistics in this jurisdiction has grown out of a history of statistical manipulation that goes back for decades. In 1953, for example, Philadelphia reported 28,560 index crimes plus negligent manslaughter and larceny under $50, which represented an increase of more than 70% compared to 1951 figures. This tremendous increase in crime, however, was not due to an “invasion by criminals” (Bell, 1967, p. 152) but to the discovery by the new administration that earlier crime records had minimized the amount of crime in Philadelphia for a number of years. In fact, one district in the city had actually handled 5,000 more complaints than it had recorded (President’s Commission on Law Enforcement and the Administration of Justice, 1968). This distortion of crime statistics apparently continued; in 1970, Philadelphia, which was the fourth largest city in the United States, reported fewer index crimes than any other city among the 10 largest. In fact, Baltimore, which had less than one half the population of Philadelphia, reported more than 60% more index crimes in 1970 (Seidman & Couzens, 1974).
The two major forms of distortion that have been employed by the Philadelphia police in more recent years are the excessive use of “unfounded” and downgrading. It is estimated that literally thousands of sexual assault cases that occurred in the 1980s and 1990s in Philadelphia were buried by the sex-crime unit either by rejecting many of them as unfounded or by placing nearly one third of its caseload into noncriminal categories, such as “investigation of person” and other “throwaway categories” (Faziollah, Matza, & McCoy, 1998). When the high rates of unfounded sexual assaults were scrutinized, the sex-crime unit reported low rates for the next year simply by shifting these cases to “investigation of persons,” which are excluded in police summary data reported to the FBI.
A number of different types of downgrading have been used in Philadelphia to circumvent the counting of Part I offenses. The city has consistently had one of the lowest rates of aggravated assault of any large city because many of these attacks are classified as “hospital cases” or are downgraded to simple assaults, thereby being excluded from UCR data on serious crimes. Similarly, burglary is often downgraded to “lost property,” car thefts and break-ins are redefined as “vandalism,” and street muggings without injury (categorized as robbery in the UCR) are downgraded to the minor offenses of “threats.”
The manipulation of crime statistics in Philadelphia has been so notorious that dramatic actions have been taken to explore its source and curtail the practice. These procedures have included the auditing of police crime figures by the city controller’s office, the assignment of 45 detectives to reinvestigate more than 2,000 sex offenses that were downgraded over a five-year period, the appointment of an academic panel to develop yearly auditing procedures, and a formal inquiry by former U.S. Attorney General Janet Reno. The Philadelphia police commissioner in the late 1990s took several measures to increase the accuracy of police data, including the dismissal of district captains who were in charge of crime data and the use of undercover investigators posing as crime victims to determine whether police are recording the incidents accurately.
Although these corrective actions should improve the accuracy of police statistics in this jurisdiction, the impact of these presumed improvements in reporting practices on actual crime rates in Philadelphia is debatable. In 1998, the Philadelphia police department failed to report an estimated 37,000 UCR Part I crimes, but when these crimes were included, Philadelphia moved from the fifth to the second most dangerous city in the United States (“Numbers,” 2000). Not surprisingly, police officials in Philadelphia attributed this major increase in crime rates to more accurate reporting rather than a surge in violent behavior. However, by blaming rising crime trends on better reporting procedures, officials in this jurisdiction may be engaging in other forms of manipulation and creative writing to deflect attention away from ineffective law enforcement practices. It is within this context that both accurate and inaccurate reporting of crime may be functional for local police departments.
Crime Reporting in Atlanta
In addition to the situation with the Philadelphia police force, there is also evidence that police officials in the city of Atlanta manipulated crime statistics through the use of the unfounded option. Slightly more than one rape report per week was written off by the Atlanta police as never having happened in 1996 (Martz, 1998). These rape reports and nearly 500 robberies were quickly classified as unfounded and not counted in official crime reports.
By eliminating these serious crimes from official records, the Atlanta police department was able to make their city appear less violent than it actually was that year. City officials claimed that rapes had declined by 11% from 1995 (when including the unfounded rapes would have revealed a 2% increase) and that robbery rates had declined by 9% (instead of increasing by 1% when the unfounded robberies were included). The timing of this downgrading of violent crime was crucial because the Olympic Games were held in Atlanta in late August of 1996 and a mayoral election was also held that year (Martz, 1999).
As a result of a major public dispute between the Atlanta police chief and her deputy chief who managed the crime statistics section of the department, a joint state-federal FBI audit was conducted to assess the accuracy of crime reports in the city. The audit revealed that approximately 16% of the cases examined in the mid-1990s were improperly classified as unfounded, providing support for the deputy chief’s allegation that the department was manipulating crime reporting (Martz, 1999). Based on UCR policies, the reporting error rate for the most serious violent crimes was 26% in 1996.
As might be expected, the Atlanta police chief blamed the high error rate on confusion in the UCR classification rules for unfounded cases, rather than on the department’s deliberate manipulation of the crime data. However, a former robbery detective was quoted as saying that detectives were encouraged in subtle ways by supervisors to record particular types of cases as unfounded (e.g., homeless people, both suspects and victims who were drug users). This detective noted, “The system was set up to cheat a little bit, not to cheat in big numbers” (as quoted in Martz, 1999).
The Crime Drop in New York City
Substantial reductions in official crime rates in New York City from 1995 to 2000 have been attributed to aggressive and effective law enforcement practices by police officials. However, critics have alleged that the reduction is due to distortion and manipulation through downgrading cases as unfounded.
The claims of manipulation of crime data in New York City have been supported by surveys of retired New York Police Department captains and higher-ranking officials. More than 100 of these retired officials said that intense pressure to produce annual crime reductions led some supervisors and precinct commanders to make “ethnically inappropriate” changes to complaints involving the UCR’s Part I offenses that helped portray their precincts and the entire city as a safer place (Rashbaum, 2010). The NYPD’s CompStat program, which was implemented in 1995 and is used to provide close monitoring of crime trends within precincts, is often identified as a major impetus for political manipulation of crime data because it helped establish the idea that precinct commanders would be held accountable for the level of crime in their areas. By holding them accountable, the CompStat system provided both the means and the incentive for dubious and questionable crime reporting and recording practices.
The specific ways in which official crime data were allegedly manipulated in this city include downgrading of offense severity and altering the nature of the victim’s criminal complaint. For example, by reducing the value of items stolen (e.g., by selectively using the lowest price of an item on eBay or in a catalog), felony grand larceny (over $1,000) that should be recorded as a Part I offense could be downgraded to misdemeanor theft, which are not recorded as a Part I UCR crime. Retired senior officers also cited examples of precinct commanders or aides they dispatched going to crime scenes to persuade victims not to file criminal complaints or to modify their accounts in ways that could circumvent its classification as a Part I offense (Rashbaum, 2010). The following comment by the recording secretary of the New York Patrolmen’s Benevolent Association provides further insight into the ways in which crime numbers were manipulated in this context:
“You eventually hit a wall where you can’t push it down anymore. So commanders have to get creative to keep the numbers going down. So how do you fake a crime decrease? It’s pretty simple. Don’t file reports, misclassify crimes from felonies to misdemeanors, undervalue property lost to crime so it’s not a felony, and report a series of crime as a single event. A particularly insidious way to fudge the numbers is to make it difficult or impossible for people to report crimes—in other words, make the victims feel like criminals so they walk away just to spare themselves further pain and suffering.” (as quoted in Moses, 2005)
Although periodic reports of cooking crime statistics suggest that the validity of these police reports remain questionable, the actual magnitude of data manipulation or fabrication under NYPD’s reporting practices is largely unknown. However, over the last several years, NYPD Commissioner Raymond Kelly has implemented an auditing system to maintain the integrity of the crime reporting system in his jurisdiction. This greater scrutiny of crime reporting involves auditing every precinct’s books twice a year, correcting and revising crime statistics that derive from any errors discovered in this process, and holding personnel accountable with disciplinary actions when these errors are due to intentional manipulation. While these actions are laudable, the low visibility of many street-level police practices fall outside of the purview of these auditing reviews.
Crime Reporting in Other Jurisdictions
There are numerous other jurisdictions that have been identified in media outlets for their questionable practices in the reporting of crimes. These include the following:
• Reductions in the violent crime rates reported by the Los Angeles Police Department over the last decade have been called into question because of the undercounting of aggravated assaults. This has occurred primarily by the classification of physical batteries involving injury or threat of serious injury as domestic violence offenses when they occur in this context (Orlov, 2006). Given the prevalence of these types of aggravated assaults, this jurisdiction appears to have dramatically decreased its rate of aggravated assault and violent crime rate in general by recording these crimes outside of the UCR category of Part I offenses.
• Reviewing police and medical examiner records, the Detroit News (LeDuff & Esparza, 2009) contended that the Detroit Police Department was systematically undercounting homicides, leading to a falsely low murder rate for the city. Their review indicated that the police department incorrectly reclassified 22 of its 368 slayings in 2008 as “justifiable” so they did not report them as homicides under the UCR standards for murder and manslaughter. The investigative reporters also found at least 59 of these omissions over the previous five years.
• Downgrading in Boca Raton, Florida, resulted in an 11% decline in the felony crime rate in 1997. In that jurisdiction, a police captain downgraded crimes reported by investigating officers as burglaries and car thefts to vandalism and suspicious incidents. In one particular case, the captain changed a burglary charge to vandalism when $5,000 worth of jewels was taken and $25,000 in damage was done (Rozsa, 1998).
• The use of categories such as vandalism, trespassing, or missing property to downgrade residential burglaries has also occurred in smaller cities such as South Bend, Indiana (Sulok, 1998). Alternative strategies include delaying the submission of crime data until after elections, as has occurred in some cities. Political opponents allege that these delays are used to conceal potentially damaging crime trends, whereas the incumbents claim the delays are due to such factors as computer problems that prevent the timely release of the data.
• In 2008, a 22% increase in the most serious violent crimes was reported for England and Wales. However, Home Office officials attributed this increase to inaccurate record keeping—at least 13 of the 43 police forces in England and Wales had previously been classifying assaults involving grievous bodily harm with intent as less serious violent assaults (Travis, 2008).
• In 2006, Japan’s reported rate of intentional homicides was 0.44 per 100,000 (United Nations Office on Drugs and Crime, 2009), one of the lowest per capita homicide rates in the world. However, in that country, autopsies are performed in only 11% of all cases of unnatural death, and it has been alleged that law enforcement authorities discourage autopsies that might uncover a higher rate of homicide in their jurisdictions and exert pressure on doctors to attribute unnatural deaths to health reasons. As Wallace (2007) commented, “Odds are that people are getting away with murder in Japan.”
Official data can also be distorted through the peculiar practices of individual police departments with respect to some crimes. For example, between 1996 and 2000, Detroit had arrested far more people in homicide cases than any other big-city police department, reporting an average of nearly three arrests per homicide. Most cities average roughly one arrest per homicide. As a result of these practices, Detroit, with less than 2% of the population of the United States, accounted for 1 in 13 homicide arrests in the United States in 1998 and 1999. When questioned about these statistics, officials in Detroit claimed that they were the result of computer glitches or the arrests of people at homicide scenes on unrelated charges (Belluck, 2001).
The pattern of manipulation, distortion, and fabrication of official crime data is a serious problem that may be self-perpetuating. For example, the considerable media attention devoted to the declining crime rate in New York City has placed great pressure on other cities to report similar reductions in crime. If these data are generated through selective reporting practices, however, this may persuade other jurisdictions to use creative counting methods. Given these factors, decreases in the number of UCR Part I offenses over time may be more reflective of changes in police reporting and recording activities than changes in criminal activity in the larger society.
The Serial Killer Epidemic
During the early and mid-1980s, considerable media attention was focused on the apparent serial killer epidemic in the United States. Riveting television interviews with serial killers such as Ted Bundy and Henry Lee Lucas helped arouse public hysteria about these types of offenders. U.S. Justice Department officials, extrapolating from data in the UCR’s SHR, claimed that as many as 4,000 persons were murdered by serial killers in the United States each year.
Taking issue with these claims, Jenkins (1994) argued that Justice Department officials grossly inflated their estimates of the annual number of serial murders. This major counting error was the result of the rather questionable assumption that all or most of the SHR murders classified as “motiveless/offender unknown” were the work of serial killers. Jenkins concluded from his extensive analysis of serial killers that such offenders are responsible for no more than 350 to 400 murders in the United States each year.
The manipulation of official crime data to create the image of a serial killer epidemic served several organizational goals for the Justice Department. Specifically, this apparent epidemic provided an immediate justification for a new Violent Criminal Apprehension Program at a new center for the study of violent crime at the FBI Academy in Quantico, Virginia. The dramatic rise in the popularity of crime profiling was also initially based on this alleged serial killer epidemic.
Official Data on Juvenile Gang Crime
Official estimates of the number of youthful gang members and gang crimes have skyrocketed in the United States over the last three decades.
Agencies such as the National Youth Gang Center estimate from surveys of police departments that in the late 1990s, there were nearly 31,000 gangs and about 850,000 gang members in the United States (see Bilchik, 1999). However, because there is no uniform procedure for removing files of inactive gang members, law enforcement agencies’ estimates of the number and age range of gang members in their jurisdictions are very likely to be artificially inflated. In addition, political pressures to deny or minimize local gang problems, as well as the countervailing tendency to exaggerate them in order to secure monetary incentives to fight gangs, play a role in distorting the statistics on gang membership (Snyder & Sickmund, 1999).
In a study of whether the law enforcement response to gangs in Nevada was commensurate with the magnitude of the gang problem, McCorkle and Miethe (2001) found that police statistics on gangs are seriously distorted. For example, contrary to the image of dangerous youthful offenders portrayed in the media and other sources, these researchers discovered that a large percentage of gang members included on police gang rosters were adults, and a large percentage were individuals who had not been charged with any criminal offense. Instead, they were persons who had been “field identified” as gang members because of their associates, style of dress, race, and geographical location.
The official image of gangs as violent hordes with guns and selling drugs is also inconsistent with the substantiated prosecutorial data collected in Nevada. Specifically, McCorkle and Miethe (2001) found that less than 10% of violent crimes and drug offenses filed in Las Vegas courts involved gang members as suspects. Although official police statistics in Las Vegas indicated a dramatic rise in gangs and gang members in the 1980s, this presumed rise in gang activity over time was not validated by a rise in gang prosecutions.
The results of this study of youth gangs are interpreted as representing a “moral panic.” From this perspective, police used selective crime statistics and counting procedures to convey the notion of gang crime as a clear and present danger to the community. This presumed threat from youth gangs was more imaginary than real, but the police used their official statistics on gangs as part of a justification for additional financial resources to increase the size of the gang unit and to pass a special bond issue that provided for more police officers.
OFFICIAL CRIME DATA IN INTERNATIONAL CONTEXT
Given increasing globalization and concerns with examining crime trends crossnationally, it is instructive to examine the collection of official crime data in other countries. As such, in this section we discuss issues surrounding the collection of crime data by the International Police Agency (Interpol) and the United Nations through their world crime surveys.
Interpol Data
Interpol was established in 1923, and it has collected international crime statistics since 1950. Created in 1999, the European Law Enforcement Organization (Europol) also serves to facilitate the sharing of crime information among the countries of the European Union. Interpol’s first report was issued in 1954 and included data for only 36 countries; subsequently, reports were issued every two years and every year since 1993, with the inclusion of crime data on a greater number of countries.
As an international data source, Interpol reports are published in four languages (Arabic, English, French, and Spanish) and include data on murder, sex offenses, serious assault, theft, aggravated theft (robbery and violent theft and breaking and entering), theft of automobiles, fraud, counterfeit currency offenses, drug offenses, and the total number of offenses recorded in crime statistics of the member nations.
Although data are provided for multiple countries, Interpol reports provide cautions about the appropriateness of using these data for comparing crime rates across countries. The primary reason these comparisons are of dubious value is because the statistics do not account for definitional differences in crime categories across various countries, the diversity of methods used by different countries in compiling the statistics, and changes in laws or data collection techniques over time. In addition, as revealed from the International Victimization Survey (also see Chapter 5), the proportion of crime not reported to law enforcement varies substantially across nations and across different types of crime. These comparisons across countries are also adversely affected by the fact that countries with greater access to telephones in their households tend to report higher rates of crime, and countries in which household insurance is more available similarly report higher rates of crime. As a result, crossnational comparisons using Interpol data are most appropriate for crimes of extreme violence such as homicide, which are most likely to come to the attention of the police (Mosher, 2005a).
United Nations Crime Surveys
An additional source of international data on crime is the United Nations Crime Surveys (UNCSs). Data from these surveys are accessible through the United Nations Office on Drugs and Crime website (http://www.unodc.org). Early in its history, the UN expressed interest in the collection of criminal justice statistics at the international level, with resolutions concerning this issue at the UN Economic and Social Council meetings in 1948 and 1951. However, with the exception of one limited cross-national crime survey conducted over the 1937–1946 period, international crime data were not collected systematically until the early 1970s.
The more recent UNCSs were initiated in 1977, covering five-year intervals from 1970 (as of 2010, 11 UNCSs had been conducted.) The original rationale for collecting international statistics was to provide social scientists with additional data to examine the correlates and causes of crime, and the first UNCS received responses from 64 nations, providing crime and other data for the period between 1970 and 1975. The second UNCS covered the years 1975–1980 and represented an explicit change in purpose, with the emphasis moving from a focus on the causes of crime to issues surrounding the operations of criminal justice systems cross-nationally. The 11th UNCS covers the years 2007–2008.
The UNCS includes information from each responding country on the combined police and prosecution expenditure by year; the number of police personnel by gender; the total number of homicides (intentional and unintentional); the number of assaults, rapes, robberies, thefts, burglaries, frauds, embezzlements, and drug-related crimes; the number of people formally charged with crimes; the number of individuals prosecuted and the types of prosecuted crimes; the gender and age of individuals’ prosecuted; the number of convictions and acquittals; the number sentenced to capital punishment and other sanctions; the number of prisoners, the length of sentences they received, and prison demographics. However, it is important to note that these data are not complete for all countries responding to the surveys for all years.
Problems in Using International Crime Data
When making comparisons of official crime data across countries using Interpol and UNCS data, it is important to be aware of a number of problems with these comparisons. The first problem involves differences in the categorization of criminal offenses across nations. For example, the Netherlands has no category for robbery—an uninformed examination of data from that country might conclude that no serious property crime occurs. Similarly, in contrast to most other countries, Japan classifies assaults that eventually result in the death of the victim as an assault instead of a homicide.
Criminologists generally agree that homicide is the most similarly defined crime across nations, whereas rape and sexual offenses are likely the least similarly defined. However, even when comparing the nature and prevalence of homicides across countries, caution must be exercised. For example, although the UNCS collects information on the total, intentional, and unintentional homicides for each participating country, there are extreme differences in the proportion of homicides classified as intentional across countries, with a range from 10% to 100%. This clearly indicates the use of diverse criteria across countries in defining intentional and unintentional homicides. In the Netherlands, many relatively nonserious offenses are first recorded by the police as attempted murders (e.g., a situation in which the driver of a car almost hits a pedestrian). At least partially as a result of such coding, the Netherlands reported to Interpol in 1990 a homicide rate of 14.8 per 100,000 population, which, on the surface, makes it appear to have had a higher homicide rate than the United States in that year. Upon closer examination, however, over 90% of the offenses coded as homicide in that year were attempts.
Another problem with international comparisons involves the counting of crimes during times of internal conflict. In fact, the classification of casualties as homicides can be particularly problematic in nations that are experiencing war, rebellion, or serious civil and political conflicts. For example, Rwanda in the 1994 Interpol report showed a total of exactly 1 million homicides, which translates into a rate of 12,500 homicides per 100,000 population. Clearly, this figure for homicides in Rwanda, which is overly exact and incredibly high, includes deaths resulting from war and civil conflict in that country (Mosher, 2005a).
Errors in international crime data and its interpretation can also result from problems with coding and the calculation of rates. For example, Belgium’s recorded homicide rate in the 1994 Interpol report was 31.5 per 100,000 population, a figure that was 25 times higher than the figure for previous years. However, it turns out that a zero was left off the reported population of Belgium when calculating the homicide rate, resulting in a rate (which was based on a total of 315 homicides, 195 of which were attempts) calculated based on a population of 1 million rather than the actual population of 10 million.
Although these weaknesses with international crime data should be kept in mind, the danger in emphasizing the problems involved in comparing crime data across countries is that an inference could be made that nothing can be gained from gathering and analyzing such information. On the contrary, although these data cannot be reliably used to rank countries in terms of their levels of crime, they are appropriate for assessing the direction of change in crime over time and across nations.
As noted above, cross-national comparisons of crime are probably most appropriate for offenses such as homicide. Exhibit 3.11 provides data on homicide rates for 2008 for a selected group of countries, grouped by region, and indicates tremendous variation in these rates. In general, Latin American and Caribbean countries exhibit the highest homicide rates, with four of the countries in that region (El Salvador, Honduras, Jamaica, and Venezuela) having 2008 rates in excess of 50 per 100,000 population. European, North American, and Oceanic countries have comparatively lower homicide rates, with the Russian Federation, at 14.2 homicides per 100,000 population, having the highest rate among these nations. Although Iceland’s population was approximately 320,000 in 2008, it is notable that there were no homicides in that country in 2008. It is also notable that the United States 2008 homicide rate of 5.2 per 100,000 is considerably higher than that of Canada and most other western industrialized nations.
SUMMARY AND CONCLUSIONS
This chapter examined police statistics on the nature and prevalence of crime. We discussed the definitions of criminal conduct underlying the UCR classification system, the problems associated with classifying and scoring crimes under this system, the nature and prevalence of crime based on official measures, and the major limitations of police data as an accurate measure of crime.
Throughout the process of reporting and recording official instances of crime, criminal definitions are socially constructed. In other words, each official count of crime requires some amount of interpretation and negotiation. Under the widely regarded UCR system in the United States, a crime report becomes part of the official data only after surviving the following five decision points: (1) someone must perceive an event or behavior as a crime, (2) the crime must come to the attention of the police, (3) the police must agree that a crime has occurred, (4) the police must code the crime on the proper UCR form and submit it to the FBI either directly or through their state data collection agency (who must also correctly submit it to the FBI), and (5) the FBI must include the crime in the UCR (Beirne & Messerschmidt, 2000, p. 39).
Of the many problems associated with police statistics on crime, charges of political manipulation and fabrication of these statistics are particularly insidious because they challenge the basic integrity of the data. Although more extensive auditing and monitoring may improve data quality, the processing of police crime data remains largely unavailable for public scrutiny and thus continues to be susceptible to creative accounting methods that may serve political ends. Under conditions of growing distrust of statistical data and numerous allegations about the downgrading of offenses, UCR claims regarding declining national crime rates and the characteristics of offenders derived from these data may best be viewed as tentative estimates that are rooted on rather shaky grounds.