"Reflect, Select, Deflect:" Proxies, ‘Numeric’ Screens, and the Dangers of Partial Vision
Bruce Bowles Jr.
Texas A&M University – Central Texas
One of the persuasive risks that we face in the information age…is that even if the amount of knowledge in the world is increasing, the gap between what we know and what we think we know may be widening.
–Nate Silver, The Signal and the Noise
A lower-income single mother applies for a loan and is rejected in spite of having done everything possible to make all of her payments on time. Confused, she struggles to understand the convoluted explanation she receives from the company as to why her credit history is not sufficient. Without this loan, she is going to continue to struggle financially and might not be able to make her next car payment, further exasperating her credit issues. A life-long Democrat who just moved to Texas is excited to attend campaign rallies for the next Presidential Candidate. However, he is disappointed when it becomes painfully obvious that the candidate has no intention of visiting his district nor of addressing the concerns he—and people similar to him—have about health care, the economy, etc. A teacher nearing retirement just discovered she might not be able to retire. Her retirement fund, which she believed was invested in low-risk securities, has suddenly plummeted, sustaining losses she was never even told to anticipate. An African-American man that resorted to selling marijuana to make ends meet is denied early parole. Having been a model prisoner, he struggles to comprehend why he is not an optimal candidate to receive parole and restart his life.
Although not visible on the surface, Big Data (extensive data sets that are quite complex, frequently requiring computers to mine them for information), algorithms, and/or actuarial risk assessments are connecting all of these major life events. These occasions that will have profound effects on you, your friends, and your loved ones are influenced to varying degrees by computers and advanced quantitative methods. And yet, the majority of the population is either unaware of the influence of these metrics or—even more alarmingly—accepts them uncritically. This is, I would contend, because these metrics adhere less precisely to the old adage “Numbers can lie!” Hopefully, I will demonstrate that these number don’t, in fact, lie at all. Instead, they operate in a more subversive fashion—they only tell part of the truth, providing incomplete images of reality that, unfortunately, frequently come to represent our entire understanding of reality.
The utopian visions Big Data creates reflect an objectivist and rationalist worldview that obscures a seemingly obvious, yet often forgotten, component of computer programming and advanced quantification methods. Even if a machine analyzes the data, the methods of analysis, formulas, assessment metrics, etc. are still created by humans.1 In essence, the execution of the processes is purely quantitative, objective, and free from bias; the creation of the processes, however, is still a social and subjective endeavor, encapsulating—and rationalizing—human bias in profound ways. Yet the former is what gets presented while the latter is perilously ignored.
The Basic Recipe for “Numeric” Screens
Before we can address the adverse consequences that arise from these partial truths, it is important to know how the processes of data analysis are rendered invisible and come to reflect incomplete pictures of reality. First, Big Data, algorithms, and actuarial risk assessments become what Bruno Latour refers to as black boxes. A black box occurs “…whenever a piece of machinery or a set of commands is too complex,” which leads to the processes used to create it becoming obscured, no longer pertinent (2). As a result, only the data fed in and the outputs generated matter (Latour 2-3). Essentially, everyone is aware of what the black box does, but few people are aware of how it accomplishes its tasks. The obscuring of the processes involved in these calculations might seem unlikely and highly unscientific. Yet, as Latour notes, “There is a simple reason for this: in the very process of their construction they disappear from sight because each part hides the other as they become darker and darker black boxes” (253). Once these black boxes are created, the metrics and formulas that drive them are rendered purely rational and viewed as complete reflections of reality; the consequences that then result—whether positive or negative—are viewed merely as a total reflection of objective reality rather than a singular vantage point in a much larger narrative. However, although these black boxes do indeed reflect reality, they also select reality and deflect reality by the same methods by which our language creates what Kenneth Burke refers to as terministic screens. The reflection of reality is far from complete or perfect. This results from “…the fact that any nomenclature necessarily directs the attention into some channels rather than others” (Burke 45).
While Big Data analysis and complex formulas do not rely on written language in the conventional sense, they nonetheless operate with terminologies in the form of what mathematician Cathy O’Neil refers to as proxies. Proxies are used when scientists and statisticians lack direct data related to the construct(s) they wish to measure. Instead, they substitute data that reflects only a small sliver of the construct under examination and/or demonstrates a strong correlation with the construct (17). The choice of these proxies, I contend, creates numeric screens. Numeric screens serve as filters for interpretations of data, allowing for certain interpretations of the data while omitting others. Thus, these numeric screens determine what aspects of reality are given focus while obscuring others, resulting in an “objective” view of reality that—even more so than traditional terminology—obscures the social processes by which it is created and other alternative narratives that can be, and often are, equally valid. These numeric screens have catastrophic consequences since they are more difficult to critique and frequently are rendered invisible to those not directly involved, whether through a lack of awareness or through companies receiving protection from critique of these processes via the guise of proprietary information.
This article is not the first attempt to “sound the alarm,” so to speak, on what Cathy O’Neil has termed WMDs (Weapons of Math Destruction) (3). Data scientists, mathematicians, lawyers, civil-rights activists, etc. have been issuing these warnings for years now. Instead, this article seeks to show the dangers of these numeric screens (especially resulting from what gets deflected), emphasize a call for transparency as to these practices, and promote a collaborative—rather than adversarial—solution to the problem. Rather than trying to eradicate these practices, I suggest, we should place them in conversation with other methodological and epistemological approaches to refine such practices from both validity-based and ethical standpoints.
Embracing multiple vantage points, or screens, allows for a more complete—even if not all-encompassing—vision, as the screens can overlap in order to unmask assumptions while demonstrating how certain methodologies can be advantageous in certain situations yet problematic in others. Rather than viewing quantitative and qualitative methodologies as natural enemies, they can be seen—when combined—as providing a more holistic vision. Various partial perspectives can lead to a more comprehensive view of the entirety of a phenomenon.
The rise of the empirical is inevitable; the future we create, however, is not. Critical collaborations have the opportunity to allow multiple people of various expertise to enter into these discussions, adding valuable critiques and insights in order to advocate for more nuanced uses of the empirical. We must remember, as baseball statistics guru Bill James has noted,
Statistical images simplify the real world… but the problems come when people insist that the real world is as simple as the statistical picture. There’s an argument that sabermetricians make the world too complicated, but that’s not necessarily true. They may actually be doing the opposite: oversimplifying the complex. (qtd. in Gray 49)
Although Big Data, algorithms, and actuarial risk assessments are indeed complex, the results they often produce reflect a simplistic view of reality, selecting for an uncomplicated version of reality while deflecting nuances and intricacies. Critical collaboration can aid in analyzing these complexities, in understanding the nuances behind these interpretations and the decisions made from them.
Numeric Screens Under the Microscope
In Language as Symbolic Action, Kenneth Burke draws upon photography to provide an initial introduction to his concept of terministic screens. He reflects upon how the same photograph, placed through various different color filters, produces remarkably different perspectives on, and experiences of, the same object. Furthermore, he equates this to the analysis of dreams, noting how the psychological lens applied to a dream—whether it be Freudian, Jungian, or Adlerian—will produce drastically different interpretations of the same dream (45-46). For Burke, this is a natural consequence of the nature of language and symbolism, since “Even if any given terminology is a reflection of reality, by its very nature as a terminology it must be a selection of reality; and to this extent it must function also as a deflection of reality” (45, emphasis original). The language we use naturally reflects reality in a particular fashion, privileging and emphasizing certain aspects of reality while obscuring—and at times rendering invisible—others. Like the filters for photographs, or the choice of psychological theories through which to analyze dreams, the terminology we use creates its own filter for how we see and come to know.
This is not a defect of human reasoning nor a mere product of flawed assumptions and biases. According to Burke, “We must use terministic screens, since we can’t say anything without the use of terms; whatever terms we use, they necessarily constitute a corresponding kind of screen; and any such screen necessarily directs the attention to one field rather than another” (50, emphasis original). The same can be said of various forms of quantitative analysis. While the terminology we choose to express “if a=b and b=c, then a=c” is not reliant on the symbols we choose to express a formal logical principle (in this instance, any three symbols would accomplish the same objective), much of quantitative analysis is not subject to purely objective reasoning such as this. As previously mentioned, in order to measure complex constructs, mathematical proxies frequently stand-in for the construct and are analyzed in relation to one another. Thus, while a perfect prediction of an individual’s ability to repay a loan of a set amount is not possible, proxies such as credit history, debt-to-income ratio, length of employment, etc. are used to assess the probability an individual will or will not default on a loan. When used ethically and responsibly, proxies can allow for accurate, although not entirely perfect, predictions about an individual based on data collected from the larger population.
The process is not perfect, however, and can become highly skewed and flawed. Through the selection and omission of proxies, as well as the privileging of certain proxies over others, particular screens are created. Much like terminology, the proxies chosen reflect certain aspects of reality, selecting particular facets for focus while deflecting others. Thus, these proxies create their own numeric screens which privilege certain interpretations. What is considered objective, scientific, and an all-encompassing review of reality is actually not immune to ideology. Kenneth Burke aptly surmises this tendency, contending that,
Not only does the nature of our terms affect the nature of our observations, in the sense that the terms direct the attention to one field rather than to another. Also, many of the “observations” are but implications of the particular terminology in terms of which the observations are made. In brief, much that we take as observations about “reality” may be but the spinning out of possibilities implicit in our particular choice of terms. (46, emphasis original)
Mathematical proxies are no different. The selection of certain proxies over others, attentiveness to particular contextual influences while ignoring other important facets of context, and the statistical methods that are entailed in these formulas (e.g. sample size, weighting, etc.) influence the observations that will be made by these formulas. In many ways, the results generated by quantitative analysis are merely “implications” of the particular proxies chosen. Rather than being purely objective reflections of an external reality, “…models, despite their reputation for impartiality, reflect goals and ideology” (O’Neil 21). The selection and application of proxies to represent a particular phenomenon are not inherently neutral; they are predicated upon human judgment and decisions. Reflecting on her work in academy and her time in the financial sector, Cathy O’Neil was astonished by how “…many of these models encoded human prejudice, misunderstanding, and bias into the software systems that increasingly managed our lives” (3). Once encoded, unfortunately, they frequently are rendered invisible, operating with a cloak of neutrality (Latour’s black boxes) given credence by society’s consistent celebration of the empirical.
By equating knowledge with information, Big Data, algorithms, actuarial risk assessments, etc. conflate the information inputted into and generated by these mathematical models with the social processes used to create these models. While the information inputted and generated by these mathematical models is—in the narrowest sense of the term—indeed objective, the social processes used to create the results are anything but. Yet the social becomes “objective” when it is quantified; the numeric screens created are even more damaging since their quantitative nature masks the subjectivity inherent within them. Disguised as “pure science,” they are capable of wreaking havoc in a devastating, yet less noticeable, fashion. They appear to merely reflect reality, while obscuring what they select for and—even worse—deflect. Rather than increasing social equality, financial stability, and justice, as we will see, they frequently have quite the opposite effect.
Reflect, Select, Deflect: Numeric Screens in Action
The adverse consequences of such numeric screens are not limited to just a few fields nor do they only affect the occasional person. They are actually quite ubiquitous throughout society, yet their influence is not always transparent and understandable. Instead, they lurk beneath such practices as lending, political campaigning, risk assessment, and even the criminal justice system, exerting profound influence behind the scenes. In each of the following examples, I will demonstrate how these numeric screens often accurately reflect reality, but also subsequently select for certain facets of reality (in convenient and subversive ways) while deflecting facets that should be of vital concern to the constructs under examination and/or the consequences of using such numeric screens.
Credit by Prox(imit)y
As previously noted, credit scores rely on proxies to determine whether a borrower is likely to repay a loan. Although credit reports and the scores predicated upon them can provide a degree of objectivity in lending practices, they are vast black boxes in many instances. Even when consumers gain access to these reports, they are provided with a wealth of data to sift through and often are given no insight into how the data was used to calculate the actual score (O’Neil 152).2 Furthermore, these black boxes—in many instances—are further used as proxies for hiring, insurance rates, etc. A person’s ability to make payments on time and repay debts is viewed as a proxy itself for her/his trustworthiness and responsibility. Thus, the scores become proxies in determining whether someone will be hired and the rates they will pay for a variety of services. As a result, a person’s credit score is a significant factor in determining her/his ability to obtain capital in order to pursue upward social mobility.
The use of credit scores in hiring practices does reflect reality to some degree. There can be strong correlations between an individual’s credit rating and that individual’s trustworthiness and responsibility, allowing for employers to—in theory—select for more promising and reliable candidates. Yet, this selection results in a subsequent deflection. In particular, the use of credit scores in hiring deflects the individual contextual nuances for each candidate as well as the social consequences of such a practice. In regard to the latter, if individuals struggle to obtain employment as a result of their credit scores, they can then struggle to repay loans. This can result in an endless feedback loop that justifies itself exponentially—the poor struggle and see their credit scores plummet, making it more difficult to obtain credit as well as a job necessary to repay debts.
This inequality gets even worse since these practices can affect rates for services such as car insurance. Often, credit scores serve as proxies to determine the rates a consumer will pay. While a seemingly fair and objective method, when it comes to auto insurance, credit scores as proxies can, in certain instances, carry more weight in determining an individual’s payment than a drunk driving conviction (O’Neil 165). Oddly, the numeric screens become blind to their own processes and determine credit history to be a more valid proxy than drunk driving convictions. With only the faintest amount of common sense, one can determine which will be a more accurate predictor of the liability an insurance company will face.
The worst might still be on the horizon, however. Kreditech, a company based in Germany, currently solicits information from loan applicants about their social media networks. Additionally, FICO has been active in Russia and India using cellphone data to determine if consumers are accurately reporting where they live and work as well as the social networks they belong to and the credit histories of friends and acquaintances within borrowers’ social networks (Waddell). Proximity has become a proxy. As data scientists and communication researchers danah boyd, Karen Levy, and Alice Marwick contend, “In the most visible examples of networked discrimination, it is easy to see inequities along the lines of race and class because these are often proxies for networked position. As a result, we see outcomes that disproportionately affect already marginalized people” (56). These algorithms may reflect certain inconvenient truths about society—for instance, that social position within networks can be highly predictive of a person’s financial stability and credit history. However, they can also select for candidates that are untrustworthy and/or financially irresponsible but are affiliated with reliable people while deflecting candidates who are reliable and financially responsible yet are associated—whether through friendship or mere geographic proximity—to others who are not. Even more alarmingly, the use of these scores as proxies for people is not random. The algorithms are more frequently applied to lower-income and minority applicants, meaning that “The privileged…are processed more by people, the masses by machines” (O’Neil 8).
One Person, One Vote?
While Barack Obama’s 2008 presidential campaign employed social media and grassroots fundraising to propel the then long-shot candidate to victory, his 2012 campaign emphasized a different approach—Big Data. During the 2012 presidential election, President Barack Obama’s campaign manager, Jim Messina, used advanced statistical analysis as a significant part of the campaign strategy. The Obama campaign employed data mining to determine where to distribute advertising resources, to identify specific districts that would be crucial to victory, and even to figure out which types of appeals would work best with certain types of voters. On top of that, they essentially simulated the election every night to determine their chances of winning certain states and, in the end, how to allocate resources to maximize their chance of winning (Scherer). It was referred to by many as the “Moneyball”3 campaign and, as journalist Michael Scherer claimed, signaled the fact that “In politics, the era of big data has arrived.”
This trend continues on. In an effort to increase the effectiveness of campaign communications and strategies, data-savvy campaigns construct models that produce predictive scores in three major categories: behavior scores, support scores, and responsiveness scores (Nickerson and Rogers 54). Behavior scores use past political behavior to determine voters’ likelihood of voting, donating to a campaign, volunteering, attending rallies, etc. Support scores rely on predicting voter preferences for certain candidates and issues. These scores are frequently calculated using smaller sample sizes in order to project the support of the larger population. Lastly, responsiveness scores attempt to predict how voters will react and engage with various methods of campaign outreach (Nickerson and Rogers 54).
Although using such data can increase campaigns’ effectiveness and target more people for political involvement, the numeric screens created by such practices have two major drawbacks. First, they tend to rely on one of the most accurate proxies for voting in an election—voting in past elections. Since a plethora of issues that affect voter turnout can be attributed to race and class—political disillusionment, voter ID laws, the inability to get transportation to and from the polls, being unable to get off work to vote, etc.—the use of such data can lead to more attention being paid to those already politically active than to those who are struggling to have their political voice heard (Nickerson and Rogers 58). While the data reflects the nature of reality, that those who have voted in the past are more likely to vote in the future, it also selects for these individuals while deflecting the concerns and needs of those who are not currently politically engaged. This is somewhat an inherently rational approach, yet it can lead to further feelings of disenfranchisement for people already facing the aforementioned struggles in regard to political engagement.
The second major issue is that the emphasis placed on voters most likely to be persuaded, at the expense of voters who are already persuaded—as well as the focus on key districts and states—leads to major discrepancies in the amount of attention voters receive and how much their votes are deemed to matter.4 When campaigns explicitly target swing voters in swing states and districts, the opinions and issues of these people can receive greater emphasis and importance than those of voters that will most likely not affect the outcome of the election. The center-left democrat in Ohio thus becomes more important than the center-left democrat in Texas; her/his vote is more likely to influence the election.
When Big Data is able to select these voters with precision, the reflection of reality is not always accurate. Since these algorithms select for impactful voters, thus giving the issues that matter to them and their overall concerns additional political credence, they also deflect less “important” voters, devaluing the issues and concerns of those who are unlikely to have an impact. The reality created can begin to skew political discourse towards the opinions and values of undecided voters.5 A small segment of the population is given an overwhelming majority of the political attention;6 democracy winds up reflecting the reality of those fortunate enough to have their votes be deemed of importance
Terministic and Numeric Screens Can Burst Bubbles
The collapse of the housing market in 2008 resulted in one of the direst financial crises in American history. While both scholarly and political debate has and continues to analyze a variety of causes for the crisis, the role of the credit rating agencies is viewed as a significant contributing factor (Griffin and Yongjun Tang; O’Neil; Silver). Research by John Griffin and Dragon Yongjun Tang has demonstrated that, from January 1997 to March 2007, only 1.3% of AAA rated Collateralized Debt Obligations (CDOs) closed actually adhered to Standard & Poor’s default probability standard; furthermore, they report that in 92.4% of cases, these CDOs were actually meeting the AA rating default standard (1296). Since the formulas used to determine risk for these bundled mortgages are usually black boxes, with the formula closely guarded and unavailable for scrutiny, the true risk of these CDOs was never known. The construction of these formulas, and the manner in which they deflected the potential for economic disaster carried within them, contributed to absolute economic chaos.
However, in The Signal and the Noise, FiveThirtyEight founder and editor Nate Silver makes a compelling argument that the housing crisis was not just a result of flawed math—it was also the result of a flawed understanding of terminology. Silver articulates the manner in which this flawed assumption was a matter of a failure to distinguish between risk and uncertainty. Risk, according to Silver, is more mathematically precise than uncertainty. With risk, while the final result of a particular endeavor cannot be perfectly predicted, the probability of a positive or negative result can. Silver uses poker as an apt metaphor, noting how the probability of drawing to an inside straight after the flop in Texas Hold’em is precisely 1 in 11. If a player has a hand that can defeat an inside straight, and her/his opponent with the inside straight draw has no other “outs” (cards that could fall that would produce a winning hand for the opponent), it is in the player’s best interest to pursue the hand, even if there is a chance of losing (Silver 29). The reward clearly outweighs the risk. Uncertainty, however, is less easily calculated. In addition, the severity of a loss is not always clear. Thus, determining the potential gains and losses becomes a much more inexact science. It’s equivalent to playing Texas Hold’em with not just the money you placed on the table at risk, but potentially the money in your bank account and your home as well.
This, in Silver’s estimation, is what led to the catastrophic consequences of the housing market bubble burst. Silver contends that “The alchemy that the ratings agencies performed was to spin uncertainty into what looked and felt like risk. They took highly novel securities, subject to an enormous amount of systemic uncertainty, and claimed the ability to quantify just how risky they were” (29-30). By focusing on risk, the rating agencies reflected a worldview predicated on a higher level of certainty than they actually had. This deflected the uncertainty that was inherent in the housing market, selecting for risk when uncertainty was the more apt terminology.
Calculations of risk allow for a precise understanding of the potential negative consequences; calculations of uncertainty are always murky, with the negative consequences being subject to a high degree of variance. Such variance is dangerous from an economic standpoint as the losses can be much greater than anyone can predict. After all, “Risk greases the wheels of a free-market economy; uncertainty grinds them to a halt” (Silver 29). A simple difference in the terminology that fueled the logic behind these risk assessments made all of the difference in the world. Without the screen of uncertainty, a major part of the entire story was left untold. Language, embodied in numeric screens, toppled economies worldwide.
And Quantitative Justice for All?
In an effort to combat rising incarceration rates, and the subsequent high costs that emerge as a result, many states are now using actuarial risk assessments to determine the likelihood of recidivism when deciding if an inmate should be eligible for parole. Currently, over 60 different actuarial risk assessments are in use throughout criminal justice systems across the country (Barry-Jester et al.). Similar to actuarial work in the insurance industry, these risk assessments rely on a variety of proxies derived from data on particular group characteristics in order to make predictions about an individual’s chance of recidivism. As a result of the plethora of instruments that exist, there is no definitive formula; however, common factors taken into consideration include age, gender, place of residence (i.e. urban vs. suburban/rural setting), educational attainment, work history/current employment, the criminal history of one’s family, and prior arrests and convictions, the last of which already plays a primary role in sentencing (Starr 804-5). The use of such risk assessments, especially in relation to sentencing, has sparked controversy, as many scholars and those affiliated with the criminal justice system are uncomfortable with allowing the probability of future crimes, and statistical evidence predicated on group—not individual—behavior, to influence parole and/or sentencing of individuals.
In The New Jim Crow: Mass Incarceration in the Age of Colorblindness, Michelle Alexander contends that “What has changed since the collapse of Jim Crow has less to do with the basic structure of our society than with the language we use to justify it” (2). The language of choice, in these instances, is numeric; by relying on “objective” data to justify granting or denying parole, as well as potentially sentencing offenders to shorter or longer sentences, such risk assessment tools appear ideologically neutral. Thus, any racial disparities are viewed as merely the result of racial differences in levels of criminality. However, the aforementioned proxies used in these actuarial risk assessments are already ideologically biased against minorities. The construction of the numeric screen thus serves to reflect an apparent racially neutral reality by selecting for proxies that support this thesis while deflecting the racial bias possessed within these proxies.
Prior arrests and convictions, along with family history of incarceration, provide possibly the most significant forms of racial discrimination existing within these actuarial risk assessments.7 Furthermore, these work in a reciprocal fashion, further fueling one another in order to discriminate against African Americans and other minorities. The problem here has to do with the prevalence of the offense in relation to the prevalence of arrests and convictions, especially when it comes to drug charges. According to Human Rights Watch, from 1980 to 2007, African Americans were arrested on drug charges every year at rates relative to population that ranged from 2.8 times to 5.5 times higher than Whites (1). However, drug usage rates across race remain relatively even regardless of the drug in question, with only occasional discrepancies for particular types of narcotics (National Survey on Drug Use and Health, 2011).
In Los Angeles, for instance, Ian Ayers’ analysis of police records indicated that for every 10,000 residents, 3,400 more African Americans are stopped than Whites; furthermore, African Americans are 76% more likely to be searched, 127% more likely to be frisked, and 29% more likely to be arrested. Yet, intriguingly, African Americans are 25% less likely to be found with drugs. Numerous studies have yielded similar results in other cities. Thus, actuarial risk assessments are overwhelmingly going to score African Americans as a higher risk when it comes to prior arrests, convictions, and a family history of incarceration since the criminal justice system is more actively seeking to find African Americans committing crimes. These risk assessments are blind to racial prejudice; they merely make predictions based upon the data they are presented. And therein lies the biggest problem of all; the less likely that African Americans are granted parole and the more likely they are to receive longer sentences, the more the data used in these actuarial risk assessments will further skew against African Americans and other minorities. This is what Bernard Harcourt refers to as the ratchet effect.
The ratchet effect cannot be fully understood in relation to actuarial risk assessments and recidivism without first understanding the semantic trickery of the term recidivism. Recidivism is generally referred to as the likelihood that a prisoner will reoffend after release, yet data on recidivism does not, in fact, track the likelihood that a prisoner will reoffend. Instead, it tracks the likelihood that the prisoner will be caught reoffending. And, if you look for African Americans disproportionately in relation to Whites, African Americans are more likely to be caught. Thus, as Bernard Harcourt so aptly surmises, “Criminal profiling, when it works, is a self-confirming prophecy. It aggravates over time the perception of a correlation between the group trait and crime. What I call a ratchet effect could be called a ‘compound’ or ‘multiplier’ effect of criminal profiling…” (154-156). When more African Americans are arrested in spite of no differences in levels of criminality amongst races, their chance for recidivism increases; the more recidivism amongst African Americans, the more the proxies skew against them. In the end, the proxies exasperate their own calculations, further increasing racial disparities in parole and sentencing. Judges and lawyers are capable of noticing these trends and accounting for context—actuarial risk assessments cannot.
Similar to the manner in which Kenneth Burke’s terministic screens can produce observations that are merely implications of the terms in which the observations are made, numeric screens can produce observations that are merely implications of the proxies selected. The data inputted into these actuarial risk assessments is racially biased, yet the quantitative nature masks this. These actuarial risk assessments reflect the racial disparities in the criminal justice system while actively selecting minorities under the guise of the empirical. The racial biases and prejudices that produce this data, though, are deflected, not seen as consequential to the calculation. In essence, they reflect the truth that African Americans and other minorities are arrested at higher rates yet obscure the truths as to why this is the case. Under the veil of empirical science, the racist ideologies are kept hidden. In the words of Michelle Alexander, “It is the genius of the new system of control that it can always be defended on nonracial grounds, given the rarity of a noose or a racial slur in connection with any particular criminal case” (103). Once raw data and quantitative analysis enter into the discussion, the ability to question racial bias in the criminal justice system becomes even more difficult as it gets lost in the black boxes of these actuarial risk assessments.
The Limits of Big Data and Embracing Multiple Screens
Although the perils of numeric screens are vast, a complete rejection of Big Data is equally as unproductive. Empiricism offers a way to see patterns that may go unnoticed through observation, to understand complex phenomena that may be more difficult to analyze through qualitative and/or theoretical means, and—when used ethically and responsibly—a check on potential human biases that can percolate such endeavors. The problem with Big Data is not Big Data itself; rather, the damaging effects of Big Data emerge when its methods and results are viewed as a direct reflection of reality, when the numeric screens become the only screens through which we see.
Professional baseball provides an illustrative example of the potential productive uses of Big Data. When sabermetrics—baseball’s version of Big Data and advanced statistical analysis—was first introduced into the sport, a tension emerged between this new quantitative approach and the more qualitative endeavors of scouts and experts. The tension, however, has slowly dissipated over time as scouts have become aware of what they can see through statistics that they might not be able to through observation, and sabermetricians likewise have had their statistical models critiqued and improved via consultations with scouts and experts. Most winning teams now rely on blending the two approaches, with each providing a check for the other in order to create more comprehensive and accurate models (Silver 99-101). The answer lay in an intermingling of the two epistemological approaches—data analysis and observation.
Furthermore, as Cathy O’Neil has observed, “baseball represents a healthy case study… Baseball models are fair, in part, because they’re transparent. Everyone has access to the stats and can understand more or less how they’re interpreted” (17). This transparency has allowed for critique of the various models employed—whether by other sabermetricians or by scouts and experts. This consistent feedback loop enables the metrics used to be constantly improved through the addition of new data and/or examination of the assumptions that comprise the models. Working collaboratively, sabermetricians, scouts, coaches, managers, and even players can constantly question and improve the models, with the best metrics allowing for competitive advantages.
Thus, two pillars of using Big Data, algorithms, and actuarial risk assessment productively, ethically, and responsibly are transparency and critical collaboration from other epistemological vantage points. The two are directly interrelated, yet transparency has to emerge before we can take advantage of the benefits of critical collaboration. Overall, the main vehicle for understanding the dangers posed by Big Data, algorithms, and actuarial risk assessment has been reverse engineering. Essentially, researchers have been able to roughly discern how these formulas are being constructed by analyzing their results and consequences. Yet, all too often, the actual formulas are not open and accessible. Under the guise of proprietary information, or through overwhelming complexity, a veil of secrecy purveys these formulas, creating elaborate black boxes that few actually comprehend.
Rhetoricians and other experts across a variety of disciplines need to enter into this debate and insist on transparency. These metrics can deny people credit, either emphasize or deemphasize their roles in the political process, put individual investments and the economy at risk, and determine what people, and what areas, are defined as criminal and further influence whether people in those areas are paroled and the length of sentences they receive. With such dire consequences, we need to advocate for the rights of those acted upon rather than those doing the acting. Once access is allowed to these statistical methods, critical collaboration can emerge.
The failures of the empirical have everything to do with the vantage point it claims. Such approaches rely upon what Donna Haraway refers to as the “god trick,” claiming a form of vision that is all-encompassing and purely objective, capable of “seeing everything from nowhere” (581). For Haraway, such an approach is flawed, since it fails to account for location and perspective. Big Data in credit reporting can fail to see the individual as well as omit pertinent contextual information critical to informed decisions; political analysis predicated upon the empirical tends to obscure certain voters in favor of others while also failing to promote political engagement by all; risk assessments for investments can see everything through the lens of risk without accounting for uncertainty; data scientists can fail to see how the inputs for their actuarial risk assessments encode racial prejudices.
Within the realm of Big Data, there is often a rather heavy focus on information. It is assumed we use information to create knowledge, that it is knowledge’s raw material; however, these two frequently become conflated. John Seely Brown and Paul Duguid suggest that, “Knowledge’s personal attributes suggest that the shift toward knowledge may (or should) represent a shift toward people... Focusing on knowledge... turns attention towards knowers” (120-121). For Brown and Duguid, an overemphasis on information is problematic since it sees information as the solution to problems and, in many cases, when searching for a solution to a problem people will simply throw more information at it (3). Essentially, Brown and Duguid propose, “Increasingly, as the abundance of information overwhelms us all, we need not simply more information, but people to assimilate, understand, and make sense of it” (121). Lost in the sea of information is the vantage point of those analyzing the information. The knowledge is viewed as implicit in the information, with the knowledge merely needing to be extracted. This avoids analysis of the processes and methods used to extract this knowledge. These processes, however, determine the reality that is reflected by selecting for certain proxies, data points, and assumptions while deflecting others. The knowledge is not implicit within the information; the manner in which it is extracted can have significant implications on the knowledge produced.
More people and viewpoints is the more productive solution, not more information. As Haraway theorizes, “…objectivity turns out to be about particular and specific embodiment and definitely not about the false vision promising transcendence of all limits and responsibility. The moral is simple: only partial perspective promises objective vision” (583). The key is not to disregard empirical vantage points in favor of the qualitative and/or theoretical. Rather, by merging various vantage points, a more complete picture can emerge. Each epistemology can offer unique insights while providing valuable critiques of the others.
Each of these instances requires a shift from information towards knowledge and knowers, an acknowledgement of the limitations of partial perspectives as well as the benefits of multiple perspectives, and—most importantly—a shift away from determining the validity of such empirical practices based solely on their predictive abilities (which are often not as great as purported) towards an examination of the consequences they produce. Pamela Moss, Brian Girard, and Laura Haniford promote a view of validity in assessment that accounts for the soundness of the decisions made and the consequences those decisions produce:
Validity refers to the soundness of those interpretations, decisions, or actions. A validity theory provides guidance about what it means to say that an interpretation, decision, or action is more or less sound; about the sorts of evidence, reasoning, and criteria by which soundness might be judged; and about how to develop more sound interpretations, decisions, or actions. (109)
Approaching Big Data, algorithms, and actuarial risk assessments from the vantage point of interpretations, decisions, and actions is crucial. Through transparency and critical collaboration, we can come to understand how certain interpretations are produced, critique the manner in which decisions are made, and ensure that our actions are not only productive, but also ethical and just. The amount of data we collect, and the sophistication of the empiricism used to analyze it, are only a fraction of the entire picture. The choices we make in regard to how to use this wealth of information—the quality of our interpretations, the soundness of our decisions, and the results of our actions—will make all the difference. Paradoxically, quantity might just be key here; it’s better to see from multiple perspectives rather than just one.
About the Author
Bruce Bowles Jr. is an Assistant Professor of English and the Director of the University Writing Center at Texas A&M University – Central Texas. He holds a Ph.D. in English (with a concentration in Rhetoric and Composition) from The Florida State University. His research interests focus on writing assessment (in particular responding to student writing, classroom assessment, the consequences of assessment, and the epistemological underpinnings of assessment practices), writing center pedagogy and administration, and the intersection between rhetoric and public discourse. He currently lives in Copperas Cove, TX, with his lovely wife Erin, his amazing son Steven (2), and the family’s newest edition, his wonderful daughter Adelyn (1 month).
Notes
1. At times, with machine learning, a computer might develop the formula, metric, etc. Still, a human originally programmed the computer to learn, to value data, to look for correlations, etc. in a particular fashion.
2. This wealth of information is also frequently not guarded well, as is evidenced by the data breach Equifax sustained in July of 2017.
3. This moniker references the 2003 book Moneyball: The Art of Winning an Unfair Game by Michael Lewis which chronicled the Oakland Athletics’ use of sabermetrics (baseball research predicated upon Big Data) in order to construct a contending team with a small budget. The book was later turned into the movie Moneyball starring Brad Pitt as Oakland Athletics’ general manager Billy Beane.
4. The degree to which these discrepancies are also a natural result of the Electoral College in presidential elections, and of political gerrymandering in congressional and state legislative elections, is an important factor. However, such conjecture is well beyond the scope of this article.
5. Albeit, Hillary Clinton’s failure to account for voters in areas considered to be “locks” in the 2016 presidential election—most notably in the states of Michigan and Wisconsin—may have cost her the election.
6. The attention being paid to the Obama-Trump voters by the Democratic Party after the 2016 election provides an apt example. While debates have been raging in Democratic circles as to how much these voters should be attended to, it is becoming rather apparent that their concerns will most likely have a significant influence on Democratic strategies in 2018 and 2020. Although pragmatic, this does obscure tried-and-true Democratic voters who might become disillusioned with the lack of attention the Democratic Party is granting them.
7. For more information on the manner in which employment history/ability to obtain employment can be problematic, see Marianne Bertrand and Sendhil Mullainathan’s article based on the famous University of Chicago Study “Are Emily and Greg More Employable Than Lakisha and Jamal? A Field Experiment on Labor Market Discrimination,” as well as the work of Devah Pager in “The Mark of a Criminal Record” and Pager’s collaborative endeavor with Bruce Western and Naomi Sugie in “Sequencing Disadvantage: Barriers to Employment Facing Young Black and White Men with Criminal Records.”
Works Cited
Alexander, Michelle. The New Jim Crow: Mass Incarceration in the Age of Colorblindness. The New Press, 2012.
Ayers, Ian. “Racial Profiling in L.A.: The Numbers Don’t Lie.” Los Angeles Times, 23 Oct. 2008.
Barry-Jester, Anna Maria, et al. “Should Prison Sentences Be Based On Crimes That Haven’t Been Committed Yet?” FiveThirtyEight, 4 Aug. 2015.
boyd, danah, et al. “The Networked Nature of Algorithmic Discrimination.” Data and Discrimi nation: Collected Essays. Ed. Seeta Peña Gangaharan, Virginia Eubanks, and Solon Barocas, Open Technology Institute, 2014, pp. 53-57.
Brown, John Seely, and Paul Duguid. The Social Life of Information. Harvard Business School Press, 2000.
Burke, Kenneth. Language as Symbolic Action: Essays on Life, Literature, and Method. University of California Press, 1966.
Decades of Disparity: Drug Arrests and Race in the United States. Human Rights Watch, 2009.
Gray, Scott. The Mind of Bill James: How a Complete Outsider Changed Baseball. Doubleday, 2006.
Griffin, John, and Dragon Yongjun Tang. “Did Subjectivity Play a Role in CDO Credit Ratings?” The Journal of Finance, vol. 67, no. 4, 2012, pp. 1293-1328.
Haraway, Donna. “Situated Knowledges: The Science Question in Feminism and the Privilege of Partial Perspective.” Feminist Studies, vol. 14, no. 3, 1988, pp. 575-99.
Harcourt, Bernard. Against Prediction: Profiling, Policing, and Punishing in an Actuarial Age. The University of Chicago Press, 2007.
Latour, Bruno. Science in Action: How to Follow Scientists and Engineers Through Society. Harvard University Press, 1987.
Moss, Pamela, Brian Girard, and Laura Haniford. “Validity in Educational Assessment.”Rethinking Learning: What Counts as Learning and What Learning Counts, special issue of Review of Research in Education, vol. 30, 2006, pp. 109- 62.
Nickerson, David, and Todd Rogers. “Political Campaigns and Big Data.” The Journal of Economic Perspectives, vol. 28, no. 2, 2014, pp. 51-73.
O’Neil, Cathy. Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy. Crown Publishing, 2016.
Quick Tables: National Survey on Drug Use and Health, 2011. United States Department of Health and Human Services, Substance Abuse and Mental Health Services Administration, Center for Behavioral Health Statistics and Quality.
Scherer, Michael. “Inside the Secret World of the Data Crunchers Who Helped Obama Win.” Time, 7 Nov. 2012.
Silver, Nate. The Signal and the Noise: Why So Many Predictions Fail—But Some Don’t. Penguin, 2012.
Starr, Sonja. “Evidence-Based Sentencing and the Scientific Rationalization of Discrimination.” Stanford Law Review, vol. 66, no. 4, 2014, pp. 803-72.
Waddell, Kaveh. “How Algorithms Can Bring Down Minorities’ Credit Scores.” The Atlantic, 2 Dec. 2016.