(Heatmap of users tweeting the N word in the US, from the Geography of Hate project at Humbolt University)
In Do Artifacts have politics? Langdon Winner identifies that certain technologies are democratic or autocratic regardless of the intent of the creators of the technologies. The most well known example that Winner uses to illustrate his point is a set of overpasses that were made in Long Island in the 1930s. These overpasses were purposely built so low that public buses could not pass under them. These had the desired effect that public transportation could not reach certain beaches which were mostly used by rich white folks. The road infrastructure itself excluded access by people of color and working class white people. Even though there are many laws against racial discrimination it would take millions of dollars to change the overpasses so that the road infrastructure entrenches an aspect of racism and classism.
With the recent racist Google Maps hack, the question of racial bias in Big Data and racism in algorithms has come to the fore again. However the Google Map example is closer to the case of consciously rigging the data so that it produces certain results which is more akin to google bombing. Consider the case of Latanya Sweeney which triggered the debate regarding racial bias in algorithms. Dr. Sweeney is an African American and a Professor at Harvard University as well as the director of the Data Privacy Lab at Harvard. Dr. Sweeney observed that search for tradional African American names on Google return suggestions for looking at the person’s criminal record but that was not the case of traditional Caucasian names. This created some ruckus in the media and even Google stepped in and apologized after stating that there was nothing intentional on its part. While Google has fixed this problem, the same cannot be said for other search engines like Yahoo! Here is the screen grab of the search results for Latanya Sweeney with the ad for criminal records highlighted here.
Dr. Sweeney is not the only person at Harvard in her department whose name triggers this ad but it is only triggered by her name and the other African American faculty members in the department. Here is the other example.
If one searchers for the other faculty members in her department then one does not get such links in the ads. Much has been written about this issue in the past and it was supposedly fixed. It seems that Google is not the only one that needs to fix this issue in its backyard. People of color face this issue on a regular basis.
Another simple but telling example is that of searching traits of people. The auto-suggest function in Google reveals what similar search terms people have search for in the past. The following two examples of a religious and a racial group speak for themselves.
Even the Daily Show highlighted the issue last year with their segment on racism. What is going on over here? These searches reveal more about the population of users in Google. More often than not it is by accidental auditing that one discovers these flaws in technological systems. People who argue against any sort of tweaking in the algorithms argue that the algorithms are a mirror of reality. What this aphorism fails to acknowledge is that it is not the physical reality bur rather the social reality that we are mirroring. Social reality by its very definition inherently flawed and biased towards one group or the other/
Adding “objectivity” to any algorithm or systematic analysis would add bias not because the analysis itself or the algorithm that is used to analyze the data is biased but rather the systems (e.g., law enforcement departments or the judicial system) that generate the data have bias. Consider the scenario where African Americans are more likely to be incarcerated for a particular offense but other people are less likely to be charged. Over time the data will show higher crime rate and incarceration rate for African Americans even though it is the bias in the system itself that is leading to this state of affairs. Any algorithm or other type of analysis will reveal this observation. The bias will remain until and unless we add explicit conditions to check against other rates of incarceration for the same crime committed for other groups or segment the data.
Such type of biases can also carry over to other domains like recommendation systems. Consider the infamous case of admissions decisions at the St. George’s Hospital Medical School. On the surface, the idea of having an unbiased system that uses past data of admittance to make decisions about the future makes sense as it would not have the same bias as human beings in decision making. However just the fact that we are using past decisions which could have been made by biased people does not reduce the bias. It rather perpetuates the bias because if minorities were left out in the past because of some systematic bias then even the “unbiased objective” algorithm will be making the same biased decisions.
One way to reduce the bias and tackle this issue is algorithmic auditing. Consider the following illustrative example. Based on historical transaction data a targeted advertisement campaign targets 1,000 users. All of these happen to be white even though the algorithm is using the history of click, usage and purchase patterns to determine which users should be targeted. A question arises, is the algorithm being racist? At a fundamental level of course not because there is nothing explicit in the algorithm that states that it should target or not target a particular set of people. It is the bias in the system (judicial, educational, governmental etc) that leads to the production of data which is then fed into the algorithm.
Why stop here? In the future we may have a scenario where some people who want to introduce automation even in jury decisions and a judge’s decision. Just imagine the result if historical data is used by such a system or algorithm to make its decisions without any tweaking or conditions. We may end up with a scenario where this efficient computing judiciary is as biased if not more biased as compared to its human counterparts. Crime prediction and sentencing thus has the potential to be a socially divisive issue. The flipside of racial bias in data is that one can also use big data to point towards systematic bias in the system. To sum up the argument one can say while algorithms and data may not be racist.