Cities with a higher incidence of a certain kind of racist tweets report more actual hate crimes related to race, ethnicity, and national origin, a new study of 532 million tweets indicates.
Researchers analyzed the location and linguistic features of tweets from between 2011 and 2016 and trained a machine learning model—one form of artificial intelligence—to identify and analyze two types of tweets: those that are targeted—directly espousing discriminatory views—and those that are self-narrative—describing or commenting upon discriminatory remarks or acts.
The team compared the prevalence of each type of discriminatory tweet to the number of actual hate crimes reported during that same time period in those same cities.
“We found that more targeted, discriminatory tweets posted in a city related to a higher number of hate crimes,” says Rumi Chunara, an assistant professor of computer science and engineering at the New York University Tandon School of Engineering and biostatistics at the College of Global Public Health.
“This trend across different types of cities (for example, urban, rural, large, and small) confirms the need to more specifically study how different types of discriminatory speech online may contribute to consequences in the physical world.”
The analysis included cities with a wide range of urbanization, varying degrees of population diversity, and different levels of social media use. The team limited the dataset to tweets and bias crimes describing or motivated by race, ethnic, or national origin-based discrimination.
The Federal Bureau of Investigation categorizes and tracks hate crimes. Crimes motivated by race, ethnicity, or national origin represent the largest proportion of hate crimes in the nation. Statistics for sexual orientation crimes were not available in all cities, although the researchers previously studied this form of bias.
The group also identified a set of discriminatory terms and phrases that commonly used on social media across the country, as well as terms specific to a particular city or region. These insights could prove useful in identifying groups that may be likelier targets of racially motivated crimes and types of discrimination in different places.
While actual Twitter users generated most tweets included in this analysis, the team found that bots generated an average of 8 percent of tweets containing targeted discriminatory language.
A negative relationship existed between the proportion of race/ethnicity/national-origin-based discrimination tweets that were self-narrations of experiences and the number of crimes based on the same biases in cities.
While experiences of discrimination in the real world are known psychological stressors with health and social consequences, the implications of online exposure to different types of online discrimination—self-narrations versus targeted, for example—need further study, Chunara notes.
The findings represent one of the largest, most comprehensive analyses of discriminatory social media posts and real-life bias crimes in this country, the researchers say, although they emphasize that the specific causal mechanisms between social media hate speech and real-life acts of violence needs exploration.
Chunara presented the paper at the Association for the Advancement of Artificial Intelligence Conference on Web and Social Media in Munich, Germany.