Blogs | Information democracy | Open internet and inclusive technology | Freedom of expression online | Platform regulation | Profiling practices

Can we rely on machines making decisions for us on illegal content?

By EDRi · February 26, 2020

While automation is necessary for handling a vast amount of content shared by users, it makes mistakes that can be far-reaching for your rights and the well-being of society.

Most of us like to discuss our ideas and opinions on silly and serious issues, share happy and sad moments, and play together on the internet. And it’s a great thing. We all want to be free to learn about new things, get in touch with our friends, and reach out to new people. Every minute, photos, videos, and ideas are being shared. Every single minute, 527 760 photos are shared on Snapchat, 4 146,600 videos watched on YouTube, 456 000 tweets shared, and around 46 740 photos posted on Instagram. Do you know how many minutes we have in one day? 1440.

These pieces of information are different in nature. Some of them are home videos, and the law has nothing to do with them. But there is content that clearly breaches the law, such as child abuse material, or incitement to violence. And between legal and illegal content, there is a third group, which some people find harmful, while others have no problem with it. Although not illegal, but some parents would like to avoid their children getting access to pornography at the age of 12. It is not easy to define and let alone categorise what is harmful and for whom. It depends on culture, age, circumstances, and so many other factors.

Because a large quantity of internet content is hosted by online platforms, they have to rely on automated tools to find and tackle different categories of illegal or potentially harmful content. In particular, dominant players such as Facebook and Google have been using monitoring and filtering technologies for identification and removal of content. Do we agree on removing child abuse materials by allowing the police to investigate crimes? Certainly. Are we against permanent deletion of the content that would otherwise serve as vital evidence and documentation of gross human rights abuse as well as war crimes? Absolutely.

The EU, together with some Member States, has been continuously pushing online platforms to swiftly remove illegal or potentially harmful content, such as online hate speech or terrorism, often under the threat of fines if they don’t act fast enough. To meet these demands, tech companies have to rely on automated tools to filter out information that should not go online.

While automation is necessary for handling a vast amount of content shared by users, it makes mistakes that can be far-reaching for your rights and the well-being of society.

Contextual blindness of automated measures silences legitimate speech

Automated decision-making tools lack an understanding of linguistic or cultural differences. Content recognition technologies are unable to assess the context of expressions accurately. Even in straightforward cases, they make false matches. In 2017, the pop star Ariana Grande streamed her benefit concert “One Love Manchester” via her YouTube channel. The stream was promptly shut down by YouTube‘s upload filter, which wrongly flagged Grande’s show as a violation of her own copyright. On a more serious note, the same automated tools removed thousands of YouTube videos that could serve as evidence of atrocities committed against civilians in Syria, potentially jeopardising any future war crimes investigation that could bring war criminals to justice. Because of their contextual blindness or, in other words, inability to understand users’ real meaning and intentions, they flag and remove content that is completely legitimate. Thus, journalists, activists, comedians, artists, as well as any of us sharing our opinions and videos or pictures online risk being censored because internet companies are relying on these poorly working tools.

They’re not a silver bullet

These technologies are sometimes described as “Artificial Intelligence” (AI), a term that conjures up notions of superhuman computational intelligence. However, nothing of the sort exists, nor is it on the horizon. Instead, what this term refers to is advanced statistical models that have been trained to recognise patterns, but with no actual “understanding” or “intelligence”. Content recognition technologies cannot understand the meaning or intention of those who share a post on social media or the effect it has on others. They merely scan content for certain patterns such as visual, verbal, or audio files, which correspond to what they have been trained to identify as “hate speech” or “terrorist content”. There is no perfect, unambiguous training data, and so their ability to recognise these patterns is inherently limited to what they have been trained to recognise. Although they can achieve very high levels of accuracy in identifying unambiguous, consistent patterns, their ability to automate the very sensitive task of judging whether something constitutes hate speech will always be fundamentally limited.

Understandably, governments want to show their citizens that they are doing something to keep us safe from terrorism, hate speech, child abuse, or copyright breach. And companies are very happy to sell their automation technologies as a silver-bullet solution to politicians desperately digging for a simple answer. But we have to keep in mind that no automation will solve the problems deeply rooted in our society. We can use them as a tool to lessen the burden on platforms, but we need safeguards that ensure that we don’t sacrifice our human rights freedom because of poorly trained automated tools.

So who should decide what we see online? Read the next installment of this series in the next issue of the EDRi-gram newsletter.

Access Now
https://www.accessnow.org/

How much data do we create every day? The mind-blowing stats everyone should read (21.05.2018)
https://www.forbes.com/sites/bernardmarr/2018/05/21/how-much-data-do-we-create-every-day-the-mind-blowing-stats-everyone-should-read/

A human-centric internet for Europe (19.02.2020)
https://edri.org/a-human-centric-internet-for-europe/

Automation and illegal content: Can we rely on machines making decisions for us? (17.02.2020)
https://www.liberties.eu/en/news/automation-and-illegal-content-article-1/18746

Automation and illegal content: can we rely on machines making decisions for us? (17.02.2020)
https://www.accessnow.org/automation-and-illegal-content-can-we-rely-on-machines-making-decisions-for-us/

(Contribution by Eliška Pírková, EDRi member Access Now, and Eva Simon, Civil Liberties Union for Europe)