16.09.24 Discussion

Algorithms – Reputation – Karma: Why fake news is taking over the world and defeating quality information

Lev Gershenzon
Founder of The True Story
Lev Gershenzon

The global Internet infrastructure, which includes various media outlets (both those duplicating offline sources and those operating solely online), social networks and search engines, has become the primary source of information about what is happening in the world for people worldwide. However, the accessibility and global nature of the information flow have had an unexpected social effect. Producers of fake news, disinformation, myths and propaganda of various kinds are taking advantage of the new model to promote their content as effectively – or sometimes even more effectively – than creators of quality content. The most ridiculous fabrications and myths can reach a vast audience in a very short time. As a result, the new global information infrastructure of the Internet is making a significant contribution to the rise of populism and political polarisation trends worldwide.

Humanity has yet to fully comprehend the new mechanisms of information delivery and their side effects, as well as the role played by various elements of this new infrastructure, particularly the algorithms of search platforms and social networks. These algorithms are largely ‘responsible’ for the visibility of certain information, its audience reach, and the speed of its dissemination.

Lev Gershenzon, the creator of the first Yandex news aggregator and former head of Yandex.Novosti (2005-2012) and founder of The True Story service, believes that the rigid dichotomy between freedom of information and censorship is false. The issue is not with 'fake news' itself but with the mechanisms that promote it. Gershenzon proposes reconsidering the approach to combating false information. He suggests creating a mechanism of 'information karma' that would automatically provide users with data about the reliability of the source or the entity sharing the information. The algorithms of search platforms and social networks have, to a large extent, devalued the reputation of traditional media outlets, which once served as markers of information quality and structured the flow of information. An 'information karma' code, integrated into the algorithms of networks and platforms and generated automatically, could help restore their reputation. It would act as a counterbalance to the short-term benefits of clickbait by introducing long-term reputational consequences.

‘Heavy’ and ‘light’ news and the yellowing picture of the world

The main feature of the Internet as an information delivery mechanism is the diversification of channels and the crucial role of intermediaries. The significance of a news story is determined by how many people will see it. Between the publisher and the consumer lies an infrastructure of intermediaries, and it is this infrastructure that ultimately determines how wide an audience a piece of information will reach. As a result, algorithms and the characteristics of social networks, and even more so the algorithms of search platforms, directly and significantly influence the quality of information that reaches consumers. Social networks and search platforms, however, play slightly different roles in this process.

The audience of quality media outlets is often highly dependent on traffic coming from Google search results and its related recommendation service, Google Discover. In contrast, entertainment content, as well as provocative content that often includes misinformation (fake news), relies less on these algorithms, as it spreads through social networks and celebrity influencers. These are two distinct pathways through which information reaches consumers. The downgrading of authoritative sources in search results and recommendation feeds leads to misinformation gaining more prominence, becoming less balanced by serious content.

The problem became apparent in March of this year, when Google updated its core algorithm. Media outlets that rely heavily on visibility in search results were affected globally. The consequences were especially severe for Ukrainian media: quality publications lost between 30% to 70% of their daily audience and up to 80% of their advertising revenue. The purpose of the update, of course, was not to target Ukrainian media, but to 'lighten' the feed, making it less anxiety-inducing and more entertaining, reducing the presence of 'heavy' news. However, in Ukraine, it was quality journalism that suffered: the country is at war, and all political and social processes, as well as news topics, are inevitably tied to it.

At the same time, a similar downgrading of serious content was happening on social networks. Meta, which owns Facebook, decided to spare users from 'heavy' political content, which essentially meant news content, by making changes to its algorithm. In early 2024, Chartbeat analysed 1930 news websites and found that in December 2023, Facebook accounted for 33% of referral traffic from social media to these sites, down from 50% a year earlier. In some countries, Meta’s decision was tied to government policy. For example, in 2023, after Canada passed a law requiring platforms to pay Canadian media for their content, Facebook imposed a ban on sharing news for Canadian users. Earlier, in 2021, for similar reasons, Meta introduced the same ban in Australia. However, the Australian government soon realised this was not the best outcome of regulation, leading to an agreement with the social network, and Facebook lifted the ban.

These examples show that shifts in search engine or social media algorithms lead to significant changes in the accessibility of certain types of content for consumers, which, in turn, alters their 'worldview'. Algorithms aim to 'lighten' this worldview, noticing that 'heavy' content is in less demand. However, fake news and sensationalism often slip through this filter, and as a result, the content recommended by search engines or approved by social networks becomes increasingly 'yellow', or sensationalised.

Algorithms at the service of autocracies and fake news

If an algorithm can downgrade one type of content – 'serious' content – then fake news, conversely, can rise to the top thanks to the same algorithm. The hypothesis, supported by observational experience, is that a piece of content that demonstrates explosive growth in a short time – meaning it generates many comments and interactions such as 'reactions', 'shares', etc – receives a boost from the algorithm. A rapid increase in engagement (virality) signals to the algorithm that the text or video has strong growth potential, so it should be shown to more people.

Thus, if a content creator has resources, networks, and even better – physical users who can act in a coordinated manner under certain instructions – it is not difficult to influence the content distribution process. Platforms have mechanisms to track such coordinated activity, but content creators are also evolving. It is clear that creators and distributors of content continue to spend enormous budgets to manipulate platform algorithms.

For example, in Germany, the AfD (Alternative for Germany) party has a larger audience on TikTok than all other parties combined. However, AfD is not the most popular party in the country, nor is it the most popular among young people. But AfD knows how to work with this platform, and as a result, its content is far more visible than one might expect based on the party’s natural level of popularity.

Some nearly inexplicable phenomena occur in this area. If we examine which websites receive the most referrals, we find not only propaganda sites at the top but also seemingly random ones – sites that, based on their niche themes or content quality, should not logically rank so high. For example, for some time, the number one site in terms of referrals from Google in the Russian Internet segment was the regional Murmansk site ‘hibiny.ru’, with over a million daily visits – ten times more than the newspaper Kommersant and four times more than RIA Novosti. Hibiny.ru features very little political content; it mainly covers local events, sports, and entertainment. In Google Discover, the site appears not with regional news but with 'general interest' materials, often unrelated to Russian topics. It is unlikely that the site's managers are the best at manipulating Google’s algorithm. More likely, the algorithm is simple enough that even a random site can achieve significant 'success'. But for now, this remains just a hypothesis.

Top 5 Russian sites by Google search referrals, 12 September 2024

However, in general terms, the problem is not a hypothesis, but a brutal reality: with enough resources and intent, an algorithm can be 'tricked' into artificially boosting content, after which the algorithm, following its logic, will rapidly amplify the 'injected' content. Platforms defend the 'market-driven' logic of their algorithms: they promote what users demand. However, when these algorithms encounter non-market efforts by producers of propaganda content, the result is that they become effective tools for promotion in their hands.

The ‘information karma’ mechanism

Earlier this year, AI-generated photos of singer Taylor Swift (so-called deep fakes) surfaced on the social media platform X. The images, depicting a nude Swift without her consent, circulated online for 19 hours, gathering millions of views before the account posting them was finally taken down by moderators. If the platform had a functioning algorithm to detect nudity and AI-generated content, the images would have been removed much sooner.

What is the problem here? A search platform, as an information intermediary, is not responsible for the content it indexes, but it is responsible for how widely that content is disseminated, which, as demonstrated, directly depends on the algorithm's settings. Conspiracy theories, lies, and half-truths have always spread. But today, the cost of 'contact', that is the cost of delivering information to a user, is much lower than before. The content creator no longer needs to spend on printing and distributing copies or securing a large broadcast frequency. Instead, they embed their material into a ready-made distribution infrastructure, which is indifferent to content quality and cares only about maximising overall audience reach.

In other words, we live in a new reality that society has yet to adapt to. Previously, the mechanism of reputation – whether it be a publication, TV channel, etc – stood in the way of spreading fake content. Now, the only intermediary between the distributor and the audience is a soulless algorithm, which is also vulnerable. As a result, even though the cost of verifying information has also seemingly decreased, we find ourselves in a world where fakes and myths spread at an astounding speed, reaching enormous audiences in just a few hours. It is frightening to imagine what could have happened if Goebbels had access to television, social media, and recommendation services.

At the end of July, in Southport, UK, during a children’s dance club session, three girls were murdered. Shortly after the tragedy, a fake news account on X reported that the alleged killer was a recently arrived refugee, supposedly Muslim based on his name. This sparked a wave of anti-immigrant protests and anti-Muslim riots across the country. When it was soon revealed that the suspect was born in the UK and was not Muslim, the account creators issued an apology for spreading ‘false information’. That account remains active on X to this day.

Of course, it is incorrect to claim that social media is inherently evil. These platforms have made the spread of all kinds of information easier, but they also provide opportunities to make information work more productive and safe. The solution is not endless moderation; the issue can be approached differently. For example, we can look at how information is collected and analysed in banking or insurance sectors, where every client builds a credit history. This history reveals a lot about a person and helps predict their future behaviour. Similarly, in the information environment, we can accumulate data on the reliability of content creators and distributors – what could be called their ‘information karma’.

Of course, any user can report content to platforms, and within a few hours or days, that content might be removed. However, first, during those hours, a video or message could reach a massive audience, some of whom may become voluntary re-sharers of the content. Second, there is no unified approach to penalising false and harmful information, and determining the level of ‘falsity’ and ‘harm’ is a separate complex task. In my view, responsibility should involve the publisher of illegal or false content accumulating such 'karma' that becomes visible to consumers, and based on this, their content would be downgraded if their previous publications were part of a 'track' of unreliable information.

There are fact-checking projects that allow for the automatic detection of AI-generated, falsified, or illegal content. If it is found that a site or account has repeatedly published or shared such content, there should be consequences for that site or account. The mechanism for these consequences should be integrated into search engine algorithms, which would then immediately inform the user about the reputation of the source or distributor of the information. Essentially, such automatically tracked 'karma' would restore the reputation system, whose importance has been significantly diminished or replaced by current ranking algorithms.

Built-in fact-checking and the cost of clickbait

Discussions on this topic often encounter arguments about freedom of information and the potential censorship resulting from various regulatory measures. However, the strict dichotomy of 'censorship versus freedom' seems false. In reality, we face a broad spectrum of informational challenges: in some cases, it is about inciting hatred or calling for violence; in others, it is the spread of false, fake information. This type of information may not directly incite violence but could ultimately lead to or justify it.

This is not a contrived problem. Social media users and consumers of search engine or recommendation service results often lack the ability to discern the quality of information or verify it. According to the 2024 ‘Digital News Report’ from the Reuters Institute, at least a quarter of users on TikTok and X struggle to determine the reliability of information, and among Google users, the figure is 13% (though those who claim to easily recognise fakes might actually be avid consumers of them).

The 'information karma' mechanism and built-in fact-checking should serve as tools to signal to the user that a particular piece of information has a 'significant papertrail', meaning it is supported by multiple credible sources. Conversely, another fact might only appear in specific types of publications. The combination of fact-checking tools and 'information karma', which tracks the amount of false information associated with a source, should act as a recommendation to users.

Technically, this is entirely feasible. Platforms already have probabilistic classifiers for detecting false information. The obstacle to implementing such an approach is not technological but rather the platforms' will and business models. In areas where regulators require strict moderation of content – such as banning the promotion of violence or hate speech – many platforms manage these tasks effectively. However, limiting the spread of false content doesn’t fit into the platforms' business models. User engagement is more important to them. Accurate information is often less popular than flashy conspiracy theories. And the less user engagement, i.e. the fewer comments and reactions, the less advertising is shown.

To increase the importance of accurate information for platforms, the costs of spreading false or dangerous content must be higher than the revenue from increased user engagement. It would be wrong, however, to impose responsibility on platforms for the quality of all content published on them. But the larger the audience to which they show that content, the higher the responsibility of the intermediary, who can and should provide users with 'metadata' about the information – indicating its origin and presumed quality. Ultimately, for publishers and distributors of false information, the short-term benefits of clickbait will be balanced by long-term reputational costs.