Handling misinformation in a time of AI: How can we decipher fact from lies?

10 min read4 days ago

Misinformation isn’t solely or even uniquely an AI issue. It has been a problem way before the technology was mainstream. I mean, a lot of us have older relatives who forward us tens of WhatsApp broadcast messages with unverified information.

The presence of Generative AI tools, however, which are now very easily accessible on almost every device, and app, and for easy download, exacerbates this issue. Cases of misinformation vary from political parties creating propaganda to boost their image or against opposition parties, to people creating falsified scenarios with someone else’s likeness, to rewriting culture and history. A form of the latter case was what inspired this article.

Recently, a person posted on Twitter/X about the origin of the name “hush puppies” and cited Google’s AI overview as their source. According to the overview, “hush puppies may have originated from enslaved people who threw fried cornmeal batter to distract dogs that were watching them.” Hence hush puppy.

Image is the screenshot of the Google AI overview posted by the person who made the Twitter post.

There were several replies to this post. Some mocked AI as a source. Some explained why this claim was wrong and a lie accompanied with articles debunking it including this one from a culinary historian. A few people even questioned the effectiveness and logic behind throwing food at a dog while it chased you. Especially since the dog could simply trace the smell of the food back to the enslaved people running away. Meanwhile, some others questioned the validity of articles posted to debunk the Google AI claim because ‘how can you debunk an article with another?’

Reading through the entire conversation had me wondering how we can prepare for and handle misinformation issues that may arise, and at more rapid rates, with the prevalence of AI technology. So in this article, we will explore some types of misinformation, why they arise, and possible ways to tackle these issues before they become worse than they currently are.

Types of AI Misinformation

Misinformation, also called disinformation, appears in several forms. From completely fabricated information to incomplete ones taken out of context and so much more. When dealing with the case of misinformation in AI, it’s important to be aware of the different types that may arise and also possible drivers/sources. Some cases of misinformation are a fault of the technology and its design, others are purposely orchestrated by the humans using it, and some others could be both. The five types of misinformation highlighted below have been categorized into two sources: Machine-driven Disinformation for the first two and Human-driven Disinformation for the last three.

1. Hallucinations: this refers to situations where a machine learning model goes rogue proffering output or information that is false, completely made up, and unrelated to its training data. They are a common limitation of large language models (LLMs) including ChatGPT, Google’s Gemini, etc., and are sometimes difficult to spot because to someone without adequate knowledge, they do seem like plausible facts.

Some notable examples of this occurrence include Meta’s now-withdrawn Galactica which cited a non-existent paper from a real author in their relevant field as a source on a question it was asked. Also, there was Google’s Bard now known as Gemini which in its first demo ‘claimed that the James Webb Space Telescope had captured the first images of a planet outside our solar system, which wasn’t factually true.’ AI hallucinations are a critical type of misinformation as researchers have discovered that chatbots hallucinate up to 27% of the time. In an evaluation of 9 of the most commonly used LLMs, it was discovered that 46.4% of texts generated contained factual errors. Regardless of these, many people still heavily rely on Gen AI tools as their quick source of information in research, work, etc. without any efforts to verify outputs.

2. Training data flaw: just like in the hush puppy example mentioned earlier, the misinformation spread by Google’s Gemini/AI overview was only reproducing output based on the data it had been trained on. Even though the data was wrong. This is a problem with training models on uncontrolled large sets of data scraped from all over the internet. These datasets are rarely, if ever audited for accuracy, biases, or any discrepancies. So when models are trained on these types of data containing a mix of information whether wrong or right, they go on to repeat these things.

3. Plausible deniability of reality: this is a phenomenon where a person can deny the truth simply because there is no concrete proof of it. With AI this could look like people denying involvement in activities with the claim that it was fabricated by AI or accusing someone of using AI when that’s not the case.

Some examples of this: last year during the American election season, now President Donald Trump made a post on Truth Social, his social network, claiming that the crowds at Kamala Harris’ campaigns were AI-generated. This false claim trended for a while extending outside Truth Social to other social media platforms, as well as the news. There was also the case of the artist who designed a book cover that the internet claimed was AI-generated. Even after attempts to prove it wasn’t, many were still hellbent on their accusations that the artist had to refund the author who hired them. There’s also the famous Kate Middleton cancer video which caused a stir on the internet as people kept analyzing whether or not it was real or AI. Because there is still no 100% guaranteed way of verifying AI-generated content and also people are quick to accept whatever information they see on the internet without doing the ‘extra work’ to first verify, plausible deniability of reality will be one of the most common forms of AI misinformation we will experience in coming years.

4. Deepfake and its complicit uses: closely related to the previous point, deepfake technology enables people to replicate the likeness of another, be it voice or appearance. While this branch of AI has a few benefits in accessibility, the cons are very important to consider too. One of them is the increase in creating and spreading false information. When anyone can easily create a video with the exact likeness, not a simulation, of another person, there is a world of possibilities they can create to fit whatever agenda. An example was the Nigerian 2023 election season when an alleged leaked audio recording was paraded around as a conversation between one of the presidential candidates, his running mate, and another party member on plans to rig the election in their favor. It was later discovered that the recording was fake and a product of deepfake.

5. Data manipulation to fit a narrative: unlike in number 2 where misinformation is a fault of the quality of training data used, data manipulation refers to “the deliberate seeding of training sets with inaccurate information for the purpose of skewing the output of AI models toward misinformation.” This could serve many purposes such as revising history in favor or harm of a certain group or the erasure of some, influencing the reputation of companies, personalities, political parties, or a group. The issue with this is that for the most part, it is difficult to trace the sources of training sets data, and also for people growing increasingly reliant and trusting of chatbot/machine outputs, this form of misinformation could go unnoticed. The power lies in the hands of the developer to use it for good or not.

Why is misinformation still an issue?

There are several reasons why misinformation remains an issue in society and may even continue to be an issue. Some notable ones to consider are:

People’s refusal to confirm the source and validity of news before spreading it. Most likely because that false information conforms with a belief or expression they have, especially in politics.
The ease of distributing news and information across social media.
There’s also the added difficulty of easily identifying and verifying AI-generated images or content.
Automation bias, which refers to people’s tendency to trust machine outputs over their instincts or knowledge. This happens because while people accept technology’s abilities to process data faster and make things easier, many of us have yet to accept the limitations of the tools we use.
Lastly, the lack of consequences for those who create and spread misinformation, especially in sensitive areas or topics.

For these reasons, misinformation remains an issue that plagues society. If not properly handled now, at the still early stage of AI’s growth, it could greatly affect our perceptions of reality. So how can we tackle AI-led misinformation?

How to handle AI-led misinformation

Artificial Intelligence presents an added layer to the already complex issue of misinformation. So now when developing strategies to tackle this serious problem, there needs to be consideration of how AI contributes to disinformation and how to balance solutions with the benefits of some of these technologies. Here are some ways misinformation in AI can be tackled:

1. Source evaluation: Emphasis and awareness training to encourage people to always double-check evaluate the source of information before accepting it and even sharing it. There needs to be more concerted efforts towards educating the general public about cases of misinformation and their risks. People also need to know how AI contributes to this so they know to always double-check information relayed by a chatbot, AI model, or even from third-party sources.

2. Media literacy: More than ever it has become important for schools, communities, and governments to invest in media literacy education for the masses. Especially for the younger generations who get their news from social media. Media or information literacy skills would help improve general abilities to interact with and identify fake, suspicious, and fabricated AI content. It would instill in people not just the ability to identify misinformation but also the instinct to verify first. Although AI-generated content or misinformation can sometimes be difficult to discern because they appear legitimate, media literacy can help sharpen skills and train the ordinary eye to pick out inconsistencies. These media literacy efforts should also be inclusive so the most vulnerable in society aren’t left out like disabled people.

3. Focus on developing specialized or localized datasets and models: with larger and general-use datasets and models, it’s more likely for them to include biased, unchecked, and false information which these tools can repeat. It is also more likely for certain underrepresented contexts and cultures to get further hidden away. For example, data on African histories are underrepresented in digital archives used as data sources. In this case, it is easy to misrepresent an African story as fact because there may not be enough accurate data on it. It is for similar reasons that localized models such as ChatBlackGPT, Latimer AI, and Spark Plug for instance have been developed to accurately represent Black and Brown cultures and histories which models like ChatGPT didn’t get. Studies have also shown that when models are not majorly or frequently trained in languages besides English, they are more likely to spread misinformation in those languages influenced by mistranslation.

4. Regulation with enforcement: regulation remains the beacon of hope for AI safety issues and measures. As former United States Supreme Court Justice Oliver Wendell Holmes Jr. put it, laws are not made for the good man who respects and acts in a morally upright way but rather for the bad person who would push the boundaries of morality to cheat, steal, and harm. The same thought should be applied to AI regulation. Governments, organizations, and necessary bodies need to enact policies that guardrail the development and use of AI especially in sensitive industries, use cases, and against producing misinformation. There also need to be sanctions and consequences for those found complicit in this act.

There are a ton of other things that could be done to curb the issue of misinformation with AI like investment in more fact-checking companies and transparency and traceability of data used in the development of models. It is impressive that more people are questioning the legitimacy behind certain things shared on the internet, but there is still a lot that people ignore, especially when it’s an output generated in response to a prompt. Gen AI is only going to improve and get better with continuous training which means that for the most part, it’ll also get harder for untrained human eyes to detect what is or isn’t AI produced. This leaves us in a very precarious state where anyone is liable to the dire consequences of AI-generated disinformation.

It is important that we act now, asking why certain tech is useful at certain capabilities (eg why not synthetic voice technology? Why do we need exact sounding ones?) and with easy access for everyone. It is important that we weigh the pros and cons of such technology in reference to society’s well being. Lastly, it is also important that as we develop these technologies we limit their abilities to harm by stress-testing their tendencies to produce misinformation. It shouldn’t be so easy for me to get Meta’s Llama to write me a theory with lies on how vaccines are the leading cause of increased mental disabilities. There may be a disclaimer at the beginning and end of its response but for a person determined to spread propaganda, how does a disclaimer stop them?

Handling misinformation in a time of AI: How can we decipher fact from lies?

Types of AI Misinformation

Why is misinformation still an issue?

How to handle AI-led misinformation

Written by Ebosetale Jenna Oriarewo

No responses yet