Skip to content
Longterm Wiki
Back

The Perils and Promises of Fact-Checking with Large Language Models

web

Relevant to AI safety discussions around LLM reliability and deployment in high-stakes information contexts; highlights evaluation challenges and risks of over-trusting LLMs for truth verification tasks.

Metadata

Importance: 42/100journal articleprimary source

Summary

This paper evaluates LLM agents (GPT-3 and GPT-4) for automated fact-checking, finding that contextual information retrieval significantly enhances performance, but accuracy remains inconsistent across query languages and claim types. The study highlights both the promise and limitations of using LLMs to combat misinformation at scale.

Key Points

  • GPT-4 outperforms GPT-3 in fact-checking tasks, but accuracy varies significantly by query language and claim veracity.
  • LLM agents equipped with contextual retrieval (RAG-style) show markedly improved fact-checking capabilities over base models.
  • Agents explain their reasoning and cite sources, improving transparency but not eliminating inconsistent accuracy.
  • Automated fact-checking is increasingly critical as misinformation spreads faster than human fact-checkers can respond.
  • The study calls for deeper research into failure modes of LLM fact-checking agents before deployment in information ecosystems.

Cited by 1 page

PageTypeQuality
AI-Era Epistemic InfrastructureApproach59.0

Cached Content Preview

HTTP 200Fetched Mar 20, 202688 KB
## ORIGINAL RESEARCH article

Front. Artif. Intell., 06 February 2024

Sec. Natural Language Processing

Volume 7 - 2024 \| [https://doi.org/10.3389/frai.2024.1341697](https://doi.org/10.3389/frai.2024.1341697)

# The perils and promises of fact-checking with large language models

- [DQ\\
\\
Dorian Quelle 1,2† \*](https://loop.frontiersin.org/people/2555390)
- [AB\\
\\
Alexandre Bovet 1,2†](https://loop.frontiersin.org/people/2090422)

- 1. Department of Mathematical Modeling and Machine Learning, University of Zurich, Zurich, Switzerland

- 2. Digital Society Initiative, University of Zurich, Zurich, Switzerland


Article metrics

[View details](https://www.frontiersin.org/journals/artificial-intelligence/articles/10.3389/frai.2024.1341697/full#metrics)

## Abstract

Automated fact-checking, using machine learning to verify claims, has grown vital as misinformation spreads beyond human fact-checking capacity. Large language models (LLMs) like GPT-4 are increasingly trusted to write academic papers, lawsuits, and news articles and to verify information, emphasizing their role in discerning truth from falsehood and the importance of being able to verify their outputs. Understanding the capacities and limitations of LLMs in fact-checking tasks is therefore essential for ensuring the health of our information ecosystem. Here, we evaluate the use of LLM agents in fact-checking by having them phrase queries, retrieve contextual data, and make decisions. Importantly, in our framework, agents explain their reasoning and cite the relevant sources from the retrieved context. Our results show the enhanced prowess of LLMs when equipped with contextual information. GPT-4 outperforms GPT-3, but accuracy varies based on query language and claim veracity. While LLMs show promise in fact-checking, caution is essential due to inconsistent accuracy. Our investigation calls for further research, fostering a deeper comprehension of when agents succeed and when they fail.

## 1 Introduction

Fact-checking has become a vital tool to reduce the spread of misinformation online, shown to potentially reduce an individual's belief in false news and rumors (Morris et al., [2020](https://www.frontiersin.org/journals/artificial-intelligence/articles/10.3389/frai.2024.1341697/full#B33); Porter and Wood, [2021](https://www.frontiersin.org/journals/artificial-intelligence/articles/10.3389/frai.2024.1341697/full#B38)) and to improve political knowledge (Nyhan and Reifler, [2015](https://www.frontiersin.org/journals/artificial-intelligence/articles/10.3389/frai.2024.1341697/full#B37)). While verifying or refuting a claim is a core task of any journalist, a variety of dedicated fact-checking organizations have formed to correct misconceptions, rumors, and fake news online. A pivotal moment in the rise of fact-checking happened in 2009 when the prestigious Pulitzer Prize in the national reporting category was awarded to Politifact. Politifact's innovation was to propose the now sta

... (truncated, 88 KB total)
Resource ID: dd6f2b62bdf62bd8 | Stable ID: YjU2ZmMyNz