Google DeepMind researchers

web

2024·Springer(peer-reviewed)·link.springer.com/article/10.1007/s13347-025-00928-y

Credibility Rating

4/5

High(4)

High quality. Established institution or organization with editorial oversight and accountability.

Rating inherited from publication venue: Springer

A peer-reviewed journal article from Springer involving Google DeepMind researchers, likely addressing technical or policy aspects of AI safety, alignment, or related concerns through rigorous academic analysis.

Paper Details

Citations

Year

2024

Metadata

journal articleprimary source

Cited by 1 page

Page	Type	Quality
Epistemic Collapse	Risk	49.0

Cached Content Preview

HTTP 200Fetched Apr 9, 20262 KB

# AI’s Epistemic Harm: Reinforcement Learning, Collective Bias, and the New AI Culture War
Authors: Declan Humphreys
Journal: Philosophy &amp; Technology
Published: 2025-09
DOI: 10.1007/s13347-025-00928-y
## Abstract

Abstract Generative AI is increasingly used as an epistemic tool to aid inquiry, broaden our knowledge and generally help us find things out. It is now embedded in internet search functions to condense information and answer user questions without further need to access websites or news articles. However, models based on generative AI come with inherent flaws which may inhibit their effectiveness as tools to aid responsible inquiry. While the risk of misinformation due to ‘hallucinations’ has been documented, this paper instead focuses on the capacity for these models to present moral or political bias reflective of the preferences of the model designers and crowd worker labellers. This paper takes the novel approach of tracing the ethical issue of bias back to fundamental alignment problem within the process of Reinforced Learning from Human Feedback (RLHF), a key step in training generative AI models. The potential for language models to be influenced by their designers, censored to certain topics, or trained to be more politically leaning one way or another has the capability of insidiously causing epistemic harm. By applying the concepts of bias of crowds to the training process of language models I show how they can impede effective epistemic inquiry and cause epistemic and moral harm. In doing so I present an argument showing that the ethics of AI language models cannot be removed from the socio-technical systems from which they are created.

Resource ID: c87a82e621f72659 | Stable ID: sid_QV5AQafSDp