LLMs can unmask pseudonymous users at scale with surprising accuracy

Ars Technica
Large language models (LLMs) can identify individuals from anonymized text with surprising accuracy by leveraging web browsing and reasoning capabilities.

Summary

Researchers have demonstrated that LLMs can deanonymize users from text data, even when starting with anonymized transcripts. Unlike previous methods requiring structured data, LLMs can browse the web and use reasoning to match individuals. In experiments, they successfully identified 7% of participants in a questionnaire and varying percentages of Reddit users based on their movie preferences, with identification rates increasing with more shared information. The study highlights a growing capability of AI to compromise pseudonymity and raises concerns about privacy as AI systems improve.

(Source:Ars Technica)