Comparison of commercial decoder-only large language models for multilingual sentiment analysis of short text

John Corcoran Burns*, Tom Kelsey

*Corresponding author for this work

Research output: Contribution to journalReview articlepeer-review

Abstract

This article explores multilingual sentiment analysis of short texts using three commercial decoder-only Large Language Models (“LLMs” ): OpenAI’s ChatGPT, Anthropic’s Claude, and Google’s Gemini. The training data for these models is approximately 90% English, and it remains an open question whether it is better to evaluate text data in its original language or translate it into English first. We build on previous research on sentiment analysis of multilingual short texts, such as those found on social media, using 1000 short text samples in seven languages (English, Spanish, French, Portuguese, Arabic, Japanese, and Korean) translated into English using Google Translate. We processed these samples with decoder-only LLMs and compared their results with those fromother methods (encoder-only LLMs, RNNs, lexicons). We found that decoder-only LLMs achieved the highest accuracy across all sentiment analysis methods when working with the original language data. The only exception was with the French data, where an RNN was the most accurate. Among the three decoder-only LLMs, ChatGPT had the highest accuracy in four of the seven languages, Claude in two, and Gemini, whichranked second in six of the seven languages.
Original languageEnglish
Pages (from-to)319-331
Number of pages13
JournalJournal of Social Media Research
Volume2
Issue number4
DOIs
Publication statusPublished - 4 Dec 2025

Keywords

  • Multilingual analysis
  • Sentiment analysis
  • Social media
  • Artificial intelligence
  • Large language models

Fingerprint

Dive into the research topics of 'Comparison of commercial decoder-only large language models for multilingual sentiment analysis of short text'. Together they form a unique fingerprint.

Cite this