An Empirical Analysis of ChatGPT Translation Error Types in Texts of Chinese Red Culture Based on the MQM Quality Assessment Framework
Download PDF
$currentUrl="http://$_SERVER[HTTP_HOST]$_SERVER[REQUEST_URI]"

Keywords

AI translation
ChatGPT
Multidimensional quality metrics
Error types

DOI

10.26689/erd.v7i4.10326

Submitted : 2025-03-29
Accepted : 2025-04-13
Published : 2025-04-28

Abstract

In recent years, translation quality evaluation has emerged as a major, and at times contentious, topic. The industry view on quality is highly fragmented, in part because different kinds of translation projects require very different evaluation methods. In response, the EU-funded QTLaunchPad project has developed the Multidimensional Quality Metrics (MQM) framework, an open and extensible system for declaring and describing translation quality metrics using a shared vocabulary of “issue types.” As an effective approach to evaluating AI translation quality, the classification of translation errors has drawn increasing attention. This study focuses on translation errors in red texts, using the MQM quality assessment model as the analytical framework to categorize errors in translations produced by ChatGPT4.0, a leading engine among current large language models. The findings aim to provide pedagogical support for pre-editing and post-editing training in professional translator education.

References

Okpor M, 2014, Machine Translation Approaches: Issues and Challenges. International Journal of Computer Science Issues (IJCSI), 11(5): 159.

Moorkens J, Castilho S, Gaspari F, et al., 2018, Translation Quality Assessment. Machine Translation: Technologies and Applications, 1: 299.

Babych B, 2014, Automated MT Evaluation Metrics and Their Limitations. Tradumàtica Tecnologies de la Traducció, 2014(12): 464–470.

Putri T, 2019, An Analysis of Types and Causes of Translation Errors. Etnolingual Journal, 3(2): 93–103.

Lommel A, Burchardt A, Uszkoreit H, 2014, Multidimensional Quality Metrics (MQM): A Framework for Declaring and Describing Translation Quality Metrics. German Research Center for Artificial Intelligence (DFKI), 2014: 455–463.

Burchardt A, 2013, Multidimensional Quality Metrics: A Flexible System for Assessing Translation Quality. Proceedings of Translating and the Computer 35, Aslib, London.

Li Y, Suzuki J, Morishita M, et al., 2024, MQM-Chat: Multidimensional Quality Metrics for Chat Translation. arXiv preprint: 2408.16390.

Peng K, Ding L, Zhong Q, et al., 2023, Towards Making the Most of ChatGPT for Machine Translation. arXiv preprint: 2303.13780.

Salvagno M, Taccone F, Gerli A, 2023, Can Artificial Intelligence Help for Scientific Writing? Critical Care, 27(1): 75.

Thorp H, 2023, ChatGPT Is Fun, but Not an Author. Science, 379(6630): 313–313.

Wu J, Yang S, Zhan R, et al., 2025, A Survey on LLM-Generated Text Detection: Necessity, Methods, and Future Directions. Computational Linguistics, 2025: 1–66.

Grimm D, Lee Y, Hu K, et al., 2024, The Utility of ChatGPT as a Generative Medical Translator. European Archives of Oto-Rhino-Laryngology, 2024: 1–5.

Alm A, Watanabe Y, 2023, Integrating ChatGPT in Language Education: A Freirean Perspective. Iranian Journal of Language Teaching Research, 11(3 Special Issue): 19–30.

Athanassopoulos S, Manoli P, Gouvi M, et al., 2023, The Use of ChatGPT as a Learning Tool to Improve Foreign Language Writing in a Multilingual and Multicultural Classroom. Advances in Mobile Learning Educational Research, 3(2): 818–824.

Lommel A, Burchardt A, Popović M, et al., 2014, Using a New Analytic Measure for the Annotation and Analysis of MT Errors on Real Data. Proceedings of the 17th Annual Conference of the European Association for Machine Translation, 2014: 165–172.

Lommel A, Gladkoff S, Melby A, et al., 2024, The Multi-Range Theory of Translation Quality Measurement: MQM Scoring Models and Statistical Quality Control. arXiv preprint: 2405.16969.

Freitag M, Foster G, Grangier D, et al., 2021, Experts, Errors, and Context: A Large-Scale Study of Human Evaluation for Machine Translation. Transactions of the Association for Computational Linguistics, 9: 1460–147.

Laurer M, Atteveldt W, Casas A, et al., 2023, Lowering the Language Barrier: Investigating Deep Transfer Learning and Machine Translation for Multilingual Analyses of Political Texts. Computational Communication Research, 5(2): 1.

Jiao H, Peng B, Zong L, et al., 2024, Gradable ChatGPT Translation Evaluation. arXiv preprint: 2401.09984.

Guofu X, Jieli X, 2025, Theoretical Interpretation and Practical Path of Red Culture Empowering the Consolidation of the Awareness of the Chinese National Community. Journal of Sociology and Education (JSE), 1(1): 44–53.

Gao Y, Wang R, Hou F, 2023, Unleashing the Power of ChatGPT for Translation: An Empirical Study. arXiv preprint: 2304.02182.

Jiao W, Wang W, Huang J, et al., 2023, Is ChatGPT a Good Translator? Yes with GPT-4 as the Engine. arXiv preprint: 2301.08745.