Earlier studies have corroborated that human translation exhibits unique linguistic features, usually referred to as translationese. However, research on machine translationese, in spite of some sparse efforts, is still in its infancy. By comparing machine translation with human translation and original target language texts, this study aims to investigate if machine translation has unique linguistic features of its own too, to what extent machine translations are different from human translations and target-language originals, and what characteristics are typical of machine translations. To this end, we collected a corpus containing English translations of modern Chinese literary texts produced by neural machine translation systems and human professional translators and comparable original texts in the target language. Based on the corpus, a quantitative study of discourse coherence was conducted by observing metrics in three dimensions borrowed from Coh-Metrix, including connectives, latent semantic analysis and the situation/mental model. The results support the existence of translationese in both human and machine translations when they are compared with original texts. However, machine translationese is not the same as human translationese in some metrics of discourse coherence. Additionally, machine translation systems, such as Google and DeepL, when compared with each other, show unique features in some coherence metrics, although on the whole they are not significantly different from each other in those coherence metrics.
Aranberri, N. (2020). Can translationese features help users select an MT system for post-editing? Procesamiento del Lenguaje Natural, 64, 93–100.
Bahdanau, D., Cho, K., & Bengio, Y. (2014). Neural machine translation by jointly learning to align and translate. arXiv:1409.0473.
Baker, M. (1993). Corpus linguistics and translation studies: Implications and applications. In M. Baker, G. Francis, & E. Tognini-Bonelli (Eds.), Text and technology: In honour of John Sinclair (pp. 233–250). John Benjamins.
Becher, V. (2011). When and why do translators add connectives?: A corpus-based study. Target, 23(1), 26–47.
Bizzoni, Y., Juzek, T. S., España-Bonet, C., Chowdhury, K. D., van Genabith, J., & Teich, E. (2020). How human is machine translationese? Comparing human and machinetranslations of text and speech. Proceedings of the 17th International conference on spoken language translation (pp. 280–290). Association for Computational Linguistics.
Blum-Kulka, S. (1986). Shifts of ccohesion and coherence in translation. In J. House, & S. Blum-Kulka (Eds.), Interlingual and intercultural communication: Discourse and cognition in translation and second language acquisition studies (pp. 17–35). Tübingen: Narr.
Cadwell, P., O’Brien, S., & Teixeira, C. S. (2018). Resistance and accommodation: Factors for the (non-)adoption of achine translation among professional translators. Perspectives, 26(3), 301–321.
Chen, J. W. (2006). Explicitation through the Use of Connectives in translated Chinese: A corpus-based study. [Doctoral dissertation, University of Manchester]. e-theses online service of University of Manchester. https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.521458.
Cohen, J. (1969). Statistical power analysis for the behavioral sciences. Academic Press.
Čulo, O. (2014). Approaching machine translation from translation studies–a perspective on commonalities, potentials, differences. Proceedings of the 17th annual conference of the European association for machine translation (pp. 199–206). The European Association for Machine Translation.
Čulo, O., & Nitzke, J. (2016). Patterns of terminological variation in post-editing and of cognate use in machine translation in contrast to human translation. Proceedings of the 19th annual conference of the European association for machine translation (pp. 106–114).
Foltz, P. W., Kintsch, W., & Landauer, T. K. (1998). The measurement of textual coherence with latent semantic analysis. Discourse Processes, 25(2–3), 285–307.
Frawley, W. (1984). Prolegomenon to a theory of translation. In W. Frawley (Ed), Translation: Literary, linguistic, and philosophical perspectives (pp. 159–175). Associated University Presses.
Gellerstam, M. (1986). Translationese in Swedish novels translated from English. Translation Studies in Scandinavia, 1, 88–95.
Graesser, A. C., & McNamara, D. S. (2011). Computational analyses of multilevel discourse comprehension. Topics in Cognitive Science, 3(2), 371–398.
Graesser, A. C., McNamara, D. S., Louwerse, M. M., & Cai, Z. (2004). Coh-metrix: Analysis of text on cohesion and language. Behavioral Research Methods, Instruments, & Computers, 36, 193–202. https://doi.org/10.3758/BF03195564.
Graham, Y., Haddow, B., & Koehn, P. (2019). Translationese in machine translation evaluation. arXiv:1906.09833.
Granger, S. (2017). Tracking the third code: A cross-linguistic corpus-driven approach to metadiscursive markers. In A. Čermáková, & M. Mahlberg (Eds.), The corpus linguistics discourse: In honour of Wolfgang Teubert (pp. 185–204). John Benjamins.
Hair, J. F., Black, W. C., Babin, B. J., Anderson, R. E., & Tatham, R. L. (2014). Multivariate data analysis. Pearson Education.
Halliday, M. A. K., & Hasan, R. (1976). Cohesion in English. Routledge.
Hassan, H., Aue, A., Chen, C., Chowdhary, V., Clark, J., Federmann, C., Huang, X., Junczys-Dowmunt, M., Lewis, W., Li, M., Liu, S., Liu, T., Luo, R., Menezes, A., Qin, T., Seide, F., Tan, X., Tian, F., Wu, L., & Zhou, M. (2018). Achieving human parity on automatic Chinese to English news translation. arXiv:1803.05567v2.
Kajzer-Wietrzny, M. (2022). An intermodal approach to cohesion in constrained and unconstrained language. Target, 34(1), 130–162.
Károly, K. (2010). Shifts in repetition vs. shifts in text meaning: A study of the textual role of lexical repetition in non-literary translation. Target, 22(1), 40–70.
Károly, K. (2014). Referential cohesion and news content: A case study of shifts of reference in Hungarian-English news translation. Target, 26(3), 406–431.
Kintsch, W. (1998). Comprehension: A paradigm for cognition. Cambridge University Press.
Koponen, M. (2016). Is machine translation post-editing worth the effort? A survey of research into post-editing and effort. The Journal of Specialised Translation, 25, 131–148.
Kruger, H. (2012). A corpus-based study of the mediation effect in translated and edited language. Target, 24(2), 355–388.
Kruger, H. (2018). That again: A multivariate analysis of the factors conditioning syntactic explicitness in translated English. Across Languages and Cultures, 20(1), 1–33.
Krüger, R. (2020a). Explicitation in neural machine translation. Across Languages and Cultures, 21(2), 195–216.
Krüger, R. (2020b). Propositional opaqueness as a potential problem for neural machine translation. In B. Ahrens, M. Beaton-Thome, M. Krein-Kühle, R. Krüger, L. Link, & U. Wienen (Eds.), Interdependence and innovation in translation, interpreting and specialised communication (pp. 261–278). Frank & Timme.
Kuo, C. L. (2019). Function words in statistical machine-translated Chinese and original Chinese: A study into the translationese of machine translation systems. Digital Scholarship in the Humanities, 34(4), 752–771.
Landauer, T. K., McNamara, D. S., Dennis, S., & Kintsch, W. (Eds.), (2007). Handbook of latent semantic analysis. Erlbaum.
Lapshinova-Koltunski, E. (2015). Variation in translation: Evidence from corpora. In C. Fantinuoli, & F. Zanettin (Eds.), New directions in corpus-based translation studies (pp. 81–99). Langugae Science Press.
Läubli, S., Sennrich, R., & Volk, M. (2018). Has machine translation achieved human parity? A case for document-level evaluation. arXiv:1808.07048v1.
Loock, R. (2020). No more rage against the machine: How the corpus-based identification of machine-translationese can lead to student empowerment. The Journal of Specialised Translation, 34, 150–170.
Louwerse, M. (2001). An analytic and cognitive parameterization of coherence relations. Cognitive Linguistics, 12, 291–315.
Macken, L., Prou, D., & Tezcan, A. (2020). Quantifying the effect of machine translation in a high-quality human translation production process. Informatics, 7(2), 1–19.
Mauranen, A. (2000). Strange strings in translated language: A study on corpora. In M. Olohan (Ed.), Intercultural faultlines: Research models intranslation studies (pp. 119–142). Routledge.
McNamara, D. S., Graesser, A. C., McCarthy, P. M., & Cai, Z. (2014). Automated evaluation of text and discourse with Coh-Metrix. Cambridge University Press.
Miller, G. A., Beckwith, R., Fellbaum, C., Gross, D., & Miller, K. J. (1990). Introduction to WordNet: An on-line lexical database. International Journal of Lexicography, 3(4), 235–244.
Moorkens, J. (2017). Under pressure: Translation in times of austerity. Perspectives, 25(3), 464–477.
Niu, J., Jiang, Y., & Zhou, Y. (2020). Approaching textual coherence of machine translation with complex network. International Journal of Modern Physics C, 31(12), 1–21.
O’Brien, S. (2012). Translation as human-computer interaction. Translation Spaces, 1(1), 101–122.
Olohan, M., & Baker, M. (2000). Reporting that in translated English. Evidence for subconscious processes of explicitation? Across Languages and Cultures, 1(2), 141–158.
Öner Bulut, S. (2019). Integrating machine translation into translator training: Towards ‘Human Translator Competence. transLogos, 2(2), 1–26.
Øverås, L. (1998). In search of the third code: An investigation of norms in literary translation. Meta, 43(4), 557–570.
Puurtinen, T. (2003). Genre-specific features of translationese? Linguistic differences between translated and non-translated Finnish children's literature. Literary and Linguistic Computing, 18(4), 389–406.
Rohdenburg, G. (1996). Cognitive complexity and increased grammatical explicitness in English. Cognitive Linguistics, 7(2), 149–182.
Rossi, C., & Chevrot, J. P. (2019). Uses and perceptions of machine translation at the European Commission. The Journal of Specialised Translation, 31, 177–200.
Tirkkonen-Condit, S. (2002). Translationese—a myth or an empirical fact?: A study into the linguistic identifiability of translated language. Target, 14(2), 207–220.
Toral, A., Castilho, S., Hu, K., & Way, A. (2018). Attaining the unattainable? Reassessing claims of human parity in neural machine translation (p. 10432). arXiv:1808.
Vanmassenhove, E., Shterionov, D., & Gwilliam, M. (2021). Machine translationese: Effects of algorithmic bias on linguistic complexity in machine translation. arXiv:2102.00287.
Vanmassenhove, E., Shterionov, D., & Way, A. (2019). Lost in translation: Loss and decay of linguistic richness in machine translation. arXiv:1906.12068.
Way, A. (2018). Quality expectations of machine translation. In J. Moorkens, S. Castilho, F. Gaspari, & S. Doherty (Eds.), Translation quality assessment (pp. 159–178). Springer.
Wintner, S. (2016). Translationese: Between human and machine translation. Proceedings of COLING 2016, the 26th International conference on computational linguistics: Tutorial abstracts (pp. 18–19). Association for Computational Linguistics.
Wong, B. T., & Kit, C. (2012). Extending machine translation evaluation metrics with lexical cohesion to document level. Proceedings of the 2012 joint conference on empirical methods in natural language processing and computational natural language learning (pp. 1060–1068). Association for Computational Linguistics.
Xiao, R. (2011). Word clusters and reformulation markers in Chinese and English: Implications for translation universal hypotheses. Languages in Contrast, 11(2), 145–171.
Xu, J. (2019). Yuliaoku yu huayu yanjiu [Corpora and Discourse Studies]. Beijing: Foreign Language Teaching and Research Press.
Zhang, B., Zhu, J., & Su, H. (2020). Maixiang disandai rengongzhinneg [Toward the third generation of artifificial intelligence]. SCIENTIA SINICA: Informationis, 50(9), 1281–1302.
Zufferey, S., & Cartoni, B. (2014). A multifactorial analysis of explicitation in translation. Target, 26(3), 361–384.
Zwaan, R. A., & Radvansky, G. A. (1998). Situation models in language comprehension and memory. Psychological Bulletin, 123(2), 162–185.