We present a study of discourse connectives and discourse relations in English parallel texts, i.e. in written and spoken originals, as well as translation and interpreting from German. For this, we apply automatic procedures to annotate discourse connectives and relations they trigger in a parallel corpus. We look at distributions of various connectives and discourse relations, comparing spoken and written mode, as well as original and translated or interpreted language production. Furthermore, we analyse the translation patterns in terms of translation entropy. We link our observations to the phenomena of explicitation and implicitation. We find that in both interpreting and translation, explicitation and implicitation patters are affected by the cognitive complexity of the discourse relation signalled by the connective. Moreover, we also show that the difference in the specificity of the same connectives in interpreting and translation also depends on the type of relation they trigger.
Artetxe, M., & Schwenk, H. (2019). Massively multilingual sentence embeddings for zero-shot cross-lingual transfer and beyond. Transactions of the Association for Computational Linguistics, 7, 597–610. https://doi.org/10.1162/tacl_a_00288.
Becher, V. (2011). When and why do translators add connectives? A corpus-based study. Target, 23(1), 26–47. https://doi.org/10.1075/target.23.1.02bec.
Bendazzoli, C. (2019). Discourse markers in English as a target language: The use of so by simultaneous interpreters. Textus, English Studies in Italy, 183–202. https://doi.org/10.7370/93189, (1. 2019).
Blum-Kulka, S. (1986). Shifts of cohesion and coherence in translation. In J. House, & S. Blum-Kulka (Eds.), Interlingual and intercultural communication (pp. 17–35). Gunter Narr.
Bourgonje, P., Grishina, Y., & Stede, M. (2017). Toward a bilingual lexical database on connectives: Exploiting a German/Italian parallel corpus.
Carl, M., & Schaeffer, M. (2017). Sketch of a noisy channel model for the translation process. In S. Hansen-Schirra, O. Czulo, & S. Hofmann (Eds.), Empirical modelling of translation and interpreting (Vol. 7, pp. 71–116). Language Science Press.
Crible, L. (2020). Weak and strong discourse markers in speech, chat and writing: Do signals compensate for ambiguity in explicit relations? Discourse Processes, 57(9), 793–807. https://doi.org/10.1080/0163853X.2020.1786778.
Crible, L., Abuczki, Á., Burkšaitiene, N., Furkó, P., Nedoluzhko, A., Rackevičienė, S., … Zikánová, Š. (2019). Functions and translations of discourse˙ markers in TED talks: A parallel corpus study of underspecification in five languages. Journal of Pragmatics, 142, 139–155. https://doi.org/10.1016/j.pragma.2019.01.012.
Crible, L., & Cuenca, M.-J. (2017). Discourse markers in speech: Characteristics and challenges for corpus annotation. Dialogue and Discourse, 8(2), 149–166. https://doi.org/10.5087/dad.2017.207.
Cuenca, M.-J. (2022). Translating discourse markers: Implicitation and explicitation strategies. In M.-J. Cuenca, & L. Degand (Eds.), From production to comprehension (pp. 215–246). De Gruyter Mouton. https://doi.org/10.1515/9783110790351-009.
Defrancq, B., Plevoets, K., & Magnifico, C. (2015). Connective items in interpreting and translation: Where do they come from? Yearbook of Corpus Linguistics and Pragmatics, 3, 195–222. https://doi.org/10.1007/978-3-319-17948-3_9.
Delogu, F., Crocker, M. W., & Drenhaus, H. (2017). Teasing apart coercion and surprisal: Evidence from eye-movements and ERPs. Cognition, 161, 46–59. https://doi.org/10.1016/j.cognition.2016.12.017.
Dou, Z.-Y., & Neubig, G. (2021). Word alignment by fine-tuning embeddings on parallel corpora. arXiv Preprint arXiv:2101.08231. https://doi.org/10.18653/v1/2021.eacl-main.181.
Dupont, M., & Zufferey, S. (2017). Methodological issues in the use of directional parallel corpora. International Journal of Corpus Linguistics, 22(2), 270–297. https://api.semanticscholar.org/CorpusID:57397376. https://doi.org/10.1075/ijcl.22.2.05dup.
Ferraresi, A., & Miličević, M. (2017). 5 phraseological patterns in interpreting and translation. Similar or different? In G. D. Sutter, M.-A. Lefer, & I. Delaere (Eds.), New methodological and theoretical traditions (pp. 157–182). De Gruyter Mouton. https://doi.org/10.1515/9783110459586-006.
Gellerstam, M. (1986). Translationese in Swedish novels translated from English. In L. Wollin, & H. Lindquist (Eds.), Translation studies in Scandinavia (pp. 88–95). CWK Gleerup.
Gile, D. (2009). Basic concepts and models for interpreter and translator training. John Benjamins Publishing Company. https://www.jbe-platform.com/content/books/9789027288080.
Götz, A. (2023). Adding connectives to manage interpreted discourse: A corpus-based examination of Hungarian to English interpreting. In M. A. Locher, D. Dayter, & T. C. Messerli (Eds.), Pragmatics and translation (Vol. 337, pp. 51–71). 51–71. https://doi.org/10.1075/pbns.337.03got.
Gumul, E. (2006). Explicitation in simultaneous interpreting: A strategy or a by-product of language mediation? Across Languages and Cultures, 7(2), 171–190. https://doi.org/10.1556/Acr.7.2006.2.2.
Gumul, E., & Bartłomiejczyk, M. (2022). Interpreters’ explicitating styles: A corpus study of material from the European parliament. Interpreting. International Journal of Research and Practice in Interpreting, 24(2), 163–191. https://doi.org/10.1075/intp.00081.gum.
Hale, J. (2001). A probabilistic Earley parser as a psycholinguistic model. In Proceedings of the second meeting of the North American chapter of the association for computational linguistics (pp. 1–8). https://doi.org/10.3115/1073336.1073357.
Hantsch, A., Jescheniak, J. D., & Schriefers, H. (2005). Semantic competition between hierarchically related words during speech planning. Memory & Cognition, 33(6), 984–1000. https://doi.org/10.3758/BF03193207.
Hoek, J., Evers-Vermeul, J., & Sanders, T. J. M. (2015). The role of expectedness in the implicitation and explicitation of discourse relations. In Proceedings of the second workshop on discourse in machine translation (pp. 41–46). https://doi.org/10.18653/v1/W15-2505.
Hoek, J., Zufferey, S., Evers-Vermeul, J., & Sanders, T. J. M. (2017). Cognitive complexity and the linguistic marking of coherence relations: A parallel corpus study. Journal of Pragmatics, 121, 113–131. https://doi.org/10.1016/j.pragma.2017.10.010.
Kajzer-Wietrzny, M. (2012). Interpreting universals and interpreting style. Doctoral dissertation, Adam Mickiewicz University [Unpublished PhD thesis].
Karakanta, A., Vela, M., & Teich, E. (2018). Europarl-uds: Preserving and extending metadata in parliamentary debates. ParlaCLARIN: Creating and Using Parliamentary Corpora.
Klaudy, K. (2008). Explicitation. In M. Baker (Ed.), Routledge Encyclopedia of translation studies (pp. 80–84). Routledge.
Klaudy, K., & Károly, K. (2005). Implicitation in translation: Empirical evidence for operational asymmetry in translation. Across Languages and Cultures, 6(1), 13–28. https://doi.org/10.1556/Acr.6.2005.1.2.
Knaebel, R. (2021). Discopy: A neural system for shallow discourse parsing. In Proceedings of the 2nd workshop on computational approaches to discourse (pp. 128–133). https://doi.org/10.18653/v1/2021.codi-main.12.
Kunilovskaya, M., Przybyl, H., Lapshinova-Koltunski, E., & Teich, E. (2023, September). Simultaneous interpreting as a noisy channel: How much information gets through. In R. Mitkov & G. Angelova (Eds.), Proceedings of the 14th international conference on recent advances in natural language processing (pp. 608–618). INCOMA Ltd., Shoumen. Bulgaria. https://aclanthology.org/2023.ranlp-1.66. https://doi.org/10.26615/978-954-452-092-2_066.
Laali, M. (2017). Inducing discourse resources using annotation projection. Doctoral dissertation. Concordia University.
Lapshinova-Koltunski, E., Bizzoni, Y., Przybyl, H., & Teich, E. (2021). Found in translation/interpreting: Combining data-driven and supervised methods to analyse cross-linguistically mediated communication. In Proceedings of the first workshop on modelling translation: Translatology in the digital age (pp. 82–90). https://aclanthology.org/2021.motra-1.9.
Lapshinova-Koltunski, E., Pollkläsener, C., & Przybyl, H. (2022). Exploring explicitation and implicitation in parallel interpreting and translation corpora. Prague Bulletin of Mathematical Linguistics, 119(1), 5–22. https://ufal.mff.cuni.cz/pbml/119/art-lapshinova-koltunski-pollklaesener-przybyl.pdf. https://doi.org/10.14712/00326585.020.
Lapshinova-Koltunski, E., Przybyl, H., & Bizzoni, Y. (2021). Tracing variation in discourse connectives in translation and interpreting through neural semantic spaces. In Proceedings of the 2nd workshop on computational approaches to discourse (pp. 134–142). https://doi.org/10.18653/v1/2021.codi-main.13.
Marco, J. (2018). Connectives as indicators of explicitation in literary translationa study based on a comparable and parallel corpus. Target, 30(1), 87–111. https://doi.org/10.1075/target.16042.mar.
Martínez, J. M. M., & Teich, E. (2017). Modeling routine in translation with entropy and surprisal: A comparison of learner and professional translations. In L. Cercel, M. Agnetta, & M. T. A. Lozano (Eds.), Kreativität und Hermeneutik in der translation. Narr Francke Attempto Verlag.
Mauranen, A. (2007). Chapter 3. Universal tendencies in translation. In G. Anderman, & M. Rogers (Eds.), The linguist and the translator (pp. 32–48). Multilingual Matters. https://doi.org/10.21832/9781853599873-006.
Morselli, N. (2018). Interpreting universals: A study of explicitness in the intermodal eptic corpus [Special issue: New findings in corpus-based interpreting studies]. inTRAlinea, 20. https://www.intralinea.org/specials/article/2320.
Murray, J. D. (1997). Connectives and narrative text: The role of continuity. Memory & Cognition, 25(2), 227–236. https://doi.org/10.3758/BF03201114.
Och, F. J., & Ney, H. (2000). Improved statistical alignment models. In Proceedings of the 38th annual meeting of the association for computational linguistics (pp. 440–447). https://doi.org/10.3115/1075218.1075274.
Özer, S., Kurfalı, M., Zeyrek, D., Mendes, A., & Valunaite Oleškevičiene, G. (2022). Linking discourse-level information and the induction of bilingual discourse connective lexicons. Semantic Web [Preprint], 13(6), 1081–1102. https://doi.org/10.3233/SW-223011.
Prasad, R., Dinesh, N., Lee, A., Miltsakaki, E., Robaldo, L., Joshi, A. K., & Webber, B. L. (2008). The penn discourse treebank 2.0. LREC.
Przybyl, H., Karakanta, A., Menzel, K., & Teich, E. (2022). Exploring linguistic variation in mediated discourse: Translation vs. interpreting. In M. Kajzer-Wietrzny, S. Bernardini, A. Ferraresi, & I. Ivaska (Eds.), Empirical investigations into the forms of mediated discourse at the European Parliament. Language Science Press.
Przybyl, H., Lapshinova-Koltunski, E., Menzel, K., Fischer, S., & Teich, E. (2022). Epic uds-creation and applications of a simultaneous interpreting corpus. In Proceedings of the thirteenth language resources and evaluation conference (pp. 1193–1200).
Robledo, H., & Nazar, R. (2023). A proposal for the inductive categorisation of parenthetical discourse markers in Spanish using parallel corpora. International Journal of Corpus Linguistics, 28(4), 500–527. https://doi.org/10.1075/ijcl.20017.rob.
Rubino, R., Lapshinova-Koltunski, E., & van Genabith, J. (2016). Information density and quality estimation features as translationese indicators for human translation classification. In Proceedings of the NAACL HT 2006 (pp. 960–970). https://doi.org/10.18653/v1/N16-1110.
Sanders, T. J. M. (2005). Coherence, causality and cognitive complexity in discourse. In Proceedings/Actes SEM-05, first international symposium on the exploration and modelling of meaning (pp. 105–114).
Sanders, T. J. M., Spooren, W. P. M., & Noordman, L. G. M. (1992). Toward a taxonomy of coherence relations. Discourse Processes, 15(1), 1–35. https://doi.org/10.1080/01638539209544800.
Schaeffer, M., & Carl, M. (2013). Shared representations and the translation process: A recursive model. Translation and Interpreting Studies, 169–190.
Schaeffer, M., Dragsted, B., Hvelplund, K. T., Balling, L. W., & Carl, M. (2016). Word translation entropy: Evidence of early target language activation during reading for translation. In M. Carl, S. Bangalore, & M. Schaeffer (Eds.), New directions in empirical translation process research: Exploring the critt tpr-db (pp. 183–210). Springer. https://doi.org/10.1007/978-3-319-20358-4_9.
Seeber, K. G. (2013). Cognitive load in simultaneous interpreting: Measures and methods. Target, 25(1), 18–32. https://doi.org/10.1075/target.25.1.03see.
Seeber, K. G., & Kerzel, D. (2012). Cognitive load in simultaneous interpreting: Model meets data. International Journal of Bilingualism, 16(2), 228–242. https://doi.org/10.1177/1367006911402982.
Shannon, C. E. (1948). A mathematical theory of communication. Bell System Technical Journal, 27(3), 379–423, 623–656. https://doi.org/10.1002/j.1538-7305.1948.tb01338.x.
Shlesinger, M. (1995). Shifts in cohesion in simultaneous interpreting. The Translator, 1(2), 193–214. https://doi.org/10.1080/13556509.1995.10798957.
Shlesinger, M., & Ordan, N. (2012). More spoken or more translated?: Exploring a known unknown of simultaneous interpreting. Target, 24(1), 43–60. https://doi.org/10.1075/target.24.1.04shl.
Sluyter-Gäthje, H., Bourgonje, P., & Stede, M. (2020). Shallow discourse parsing for under-resourced languages: Combining machine translation and annotation projection. In Proceedings of the twelfth language resources and evaluation conference (pp. 1044–1050).
Teich, E., Martînez Martînez, J., & Karakanta, A. (2020). Translation, information theory and cognition. In F. Alves, & A. L. Jakobsen (Eds.), The Routledge handbook of translation and cognition. Routledge. https://doi.org/10.4324/9781315178127-24.
Thompson, B., & Koehn, P. (2019). Vecalign: Improved sentence alignment in linear time and space. In Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (pp. 1342–1348). EMNLP-IJCNLP. https://doi.org/10.18653/v1/D19-1136.
Versley, Y. (2010). Discovery of ambiguous and unambiguous discourse connectives via annotation projection. In Proceedings of the workshop on annotation and exploitation of parallel corpora (pp. 83–82). AEPC.
Volansky, V., Ordan, N., & Wintner, S. (2015). On the features of translationese. Digital Scholarship in the Humanities, 30(1), 98–118. https://doi.org/10.1093/llc/fqt031.
Yung, F., Scholman, M., Lapshinova-Koltunski, E., Pollkläsener, C., & Demberg, V. (2023). Investigating explicitation of discourse connectives in translation using automatic annotations. In S. Stoyanchev, S. Joty, D. Schlangen, O. Dusek, C. Kennington, & M. Alikhani (Eds.), Proceedings of the 24th meeting of the special interest group on discourse and dialogue (pp. 21–30). Association for Computational Linguistics. https://doi.org/10.18653/v1/2023.sigdial-1.2.
Zeyrek, D., Mendes, A., Grishina, Y., Kurfali, M., Gibbon, S., & Ogrodniczuk, M. (2019). Ted multilingual discourse bank (ted-mdb): A parallel corpus annotated in the pdtb style. Language Resources and Evaluation, 1–38.
Zufferey, S., & Cartoni, B. (2014). A multifactorial analysis of explicitation in translation. Target, 26(3), 361–384. https://doi.org/10.1075/target.26.3.02zuf.