The aim of this paper is to show that conceptualizing and defining those subsets of data that the researcher extracts from an online searchable corpus constitutes a step forward towards more methodological transparency, especially in those settings where researchers do not build their own corpora, but use those that can be searched online. To that end, in Section 2 a succinct typology of online searchable corpora for the study of language-pair related or translation-related phenomena will be drawn. In Section 3, the concept of secondary sample corpus will be put forward, which, in turn, will shed new light on the status of web-based parallel (and parallelized) corpora. Section 4 will be devoted to general methodological requirements and to requirements of the quantitative approach as the main – but not the only – approach shaping the research work in the fields of Corpusbased Descriptive Translation Studies and Corpus-based Contrastive Linguistics; the focus will lie on issues of replicability, comparability, availability and usability. In Section 5, some brief conclusions will be drawn.
Baroni, M., Bernardini, S., Ferraresi, A. & Zanchetta, E. 2009. The WaCky Wide Web: A Collection of Very Large Linguistically Processed Web-crawled Corpora. Language Resources & Evaluation Vol. 43. 209–226.
Becher, V. 2010. Towards a More Rigorous Treatment of the Explicitation Hypothesis in Translation Studies. Trans-Kom Vol. 3. No. 1. 1–25. www.trans-kom.eu/bd03nr01/transkom_ 03_01_01_Becher_Explicitation.20100531.pdf.
Benko, V. 2014. Aranea: Yet Another Family of (Comparable) Web Corpora. In: Sojka, P. et al. (eds) TSD 2014. Berlin, Heidelberg: Springer. 247–256.
Berlin-Brandenburgische Akademie der Wissenschaften. 2011. DWDS: Digitales Wörterbuch Der Deutschen Sprache Des 20. Jahrhunderts [Online]. http://www.dwds.de.
Bernardini, S. 2016. Discovery Learning in the Language-for-Translation Classroom: Corpora as Learning Aids. Cadernos de Traduçao special issue No. 1. 14–35. https://dialnet.unirioja.es/descarga/articulo/5501922.pdf.
Bernardini, S. & Ferraresi, A. 2013. Old Needs, New Solutions: Comparable Corpora for Language Professionals. In: Sharoff, S., Rapp, R., Zweigenbaum, P. & Fung, P. (eds) Building and Using Comparable Corpora. Berlin, Heidelberg: Springer. 303–319.
Bilbao Telletxea, G. & Makazaga Eizaguirre. J. M. 2015. EHUskaratuak: un corpus de traducciones académicas en una lengua minoritaria. In: Sánchez Nieto, M.T., Álvarez Álvarez, S., Arnáiz-Uzquiza, V., Ortego Antón, M.T., Santamaría Ciordia, L. & Fernández Muñiz, R. (eds) Metodologías y aplicaciones en la investigación en traducción e interpretación con corpus. Universidad de Valladolid. 323–338. https://dialnet.unirioja.es/ descarga/articulo/5334863.pdf.
Borja Albi, A. 2008. Corpora for Translators in Spain. The CDJ-GITRAD Corpus and the GENTT Project. In: Anderman, G. & Rogers, M. (eds) Incorporating Corpora. The Linguist and the Translator. Toronto: Multilingual Matters. 243–265.
Bracho Lapiedra, L. (ed.) 2013. El corpus COVALT: un observatori de fraseologia traduïda. Shaker Ver. Aquisgrán.
Briz Gómez, A. & Albelda Marco, M. 2009. Estado actual de los corpus de lengua española hablada y escrita: I+D. In: Instituto Cervantes (ed.) El español en el mundo. anuario del Instituto Cervantes 2009. Madrid: Instituto Cervantes. 165–226.
Čermák, F. & Rosen, A. 2012. The Case of InterCorp, a Multilingual Parallel Corpus. International Journal of Corpus Linguistics Vol. 13. No. 3. 411–427.
Corpas Pastor, G. & Seghiri, M. 2007. Specialized Corpora for Translator: A Quantitative Method to Determine Representativeness. Translation Journal Vol. 11. No. 3. http://accurapid.com/journal/41corpus.htm.
Davies, M. 2008. COCA. Corpus of Contemporary American English. http://corpus.byu.edu/ coca/.
Davies, M. 2011. Corpus del español (100 Millones de Palabras, Siglo XIII – Siglo XX). http://www.corpusdelespanol.org.
Faya Ornia, G. 2015. Propuesta de clasificación de corpus textuales. In: Sánchez Nieto, M. T., Álvarez Álvarez, S., Arnáiz-Uzquiza, V., Ortego Antón, M. T., Santamaría Ciordia, L. & Fernández Muñiz, R. (eds) Metodologías y aplicaciones en la investigación en traducción e interpretación con corpus = Methodologies and Applications in Corpus-based and Corpus-driven Translation and Interpreting Research. Valladolid: Ediciones Universidad de Valladolid. 339–356. http://uvadoc.uva.es:80/handle/10324/16449.
Gómez Guinovart, X. 2015. Máis información sobre o corpus paralelo CLUVI (Corpus Lingüístico da Universidade de Vigo). http://sli.uvigo.es/CLUVI/info.html.
Hareide, L. & Hofland, K. 2012. Compiling a Norwegian–Spanish Parallel Corpus: Methods and Challenges. In: Oakes, M. P. & Ji, M. (eds) Quantitative Methods in Corpus-based Translation Studies. Amsterdam and Philadelphia: Benjamins. 75–113.
Hernández Sampieri, R., Fernández Collado, C. & Baptista Lucio, P. 2006. Metodología de la investigación. Mexico: McGraw Hill.
Izquierdo, M., Hofland, K. & Reigem, Ø. 2008. The ACTRES Parallel Corpus: An English– Spanish Translation Corpus. Corpora Vol. 3. No. 1. 31–41.
Jakubíček, M., Kilgarriff, A., Kovář, V., Rychlý, P. & Suchomel, V. 2013. The TenTen Corpus Family. 7th International Corpus Linguistics Conference, 125–127. http://ucrel.lancs.ac.uk/ cl2013/doc/CL2013-ABSTRACT-BOOK.pdf.
Laviosa, S. 1997. How Comparable Can ‘Comparable Corpora’ Be? Target Vol. 9. No. 2. 289–319. doi:10.1075/target.9.2.05lav.
Laviosa, S. 2011. Corpus-based Translation Studies: Where Does it Come from? Where is it Going? In: Kruger, A., Wallmach, K. & Munday, J. (eds) Corpus-based Translation Studies. Research and Applications. London: Continuum. 13–32.
Laviosa, S. 2015. Corpora and Holistic Cultural Translation. In: Sánchez Nieto, M. T. (ed.) Corpus-based Translation and Interpreting Studies: From Description to Application / Estudios traductológicos basados en corpus: de la descripción a la aplicación. Berlin: Frank & Timme Verlag. 31–51.
Lu, B., Tan, C., Cardie, C. & Tsou, B. K. 2011. Joint Bilingual Sentiment Classification with Unlabeled Parallel Corpora. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Vol. 1. Association for Computational Linguistics. 320–330.
Malmkjær, K. 2008. Norms and Nature in Translation Studies. In: Anderman, G. & Rogers, M. (eds) Incorporating Corpora: The Linguist and the Translator. Clevedon, Buffalo, and Toronto: Multilingual Matters. 49–59.
Marco, J. 2013. Els estudis de traducció basats en corpus. In: Bracho Lapiedra, L. (ed.) El Corpus COVALT: un observatori de fraseologia traduïda. Aachen: Shaker Verlag. 5–47. http://www.raco.cat/index.php/QuadernsTraduccio/article/download/294282/382836.
Marco, J. 2015. Taking Stock: A Critical Overview of Research on (Universal) Features of Translated Language. In: Sánchez Nieto, M. T. (ed.) Corpus-based Translation and Interpreting Studies: From Description to Application / Estudios traductológicos basados en corpus: de la descripción a la aplicación. Berlin: Frank & Timme. 53–76.
Martínez Vilinsky, B. 2012. On the Lower Frequency of Occurrence of Spanish Verbal Periphrases in Translated Texts as Evidence for the Unique Items Hypothesis. Across Languages and Cultures Vol. 13. No. 2. 197–210. http://www.akademiai.com/content/ l352745n4h676892/.
Molés-Cases, T. 2015. La traducción de los eventos de movimiento en un corpus paralelo alemán-español de literatura infantil y juvenil. Ph. diss., Universitat Jaume I. http://ccuc.cbuc.cat/record=b6628150~S23%2Aspi.
Molés-Cases, T. 2016. Compilación y análisis de un corpus paralelo para la investigación en traducción. Proyecto Con Déjà Vu, Treetagger e IMS Open Corpus Workbench. RLA (Revista de Lingüística Teórica y Aplicada) Vol. 54. No. 1. 149–174.
Molés-Cases, T. & Oster, U. 2015. Webquests in Translator Training: Introducing Corpus-based Tasks. In: Leńko-Szymańska, A. & Boulton A. (eds) Multiple Affordances of Language Corpora for Data-driven Learning. Amsterdam and Philadelphia: John Benjamins. 199–224.
Neumann, S. & Hansen-Schirra, S. 2013. Exploiting the Incomparability of Comparable Corpora for Contrastive Linguistics and Translation Studies. In: Sharoff, S., Rapp, R., Zweigenbaum, P. & Fung, P. (eds) Building and Using Comparable Corpora. Berlin, Heidelberg: Springer. 321–335. doi:10.1007/978-3-642-20128-8_17.
Oakes, M. P. 2012. Describing a Translational Corpus. In: Oakes, M. P. & Ji, M. (eds) Quantitative Methods in Corpus-based Translation Studies. A Practical Guide to Descriptive Translation Research. Amsterdam and Philadelphia: Benjamins. 115–148.
Olohan, M. 2002a. Comparable Corpora in Translation Research: Overview of Recent Analyses Using the Translational English Corpus.. In: Yuste, E. et al. (eds) LREC Language Resources in Translation Work and Research Workshop Proceedings. 5–9. http://bit.ly/2bGxlwb.
Olohan, M. 2002b. Corpus Linguistics and Translation Studies: Interaction and Reaction. Linguistica Antverpiensia Vol. 1. 419–429. Oxford University Computing Services on behalf of the BNC Consortium. 2007. The British National Corpus, Version 3 (BNC XML Edition). http://www.natcorp.ox.ac.uk/.
Parra Escartin, C. 2012. Design and Compilation of a Specialized Spanish–German Parallel Corpus. In: Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC’12) (23–25 May 2012, Istanbul, Turkey). 2199–2206. http://bit.ly/2b9kG2o.
Rabadán, R. 2006. Hipótesis, explicaciones y aplicaciones: los caminos de la investigación en traducción inglés-español. In: Gonzalo García, R. C. & Hernúñez, P. (eds) Corcillvm. Estudios de traducción, lingüística y filología dedicados a Valentín García Yebra. Madrid: Arco Libros. 147–170.
Real Academia Española (RAE). 2011. Banco de Datos (CREA): Corpus de Referencia Del Español Actual [online]. http://www.rae.es.
Real Academia Española, (RAE). 2013. Corpus del español del siglo XXI (CORPES). Descripción del sistema de codificación. libros y prensa [online] /. Madrid: Real Academia Española. http://www.rae.es/sites/default/files/CORPES_Sistema_de_codificacion 2014. pdf.
Rojo, G. 2010. Sobre la codificación y explotación de corpus textuales: otra comparación del corpus del español con el CORDE Y El CREA. Lingüística Vol. 24. 11–50. http://www.mundoalfal.org/sites/default/files/revista/revista_24_3.swf.
Saldanha, G. 2009. Principles of Corpus Linguistics and their Application to Translation Studies Research. Tradumàtica Vol. 7. http://www.fti.uab.cat/tradumatica/revista/num7/ articles/ 01/central.htm.
Sánchez Nieto , M. T. 2015. Construcción de corpus virtuales comparables deslocalizados (DE/ES): análisis y comparación de recursos [Building virtual delocalized comparable corpora (DE /ES). Analysis and contrast of resources]. In: Sánchez Nieto, M. T. (ed.) Corpus-based Translation and Interpreting Studies: From Description to Application / Estudios traductológicos basados en corpus: de la descripción a la aplicación. Berlin: Frank & Timme. 235–259. http://bit.ly/2bs3Oqi.
Sanz Villar, Z., Zubillaga Gómez, N. & Uribarri, I. 2015. Estudio basado en corpus de las traducciones del alemán al vasco. In: Sánchez Nieto, M. T. (ed.) Corpus-based Translation and Interpreting Studies: From Description to Application / Estudios traductológicos basados en corpus: de la descripción a la aplicación, Berlin: Frank & Timme Verlag für wissenschaftliche Literatur. 211–233.
Seghiri Domínguez, M. 2011. Metodología protocolizada de compilación de un corpus de seguros de viajes: aspectos de diseño y representatividad. RLA. Revista de Lingüística Teórica Y Aplicada Vol. 49. No. 2. 13–30. http://www.scielo.cl/scielo.php? script=sci_arttext&pid= S0718-48832011000200002&lng=en&nrm=iso&tlng=en.
Seghiri Domínguez, M. 2015. Determinación de la representatividad cuantitativa de un corpus ad hoc bilingüe (inglés-español) de manuales de instrucciones generales de lectores electrónicos / Establishing the Quantitative Representativiness of an E-Reader User’s Guide Ad Hoc Corpus. In: Sánchez Nieto, M. T. (ed.) Corpus-based Translation and Interpreting Studies: From Description to Application. Berlin: Frank & Timme Verlag für wissenschaftliche Literatur. 125–146.
Smith, N., Hoffmann, S. & Rayson, P. 2008. Corpus Tools and Methods, Today and Tomorrow: Incorporating Linguists’ Manual Annotations. Literary and Linguistic Computing Vol. 23. No. 2. 163–179. http://llc.oxfordjournals.org/content/23/2/163.abstract.
Sutter, G., Goethals, P., Leuschner, T. & Vandepitte, S. 2012. Towards Methodologically More Rigorous Corpus-based Translation Studies. Across Languages and Cultures Vol. 13. No. 2. 137–143. http://www.akademiai.com/content/j82h334u555g4528/.
Tognini-Bonelli, E. 2010. Theoretical Overview of the Evolution of Corpus Linguistics. In: McCarthy, M. & O’Keeffe, A. (eds) The Routledge Handbook of Corpus Linguistics. London: Routledge. 14–27.
Toury, G. 1995. Descriptive Translation Studies and Beyond. Amsterdam/Philadelphia: Benjamins.
Tymocko, M. 2002. Connecting the Two Infinite Orders. Research Methods in Translation Studies. In: Hermans, T. (ed.) Crosscultural Transgressions. Research models in translation studies II. Historical and ideological issues. Manchester: St. Jerome. 9–25.
Volk, M., Graën, J. & Callegaro, E. 2014. Innovations in Parallel Corpus Search Tools. In: Calzolari N., Choukri, K., Declerck, T., Loftsson, H., Maegaard, B., Mariani, J., Moreno, A., Odijk, J. & Piperidis, S. (eds) Proceedings of the 9th International Conference on Language Resources and Evaluation (LREC’14). European Language Resources Association (ELRA). 3172–3178.
Zanettin, F. 2011. Translation and Corpus Design. SYNAPS – A Journal of Professional Communication Vol. 26. 14–23.
Zanettin, F. 2012. Translation-driven Corpora: Corpus Resources for Descriptive and Applied Translation Studies. Manchester: St. Jerome.