Search Results
You are looking at 1 - 2 of 2 items for
- Author or Editor: Zijian Győző Yang x
- Refine by Access: All Content x
Abstract
One of the most important NLP tasks for the industry today is to produce an extract from longer text documents. This task is one of the hottest topics for the researchers and they have created some solutions for English. There are two types of the text summarization called extractive and abstractive. The goal of the first task is to find the relevant sentences from the text, while the second one should generate the extraction based on the original text. In this research I have built the first solutions for Hungarian text summarization systems both for extractive and abstractive subtasks. Different kinds of neural transformer-based methods were used and evaluated. I present in this publication the first Hungarian abstractive summarization tool based on mBART and mT5 models, which gained state-of-the-art results.
Abstract
In the scope of this research, we aim to give an overview of the currently existing solutions for machine translation and we assess their performance on the English-Hungarian language pair. Hungarian is considered to be a challenging language for machine translation because it has a highly different grammatical structure and word ordering compared to English. We probed various machine translation systems from both academic and industrial applications. One key highlight of our work is that our models (Marian NMT, BART) performed significantly better than the solutions offered by most of the market-leader multinational companies. Finally, we fine-tuned different pre-finetuned models (mT5, mBART, M2M100) for English-Hungarian translation, which achieved state-of-the-art results in our test corpora.