Reduction of Neural Machine Translation Failures by Incorporating Statistical Machine Translation
Abstract
:1. Introduction
2. Background
2.1. Related Work
2.2. Preliminaries
2.2.1. Neural Machine Translation
2.2.2. Statistical Machine Translation
2.2.3. Classification
- Logistic Regression (LR) [28] is a simple and widely used algorithm for binary classification. It works by modeling the probability of the positive class using a logistic function.
- Decision Tree (DT) [29] is a simple algorithm for binary classification. It works by splitting the data recursively, based on the features that are most informative for the classification task.
- Gradient-Boosted Decision Tree (GBDT) [30] is an algorithm that sequentially builds decision trees to correct errors made by previous trees, making it effective for binary classification tasks. It combines the predictions of multiple trees to provide accurate binary classification results, capturing complex patterns in the data while mitigating overfitting through regularization techniques.
- Random Forest (RF) [31] is an ensemble learning method that combines multiple decision trees to improve the accuracy and stability of the model. It works by selecting a subset of features randomly at each node in the decision tree.
- Naive Bayes (NB) [32] is a probabilistic algorithm that assumes independence between features and works by calculating the probability of the observation belonging to each class based on the likelihood and prior probabilities.
- K-Nearest Neighbors (kNN) [33] is a non-parametric algorithm that works by finding the k-nearest data points to a new observation and assigning the label based on the majority of the neighbors.
- Multilayer Perceptron (MLP) [28] is a type of neural network (NN) that can be used for binary classification problems. It works by building a network of interconnected nodes that process input data and produce an output.
- Convolutional Neural Network (CNN) [34] is a type of neural network that excels at analyzing and extracting features from structured data-like images. Layers of convolutional filters are used to automatically learn hierarchical representations, making them highly effective for binary classification tasks where they can capture intricate patterns and relationships in the data to make accurate predictions.
- Support Vector Machine (SVM) [28] is a powerful machine learning algorithm that is commonly used for binary classification problems. It works by finding a hyperplane that separates the two classes with the largest possible margin.
2.3. Aim and Research Contribution
3. Methodology
3.1. Sentence Embeddings
3.2. Feature Extraction
3.2.1. Cosine Similarity
3.2.2. Jensen–Shannon Divergence
3.2.3. Euclidean Distance
3.2.4. Cityblock Distance
3.2.5. Squared Euclidean Distance
3.2.6. Chebyshev Distance
3.2.7. Canberra Distance
3.2.8. Dice Coefficient
3.2.9. Kulczynski Distance
3.2.10. Russell-Rao Similarity
3.2.11. Sokal–Sneath Distance
3.3. Classification
4. Experiments
4.1. Corpora and Tools
4.2. Experimental Settings for Models’ Training and Classification
4.3. Results
5. Discussion
Translation Examples
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
MT | Machine Translation |
NMT | Neural Machine Translation |
SMT | Statistical Machine Translation |
HMT | Hybrid Machine Translation |
NLP | Natural Language Processing |
LR | Logistic Regression |
DT | Decision Tree |
GBDT | Gradient-Boosted Decision Tree |
RF | Random Forest |
NB | Naive Bayes |
kNN | K-Nearest Neighbors |
MLP | Multilayer Perceptron |
CNN | Convolutional Neural Network |
SVM | Support Vector Machine |
BLEU | BiLingual Evaluation Understudy |
chrF | Character F-score |
TER | Translation Edit Rate |
BERT | Bidirectional Encoder Representations from Transformers |
mBERT | Multilingual Bidirectional Encoder Representations from Transformers |
DE | Differential Evolution |
RNN | Recurrent Neural Networks |
LSTM | Long Short-Term Memory |
GRU | Gated Recurrent Unit |
OOV | Out-of-vocabulary |
WMT | Workshop on Machine Translation |
References
- Popović, M. Comparing Language Related Issues for NMT and PBMT between German and English. Prague Bull. Math. Linguist. 2017, 108, 209–220. [Google Scholar] [CrossRef]
- Popović, M. Language-related issues for NMT and PBMT for English–German and English–Serbian. Mach. Transl. 2018, 32, 237–253. [Google Scholar] [CrossRef]
- Pires, T.; Schlinger, E.; Garrette, D. How Multilingual is Multilingual BERT? In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy, 28 July–2 August 2019; Association for Computational Linguistics: Cedarville, OH, USA; pp. 4996–5001. [Google Scholar] [CrossRef]
- Koehn, P.; Och, F.J.; Marcu, D. Statistical phrase-based translation. In Proceedings of the 2003 Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics, Edmonton, AB, Canada, 27 May–1 June 2003. [Google Scholar]
- Koehn, P. Statistical Machine Translation; Cambridge University Press: Cambridge, UK, 2010. [Google Scholar]
- Lopez, A. Statistical machine translation. ACM Comput. Surv. (CSUR) 2008, 40, 1–49. [Google Scholar] [CrossRef]
- Cho, K.; Van Merriënboer, B.; Bahdanau, D.; Bengio, Y. On the properties of neural machine translation: Encoder-decoder approaches. arXiv 2014, arXiv:1409.1259. [Google Scholar]
- Sutskever, I.; Vinyals, O.; Le, Q.V. Sequence to sequence learning with neural networks. Adv. Neural Inf. Process. Syst. 2014, 2, 3104–3112. [Google Scholar]
- Vashishth, S.; Bhandari, M.; Yadav, P.; Rai, P.; Bhattacharyya, C.; Talukdar, P. Incorporating syntactic and semantic information in word embeddings using graph convolutional networks. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy, 28 July–2 August 2019; Association for Computational Linguistics: Cedarville, OH, USA, 2019; pp. 3308–3318. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. Adv. Neural Inf. Process. Syst. 2017, 30. [Google Scholar] [CrossRef]
- Bahdanau, D.; Cho, K.; Bengio, Y. Neural machine translation by jointly learning to align and translate. arXiv 2014, arXiv:1409.0473. [Google Scholar]
- Meng, F.; Lu, Z.; Wang, M.; Li, H.; Jiang, W.; Liu, Q. Encoding Source Language with Convolutional Neural Network for Machine Translation. arXiv 2015, arXiv:1503.01838. [Google Scholar] [CrossRef]
- Stahlberg, F.; Hasler, E.; Byrne, B. The edit distance transducer in action: The University of Cambridge English-German system at WMT16. arXiv 2016, arXiv:1606.04963. [Google Scholar]
- Stahlberg, F. Neural Machine Translation: A Review. J. Artif. Intell. Res. 2020, 69, 343–418. [Google Scholar] [CrossRef]
- Wang, X.; Pham, H.; Dai, Z.; Neubig, G. SwitchOut: An efficient data augmentation algorithm for neural machine translation. arXiv 2018, arXiv:1808.07512. [Google Scholar]
- Sennrich, R.; Haddow, B.; Birch, A. Edinburgh neural machine translation systems for WMT 16. arXiv 2016, arXiv:1606.02891. [Google Scholar]
- Cromieres, F.; Chu, C.; Nakazawa, T.; Kurohashi, S. Kyoto university participation to WAT 2016. In Proceedings of the 3rd Workshop on Asian Translation (WAT2016), Osaka, Japan, 11–16 December 2016; pp. 166–174. [Google Scholar]
- Huang, J.X.; Lee, K.S.; Kim, Y.K. Hybrid Translation with Classification: Revisiting Rule-Based and Neural Machine Translation. Electronics 2020, 9, 201. [Google Scholar] [CrossRef]
- Sen, S.; Hasanuzzaman, M.; Ekbal, A.; Bhattacharyya, P.; Way, A. Neural machine translation of low-resource languages using SMT phrase pair injection. Nat. Lang. Eng. 2021, 27, 271–292. [Google Scholar] [CrossRef]
- Yan, R.; Li, J.; Su, X.; Wang, X.; Gao, G. Boosting the Transformer with the BERT Supervision in Low-Resource Machine Translation. Appl. Sci. 2022, 12, 7195. [Google Scholar] [CrossRef]
- Bacanin, N.; Zivkovic, M.; Stoean, C.; Antonijevic, M.; Janicijevic, S.; Sarac, M.; Strumberger, I. Application of Natural Language Processing and Machine Learning Boosted with Swarm Intelligence for Spam Email Filtering. Mathematics 2022, 10, 4173. [Google Scholar] [CrossRef]
- Fuad, A.; Al-Yahya, M. Cross-Lingual Transfer Learning for Arabic Task-Oriented Dialogue Systems Using Multilingual Transformer Model mT5. Mathematics 2022, 10, 746. [Google Scholar] [CrossRef]
- Baniata, L.H.; Kang, S.; Ampomah, I.K.E. A Reverse Positional Encoding Multi-Head Attention-Based Neural Machine Translation Model for Arabic Dialects. Mathematics 2022, 10, 3666. [Google Scholar] [CrossRef]
- Alokla, A.; Gad, W.; Nazih, W.; Aref, M.; Salem, A.B. Retrieval-Based Transformer Pseudocode Generation. Mathematics 2022, 10, 604. [Google Scholar] [CrossRef]
- Minaee, S.; Kalchbrenner, N.; Cambria, E.; Nikzad, N.; Chenaghlu, M.; Gao, J. Deep Learning–Based Text Classification: A Comprehensive Review. ACM Comput. Surv. 2021, 54, 62. [Google Scholar] [CrossRef]
- Chen, L.C.; Chang, K.H.; Yang, S.C.; Chen, S.C. A Corpus-Based Word Classification Method for Detecting Difficulty Level of English Proficiency Tests. Appl. Sci. 2023, 13, 1699. [Google Scholar] [CrossRef]
- Canbek, G.; Taskaya Temizel, T.; Sagiroglu, S. PToPI: A Comprehensive Review, Analysis, and Knowledge Representation of Binary Classification Performance Measures/Metrics. SN Comput. Sci. 2023, 4, 13. [Google Scholar] [CrossRef] [PubMed]
- Hsu, B.M. Comparison of Supervised Classification Models on Textual Data. Mathematics 2020, 8, 851. [Google Scholar] [CrossRef]
- Panigrahi, R.; Borah, S.; Bhoi, A.K.; Ijaz, M.F.; Pramanik, M.; Kumar, Y.; Jhaveri, R.H. A Consolidated Decision Tree-Based Intrusion Detection System for Binary and Multiclass Imbalanced Datasets. Mathematics 2021, 9, 751. [Google Scholar] [CrossRef]
- Ding, W.; Chen, Q.; Dong, Y.; Shao, N. Fault Diagnosis Method of Intelligent Substation Protection System Based on Gradient Boosting Decision Tree. Appl. Sci. 2022, 12, 8989. [Google Scholar] [CrossRef]
- Lučin, I.; Lučin, B.; Čarija, Z.; Sikirica, A. Data-Driven Leak Localization in Urban Water Distribution Networks Using Big Data for Random Forest Classifier. Mathematics 2021, 9, 672. [Google Scholar] [CrossRef]
- Gan, S.; Shao, S.; Chen, L.; Yu, L.; Jiang, L. Adapting Hidden Naive Bayes for Text Classification. Mathematics 2021, 9, 2378. [Google Scholar] [CrossRef]
- Kang, S. k-Nearest Neighbor Learning with Graph Neural Networks. Mathematics 2021, 9, 830. [Google Scholar] [CrossRef]
- Nadeem, M.I.; Ahmed, K.; Li, D.; Zheng, Z.; Naheed, H.; Muaad, A.Y.; Alqarafi, A.; Abdel Hameed, H. SHO-CNN: A Metaheuristic Optimization of a Convolutional Neural Network for Multi-Label News Classification. Electronics 2023, 12, 113. [Google Scholar] [CrossRef]
- Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv 2018, arXiv:1810.04805. [Google Scholar] [CrossRef]
- Savini, E.; Caragea, C. Intermediate-Task Transfer Learning with BERT for Sarcasm Detection. Mathematics 2022, 10, 844. [Google Scholar] [CrossRef]
- Patil, R.; Boit, S.; Gudivada, V.; Nandigam, J. A Survey of Text Representation and Embedding Techniques in NLP. IEEE Access 2023, 11, 36120–36146. [Google Scholar] [CrossRef]
- Dash, G.; Sharma, C.; Sharma, S. Sustainable Marketing and the Role of Social Media: An Experimental Study Using Natural Language Processing (NLP). Sustainability 2023, 15, 5443. [Google Scholar] [CrossRef]
- de Lima, R.R.; Fernandes, A.M.R.; Bombasar, J.R.; da Silva, B.A.; Crocker, P.; Leithardt, V.R.Q. An Empirical Comparison of Portuguese and Multilingual BERT Models for Auto-Classification of NCM Codes in International Trade. Big Data Cogn. Comput. 2022, 6, 8. [Google Scholar] [CrossRef]
- Gomaa, W.H.; Fahmy, A.A. A Survey of Text Similarity Approaches. Int. J. Comput. Appl. 2013, 68, 13–18. [Google Scholar]
- Dzisevič, R.; Šešok, D. Text Classification using Different Feature Extraction Approaches. In Proceedings of the 2019 Open Conference of Electrical, Electronic and Information Sciences (eStream), Vilnius, Lithuania, 25 April 2019; pp. 1–4. [Google Scholar] [CrossRef]
- Magalhães, D.; Pozo, A.; Santana, R. An empirical comparison of distance/similarity measures for Natural Language Processing. In Proceedings of the Anais do XVI Encontro Nacional de Inteligência Artificial e Computacional, SBC, Porto Alegre, Brasil, 15–18 October 2019; pp. 717–728. [Google Scholar] [CrossRef]
- Wang, J.; Dong, Y. Measurement of Text Similarity: A Survey. Information 2020, 11, 421. [Google Scholar] [CrossRef]
- Ristanti, P.Y.; Wibawa, A.P.; Pujianto, U. Cosine Similarity for Title and Abstract of Economic Journal Classification. In Proceedings of the 2019 5th International Conference on Science in Information Technology (ICSITech), Jogjakarta, Indonesia, 23–24 October 2019; pp. 123–127. [Google Scholar] [CrossRef]
- Park, K.; Hong, J.S.; Kim, W. A Methodology Combining Cosine Similarity with Classifier for Text Classification. Appl. Artif. Intell. 2020, 34, 396–411. [Google Scholar] [CrossRef]
- Eligüzel, N.; Çetinkaya, C.; Dereli, T. A novel approach for text categorization by applying hybrid genetic bat algorithm through feature extraction and feature selection methods. Expert Syst. Appl. 2022, 202, 117433. [Google Scholar] [CrossRef]
- Kadhim, A.I. Survey on Supervised Machine Learning Techniques for Automatic Text Classification. Artif. Intell. Rev. 2019, 52, 273–292. [Google Scholar] [CrossRef]
- Berciu, A.G.; Dulf, E.H.; Micu, D.D. Improving the Efficiency of Electricity Consumption by Applying Real-Time Fuzzy and Fractional Control. Mathematics 2022, 10, 3807. [Google Scholar] [CrossRef]
- Inyang, U.; Akpan, E.; Akinyokun, O. A Hybrid Machine Learning Approach for Flood Risk Assessment and Classification. Int. J. Comput. Intell. Appl. 2020, 19, 2050012. [Google Scholar] [CrossRef]
- Krivulin, N.; Prinkov, A.; Gladkikh, I. Using Pairwise Comparisons to Determine Consumer Preferences in Hotel Selection. Mathematics 2022, 10, 730. [Google Scholar] [CrossRef]
- Machado, J.A.T.; Mendes Lopes, A. Fractional Jensen–Shannon analysis of the scientific output of researchers in fractional calculus. Entropy 2017, 19, 127. [Google Scholar] [CrossRef]
- Shamir, R.R.; Duchin, Y.; Kim, J.; Sapiro, G.; Harel, N. Continuous dice coefficient: A method for evaluating probabilistic segmentations. arXiv 2019, arXiv:1906.11031. [Google Scholar]
- Cha, S.H. Comprehensive Survey on Distance/Similarity Measures between Probability Density Functions. Int. J. Math. Model. Meth. Appl. Sci. 2007, 1, 300–307. [Google Scholar]
- Ibrahim, H.; El Kerdawy, A.M.; Abdo, A.; Eldin, A.S. Similarity-based machine learning framework for predicting safety signals of adverse drug–drug interactions. Inform. Med. Unlocked 2021, 26, 100699. [Google Scholar] [CrossRef]
- Gutiérrez-Reina, D.; Sharma, V.; You, I.; Toral, S. Dissimilarity metric based on local neighboring information and genetic programming for data dissemination in vehicular ad hoc networks (VANETs). Sensors 2018, 18, 2320. [Google Scholar] [CrossRef] [PubMed]
- Bañón, M.; Chen, P.; Haddow, B.; Heafield, K.; Hoang, H.; Esplà-Gomis, M.; Forcada, M.L.; Kamran, A.; Kirefu, F.; Koehn, P.; et al. ParaCrawl: Web-Scale Acquisition of Parallel Corpora. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Online, 5–10 July 2020; pp. 4555–4567. [Google Scholar] [CrossRef]
- Neubig, G.; Watanabe, T. Optimization for Statistical Machine Translation: A Survey. Comput. Linguist. 2016, 42, 1–54. [Google Scholar] [CrossRef]
- Lü, Y.; Huang, J.; Liu, Q. Improving Statistical Machine Translation Performance by Training Data Selection and Optimization. In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), Prague, Czech Republic, 28–30 June 2007; pp. 343–350. [Google Scholar]
- Dugonik, J.; Bošković, B.; Brest, J.; Sepesy Maučec, M. Improving Statistical Machine Translation Quality Using Differential Evolution. Informatica 2019, 30, 629–645. [Google Scholar] [CrossRef]
- Papineni, K.; Roukos, S.; Ward, T.; Zhu, W.J. Bleu: A Method for Automatic Evaluation of Machine Translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, Stroudsburg, PA, USA, 7–12 July 2002; pp. 311–318. [Google Scholar] [CrossRef]
- Popović, M. chrF: Character n-gram F-score for automatic MT evaluation. In Proceedings of the Tenth Workshop on Statistical Machine Translation, Lisbon, Portugal, 17–18 September 2015; Association for Computational Linguistics: Cedarville, OH, USA, 2015; pp. 392–395. [Google Scholar] [CrossRef]
- Snover, M.; Dorr, B.; Schwartz, R.; Micciulla, L.; Makhoul, J. A Study of Translation Edit Rate with Targeted Human Annotation. In Proceedings of the 7th Conference of the Association for Machine Translation in the Americas: Technical Papers, Cambridge, MA, USA, 8–12 August 2006; pp. 223–231. [Google Scholar]
- Post, M. A Call for Clarity in Reporting BLEU Scores. In Proceedings of the Third Conference on Machine Translation: Research Papers, Belgium, Brussels, 31 October–1 November 2018; Association for Computational Linguistics: Cedarville, OH, USA, 2018; pp. 186–191. [Google Scholar]
- Banerjee, S.; Lavie, A. METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments. In Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization, Ann Arbor, MI, USA, 29 June 2005; Association for Computational Linguistics: Cedarville, OH, USA, 2005; pp. 65–72. [Google Scholar]
- Rei, R.; Stewart, C.; Farinha, A.C.; Lavie, A. COMET: A Neural Framework for MT Evaluation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Online, 16–20 November 2020; Association for Computational Linguistics: Cedarville, OH, USA, 2020; pp. 2685–2702. [Google Scholar] [CrossRef]
- Sennrich, R.; Haddow, B.; Birch, A. Neural Machine Translation of Rare Words with Subword Units. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Berlin, Germany, 7–12 August 2016; pp. 1715–1725. [Google Scholar] [CrossRef]
- Junczys-Dowmunt, M.; Grundkiewicz, R.; Dwojak, T.; Hoang, H.; Heafield, K.; Neckermann, T.; Seide, F.; Germann, U.; Fikri Aji, A.; Bogoychev, N.; et al. Marian: Fast Neural Machine Translation in C++. In Proceedings of the ACL 2018, System Demonstrations, Melbourne, Australia, 15–20 July 2018; pp. 116–121. [Google Scholar]
- Marian NMT Documentation. Online. 2018. Available online: https://marian-nmt.github.io/docs/cmd/marian/ (accessed on 14 April 2023).
- Koehn, P.; Hoang, H.; Birch, A.; Callison-Burch, C.; Federico, M.; Bertoldi, N.; Cowan, B.; Shen, W.; Moran, C.; Zens, R.; et al. Moses: Open Source Toolkit for Statistical Machine Translation. In Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics Companion Volume Proceedings of the Demo and Poster Sessions, Prague, Czech Republic, 23–30 June 2007; pp. 177–180. [Google Scholar]
- Moses SMT Documentation. Online. 2017. Available online: http://www2.statmt.org/moses/ (accessed on 14 April 2023).
Feature | Name |
---|---|
Cosine similarity | |
Jensen–Shannon divergence | |
Euclidean distance | |
Cityblock distance | |
Squared Euclidean distance | |
Chebyshev distance | |
Canberra distance | |
Dice coefficient | |
Kulczynski distance | |
Russell–Rao similarity | |
Sokal–Sneath similarity |
Training | Development | Test | |||
---|---|---|---|---|---|
SMT | NMT | HMT | |||
Sentences | 9,000,000 | 500 | 45,000 | 90,000 | 3000 |
Parameter | Value |
---|---|
type | transformer |
workspace GPU memory | 10 GB |
max–length | 100 |
mini–batch–fit | True |
maxi–batch | 1000 |
early–stopping | 10 |
after–epochs | 50 |
valid–metrics | cross–entropy and perplexity |
valid–mini–batch | 64 |
beam–size | 6 |
normalize | 0.6 |
enc–depth | 6 |
dec–depth | 6 |
transformer–heads | 8 |
transformer–postprocess–emb | d |
transformer–postprocess | dan |
transformer–dropout | 0.1 |
label–smoothing | 0.1 |
learn–rate | 0.0003 |
lr–warmup | 16,000 |
lr–decay–inv–sqrt | 16,000 |
optimizer–params | 0.9, 0.98, 1 × 10 |
clip–norm | 5 |
tied–embeddings–all | True |
sync–sgd | True |
exponential–smothing | True |
Parameter | Value |
---|---|
alignment | grow-diag-final-and |
reordering | msd-bidirectional-fe |
smoothing | improved-kneser-ney |
evaluation metric | BLEU |
n-gram language model order | 5 |
number of generations | 50 |
population size | 25 |
dimension | 14 |
Baseline (NMT) | |||||
---|---|---|---|---|---|
BLEU ↑ | chrF ↑ | TER ↓ | METEOR ↑ | COMET ↑ | |
Slovenian ⇒ English | 46.4 | 65.6 | 40.1 | 70.5 | 83.3 |
English ⇒ Slovenian | 32.0 | 54.1 | 54.4 | 55.3 | 80.7 |
Classifiers in HMT | Slovenian ⇒ English | ||||
---|---|---|---|---|---|
BLEU ↑ | chrF ↑ | TER ↓ | METEOR ↑ | COMET ↑ | |
Logistic Regression (LR) | 46.8 | 65.9 | 41.9 | 70.3 | 83.0 |
Decision Tree (DT) | 47.1 | 65.9 | 42.4 | 69.9 | 82.2 |
Gradient-Boosted Decision Tree (GBDT) | 47.8 | 66.4 | 40.2 | 70.5 | 83.8 |
Random Forest (RF) | 47.7 | 66.4 | 40.4 | 70.5 | 83.5 |
Naive Bayes (NB) | 47.7 | 66.3 | 40.1 | 70.5 | 83.8 |
K-Nearest Neighbor (kNN) | 45.7 | 65.0 | 44.4 | 69.1 | 81.6 |
Multilayer Perceptron (MLP) | 46.3 | 65.5 | 43.2 | 69.8 | 82.2 |
Convolutional Neural Network (CNN) | 46.8 | 65.8 | 42.3 | 70.0 | 82.6 |
Support Vector Machine (SVM) | 47.9 | 66.6 | 39.9 | 70.9 | 83.9 |
Classifiers in HMT | English ⇒ Slovenian | ||||
---|---|---|---|---|---|
BLEU ↑ | chrF ↑ | TER ↓ | METEOR ↑ | COMET ↑ | |
Logistic Regression (LR) | 41.5 | 62.0 | 48.8 | 60.8 | 81.6 |
Decision Tree (DT) | 40.4 | 60.8 | 50.6 | 59.2 | 80.1 |
Gradient-Boosted Decision Tree (GBDT) | 42.4 | 62.5 | 47.5 | 61.4 | 82.4 |
Random Forest (RF) | 41.5 | 62.0 | 48.6 | 60.8 | 81.8 |
Naive Bayes (NB) | 42.4 | 62.6 | 47.4 | 61.5 | 82.5 |
K-Nearest Neighbor (kNN) | 40.7 | 61.3 | 50.3 | 60.0 | 80.6 |
Multilayer Perceptron (MLP) | 42.9 | 63.1 | 48.8 | 61.5 | 80.9 |
Convolutional Neural Network (CNN) | 42.5 | 62.7 | 48.6 | 61.3 | 81.3 |
Support Vector Machine (SVM) | 42.3 | 62.5 | 47.9 | 61.3 | 82.0 |
BLEU ↑ | Distribution [%] | |||||
---|---|---|---|---|---|---|
SMT | NMT | Upper | SMT | NMT | Equal | |
Slovenian ⇒ English | 41.9 | 46.4 | 53.5 | 29.2 | 55.5 | 15.3 |
English ⇒ Slovenian | 41.6 | 32.0 | 49.5 | 44.2 | 43.8 | 12.0 |
BLEU ↑ | chrF ↑ | TER ↓ | ||
---|---|---|---|---|
REF | single market month: sharing ideas online to change europe | |||
SRC | mesec enotnega trga: spremenimo evropo z izmenjavo zamisli | |||
SMT | single market month: sharing ideas online to change europe | 100.0 | 100.0 | 0.0 |
NMT | single market month: changing europe by sharing ideas | 33.0 | 67.7 | 50.0 |
REF | that ’s what they say when they need a card: ) for some 50th wedding anniversary, or a special birthday. | |||
SRC | tako mi rečejo, ko želijo voščilnico: ) za kakšno zlato poroko, pa okrogel rojstni dan. | |||
SMT | that ’s what they say when they need a card: ) for some 50th wedding anniversary, or a special birthday. | 100.0 | 100.0 | 0.0 |
NMT | they tell me when they want a card: ) for a golden wedding and a round birthday. | 18.1 | 38.0 | 52.2 |
REF | a newly designed terrace lies in the comfortable shades of trees and shrubs, only a few meters from the sea, and offers an impressive view on the old part of the marina. | |||
SRC | na novo urejena terasa, le nekaj metrov oddaljena od morja, nudi v prijetnem hladu zelenja impresivni pogled na stari del marine. | |||
SMT | a newly designed terrace lies, only a few meters from the sea, and offers the comfortable shades of trees and shrubs in an impressive view on the old part of the marina. | 83.1 | 94.2 | 5.7 |
NMT | the newly renovated terrace, only a few meters away from the sea, offers an impressive view of the old part of the marina in the pleasant shade of greenery. | 39.6 | 60.0 | 45.7 |
REF | the old town of dubrovnik is easily reachable by a direct bus line departing 50 yards from the hotel. | |||
SRC | do starega mestnega jedra dubrovnika se lahko enostavno odpeljete z direktnim avtobusom, ki ustavlja 50 m stran. | |||
SMT | the old town of dubrovnik is easily reachable by a direct bus line departing 150 feet from the. | 72.3 | 82.9 | 15.0 |
NMT | the old town of dubrovnik can be easily reached by direct bus, 50 m away. | 23.1 | 48.6 | 50.0 |
BLEU ↑ | chrF ↑ | TER ↓ | ||
---|---|---|---|---|
REF | apartmaji mirko staničič, v lasti družine staničič, se nahajajo v mestu brela (splitsko-dalmatinska županija). | |||
SRC | apartments mirko stanicic, owned by the stanicic family, are located in brela (split-dalmatia county). | |||
SMT | apartmaji mirko staničič, v lasti družine staničič, se nahajajo v mestu brela (splitsko-dalmatinska županija). | 100.0 | 100.0 | 0.0 |
NMT | apartma mirko staničič, v lasti družine staničič, se nahaja v mestu brela (splitsko-dalmatinska županija). | 79.1 | 91.2 | 10.5 |
REF | da bi s svojo družino preživeli kvaliteten in nepozaben dopust, smo v našo ponudbo uvrstili počitniške hiše v tkon—otok pasman, ki so kompletno opremljene z vsem, kar vam je potrebno za popolne poletne počitnice. | |||
SRC | in order to spend quality and unforgettable holiday with your family, we have included in our accommodation offer the holiday houses in tkon—island pasman, which are fully equipped with everything you need for a perfect summer vacation. | |||
SMT | da bi preživeli kvaliteten in nepozaben dopust z družino, je v našo ponudbo uvrstili počitniške hiše v tkon—otok pasman, ki so kompletno opremljene z vsem, kar vam je potrebno za popolne poletne počitnice. | 80.7 | 91.0 | 12.5 |
NMT | da bi s svojo družino našli kvaliteten in nepozaben dopust, smo v našo ponudbo uvrstili počitniške hiše v tkon—otok pasman. | 43.6 | 59.1 | 42.5 |
REF | ko ste na opravilu, pritisnite tipko enter, da ga izberete in odprete v pogledu s podrobnostmi. | |||
SRC | use the down arrow key to browse through the results, press enter to select one. | |||
SMT | s tipko s puščico dol lahko brskate po rezultatih iskanja. pritisnite tipko enter, da izberete enega od njih. | 19.1 | 44.5 | 89.5 |
NMT | če želite med rezultati brskati s puščičnimi tipkami dol, pritisnite enter. | 4.9 | 25.5 | 89.5 |
REF | zaradi jasnosti bi bilo treba navedeno uredbo prenoviti. | |||
SRC | in the interests of clarity, that regulation should be recast. | |||
SMT | zaradi jasnosti bi bilo treba navedeno uredbo prenoviti. | 100.0 | 100.0 | 0.0 |
NMT | zaradi jasnosti bi bilo treba to uredbo prenoviti. | 59.7 | 81.5 | 11.1 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Dugonik, J.; Sepesy Maučec, M.; Verber, D.; Brest, J. Reduction of Neural Machine Translation Failures by Incorporating Statistical Machine Translation. Mathematics 2023, 11, 2484. https://doi.org/10.3390/math11112484
Dugonik J, Sepesy Maučec M, Verber D, Brest J. Reduction of Neural Machine Translation Failures by Incorporating Statistical Machine Translation. Mathematics. 2023; 11(11):2484. https://doi.org/10.3390/math11112484
Chicago/Turabian StyleDugonik, Jani, Mirjam Sepesy Maučec, Domen Verber, and Janez Brest. 2023. "Reduction of Neural Machine Translation Failures by Incorporating Statistical Machine Translation" Mathematics 11, no. 11: 2484. https://doi.org/10.3390/math11112484