Abstract:
Despite recent advancements in neural machine interpreting, limited research has assessed
its performance as compared to human simultaneous interpreting, particularly in the
English-Russian language pair. Addressing this gap, the present study investigates the
quality of machine interpreting by Yandex as compared to professional human
simultaneous interpreting. The research aims to identify the strengths and weaknesses of
machine interpreting outputs and reveal typical errors through a convergent mixed
methods design. Utilizing a quality assessment methodology, the study integrates
quantitative scoring to measure the performance and qualitative feedback to reveal
particular mistakes. Thus, a quality assessment scale consisting of score-based and
feedback components was adapted to evaluate six audio fragments interpreted by both
professional human interpreters and Yandex neural networks. Five expert assessors were
sampled to ensure objective evaluation of the interpretations. Quantitative results indicated
that human interpreters outscored machine interpreting in terms of logical cohesion,
terminology, and style. In contrast, Yandex scored higher than humans in completeness
and fluency of delivery, successfully handling strong regional accents and high speed of
delivery. Qualitative analysis identified that while machine interpreting demonstrated no
cognitive limitations typical for human interpreters, it resulted in lexical and grammatical
redundance, producing overloaded sentences difficult for comprehension. Yandex also
misinterpreted numbers, leading to significant meaning distortions. Unnatural delivery,
marked by a robotic, monotonous voice and lack of prosody further diminished the output
quality. Additionally, machine interpreting struggled with context recognition, resulting in
inaccurate word choices and terminological inconsistencies. This study concludes that
while machine interpreting cannot yet fully replicate human expertise, it holds potential as
supportive technologies – particularly in assisting human interpreters or offering cost-
effective solutions for low-stakes communicative events. Large-scale empirical studies
iv
should be conducted in the future to evaluate professional-grade machine interpreting tools
in real-time conditions and to consider user perceptions.