Please use this identifier to cite or link to this item: https://dspace.iiti.ac.in/handle/123456789/17053
Full metadata record
DC FieldValueLanguage
dc.contributor.authorMaurya, Chandresh Kumaren_US
dc.date.accessioned2025-10-31T17:40:59Z-
dc.date.available2025-10-31T17:40:59Z-
dc.date.issued2025-
dc.identifier.citationGupta, M., Dutta, M., & Maurya, C. K. (2025). Direct speech-to-speech neural machine translation: A survey. Speech Communication, 175. https://doi.org/10.1016/j.specom.2025.103317en_US
dc.identifier.issn0167-6393-
dc.identifier.otherEID(2-s2.0-105019083773)-
dc.identifier.urihttps://dx.doi.org/10.1016/j.specom.2025.103317-
dc.identifier.urihttps://dspace.iiti.ac.in:8080/jspui/handle/123456789/17053-
dc.description.abstractSpeech-to-Speech Translation (S2ST) models transform speech from one language to another target language with the same linguistic information. S2ST is important for bridging the communication gap among communities and has diverse applications. In recent years, researchers have introduced direct S2ST models, which have the potential to translate speech without relying on intermediate text generation, have better-decoding latency, and the ability to preserve paralinguistic and non-linguistic features. However, direct S2ST has yet to achieve quality performance for seamless communication and still lags behind the cascade models in terms of performance, especially in real-world translation. To the best of our knowledge, no comprehensive survey is available on the direct S2ST system, which beginners and advanced researchers can look upon for a quick survey. The present work extensively reviews direct S2ST models, data and application issues, and performance metrics. We critically analyze the models’ performance over the benchmark datasets and provide research challenges and future directions. © 2025 Elsevier B.V., All rights reserved.en_US
dc.language.isoenen_US
dc.publisherElsevier B.V.en_US
dc.sourceSpeech Communicationen_US
dc.subjectDirect speech-to-speech translationen_US
dc.subjectDiscrete unitsen_US
dc.subjectPre-trainingen_US
dc.subjectRepresentation learningen_US
dc.subjectSelf-supervised learningen_US
dc.subjectTextless trainingen_US
dc.titleDirect speech-to-speech neural machine translation: A surveyen_US
dc.typeReviewen_US
Appears in Collections:Department of Computer Science and Engineering

Files in This Item:
There are no files associated with this item.


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Altmetric Badge: