INTELLIGENT SYNTHESIS TECHNOLOGY OF CHINESE SPEECH FOR SPEECH NAVIGATION, 504-514.

doi:10.2316/J.2024.206-1050

INTELLIGENT SYNTHESIS TECHNOLOGY OF CHINESE SPEECH FOR SPEECH NAVIGATION, 504-514.

Kade Tuerxun

References

[1] Z. Mu, X. Yang, and Y. Dong, Review of end-to-endspeech synthesis technology based on deep learning, 2021,arXiv:2104.09995.
[2] P. Janyoi and P. Seresangtakul, F0 modeling for Isarn speechsynthesis using deep neural networks and syllable-level featurerepresentation, International Arab Journal of InformationTechnology, 17(6), 2020, 906–915.
[3] Y. Wang, A speech synthesis model with mood based onvariational autoencoder, Computer Science and Application,10(12), 2020, 2159–2167.
[4] T. Celin, T. Nagarajan, and P. Vijayalakshmi, Dataaugmentation using virtual microphone array synthesis andmulti-resolution feature extraction for isolated word dysarthricspeech recognition, IEEE Journal of Selected Topics in SignalProcessing, 14(2), 2020, 346–354.
[5] Y. Wakabayashi, Speech enhancement using harmonic-structure-based phase reconstruction, Acoustical Science andTechnology, 40(3), 2019, 162–169.
[6] A. Ahmad, M.R. Selim, M.Z. Iqbal, and S.R. Mohammad,An encoder-decoder based grapheme-to-phoneme converter forBangla speech synthesis, Acoustical Science and Technology,40(6), 2019, 374–381.
[7] S. Lee and H.S. Pang. Feature extraction based on the non-negative matrix factorization of convolutional neural networksfor monitoring domestic activity with acoustic signals, IEEEAccess, 8(7), 2020, 122384–122395.
[8] Y. Lin, F. Wang, X. Cui, L. Hong, and Y. Liu, A parameterizedrepresentation for self-motion manifold of crawler crane robots,International Journal of Robotics and Automation, 37(2), 2022,219–226.
[9] R. Liu, B. Sisman, F. Bao, G. Gao, and H. Li, Modelingprosodic phrasing with multi-task learning in Tacotron-based TTS, IEEE Signal Processing Letters, 27(8), 2020,1470–1474.
[10] X. Zhou, Z. Ling, and L.R. Dai, UnitNet: A sequence-to-sequence acoustic model for concatenative speech synthesis,IEEE/ACM Transactions on Audio, Speech, and LanguageProcessing, 29(6), 2021, 2643–2655.513
[11] B. P´erez-Canedo and J.L. Verdegay, On the application of alexicographic method to fuzzy linear programming problems,Journal of Computational and Cognitive Engineering, 2(1),2023, 47–56.
[12] S. Oslund, C. Washington, A. So, T. Chen, and H. Ji, Multiviewrobust adversarial stickers for arbitrary objects in the physicalworld, Journal of Computational and Cognitive Engineering,1(4), 2022, 152–158.
[13] A. Lanza, S. Morigi, I.W. Selesnick, and S. Fiorella, Sparsity-inducing nonconvex nonseparable regularization for conveximage processing, Siam Journal on Imaging Sciences, 12(2),2019, 1099–1134.
[14] A. Walker, S. Abedi, and S. Kwon, Design of threshold-basedenergy storage control policy based on rule-constrained two-stage stochastic program, International Journal of ElectricalPower & Energy Systems, 137, 2022, 107798–107810.
[15] E. Brown, I.A. Jamsek, L. Liang, F. Rachael, and T.B.Holt, Predicting children’s word recognition accuracy with twodistance metrics, The Journal of the Acoustical Society ofAmerica, 145(3), 2019, 1798.
[16] V.Y. Semenov, Methods for calculating and coding theparameters of autoregressive speech model when developingthe vocoder based on ﬁxed point signal process, Journal ofAutomation and Information Sciences, 51(2), 2019, 30–40.
[17] Y. Liu and J. Zheng, Es-Tacotron2: Multi-task Tacotron 2with pre-trained estimated network for reducing the over-smoothness problem, Information (Switzerland), 10(4), 2019,131–152.
[18] Z. Zheng, Y. Zhong, S. Tian, A. Ma, and L. Zhang, ChangeMask:Deep multi-task encoder-Transformer-decoder architecture forsemantic change detection, ISPRS Journal of Photogrammetryand Remote Sensing, 183(5), 2022, 228–239.

Important Links:

Abstract
DOI: 10.2316/J.2024.206-1050
From Journal (206) International Journal of Robotics and Automation - 2024

Go Back