Digital Object Identifier

	10.14456/nkrafa.2025.1 The Application of Generative Artificial Intelligence Technology in Voice Conversion
รหัสดีโอไอ	10.14456/nkrafa.2025.1
Creator	Anon Bangsan
Title	The Application of Generative Artificial Intelligence Technology in Voice Conversion
Contributor	Payap Sirinam
Publisher	Navaminda Kasatriyadhiraj Royal Air Force Academy
Publication Year	2568
Journal Title	NKRAFA Journal of Science and Technology
Journal Vol.	22
Journal No.	2
Page no.	135-157
Keyword	Generative Artificial Intelligence, Generative Adversarial Network, Voice Conversion, Cyber Warfare
URL Website	https://ph02.tci-thaijo.org/index.php/nkrafa-sct
Website title	NKRAFA Journal of Science and Technology
ISSN	3057-0913
Abstract	This research aims to 1) explore the appropriate application of artificial intelligence (AI) technology for voice spoofing, 2) develop a generative AI-based voice spoofing model and investigate optimization strategies to enhance its suitability for cyber domain applications, 3) evaluate the performance and deception potential of synthetic voices generated by the model, and 4) propose practical applications of generative AI technology in offensive cyber operations.The findings indicated that MaskCycleGAN-VC was a highly effective generative artificial intelligence model suitable for voice spoofing in the Thai language. This model could generate synthetic voices that closely resembled the original in terms of naturalness, including rhythm, intonation, and emotional expression. A key feature of the model was its ability to be developed and trained within just one day, using only moderate computational resources. The synthetic voices generated by the model could deceive listeners into believing they were genuine voices with an accuracy of up to 56%, while genuine voices were misclassified as synthetic in up to 59% of cases. This highlighted the challenges of distinguishing between genuine and synthetic voices in noisy environments. Performance metrics included a Mean Opinion Score (MOS) score for naturalness of up to 3.9 and similarity of up to 4.2, with a minimum Mel Cepstral Distortion (MCD) of 5 dB and Kernel Deep Speech Distance (KDSD) of 15.9 mKDSD. This model demonstrated significant potential for applications in security and offensive cyber operations, including support for intelligence activities, confusion in emergency scenarios, and simulated training exercises. However, its usage should be approached with caution to prevent misuse in unethical contexts.

Navaminda Kasatriyadhiraj Royal Air Force Academy

บรรณานุกรม

EndNote

APA

Anon Bangsan และ Payap Sirinam. (2025) The Application of Generative Artificial Intelligence Technology in Voice Conversion. NKRAFA Journal of Science and Technology, 22(2), 135-157. 10.14456/nkrafa.2025.1

Chicago

Anon Bangsan และ Payap Sirinam. "The Application of Generative Artificial Intelligence Technology in Voice Conversion". NKRAFA Journal of Science and Technology 22 (2025):135-157. 10.14456/nkrafa.2025.1

MLA

Anon Bangsan และ Payap Sirinam. The Application of Generative Artificial Intelligence Technology in Voice Conversion. Navaminda Kasatriyadhiraj Royal Air Force Academy:ม.ป.ท. 2025. 10.14456/nkrafa.2025.1

ดิจิตอลไฟล์

Digital File

บรรณานุกรม

APA

Chicago

MLA

ดิจิตอลไฟล์

ไม่สามารถแสดงตัวอย่างไฟล์ได้