EKSTRAKSI CIRI SUARA UNTUK PENGENALAN IDENTITAS PEMBICARA MENGGUNAKAN MFCC DAN HIDDEN MARKOV MODELS

Budi Darmawan

Budi Darmawan, Dr. Eng. Ir. Risanuri Hidayat, M.Sc.

2011 | Tesis | S2 Teknik Elektro

Abstrak
File Pdf

Penelitian ini mencoba untuk membuat sistem identifikasi pembicara dengan menggunakan Hidden Markov Models (HMM) jenis kiri ke kanan dengan metode berbasis jarak euclidean untuk menghitung probabilitas runtun observasi. Rekaman suara berasal dari 10 orang pembicara yang masing-masing mengucapkan kata â€SAYAâ€ sebanyak 30 kali. Setiap suara dipisah-pisahkan berdasarkan huruf penyusunnya, dan diekstraksi ciri menggunakan FFT dan MFCC dengan jumlah koefisien yang berbeda-beda. Hasil percobaan memperlihatkan bahwa untuk mendapatkan tingkat akurasi rata-rata 90% keatas, dibutuhkan jumlah koefisien MFCC sebanyak 5 koefisien per state untuk jumlah pembicara 2 orang, 4 koefisien per state untuk jumlah pembicara 3, 4 dan 5 orang, 3 koefisien per state untuk jumlah pembicara 6 dan 7 orang, dan 5 koefisien per state untuk jumlah pembicara 8, 9 dan 10 orang. Dan untuk mendapatkan akurasi 100%, dibutuhkan 5 buah koefisien MFCC per state untuk jumlah pembicara 2 sampai 4 orang, 12 koefisien per state untuk jumlah pembicara 5 sampai 9. Dan untuk jumlah pembicara 10 orang, akurasi tertingginya 99% dengan jumlah koefisien MFCC sebanyak 12 per state. Hasil percobaan juga menunjukkan bahwa ekstraksi ciri MFCC dengan menggunakan jumlah koefisien per state sebanyak 6 koefisien sudah dapat memberikan hasil akurasi rata-rata yang lebih baik untuk semua jumlah pembicara dibandingkan dengan menggunakan FFT yang mengambil jumlah sampel 256 sebanyak sampel per state.

In this experiment developed a voice recognition system using Hidden Markov Models (HMM) type of left to right with Euclidean Distance-based method to calculate the probability of observation series. Sound recordings from 10 speakers, each saying the word \"SAYA\" 30 times. Each voice are split according to constituent letters, and extracted using FFT and MFCC feature with a number of different coefficients. The experimental results showed that in order to obtain an average accuracy rate of 90% or more, it takes a number of MFCC coefficients by 5 coefficients per state for the number of samples 2 people, 4 coefficients per state for the number of samples 3, 4 and 5 people, 3 coefficient per state for the number samples 6 and 7 people, and 5 coefficients per state for the number of samples 8, 9 and 10 people. And to get an accuracy of 100%, it takes 5 pieces of MFCC coefficients per state for the number of samples 2 to 4 people, 12 coefficients per state for the number of samples from 5 to 9. And for a number of samples 10 people, the highest accuracy of 99% with the number of MFCC coefficients of 12 per state. The experimental results also show that the MFCC feature extraction by using the number of coefficients per state as much as 5 coefficients has to give the average accuracy is better for all of the speakers compared using FFT which takes the number of samples 256 samples per state.

Kata Kunci : Hidden Markov Models, MFCC, FFT, jarak Euclidean

Tidak tersedia file untuk ditampilkan ke publik.

LAYANAN

E-Resources

Quick Access