AUTOMATIC SCORING PADA JAWABAN CONSTRUCTED RESPONSE TEST MENGGUNAKAN CONVOLUTIONAL NEURAL NETWORK
DEWI CHIRZAH, Dr. Tri Kuntoro Priyambodo, M.Sc.;Wahyono,S.Kom.,Ph.D
2020 | Tesis | MAGISTER ILMU KOMPUTERComputerized Based Test (CBT) dengan Automated Essay Scoring (AES) menjadi terobosan pelaksanaan ujian pada era teknologi informasi. AES yang ada menggunakan metode kesamaan tekstual antara jawaban dan kunci jawaban. Metode tersebut mengalami kesulitan ketika diimplementaskan untuk melakukan penilaian pada Constructed Response Test. Jawaban peserta test berbeda-beda karena peserta test dapat mengkonstruksikan bahasa mereka masing-masing. Akibatnya, jawaban siswa yang mempunyai maksud dan penalaran sama atau hampir sama akan dianggap salah hanya karena secara tekstual tidak sama dengan kunci jawaban. Penelitian ini menggunakan metode kesamaan makna untuk mengatasi kendala tersebut. Model dibangun menggunakan Convolution Neural Network (CNN) dengan pre-trained Word2Vec. Klasifikasi dimulai dengan pelatihan data wiki Bahasa Indonesia menggunakan word2vec, preprocesssing, embedding matrix, dan CNN. Embedding layer akan mengolah kalimat (embedding matrix) berdasarkan persamaan semantik. Jenis klasifikasi adalah categorical classification menggunakan sofmax, hasilnya dibagi 4 grade kesamaan yaitu A >= 0,85, B < 0,85, C <0,70, dan D < 0,55. Grade A,B,C masuk kelas Benar (1) dan grade D masuk kelas Salah (0). Implementasi Word2Vec dan CNN terbukti meningkatkan akurasi penilaian. Akurasi pada data sedikit (500) baik menggunakan stopword removal dan tidak, mencapai 0,99, data sedang (1000) mencapai 0,92. Sedangkan data banyak (2084) akurasi mencapai 0,98, dan tanpa stopword removal mencapai 0,99.
Computerized Based Test (CBT) with Automated Essay Scoring (AES) is a breakthrough in implementing exams in the era of information technology. The existing AES uses the textual similarity method between answers and answers keys. This method encountered difficulties when implemented to assess the Constructed Response Test. The answers to the test takers differ because test participants can construct their language. As a result, students' answers that have the same or nearly the same meaning and reasoning will be considered wrong just because they are not textually the same as the answer key. This study uses the same meaning method to overcome these obstacles. The model was built using the Convolution Neural Network (CNN) with pre-trained Word2Vec. Classification begins with training on Indonesian wiki data using word2vec, preprocessing, embedding matrix, and CNN. The embedding layer will process the sentence (embedding matrix) based on the semantic equation. Type of classification is categorical classification using Softmax, the results are divided into 4 similarity grades, namely A> = 0.85, B <0.85, C <0.70, and D <0.55. Grade A, B, C are in True (1) and grade D are in False (0). The implementation of Word2Vec and CNN has been shown to improve the accuracy of the assessment. The accuracy of the little data (500) using stopword removal and not, reaches 0.99, medium data (1000) reaches 0.92. While large data (2084) accuracy reaches 0.98, and without stopword removal reaches 0.99.
Kata Kunci : Automated Essay Scoring, CNN, Klasifikasi, Text Processing, Word2Vec