Analisis Performa Sistem Prediksi ICD-10 pada Rekam Medis Elektronik Berbahasa Indonesia
Azzamuddien Hanifa, Ir. Adhistya Erna Permanasari, S.T., M.T., Ph.D., IPM.; Dr. Indriana Hidayah, S.T., M.T.
2025 | Tesis | S2 Teknologi Informasi
Electronic Health Records (EHR), which are in a free-text format, present challenges in document comprehension, making the ICD-10 diagnosis coding process difficult and prone to errors when performed manually by coders. These coding errors can be detrimental to multiple parties. To address this issue, several previous studies have developed ICD diagnosis code prediction systems using deep learning models such as Convolutional Neural Network (CNN), Long-Short Term Memory (LSTM), and Bidirectional Long-Short Term Memory (BILSTM) to provide more accurate results. However, these studies were still limited to using EHR data in languages other than Indonesian.
This study aims to analyze the performance of various deep learning models in an ICD-10 code prediction system using Indonesian-language electronic health record data. Additionally, this research will analyze the construction of an ICD-10 code prediction system using Indonesian-language electronic health records.
This study demonstrates the preprocessing process in Indonesian-language electronic health records, such as removing "enter" characters, case folding, abbreviation replacement, and punctuation removal. Subsequently, the data is converted into vector form using word embedding (word2vec) to be used as an embedding layer in the deep learning models. The data, in the form of an embedding matrix, is used as input to train several prediction models. This study found that the LSTM model achieved the highest average accuracy with a value of 0.7666. The research concludes that the obtained results are still suboptimal due to high loss and val\_loss values. Furthermore, indications of overfitting were found in the CNN Parallel Layer and LSTM models, as well as indications of underfitting in the vanilla CNN and BILSTM models.
Kata Kunci : Rekam Medis Elektronik, International Classification of Disease, Sistem Prediksi Kode ICD-10, Deep Learning, CNN, LSTM, BILSTM