PERANGKINGAN TEKS MULTI-DOMAIN SEBAGAI PENDUKUNG TRANSLASI BAHASA ALAMI DENGAN MEMANFAATKAN PENDEKATAN STATISTIK BERDASARKAN TOPOLOGIS TAKSONOMI DAN POLA PENDISTRIBUSIAN RASIO EMAS; MULTI-DOMAIN TEXT RANKING FOR NATURAL LANGUAGE TRANSLATION SUPPORT WITH STATISTICAL BASED APPROACH UTILIZING THE TOPOLOGICAL TAXONOMY AND GOLDEN RATIO DISTRIBUTION PATTERN
VICTOR PHOA, Sri Hartati
2014 | Disertasi | PROGRAM STUDI S2 ILMU KOMPUTERDuring the observations in the last decade of the machine translation results, there is still a problem in terms of the quality of the translation. Some machines already have complementary features as the disambiguation support (morphological variation unit) through the domains selections. Unfortunately, these methods usually are static or as single domain because user must determine the domain of corpus, while on the other hand, rangking which based ond flat multi-domain indexing didn’t provide the good results. Under such constraints and conditions, the authors have developed new method and approach to indexing called Topological Taxonomy Term Statistical Ratio (T3SR), this based on taxonomy topology and utilize statistical feature, distributional properties (based on the golden ratio), heuristics, and relativity. This T3SR method has been tested on 10 (ten) corpus and compared with the flat method; Nearest Statistical Term Ratio (NTSR) and Normalized Ratio Nearest Statistical Term (NNTSR). Based on the results, the T3SR method outperformed the flat methods (which obtained 60% score of feasibility). T3SR method gives very good indexing results, rank patterns, and the relevance of the logic (100% score of feasibility), so it is considered very feasible to be applied in the disambiguation preprocess of machine translation
Kata Kunci : pengindeksan; perangkingan; klasifikasi teks; mesin penerjemah; bahasa alami; disambiguasi; rasio emas