Laporkan Masalah

PERBANDINGAN BERBAGAI METODE UNTUK MENDETEKSI BIAS BUTIR

Badrun Kartowagiran,

2005 | Disertasi |

vii The objective of this research study is to conceive: (1) Items of Junior High School mathematics national exit examination which statistically significantly contain bias (Differential Item Functioning/DIF) if detected by the Item Characteristic Curva (ICC) method; (2) Items of Junior High School mathematics national exit examination which statistically significantly contain DIF if detected by Raju’s Area Measure; (3) Items of Junior High School mathematics national exit examination which statistically significantly contain DIF if detected by Lord’s Chisquare; (4) Items of Junior High School mathematics national exit examination which statistically significantly contain bias DIF if detected by the likelihood ratio test; and (5) the most sensitive DIF detection method for the Junior High School mathematics national exit examination. The data for the study consist of State Junior High School students’ responses to the mathematics national exit examination of the 2003 academic year. Before subjected to the DIF analyses, test items were sorted according to the classical theories using the ITEMAN program package and three-parametre item response model using BILOG program. Good test items were then subjected to the DIF analyses using the characteristic curve method using BILOG, Raju’s area measure and Lord’s Chi-square using IRT-DIF program, and the likelihood ratio test using the MULTILOG program. The bias items were then plotted for the probability of the female or male group to answer correctly using the Maple program to determine whether the bias in the items was uniform or non-uniform. Two methods to find out the most sensitive DIF model, that are: to count the number of items which contain DIF, and to check the validity and reliability of DIF detection measurement result with confirmatory factor analysis by using the LISREL program package. Results of the study show: (1) 8 test items were detected by the ICC method as statistically significantly contain DIF, namely items 2, 4, 8, 12, 22, 25, 27, and 30; (2) 4 items were detected as containing DIF by the Raju extended model, namely items 4, 8, 12, and 20; (3) Lord’s Chi-square detected 5 items with DIF namely items 4, 8, 12, 25, and 29; (4) The likelihood ratio test detected 9 items with DIF namely items 2, 4, 8, 12, 22, 25, 27, 29, and 30, (5) of the 10 items with DIF, 9 items favour male students and 1 item favour female students; and (6) In detecting DIF on the 2003 JHS NEE Mathematics test, there is a difference in sensitivity among the Likelihood Ratio Test, the ICC method, Lord’s Chi-square, and Raju’s Areas Measures. In the order of the most sensitive to the least, the rank occurs as follows: the Likelihood Ratio Test, the ICC method, Lord’s Chi-square, and Raju’s Areas Measures, (7) . Test items containing spatial aspects tend to give advantage to male students whole those containing verbal aspects tend to give advantage to female students, (8) . It can be pointed out that gender bias in the test items seems to be related to the fact that, thus far, gender differences have been formed, socialized, endorsed, and even socially and culturally structured in the mind of the students through educational media such as religious and civics education. It is possible that, should gender differences in society diminish, gender-bias items may no longer exist, and (9) The smaller the value of the p (estimation error in drawing conclusion) of each test item containing DIF as detected by various indexes, the larger the number of the methods that can make the detection. Accoringly, it is suggested that: (1) for interested readers and researchers: there is a need for doing similar studies in some other region or other DIF detection models to know if there is gender bias or to verify the strength of the Likelihood Ratio Test, (2) for educators, it is suggested that teachers minimize or, possibly, eliminate practices both in and outside the classroom that have the potential of creating gender differences among male and female students, (3) for the Provincial or Regional Office of Education that is in charge of curriculum and examination: there is a need for organizing trainings for teachers in writing and analyzing test items so that the writing of biased items and procedural errors of steps in detecting biased items can be avoided, and (4) for institutions in charge of national examinations, such as the Centre for Educational Evaluation: (a) may use the results of this study as considerations in their activities so that selection on biased items can be avoided and the most sensitive method for detecting biased items can be selected, (b) need special considerations be given on the gender selection of the test writers so that gender bias can be minimized, and (c) in the selection of the method for detecting biased items, consideration should given on the practical and economical factors.

Kata Kunci :


    Tidak tersedia file untuk ditampilkan ke publik.