On the effect of using different scoring methods for two versions of a test.


International conference: Language Centres in Higher Education: Sharing Innovations, Research, Methodology and Best Practices, 2015

60200 6.2 Languages and Literature

Česká republika


Pedagogická fakulta

spravedlivost; konstrukt, metoda skórování, ekvivalence

Fairness; construct; scoring method; equivalence


This article presents a study of the effect of a different scoring method on the construct of the Czech Maturita English examination. In particular it focuses on decision consistency made on the basis of the test results and the implications for test fairness and validity of the interpretations of test results. Questions are discussed concerning construct validity, decision consistency and fairness by comparing the test results of two versions of the same test, but with different scoring. The findings show that rescoring causes changes the weights of skills measured by the tests, and thus changes in construct; decision consistency of the tests with different scoring was low, and therefore the interpretation of the results of the two test versions cannot be the same. It was found in this particular case that the students tested do not change their strategies, as they believe that the tests are equivalent and fair, and they are not conscious of the possible consequences of rescoring. On the basis of the results, this article tentatively concludes that introducing different scoring may increase unreliability and cause unfair decisions and judgements of students’ ability.


Změna skórování testových verzí souvisí se změnou váhy ověřovaných dovedností a posunem v definici konstruktu, čímž znesnadňuje interpretaci výsledků testovaných ve dvou verzích téhož testu stejným způsobem. Závěry studie naznačují, že při změně konstruktu je problematické považovat testové verze za paralelní či ekvivalentní, a může docházet k ohrožení validity závěrů a spravedlivosti rozhodování o úrovni dovedností testovaných.