Determinants of Grader Agreement

An Analysis of Multiple Short Answer Corpora
dc.bibliographicCitation.firstPage387en_US
dc.bibliographicCitation.issue2en_US
dc.bibliographicCitation.lastPage416en_US
dc.bibliographicCitation.volume56en_US
dc.contributor.authorPadó, Ulrike
dc.contributor.authorPadó, Sebastian
dc.date.accessioned2023-03-27T09:24:03Z
dc.date.accessioned2023-03-28T05:57:18Z
dc.date.available2023-03-27T09:24:03Z
dc.date.issued2021en_US
dc.date.updated2023-03-25T15:45:55Z
dc.description.abstractThe ’short answer’ question format is a widely used tool in educational assessment, in which students write one to three sentences in response to an open question. The answers are subsequently rated by expert graders. The agreement between these graders is crucial for reliable analysis, both in terms of educational strategies and in terms of developing automatic models for short answer grading (SAG), an active research topic in NLP. This makes it important to understand the properties that influence grader agreement (such as question difficulty, answer length, and answer correctness). However, the twin challenges towards such an understanding are the wide range of SAG corpora in use (which differ along a number of dimensions) and the hierarchical structure of potentially relevant properties (which can be located at the corpus, answer, or question levels). This article uses generalized mixed effects models to analyze the effect of various such properties on grader agreement in six major SAG corpora for two main assessment tasks (language and content assessment). Overall, we find broad agreement among corpora, with a number of properties behaving similarly across corpora (e.g., shorter answers and correct answers are easier to grade). Some properties show more corpus-specific behavior (e.g., the question difficulty level), and some corpora are more in line with general tendencies than others. In sum, we obtain a nuanced picture of how the major short answer grading corpora are similar and dissimilar from which we derive suggestions for corpus development and analysis.en_US
dc.description.sponsorshipHochschule für Technik Stuttgart (3377)
dc.identifier.doi10.1007/s10579-021-09547-3
dc.identifier.urihttp://resolver.sub.uni-goettingen.de/purl?fidaac-11858/2902
dc.language.isoengen_US
dc.relation.issn1574-020Xen_US
dc.relation.journalLanguage Resources and Evaluationen_US
dc.rightsL::CC BY 4.0en_US
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/
dc.subject.ddcddc:370en_US
dc.subject.ddcddc:400en_US
dc.subject.fieldenglishlanguageteachingen_US
dc.subject.fieldlinguisticsen_US
dc.titleDeterminants of Grader Agreementen_US
dc.title.alternativeAn Analysis of Multiple Short Answer Corporaen_US
dc.typearticleen_US
dc.type.versionpublishedVersionen_US
dspace.entity.typePublication
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
s10579-021-09547-3.pdf
Size:
360.13 KB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
2.75 KB
Format:
Plain Text
Description:
Collections