You are here:

Assessing Scientific Practices Using Machine-Learning Methods: How Closely Do They Match Clinical Interview Performance?
ARTICLE

, , , ,

Journal of Science Education and Technology Volume 23, Number 1, ISSN 1059-0145

Abstract

The landscape of science education is being transformed by the new "Framework for Science Education" (National Research Council, "A framework for K-12 science education: practices, crosscutting concepts, and core ideas." The National Academies Press, Washington, DC, 2012), which emphasizes the centrality of scientific practices--such as explanation, argumentation, and communication--in science teaching, learning, and assessment. A major challenge facing the field of science education is developing assessment tools that are capable of validly and efficiently evaluating these practices. Our study examined the efficacy of a free, open-source machine-learning tool for evaluating the quality of students' written explanations of the causes of evolutionary change relative to three other approaches: (1) human-scored written explanations, (2) a multiple-choice test, and (3) clinical oral interviews. A large sample of undergraduates (n = 104) exposed to varying amounts of evolution content completed all three assessments: a clinical oral interview, a written open-response assessment, and a multiple-choice test. Rasch analysis was used to compute linear person measures and linear item measures on a single logit scale. We found that the multiple-choice test displayed poor person and item fit (mean square outfit >1.3), while both oral interview measures and computer-generated written response measures exhibited acceptable fit (average mean square outfit for interview: person 0.97, item 0.97; computer: person 1.03, item 1.06). Multiple-choice test measures were more weakly associated with interview measures (r = 0.35) than the computer-scored explanation measures (r = 0.63). Overall, Rasch analysis indicated that computer-scored written explanation measures (1) have the strongest correspondence to oral interview measures; (2) are capable of capturing students' normative scientific and naive ideas as accurately as human-scored explanations, and (3) more validly detect understanding than the multiple-choice assessment. These findings demonstrate the great potential of machine-learning tools for assessing key scientific practices highlighted in the new "Framework for Science Education."

Citation

Beggrow, E.P., Ha, M., Nehm, R.H., Pearl, D. & Boone, W.J. (2014). Assessing Scientific Practices Using Machine-Learning Methods: How Closely Do They Match Clinical Interview Performance?. Journal of Science Education and Technology, 23(1), 160-182. Retrieved April 2, 2020 from .

This record was imported from ERIC on November 3, 2015. [Original Record]

ERIC is sponsored by the Institute of Education Sciences (IES) of the U.S. Department of Education.

Copyright for this record is held by the content creator. For more details see ERIC's copyright policy.

Keywords