Investigating the Feasibility of Automatic Assessment of Programming Tasks
ARTICLE
Janet Liebenberg, Vreda Pieterse
JITE-IIP Volume 17, Number 1, ISSN 2165-3151 e-ISSN 2165-3151 Publisher: Informing Science Institute
Abstract
Aim/Purpose: The aims of this study were to investigate the feasibility of automatic assessment of programming tasks and to compare manual assessment with automatic assessment in terms of the effect of the different assessment methods on the marks of the students. Background: Manual assessment of programs written by students can be tedious. The assistance of automatic assessment methods might possibly assist in reducing the assessment burden, but there may be drawbacks diminishing the benefits of applying automatic assessment. The paper reports on the experience of a lecturer trying to introduce automated grading. Students’ solutions to a practical Java programming test were assessed both manually and automatically and the lecturer tied the experience to the unified theory of acceptance and use of technology (UTAUT). Methodology: The participants were 226 first-year students registered for a Java programming course. Of the tests the participants submitted, 214 were assessed both manually and automatically. Various statistical methods were used to compare the manual assessment of student’s solutions with the automatic assessment of the same solutions. A detailed investigation of reasons for differences was also carried out. A further data collection method was the lecturer’s reflection on the feasibility of automatic assessment of programming tasks based on the UTAUT. Contribution: This study enhances the knowledge regarding benefits and drawbacks of automatic assessment of students’ programming tasks. The research contributes to the UTAUT by applying it in a context where it has hardly been used. Furthermore, the study is a confirmation of previous work stating that automatic assessment may be less reliable for students with lower marks, but more trustworthy for the high achieving students. Findings: An automatic assessment tool verifying functional correctness might be feasible for assessment of programs written during practical lab sessions but could be less useful for practical tests and exams where functional, conceptual and structural correctness should be evaluated. In addition, the researchers found that automatic assessment seemed to be more suitable for assessing high achieving students. Recommendations for Practitioners: This paper makes it clear that lecturers should know what assessment goals they want to achieve. The appropriate method of assessment should be chosen wisely. In addition, practitioners should be aware of the drawbacks of automatic assessment before choosing it. Recommendation for Researchers: This work serves as an example of how researchers can apply the UTAUT theory when conducting qualitative research in different contexts. Impact on Society: The study would be of interest to lecturers considering automated assessment. The two assessments used in the study are typical of the way grading takes place in practice and may help lecturers understand what could happen if they switch from manual to automatic assessment. Future Research: Investigate the feasibility of automatic assessment of students’ programming tasks in a practical lab environment while accounting for structural, functional and conceptual assessment goals.
Citation
Liebenberg, J. & Pieterse, V. (2018). Investigating the Feasibility of Automatic Assessment of Programming Tasks. Journal of Information Technology Education: Innovations in Practice, 17(1), 201-223. Informing Science Institute. Retrieved January 27, 2023 from https://www.learntechlib.org/p/187391/.
Keywords
References
View References & Citations Map- Ajzen, I. (1991). The theory of planned behaviour. Organisational Behaviour and Human Decision Processes, 50(2), 179-211. Https://doi.org/10.1016/0749-5978(1991)90020-T
- Ajzen, I., & Fishbein, M. (1980). Understanding attitudes and predicting social behaviour. Englewood Cliffs, NJ: Prentice-Hall.
- Al-Adwan, A.S., Al-Madadha, A., & Zvirzdinaite, Z. (2018). Modeling students’ readiness to adopt mobile learning in higher education: An empirical study. The International Review of Research in Open and Distributed Learning, 19(1). Https://doi.org/10.19173/irrodl.v19i1.3256
- Ala-Mutka, K., Uimonen, T., & Jarvinen, H.-M. (2004). Supporting students in C++ programming courses with automatic program style assessment. Journal of Information Technology Education: Research, 3, 245-262.
- Arifi, S.M., Abdellah, I.N., Zahi, A., & Benabbou, R. (2015). Automatic program assessment using static and dynamic analysis. Proceedings of the 2015 Third World Conference on Complex Systems (WCCS), 1-6.
- Bey, A., & Bensebaa, T. (2011). Algo+, an assessment tool for algorithmic competencies. Paper presented at the 2011 IEEE Global Engineering Education Conference (EDUCON), Amman, Jordan.
- Bey, A., Jermann, P., & Dillenbourg, P. (2018). A Comparison between two automatic assessment approaches for programming: An empirical study on MOOCs. Journal of Educational Technology& Society, 21(2), 259-272.
- Biggs, J.B., & Collis, K.F. (1982). Evaluating the quality of learning: The SOLO taxonomy (Structure of the Observed Learning Outcome). NY: Academic Press.
- Birch, G., Fischer, B., & Poppleton, M. (2016). Using fast model-based fault localisation to aid students in selfguided program repair and to improve assessment. Proceedings of the 2016 ACM Conference on Innovation and Technology in Computer Science Education, 168-173.
- Bland, J.M., & Altman, D. (1986). Statistical methods for assessing agreement between two methods of clinical measurement. The Lancet, 327(198476), 307-310. Https://doi.org/10.1016/S0140-6736(1986)90837-8
- Buyrukoglu, S., Batmaz, F., & Lock, R. (2016). Increasing the similarity of programming code structures to accelerate the marking process in a new semi-automated assessment approach. Proceedings of the 11th International Conference on Computer Science& Education (ICCSE), 371-376.
- Combéfis, S., & Paques, A. (2015). Pythia reloaded: An intelligent unit testing-based code grader for education. Proceedings of the 1st International Workshop on Code Hunt Workshop on Educational Software Engineering, 5-8.
- Council on Higher Education. (2018). VitalStats public higher education 2016. Retrieved from http://www.che.ac.za/sites/default/files/publications/CHE_VitalStats_2016%20webversion.pdf
- Dann, W.P., Cooper, S., & Pausch, R. (2008). Learning to program with Alice: Prentice Hall Press.
- Davis, F.D., Bagozzi, R., & Warshaw, P. (1989). User acceptance of computer technology: A comparison of two theoretical models. Management Science, 35(8), 982-1003.
- Del Fatto, V., Dodero, G., Gennari, R., Gruber, B., Helmer, S., & Raimato, G. (2017). Automating assessment of exercises as means to decrease MOOC teachers’ efforts. Proceedings of the Conference on Smart Learning Ecosystems and Regional Development, 201-208.
- Liebenberg & PieterseDepartment of Higher Education and Training. (2017). Statistics on post-school education and training in South Africa: 2015 (pp. 84). Retrieved from
- Edwards, S.H. (2003). Improving student performance by evaluating how well students test their own programs. Journal on Educational Resources in Computing (JERIC), 3(3), 1.
- Ihantola, P., Ahoniemi, T., Karavirta, V., & Seppälä, O. (2010). Review of recent systems for automatic assessment of programming assignments. Proceedings of the 10th Koli Calling International Conference on Computing Education Research, 86-93. Https://doi.org/10.1145/1930464.1930480
- Korhonen, A., & Malmi, L. (2000). Algorithm simulation with automatic assessment. ACM SIGCSE Bulletin, 32(3), 160-163.
- Krusche, S., & Seitz, A. (2018). ArTEMiS: An automatic assessment management system for interactive learning. Proceedings of the 49th ACM Technical Symposium on Computer Science Education, Baltimore, Maryland, 284289.
- Liebenberg, J., Benadé, T., & Ellis, S. (2018). Acceptance of ICT: Applicability of the Unified Theory of Acceptance and Use of Technology (UTAUT) to South African students. The African Journal of Information Systems, 10(3), 1.
- Lister, R. (2010). Computing education research-Geek genes and bimodal grades. ACM Inroads, 1(3), 16-17.
- Orrell, J. (2008). Assessment beyond belief: The cognitive process of grading. In A. Havnes & L. McDowell (Eds.), Balancing dilemmas in assessment and learning in contemporary education (pp. 251-263). London: Routledge.
- Parsons, D., & Haden, P. (2006). Parson’s programming puzzles: A fun and effective learning tool for first programming courses. Proceedings of the 8th Australasian Conference on Computing Education, 52, 157-163.
- Petersen, A., Craig, M., & Zingaro, D. (2011). Reviewing CS1 exam question content. Proceedings of the 42nd ACM Technical Symposium on Computer Science Education, 631-636.
- Pettit, R., Homer, J., Gee, R., Mengel, S., & Starbuck, A. (2015). An empirical study of iterative improvement in programming assignments. Proceedings of the 46th ACM Technical Symposium on Computer Science Education, 410415.
- Pieterse, V. (2013). Automated assessment of programming assignments. Proceedings of the 3rd Computer Science Education Research Conference (CSERC 2013), 45-56.
- Pieterse, V., & Janse van Vuuren, H. (2015). Experience in the formulation of memoranda for an automarker of simple programming tasks. Proceedings of the 44th Annual Southern African Computer Lecturers’ Association (SACLA), 210-214.
- Pieterse, V., & Sonnekus, I.P. (2003). Why are we doing IT to ourselves? Proceedings of the 33rd Annual Conference of the Southern African Computer Lecturers’ Association (SACLA), Paper 9.
- Pin. (2012). A dynamic binary instrumentation tool [Computer Software]. Santa Clara, CA: Intel Corporation. Retrieved from http://software.intel.com/en-us/articles/pintool
- Poon, C.K., Wong, T.-L., Tang, C.M., Li, J.K.L., Yu, Y.T., & Lee, V.C.S. (2018). Automatic assessment via intelligent analysis of students’ program output patterns. Paper presented at the International Conference on Blended Learning.
- Posavac, E.J. (2015). Program evaluation: Methods and case studies (8th ed.). New York: Routledge.
- Romli, R., Sulaiman, S., & Zamli, K.Z. (2015). Improving automated programming assessments: User experience evaluation using FaSt-generator. Procedia Computer Science, 72, 186-193.
- Šťastná, J., Juhár, J., Biňas, M., & Tomášek, M. (2015). Security measures in automated assessment system for programming courses. Acta Informatica Pragensia, 4(3), 226-241.
- Staubitz, T., Klement, H., Renz, J., Teusner, R., & Meinel, C. (2015). Towards practical programming exercises and automated assessment in Massive Open Online Courses. Proceedings of the 2015 IEEE International Conference onTeaching, Assessment, and Learning for Engineering (TALE), 23-30.
- Staubitz, T., Klement, H., Teusner, R., Renz, J., & Meinel, C. (2016). CodeOcean: A versatile platform for practical programming excercises in online environments. Proceedings of the 2016 IEEE Global Engineering Education Conference (EDUCON), 314-323.
- Valgrind. (2017) (Version 3.13.0) [Computer Software]. Retrieved from http://www.valgrind.org/
- Venkatesh, V., & Davis, F.D. (2000). A theoretical extension of the technology acceptance model: Four longitudinal field studies. Management Science, 46(2), 186-204.
- Venkatesh, V., Morris, M.G., Davis, G.B., & Davis, F.D. (2003). User acceptance of information technology. MIS Quarterly, 27(3), 425-478.
- Waugh, K., Thomas, P., & Smith, N. (2007). Teaching and learning applications related to the automated interpretation of ERDs. Proceedings of the 24th British National Conference on Databases (BNCOD’07), 39-47.
- Yu, Y., Tang, C., & Poon, C. (2017). Enhancing an automated system for assessment of student programs using the token pattern approach. IEEE 6th International Conference on Teaching, Assessment, and Learning for Engineering (TALE), 406-413.
These references have been extracted automatically and may have some errors. Signed in users can suggest corrections to these mistakes.
Suggest Corrections to References