Subscribe

Join UCL Science Magazine

Become a member!

Join Us

Has ChatGPT Bid Farewell to Traditional Assessment Methods?

ChatGPT has studied the entire internet and is now capable of forming complete and logically structured answers to many types of assessments. Will this technology change how students are tested in the future? By Saleh Bubshait.

Many students believe that ChatGPT will mark the end of tedious essays and various other university assessments. The chatbot’s remarkable growth, with millions of users on board since its release, speaks eloquently about its supremacy over earlier AI-based chatbots. It can arguably write about almost anything, from a Shakespearean Sonnet about mathematics to essays on British food or even codes for a website, all while upholding a high standard for coherence. This unparalleled versatility has spurred a heated debate about the future of education and traditional assessment methods. Will this cutting-edge technology upend conventional assessment techniques on the academic front?

A recent study aimed to assess ChatGPT’s (15 December 2023 version) competency with standardised tests [1]. It was tested on the Netherlands' national pre-university education (VWO) exam, consisting of multiple-choice and short-answer questions. As the chatbot’s knowledge cut-off at the time was 2021, it was tested on three newer exam papers from 2022. Remarkably, it scored an average of 7.18 on a scale of 10, surpassing the national average score of 7.0.

In the realm of higher education, ChatGPT was tested on four final exams administered by the University of Minnesota Law School [2]. The examinations consisted of a variety of question types, including multiple-choice, short and long essay questions. To ensure fairness, the bot responses were blindly graded. Surprisingly, the chatbot passed all four exams with an average grade of C+. The researchers confirmed that ChatGPT, despite being below average, could graduate from law school if it maintained its current performance level. Both studies show that, while not flawless, ChatGPT can achieve satisfactory and sometimes even commendable grades, potentially threatening conventional methods of assessment.

Is ChatGPT's potential for cheating, however, any worse than that of traditional methods? A recent study ascertained ChatGPT’s recognisability by using it to produce abstracts for scientific articles [3]. The generated abstracts were mixed with ones from original, published articles and then assessed by AI detectors and human reviewers. The former group identified most of the ChatGPT-generated abstracts (with a median score of 99.98% possibility of being AI-generated), suggesting that ChatGPT’s writing is distinctive. Similarly, the latter succeeded in identifying nearly 70% of the generated articles by noticing that they were “vaguer and had a formulaic feel” compared to original abstracts [3]. As experts in their fields, university instructors are likely to identify AI-generated responses in writing-based academic assessments even without the use of AI detectors. Given the likelihood of being identified, using ChatGPT for cheating in assessments could actually put students at a disadvantage.

Furthermore, while ChatGPT's essays and responses appear to be logical, they can sometimes rely on bogus or irrelevant evidence. In the study of abstracts mentioned earlier, ChatGPT drew logical conclusions based on wholly fabricated data. Such inaccuracies are usually not tolerated in academic assessments. Additionally, as an AI chatbot, ChatGPT is prone to bias and can struggle with unfamiliar or critical-thinking-oriented questions, particularly if the question conflicts with its training data, as shown in Figures 1 and 2. TU Delft associate professor and researcher Dr Winter notes that "teachers may assume that it is safe to administer exams online with minimal supervision for topics such as comprehension, where answers are likely unavailable online" [1]. Given these limitations, ChatGPT is unlikely to pose a significant challenge to conventional assessment techniques.

Figure 1: ChatGPT tends to draw logical conclusions, but with the potential drawback of using incorrect information. In this figure, ChatGPT correctly concluded that rock beats scissors, so it wins. However, the user did not choose scissors in the first place.
Figure 2: ChatGPT may experience difficulty with questions requiring critical thinking. This figure depicts an incident in which ChatGPT failed to solve a seemingly straightforward question.

While unlikely to threaten conventional assessment methods, ChatGPT can positively impact university students and staff. Much like how calculators revolutionised the way maths is taught and tested, the incorporation of ChatGPT into essay composition has the potential to transform essay-based assessments. Rather than merely assessing writing proficiency, these tasks could now focus on promoting critical thinking and deductive reasoning, which are arguably the more valuable skills to possess in the 21st century. However, ChatGPT's capabilities extend beyond writing essays. For example, it may aid instructors in preparing lecture materials, such as lecture notes and slides. In addition, It can provide students with personalised assistance, including providing exercises and explanations for a specific topic. Hence, instead of being regarded as a threat, ChatGPT may turn out to be an invaluable asset in education.

Additionally, major tests that focus on assessing the application of specific skills are currently administered in invigilated environments that make the use of the chatbot challenging. To prevent any potential exploitation of the technology, academic institutions could emphasise academic integrity and the repercussions of plagiarism in their curriculums. Further, it is vital that measures be taken to regulate access to this technology, especially given the technology’s need for internet connection and possible fees. This is to ensure equitable access for students from all backgrounds, giving everyone a fair chance to succeed in their academic pursuits.

References:

  1. De Winter J. Can ChatGPT pass high school exams on English Language Comprehension? [Preprint]. 2023 [cited 2023 Feb 14]: Available from: https://rgdoi.net/10.13140/RG.2.2.24094.20807
  2. Choi JH, Hickman KE, Monahan A, Schwarcz DB. ChatGPT Goes to Law School. University of Minnesota Law School Legal Studies Research Paper Series. 2023 Jan 25;23(03).
  3. Gao CA, Howard FM, Markov NS, Dyer EC, Ramesh S, Luo Y, et al. Comparing scientific abstracts generated by ChatGPT to original abstracts using an artificial intelligence output detector, plagiarism detector, and blinded human reviewers. bioRxiv [Preprint]. 2022 Dec 23 [cited: 30 January 2023]. Available from: https://doi.org/10.1101/2022.12.23.521610