I’ve been working for decades on developing a scalable alternative to multiple-choice tests largely because they are narrowly focused on correctness rather than competence (the ability to put knowledge & ideas to work in the real world). Consequently, I wasn’t surprised to see the long list of multiple-choice tests that the latest version of ChatGPT has passed—including medical exams. AI’s first great strength ought to be its ability to determine correct answers — especially when there are limited choices.
Then, I read an article by Josh Tamayo-Sarver about the incompetent performance of ChatGPT in the emergency room, and my first thought was, “Holy *! GhatGPT passed the medical exam. That test was used to decide if my doctor was qualified to practice medicine. Clearly, the exam isn’t doing a very good job.”
Thank you, ChatGPT, for shedding a new kind of light on one of the key weaknesses of our educational system — our reliance on narrowly focused tests of correctness, when competence is what really matters. The question on my mind is, will AI become competent before our educational systems abandon their focus on correctness?