OpenAI just recently released its latest AI upgrade, GPT-4. The chatbot has been showcasing impressive feats, such as scoring the 90th percentile in the bar exams and having a close-to-perfect score in the verbal test of the GRE.
However, just recently, it has demonstrated an area it has fallen short of: accounting.
ChatGPT vs. University Accounting Students
According to SciTechDaily, Brigham Young University (BYU) academicians and other specialists from 186 different institutions were curious about how the bot fared in accounting tests. Hence, they let the original ChatGPT undergo a test.
David Wood, the study's lead author and a professor of accounting at BYU, recruited as many professors as possible to see how the bot performed against accounting students at universities. His pitch saw massive explosions, as the study had 327 co-authors from 186 institutions in 14 countries. A total of 25,181 accounting exam questions for the classroom were contributed.
On top of this, the researchers also recruited undergraduate students from BYU to feed a bank of 2,268 textbook questions to the bot, as reported by Accounting Today. These covered various accounting areas, including managerial accounting, financial accounting, and tax, among others. There were also variations in type (multiple choice, true/false, etc.) and difficulty levels.
ChatGPT Struggles With Math and Accounting
Though the university accounting students did not exactly ace the questions, they still garnered better results compared to ChatGPT. For the human students, the average score was 76.7%.
On the other hand, ChatGPT achieved a score of 47.4%, Mint reports.
It is important to note, however, that ChatGPT did better compared to students in certain areas, primarily in auditing and AIS (accounting information systems). This covered 11.3% of the questions.
However, in managerial, financial, and tax assessments, ChatGPT exhibited poorer performance. Accounting Today reports that this could be due to ChatGPT's favor of language over math. The bot may have found it hard to deal with mathematical calculations.
The study also revealed that ChatGPT had a better performance in true/false (with 68.7% correct items) and multiple-choice items (with 59.5% correct items). It did, however, struggle to answer short-answer questions (28.7% to 39.1% correct answers).
ChatGPT also had errors in certain calculations. In some cases, the bot added numbers when it was supposed to subtract or divide them.
The researchers also noted that ChatGPT found it more difficult to answer higher-level questions. At times, it offered detailed responses to answers that were incorrect and had inconsistent responses.
Min also adds that despite offering explanations for its answers, the bot still occasionally chose the wrong multiple-choice answer even if its descriptions were quite accurate.
It was also discovered that the AI offered information that was false, including a reference that looked valid but actually pointed to sources that were non-existent.
Despite what the bot still lacks in the area of accounting, the researchers note its potential to revolutionize methods of teaching and learning.
Check out more news and information on Artificial Intelligence in Science Times.