ChatGPT’s capabilities are getting worse with age, new research claims

by Jeremy

OpenAI’s synthetic intelligence-powered chatbot ChatGPT appears to be getting worse as time goes on and researchers can’t appear to determine the explanation why. 

In a July 18 research researchers from Stanford and UC Berkeley discovered ChatGPT’s latest fashions had turn out to be far much less able to offering correct solutions to an an identical collection of questions throughout the span of some months.

The research’s authors couldn’t present a transparent reply as to why the AI chatbot’s capabilities had deteriorated.

To check how dependable the totally different fashions of ChatGPT have been, three researchers, Lingjiao Chen, Matei Zaharia and James Zou requested ChatGPT-3.5 and ChatGPT-4 fashions to resolve a collection of math issues, reply delicate questions, write new strains of code and conduct spatial reasoning from prompts.

In response to the analysis, in March ChatGPT-4 was able to figuring out prime numbers with a 97.6% accuracy price. In the identical check carried out in June, GPT-4’s accuracy had plummeted to simply 2.4%.

In distinction, the sooner GPT-3.5 mannequin had improved on prime quantity identification throughout the similar time-frame.

Associated: SEC’s Gary Gensler believes AI can strengthen its enforcement regime

When it got here to producing strains of latest code, the skills of each fashions deteriorated considerably between March and June.

The research additionally discovered ChatGPT’s responses to delicate questions — with some examples displaying a deal with ethnicity and gender — later turned extra concise in refusing to reply.

Earlier iterations of the chatbot offered intensive reasoning for why it couldn’t reply sure delicate questions. In June nonetheless, the fashions merely apologized to the person and refused to reply.

“The habits of the ‘similar’ [large language model] service can change considerably in a comparatively quick period of time,” the researchers wrote, noting the necessity for steady monitoring of AI mannequin high quality.

The researchers really helpful customers and firms who depend on LLM providers as a part of their workflows implement some type of monitoring evaluation to make sure the chatbot stays up to the mark.

On June 6, OpenAI unveiled plans to create a crew that can assist handle the dangers that might emerge from a superintelligent AI system, one thing it expects to reach throughout the decade.

AI Eye: AI’s educated on AI content material go MAD, is Threads a loss chief for AI information?