ChatGPT’s capabilities are getting worse with age, new research claims

by Jeremy July 19, 2023

OpenAI’s synthetic intelligence-powered chatbot ChatGPT appears to be getting worse as time goes on and researchers can’t appear to determine the explanation why.

In a July 18 research researchers from Stanford and UC Berkeley discovered ChatGPT’s latest fashions had turn out to be far much less able to offering correct solutions to an an identical collection of questions throughout the span of some months.

The research’s authors couldn’t present a transparent reply as to why the AI chatbot’s capabilities had deteriorated.

To check how dependable the totally different fashions of ChatGPT have been, three researchers, Lingjiao Chen, Matei Zaharia and James Zou requested ChatGPT-3.5 and ChatGPT-4 fashions to resolve a collection of math issues, reply delicate questions, write new strains of code and conduct spatial reasoning from prompts.

We evaluated #ChatGPT‘s habits over time and located substantial diffs in its responses to the *similar questions* between the June model of GPT4 and GPT3.5 and the March variations. The newer variations received worse on some duties. w/ Lingjiao Chen @matei_zaharia https://t.co/TGeN4T18Fd https://t.co/36mjnejERy pic.twitter.com/FEiqrUVbg6

— James Zou (@james_y_zou) July 19, 2023

In response to the analysis, in March ChatGPT-4 was able to figuring out prime numbers with a 97.6% accuracy price. In the identical check carried out in June, GPT-4’s accuracy had plummeted to simply 2.4%.

In distinction, the sooner GPT-3.5 mannequin had improved on prime quantity identification throughout the similar time-frame.

Associated: SEC’s Gary Gensler believes AI can strengthen its enforcement regime

When it got here to producing strains of latest code, the skills of each fashions deteriorated considerably between March and June.

The research additionally discovered ChatGPT’s responses to delicate questions — with some examples displaying a deal with ethnicity and gender — later turned extra concise in refusing to reply.

Earlier iterations of the chatbot offered intensive reasoning for why it couldn’t reply sure delicate questions. In June nonetheless, the fashions merely apologized to the person and refused to reply.

“The habits of the ‘similar’ [large language model] service can change considerably in a comparatively quick period of time,” the researchers wrote, noting the necessity for steady monitoring of AI mannequin high quality.

The researchers really helpful customers and firms who depend on LLM providers as a part of their workflows implement some type of monitoring evaluation to make sure the chatbot stays up to the mark.

On June 6, OpenAI unveiled plans to create a crew that can assist handle the dangers that might emerge from a superintelligent AI system, one thing it expects to reach throughout the decade.

AI Eye: AI’s educated on AI content material go MAD, is Threads a loss chief for AI information?

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

ChatGPT’s capabilities are getting worse with age, new research claims

Bitcoin Value Might Restart Improve As The Bears Lose Steam

CoinMarketCap’s H1 2023 report says Q2 was a ‘misplaced quarter’

Related Posts