Facebook’s chief AI scientist says GPT-3 is ‘not very good’ as a dialog system
A new study showed some expectations for the model are unrealistic
How did GPT-3 perform?
The researchers found that GPT-3 seemed helpful in finding information in long documents and in basic admin tasks such as appointment booking. But it lacked the memory, logic, and understanding of time for many more specific questions.
Nabla also found that GPT-3 was an unreliable Q&A support tool for doctors, dangerously oversimplified medical documentation analysis, and struggled to associate causes with consequences.
The model also made some basic errors in diagnosis and provided some reckless mental health advice.
The researchers do see some potential for using language models in medical settings. But they conclude that GPT-3 is “nowhere near” ready to provide significant help in the sector:
Their findings won’t shock OpenAI, given the firm’s warnings against using GPT-3 in healthcare. But they do show that many expectations for the model are wildly unrealistic.
Story byThomas Macaulay
Thomas is a senior reporter at TNW. He covers European tech, with a focus on AI, cybersecurity, and government policy.Thomas is a senior reporter at TNW. He covers European tech, with a focus on AI, cybersecurity, and government policy.
Get the TNW newsletter
Get the most important tech news in your inbox each week.