Facebook’s chief AI scientist says GPT-3 is ‘not very good’ as a dialog system

A new study showed some expectations for the model are unrealistic

How did GPT-3 perform?

The researchers found that GPT-3 seemed helpful in finding information in long documents and in basic admin tasks such as appointment booking. But it lacked the memory, logic, and understanding of time for many more specific questions.

Nabla also found that GPT-3 was an unreliable Q&A support tool for doctors, dangerously oversimplified medical documentation analysis, and struggled to associate causes with consequences.

The model also made some basic errors in diagnosis and provided some reckless mental health advice.

The researchers do see some potential for using language models in medical settings. But they conclude that GPT-3 is “nowhere near” ready to provide significant help in the sector:

Their findings won’t shock OpenAI, given the firm’s warnings against using GPT-3 in healthcare. But they do show that many expectations for the model are wildly unrealistic.

Story byThomas Macaulay

Thomas is a senior reporter at TNW. He covers European tech, with a focus on AI, cybersecurity, and government policy.Thomas is a senior reporter at TNW. He covers European tech, with a focus on AI, cybersecurity, and government policy.

Get the most important tech news in your inbox each week.

Facebook’s chief AI scientist says GPT-3 is ‘not very good’ as a dialog system#

How did GPT-3 perform?#

Story byThomas Macaulay#

Get the TNW newsletter#

Also tagged with#

Facebook’s chief AI scientist says GPT-3 is ‘not very good’ as a dialog system

How did GPT-3 perform?

Story byThomas Macaulay

Get the TNW newsletter

Also tagged with