Fact or Fiction: The Struggle with Accuracy in AI Chatbots ChatGPT and Bing Chat

Siamak Masnavi

8 Apr 2023
/
In #AI

AI chatbots like ChatGPT have been turning heads worldwide thanks to their human-like ability to discuss any topic.

Nevertheless, Benj Edwards’ report for Ars Technica, published on Thursday (April 6), highlights a major downside: these chatbots can inadvertently spread false yet persuasive information, making them undependable sources of facts and potential contributors to defamation.

Edwards explains that AI chatbots, such as OpenAI’s ChatGPT, utilize “large language models” (LLMs) to generate responses. LLMs are computer programs trained on vast amounts of text data to read and produce natural language. However, they are prone to errors, commonly called “hallucinations” or “confabulations” in academic circles. Edwards prefers “confabulation,” as it suggests creative but unintentional fabrications.

The Ars Technica article underscores the problem of AI bots generating deceptive, misleading, or defamatory information. Edwards provides examples of ChatGPT falsely accusing a law professor of sexual harassment and wrongly claiming an Australian mayor was convicted of bribery. Despite these drawbacks, ChatGPT is considered an upgrade from GPT-3, as it can decline to answer certain questions or warn of potential inaccuracies.

OpenAI CEO Sam Altman has admitted ChatGPT’s limitations, tweeting about its “incredible” limitations and the risks of relying on it for crucial matters. Altman also remarked on the chatbot’s simultaneous knowledge and penchant for being “confident and wrong.”

Edwards delves into their workings to comprehend how GPT models like ChatGPT confabulate. Researchers create LLMs like GPT-3 and GPT-4 using “unsupervised learning,” wherein the model learns to predict the next word in a sequence by analyzing vast text data and refining its predictions through trial and error.

ChatGPT differs from its predecessors, as it has been trained on human-written conversation transcripts, Edwards states. OpenAI employed “reinforcement learning from human feedback” (RLHF) to fine-tune ChatGPT, leading to more coherent responses and fewer confabulations. Nonetheless, inaccuracies remain.

Edwards cautions against blindly trusting AI chatbot outputs but acknowledges that technological improvements may change this. Since its launch, ChatGPT has undergone multiple upgrades, enhancing accuracy and its ability to decline to answer questions it cannot address.

Although OpenAI has not directly responded to queries about ChatGPT’s accuracy, Edwards refers to company documents and news reports for insights. OpenAI’s Chief Scientist, Ilya Sutskever, believes that further RLHF training can tackle the hallucination issue. At the same time, Meta’s Chief AI Scientist, Yann LeCun, argues that the current GPT-based LLMs will not resolve the problem.

Edwards also mentions alternative methods to improve LLM accuracy using existing architectures. Bing Chat and Google Bard already utilize web searches to refine their outputs, and a browser-enabled version of ChatGPT is expected to follow suit. Additionally, ChatGPT plugins plan to augment GPT-4’s training data with external sources, like the web and specialized databases. As Edwards points out, this mirrors the accuracy boost a human gains from consulting an encyclopedia.

Lastly, Edwards suggests that a GPT-4-like model could be trained to recognize when it is fabricating information and adjust accordingly. This might involve more advanced data curation and linking training data to “trust” scores, similar to PageRank. Another possibility is fine-tuning the model to be more cautious when less confident in its responses.

Image Credit

Featured Image via Pixabay

Disclaimer

The views and opinions expressed by the author, or any people mentioned in this article, are for informational purposes only, and they do not constitute financial, investment, or other advice. Investing in or trading cryptoassets comes with a risk of financial loss.

Fact or Fiction: The Struggle with Accuracy in AI Chatbots ChatGPT and Bing Chat

Image Credit

Disclaimer

Related Articles

XRP Ledger Activity Surges: New Wallets and Active Addresses Reach March Highs Amid 29% Price Spike

Cryptocurrency Analyst Predicts ‘Violent’ Rise for XRP to $22 in 2025

Former President Trump’s Potential Re-Election Could Ignite Crypto Sector, Says Bernstein

You Might Like

Most Read

Global Asset Manager VanEck’s Bitcoin Valuation Model: $2.9 Million per Coin by 2050

$ETH: CCData Research Analyzes the Launch of the New Spot Ether ETFs and Their Market Impact

Stake.com: The Platform Bringing Crypto to the UFC and Beyond

TON-Powered P2E Game Hamster Kombat Is a ‘Scam’ That ‘Must Be Stopped’, Says Powerful Russian Politician

Crypto Investment Products See Over $1.3 Billion Inflows in a Week with BTC, ETH, and SOL in the Lead