Large Language Models Lack True Reasoning, Claims Expert
According to Subbarao Kambhampati, a professor at Arizona State University, the recent claims that large language models (LLMs) like GPT-3, GPT-4, and ChatGPT possess reasoning and planning abilities are unfounded.
Prof Kambhampati conducted experiments testing these LLMs on standard planning tasks and found their empirical performance was poor, especially when object and action names were obfuscated. While fine-tuning the models on planning data can boost performance, he argues this merely converts the task to approximate retrieval rather than true reasoning.
The practice of having humans provide "chain of thought" prompting to steer LLMs is susceptible to the human unintentionally guiding the model, Kambhampati claims. He also expresses skepticism about papers claiming LLMs can self-critique and iteratively improve their own plans and reasoning.
While LLMs excel at extracting general planning knowledge and generating ideas, Kambhampati found they struggle to assemble that knowledge into executable plans that properly handle subgoal interactions. Many papers making planning claims either ignore such interactions or rely on human prompting to resolve them, he says.
Instead, Kambhampati proposes using LLMs to extract approximate domain models, which human experts then verify and refine before passing to traditional model-based solvers. This resembles classic knowledge-based AI systems, with LLMs replacing human knowledge engineers – while employing techniques to reason with incomplete models.
Overall, the AI expert argues that despite their impressive capabilities, LLMs fundamentally lack true autonomous reasoning and planning abilities as traditionally understood. However, he believes they can productively support these tasks by combining their knowledge extraction and idea generation strengths with external solvers and human oversight.
Altaf Rehmani
Large Language Models Lack True Reasoning, Claims Expert
Generative AI
Public group
Learn and Master Generative AI, tools and programming with practical applications at work or business. Embrace the future – join us now!
powered by