User
Write something
Transformers for Professionals is happening in 10 days
Understanding the planning of LLM agents: A survey
The idea of leveraging LLMs as agents performing autonomously is the latest rave. In general I continue to take the view that these systems are not in fact 'agents' as the definition of this concept would typically require (have a will/interest of their own), but rather that they follow a predefined sequence of steps or a DAG (directed acyclic graph), which is much more akin to execution than agency. This survey provides the systematic overview of LLM-based agent planning, covering recent works aiming to improve planning ability. It provides a taxonomy of existing works on LLM-Agent planning, which can be categorized into: - Task Decomposition - Plan Selection - External Module - Reflection and - Memory. Interesting overview of the current state of the art as it relates to agents. https://arxiv.org/abs/2402.02716
3
0
AI Agents can check their outputs on Google first - is this full circle?
Google Deepmind, Stanford and University of Illinois at Urbana-Champaign propose a Google search based system to factually validate LLM generated outputs to decrease LLMs tendency to confabulate. I do think this is a cool idea and will make AI agents factually more reliable, but I hope the irony doesn’t escape you: a) After we have now spent many billions of dollars on the development of LLMs and RAG systems, vector stores, data centers and hardware, etc. AI agents now go and check their outputs on Google. All this effort to go back to a Google search … b) I suspect it’s not coincidence that Google co-authored this research, looking to deeply integrate search into the AI toolbox, a technology many have argued that is going to upend their dominance and business model. In reality, I’d say this quickly gets a bit tricky though, because the answer your system proposes that is then fact-checked via Google search may well include information from your proprietary RAG system, which you might not want to send into a Google search. https://arxiv.org/abs/2403.18802
2
2
New comment Apr 15
Devon 2.0
AutoDev utilizes cutting-edge AI agents to automate large portions of the software development process. By conversing with AI agents in natural language, developers can outline high-level requirements, and then the AI autonomously handles tasks like writing code, running tests, and deploying the applications. It's like having an army of highly skilled programmers at your beck and call, ready to transform your vision into reality with unprecedented speed and efficiency. With AutoDev, the possibilities for streamlining development and accelerating innovation seem to be limitless. As AI continues to advance, platforms like this could reshape our fundamental notions of how software gets created. While Cognition Labs' newly launched Devon is a pioneering AI software engineer, AutoDev aims to go even further by providing an integrated development environment powered by multiple AI agents collaboratively taking on the end-to-end software creation process. To know more about Autodev: https://arxiv.org/abs/2403.08299
0
0
World first AI engineer
This is the BIGGEST AI news of 2024: A startup called Cognition Labs just dropped the world's first AI software engineer: • It's called Devin and it can write complete apps by itself • In a demo, Devin was able to complete real jobs posted on Upwork • It also correctly resolved almost 14% of GitHub issues found in real-world open-source projects (better than many developers) Eric Glyman, Cofounder of multi-billion $ startup Ramp, called it the "single most impressive demo I've seen in the past decade." Perplexity's CEO called it the first AI agent "that seems to cross the threshold of what is human level and works reliably." This is what the future looks like.
9
3
New comment Mar 13
World first AI engineer
1-4 of 4
Generative AI
Public group
Learn and Master Generative AI, tools and programming with practical applications at work or business. Embrace the future – join us now!
powered by