Introduction to Agents Course

An agent is a application code that is capable of planning and reasoning, and has the ability to interact with the environment
- Planning and Reasoning - Given a task, using the available AI model (Which is an LLM in most of the cases), agent thinks through and plans out a set of Actions that needs to be taken
- Ability to interact with the environment - After planning out a set of actions, the agent now tries to complete these actions using the tools available

An Agent has two main components:

Mind
- It has the ability to think through an AI model and plans out a set of actions
Body
- Given a set of actions, the tools that can be used to achieve these actions form the body of an AI Agent

What are LLMs ( Large Language Model)?

LLM is a deep learning model used to understand and generate human understandable data. LLMs are predominantly built using transformers and Transformers can be classified into the following three:

Encoders - A kind of transformer that can take in text and convert it into context heavy embeddings
Decoders - A kind of transformer that can generate the next text based on the previous text sequence data
Seq2Seq - A kind of transformer that takes in text and converts it into embedding. And generates the next word in the sequence

Attention mechanism - Used to understand the important part of the text data

Special Tokens

To help generate precise and contextual data
For example, EOS - denotes End of Sentence. And is used as an indicator for the transformer to stop generating text

Chat templates

Used to convert conversations to LLM understandable format - used to structure conversations b/w language models and users
It helps maintain context by preserving conversation history and hence leads to more coherent multi-turn conversations

Tools A tool is a piece of function given to the LLM to achieve a specific action. A tool should contain the following:

A textual description of what the function does.
A Callable (something to perform an action).
Arguments with typings.
(Optional) Outputs with typings.

How do tools work? The LLM will generate text in form of code to invoke that tool → Agent parses the output given out by the LLM → Recognize that there’s a tool call → Invoke the tool Tools are invoked by LLMs by generating a specific kind of text. It might appear like it is the agent that has invocated the tool. But it was actually an LLM.

🪴 Quartz 4.0

Explorer

Introduction to Agents Course

Graph View

Backlinks