• An agent is a application code that is capable of planning and reasoning, and has the ability to interact with the environment
    • Planning and Reasoning - Given a task, using the available AI model (Which is an LLM in most of the cases), agent thinks through and plans out a set of Actions that needs to be taken
    • Ability to interact with the environment - After planning out a set of actions, the agent now tries to complete these actions using the tools available

An Agent has two main components:

  1. Mind
    • It has the ability to think through an AI model and plans out a set of actions
  2. Body
    • Given a set of actions, the tools that can be used to achieve these actions form the body of an AI Agent

What are LLMs ( Large Language Model)?

LLM is a deep learning model used to understand and generate human understandable data. LLMs are predominantly built using transformers and Transformers can be classified into the following three:

  1. Encoders - A kind of transformer that can take in text and convert it into context heavy embeddings
  2. Decoders - A kind of transformer that can generate the next text based on the previous text sequence data
  3. Seq2Seq - A kind of transformer that takes in text and converts it into embedding. And generates the next word in the sequence

Attention mechanism - Used to understand the important part of the text data

Special Tokens

  • To help generate precise and contextual data
  • For example, EOS - denotes End of Sentence. And is used as an indicator for the transformer to stop generating text

Chat templates

  • Used to convert conversations to LLM understandable format - used to structure conversations b/w language models and users
  • It helps maintain context by preserving conversation history and hence leads to more coherent multi-turn conversations

Tools A tool is a piece of function given to the LLM to achieve a specific action. A tool should contain the following:

  • textual description of what the function does.
  • Callable (something to perform an action).
  • Arguments with typings.
  • (Optional) Outputs with typings.

How do tools work? The LLM will generate text in form of code to invoke that tool Agent parses the output given out by the LLM Recognize that there’s a tool call Invoke the tool Tools are invoked by LLMs by generating a specific kind of text. It might appear like it is the agent that has invocated the tool. But it was actually an LLM.