Agent

Tools

  • An Action taken by Agent can involve use of multiple Tools to complete
  • Examples of Tools
    • Web Search
    • Image Generation
    • Retrieval
    • API interface
  • A Tool should contain
    • Textual description of function
    • Callable
    • Arguments with typings
    • Outputs with typings
  • @tool decorator automatically provides a to_string() method which provides all this info
@tool
def calculator(a: int, b: int) -> int:
    """Multiply two integers."""
    return a * b
 
print(calculator.to_string())
# Tool Name: calculator, Description: Multiply two integers., Arguments: a: int, b: int, Outputs: int
  • This description is injected in the system prompt
system_message="""You are an AI assistant designed to help users efficiently and accurately. Your primary goal is to provide helpful, precise, and clear responses.
 
You have access to the following tools:
Tool Name: calculator, Description: Multiply two integers., Arguments: a: int, b: int, Outputs: int
"""

Thought-Action-Observation cycle

  • Agent follows continuous cycle of:
    • Thought: The LLM part of the Agent decides what the next step should be.
    • Action: The agent takes an action by calling the tools with the associated arguments.
    • Observation: The model reflects on the response from the tool.
  • Agent frameworks inject this in system prompt
system_message="""You are an AI assistant designed to help users efficiently and accurately. Your primary goal is to provide helpful, precise, and clear responses.
 
You have access to the following tools:
Tool Name: calculator, Description: Multiply two integers., Arguments: a: int, b: int, Outputs: int
 
You should think step by step in order to fulfill the objective with a reasoning divided into Thought/Action/Observation steps that can be repeated multiple times if needed.
 
You should first reflect on the current situation using 'Thought: {your_thoughts}', then (if necessary), call a tool with the proper JSON formatting 'Action: {JSON_BLOB}', or print your final answer starting with the prefix 'Final Answer:'
"""

Memory Architectures

  • https://huggingface.co/learn/context-course/unit6/agent-loop
  • Production Agents uses various memory architectures
    • Short term scratch pad: Save working state
    • Episodic Memory: Recalling past sessions
    • Semantic Memory: For persistent knowledge
    • Retrieval Augmented approaches: Fetch relevant memories on demand
    • Compaction Strategies: Summarize Older Context

Context Management

  • Real Code Agents implement
    • Compaction
    • Structured Note Taking
    • File-System Mediated Context
    • Intelligent Tool Selection

Sandboxing

  • Agent has internal tools available to it
  • These tools are sandboxed to prevent running arbitrary dangerous code
  • Following are techniques
    • Path confinement: Verify the path is in workspace
    • Command Allowlist: Only allowed commands are run, Anything which mutates state (rm, mv) or reaches network (curl, wget) is rejected
    • Output Limits: Limit output from say DB to avoid reading too much data
    • Write-Off-by-default: Turn off writes by default
    • Exception Handling: Graceful handling of exceptions
    • Timeout: To avoid hanging network call, or long running commands
  • Avoid global imports when running tools via Agent loop

Nano-Harness

  • https://huggingface.co/learn/context-course/unit6/introduction
  • Agentic Loop
    • Call LLM with task + message history
    • Parse model output as Python code
    • Execute Python (with tools available)
    • Collect stdout, stderr, exceptions
    • Append observation to message history
    • Repeat (max 50 steps)
    • Done when: final_answer() called or max steps reached

Prompting Techniques

  • CoT (Chain-of-Thought)
  • ReAct (Reasoning + Acting)
FeatureCoTReAct
Step-by-step logic✅ Yes✅ Yes
External tools❌ No✅ Yes (Actions + Observations)
Best suited forLogic, math, internal tasksInfo-seeking, dynamic multi-step tasks