Tool Calling in LLMs: How AI Agents Use Tools (and When They Shouldn't)

How tool calling works

When a model supports tool calling (also called function calling), you can provide a list of tool definitions alongside the user's message. The model reads the tool schemas and decides whether to respond directly or invoke one or more tools. If it decides to call a tool, it outputs a structured tool call object instead of (or before) its text response.

Your application receives the tool call, executes the actual function, and returns the result to the model. The model then uses that result to continue reasoning and produce its final response. This loop — user message → tool call → tool result → final response — is the foundation of every AI agent.

Tool definition example

# Define a tool the agent can use
tools = [
  {
    "name": "search_database",
    "description": "Search the customer database by company name or email. Returns contact info and account status.",
    "input_schema": {
      "type": "object",
      "properties": {
        "query": {
          "type": "string",
          "description": "Company name, domain, or email address to search for"
        },
        "limit": {"type": "integer", "default": 5}
      },
      "required": ["query"]
    }
  }
]
# The description is the most important field — models
# use it to decide when and how to call the tool
      

The most important field: description

Models don't read code — they read descriptions. A tool with a vague or misleading description will be called incorrectly, called too often, or ignored entirely. Write descriptions as if explaining to a smart intern: what does this tool do, when should you use it, and what does it return?

✓ Good description:

"Search the customer database by company name or email. Use this when you need to look up an existing customer's account status, contact info, or subscription tier. Returns up to 5 matching records ordered by relevance."

✗ Bad description:

"Searches database." — Too vague. The model doesn't know when to use it or what it returns.

Common tool design mistakes

Too many tools: Providing 30+ tools forces the model to reason about which one to use — it will make mistakes. Keep your tool set focused; 5–15 tools is usually optimal.
Overlapping tool purposes: Two tools that can both accomplish the same thing will cause the model to pick unpredictably. Differentiate clearly in descriptions.
Required parameters that could be optional: If the model has to call a tool but can't provide a required parameter, it will either fail or hallucinate a value. Make parameters optional when possible.
Tools with side effects and no confirmation: Write operations (send email, delete record, post to API) should either require a confirmation parameter or be gated behind human-in-the-loop approval for high-stakes actions.
Returning too much data: A tool that returns 100KB of JSON will consume your entire context window. Filter, summarize, or paginate tool outputs before returning them to the model.

Parallel vs sequential tool calls

Modern frontier models (Claude Opus 4, GPT-5) support parallel tool calling: they can invoke multiple tools simultaneously in a single response. This dramatically reduces latency for tasks that require multiple independent lookups. However, tools with sequential dependencies must be called in order — the model handles this correctly when tool descriptions make dependencies explicit.

Build tool-enabled agents on MoltBot

Native tool registry, parallel execution, human-in-the-loop controls. 14-day free trial.

Start Free Trial →