Tutorial // Agents2026-06-2213 min read

Build an AI Agent with Tool Use Using the Claude API

Go past chat: build an agent that calls real functions in a loop to get things done, with the Claude tool-use API in TypeScript.

Varun Raj ManoharanFounder & Principal Engineer

AgentsClaudeTool UseTypeScriptTutorial

Key takeaways

An agent is a loop that sends a message, checks stop_reason, runs any tools Claude requests, feeds results back, and repeats until end_turn.
The Claude API is stateless, so the messages array you send on every turn is the agent's entire memory.
Append the assistant's full response content before sending tool results, since each tool_result must match a tool_use_id in the history.
Add a MAX_ITERATIONS cap to prevent runaway loops and return tool errors as strings so the model can read them and recover.

A chatbot answers. An agent does. The difference is a loop.

When you call the Claude API the plain way, you send a message and you get text back. That's it. The model can't look up today's weather, it can't run a calculation it isn't sure about, and it can't read a row out of your database. It only knows what was in its training data and what you put in the prompt. Useful, but limited.

Tool use changes the shape of the interaction. You hand Claude a set of functions it's allowed to call. When it decides it needs one, it doesn't run the function itself. It pauses and tells you "I want to call get_weather with these arguments." Your code runs the real function, sends the result back, and Claude picks up where it left off. Repeat that until the model has everything it needs to answer, and you have an agent.

In this tutorial we'll build a small one in TypeScript. It has two tools: a get_weather lookup and a calculator. We'll wire up the loop by hand: send a message, check why the model stopped, run any tools it asked for, feed the results back, and keep going until it's done. Doing the loop manually (instead of reaching for an SDK helper) is the whole point. Once you've written it yourself, every agent framework you touch afterward stops being magic.

What you'll need

Node 18 or newer, and a TypeScript setup you're comfortable with (tsx is the quickest way to run a single file).
The Anthropic SDK: npm install @anthropic-ai/sdk.
An API key in your environment as ANTHROPIC_API_KEY. The SDK reads it automatically, so you never hardcode it.

That's it. No vector database, no orchestration framework. The agent loop is about 40 lines of code.

Step 1: define your tools as functions

Before Claude can call anything, you need the actual functions. These are plain TypeScript, nothing Claude-specific about them yet. The get_weather one is a stub that returns canned data so the tutorial runs without a real weather API; swap in a fetch call when you want it live.

TypeScript

// The real implementations. Claude never runs these — your code does.

function getWeather(location: string, unit: "celsius" | "fahrenheit" = "celsius"): string {
  // In production this would hit a weather API. Stubbed for the tutorial.
  const tempC = 18;
  const temp = unit === "fahrenheit" ? Math.round(tempC * 9 / 5 + 32) : tempC;
  return JSON.stringify({
    location,
    temperature: temp,
    unit,
    conditions: "partly cloudy",
  });
}

function calculator(expression: string): string {
  // A real calculator, not a call back to the model. Keep it narrow.
  if (!/^[\d\s+\-*/().]+$/.test(expression)) {
    return "Error: expression contains characters that aren't allowed.";
  }
  try {
    // Function constructor over eval; still only reachable by the regex-validated input above.
    const result = Function(`"use strict"; return (${expression})`)();
    return String(result);
  } catch {
    return "Error: could not evaluate the expression.";
  }
}

Two things worth flagging. The calculator validates its input with a regex before evaluating. Claude is generating those expression strings, and you should treat tool arguments the same way you'd treat any other untrusted input. And every function returns a string. The API will accept structured content too, but a string is the simplest thing to send back and it's what we'll use.

Step 2: describe the tools to Claude with JSON Schema

Claude doesn't see your functions. It sees a list of descriptions, name, what the tool does, and a JSON Schema for the arguments. This is the only thing the model has to decide when and how to call each tool, so the descriptions earn their keep. Be specific about when to reach for a tool, not just what it does.

TypeScript

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic();

const tools: Anthropic.Tool[] = [
  {
    name: "get_weather",
    description:
      "Get the current weather for a location. Call this whenever the user asks about weather, temperature, or conditions somewhere.",
    input_schema: {
      type: "object",
      properties: {
        location: {
          type: "string",
          description: "City and region, e.g. 'Chennai, India' or 'Berlin, Germany'.",
        },
        unit: {
          type: "string",
          enum: ["celsius", "fahrenheit"],
          description: "Temperature unit. Defaults to celsius.",
        },
      },
      required: ["location"],
    },
  },
  {
    name: "calculator",
    description:
      "Evaluate an arithmetic expression. Use this for any math the user asks for instead of computing it yourself.",
    input_schema: {
      type: "object",
      properties: {
        expression: {
          type: "string",
          description: "An arithmetic expression, e.g. '(12 + 4) * 3'.",
        },
      },
      required: ["expression"],
    },
  },
];

The input_schema is standard JSON Schema. properties declares each argument, required lists the ones that must be present, and enum pins a field to a fixed set of values. When Claude calls the tool, the SDK hands you input already parsed into an object that matches this shape. You should still validate it, because the model can occasionally surprise you.

One detail that bites people: keep this tools array stable across a conversation. Claude is good about the rest, but a tool whose description says "use this for math" and another that says "compute arithmetic" will sometimes make the model dither. Write them like you'd write a function's docstring for a teammate.

Step 3: a router from tool name to function

When Claude asks for a tool, you get back its name and input. You need to map that to the right function and run it. A small dispatcher keeps the loop readable:

TypeScript

function runTool(name: string, input: any): string {
  switch (name) {
    case "get_weather":
      return getWeather(input.location, input.unit);
    case "calculator":
      return calculator(input.expression);
    default:
      return `Error: unknown tool '${name}'.`;
  }
}

The default case matters more than it looks. If you ever ship a tool definition without a matching implementation, you want a clear error string going back to the model rather than an unhandled exception taking down the loop. The model will read that error and usually adapt.

Step 4: the agent loop

Here's the core of the whole thing. The pattern is always the same:

Send the conversation to Claude.
Look at response.stop_reason.
If it's end_turn, the model is done, print the answer and stop.
If it's tool_use, find the tool_use blocks in the response, run each one, append the results as a new user message, and go back to step 1.

TypeScript

async function runAgent(userQuestion: string): Promise<string> {
  // The conversation history. We send the whole thing on every turn —
  // the API is stateless, so this array IS the agent's memory.
  const messages: Anthropic.MessageParam[] = [
    { role: "user", content: userQuestion },
  ];

  const MAX_ITERATIONS = 10; // guard against runaway loops — see the gotchas
  let iterations = 0;

  while (iterations < MAX_ITERATIONS) {
    iterations++;

    const response = await client.messages.create({
      model: "claude-sonnet-4-6",
      max_tokens: 1024,
      tools,
      messages,
    });

    // Claude finished. Pull the text out and return it.
    if (response.stop_reason === "end_turn") {
      const text = response.content
        .filter((block): block is Anthropic.TextBlock => block.type === "text")
        .map((block) => block.text)
        .join("");
      return text;
    }

    // Claude wants to call one or more tools.
    if (response.stop_reason === "tool_use") {
      // IMPORTANT: append the assistant's full content first. It contains the
      // tool_use blocks, and each tool_result you send must match a tool_use_id.
      messages.push({ role: "assistant", content: response.content });

      const toolResults: Anthropic.ToolResultBlockParam[] = [];

      for (const block of response.content) {
        if (block.type === "tool_use") {
          const result = runTool(block.name, block.input);
          toolResults.push({
            type: "tool_result",
            tool_use_id: block.id, // ties the result back to the call
            content: result,
          });
        }
      }

      // All tool results go back in a SINGLE user message.
      messages.push({ role: "user", content: toolResults });
      continue; // loop again — Claude now has the results
    }

    // Any other stop reason (max_tokens, refusal, etc.) — bail out.
    return `Stopped unexpectedly: ${response.stop_reason}`;
  }

  return "Hit the iteration limit without finishing.";
}

Let me walk through the parts that aren't obvious.

The messages array is the whole memory. The Claude API is stateless. It doesn't remember the previous turn for you, every call sends the entire history. That's why we keep pushing onto messages and pass it back in full each iteration. When the loop ends, this array holds the complete trace of what happened: the question, every tool call, every result, the final answer.

You must append response.content verbatim before sending results. This is the single most common mistake. The assistant's response is an array of content blocks, and the tool_use blocks live inside it. If you skip appending it and just send your tool_result blocks, the API has nothing to attach those results to, the tool_use_id references a call that isn't in the history. Append the assistant turn, then append the user turn with the results.

All results go in one user message. Claude can ask for several tools at once (more on that in a second). When it does, you run them all and return every tool_result in a single user message. Splitting them across multiple messages quietly teaches the model to stop making parallel calls, which you don't want.

stop_reason is the steering wheel. Everything branches off it. end_turn means a normal finish. tool_use means "I need you to run something." There are others (max_tokens if the response got cut off, refusal if the model declined for safety reasons), and a production agent should handle those explicitly rather than treating every non-end_turn reason as a tool call.

Step 5: run it

TypeScript

const answer = await runAgent(
  "What's the weather in Chennai right now, and what's that temperature times 1.8 plus 32?",
);
console.log(answer);

That question forces both tools. Claude calls get_weather for Chennai, reads back 18°C, then calls calculator with 18 * 1.8 + 32 to do the Fahrenheit conversion, and finally writes a sentence combining the two. You can watch it happen by logging response.stop_reason and the tool names on each iteration, the loop runs three or four times before it lands on end_turn.

What's nice is that you didn't script any of that sequencing. You never told the agent "first check weather, then convert." You gave it two tools and a question, and the loop let it work out the order. That's the part that feels different from a normal API call.

Gotchas

A few things will trip you up the first time. They tripped me up too.

Infinite loops are real

Without the MAX_ITERATIONS guard, an agent can loop forever. The usual cause is a tool that keeps returning something the model finds unsatisfying, so it calls the tool again, gets the same unsatisfying result, and calls it again. Or a bug where you forget to append the tool result, so Claude re-requests the call it already made. The iteration cap is not optional. It's the difference between a bug and a runaway API bill. Pick a number that's comfortably above your longest legitimate task and stop hard when you hit it.

Bad tool arguments

The model generates the arguments, and it gets them wrong sometimes: a malformed expression, a location that's actually a question, a field you required that came back empty. Don't let that throw. Validate inside the tool, and when something's off, return an error string describing the problem rather than raising. Claude reads tool results, including error results, and it's genuinely good at recovering. Tell it "expression contains characters that aren't allowed" and it'll usually fix the expression and try again. An unhandled exception, by contrast, kills the loop and the model never gets a chance to correct course.

Parallel tool calls

Claude will often request multiple tools in a single response when they're independent. "Weather in Chennai and weather in Berlin" can come back as two get_weather blocks at once. The loop above handles this correctly because it iterates over every tool_use block in the response and collects all the results before sending them back. The trap is sending those results one message at a time. Do that and you break the contract: results for a single assistant turn belong together in one user message. If you genuinely want one-at-a-time behavior, set disable_parallel_tool_use: true on tool_choice rather than fighting the loop.

The max-iterations guard isn't a substitute for a budget

The iteration cap stops you from looping forever, but it doesn't bound cost on its own. A single iteration with a huge context is still expensive. If you're running agents at any scale, watch the token usage that comes back on each response.usage and decide your real ceiling from there. The iteration count is the seatbelt; token accounting is the speedometer.

Where this goes next

What you've built is the complete pattern. Every "agent framework" is some elaboration on this loop: better memory, retries, human approval before a tool runs, a dozen tools instead of two. None of it changes the core: send, check stop_reason, run tools, feed results back, repeat until end_turn.

If you want to grow it, the natural next steps are giving a tool real side effects (write to a database, send a message, and gate the irreversible ones behind a confirmation step), and switching the hand-written loop for the SDK's tool runner once you trust the mechanics. But write it by hand at least once. The loop is short, and understanding it cold is worth more than any abstraction built on top of it.