• how
  • to
  • run
  • use
  • ai

Running AI Agents with Kimi/Kimik 2.5 on Cloudflare Workers

Amit Hariyale

Amit Hariyale

Full Stack Web Developer, Gigawave

8 min read · April 16, 2026

how to run use ai agents in kimik 2.5 on cloudflare matters in real projects because weak implementation choices create hard-to-debug failures and inconsistent user experience.

This guide uses focused, production-oriented steps and code examples grounded in official references.

Key Concepts Covered

Core setup for how to run use ai agents in kimik 2.5 on cloudflareImplementation flow and reusable patternsValidation and optimization strategyCommon pitfalls and recovery pathsProduction best practicesVerification checklist for releaseUnclear setup path for how to run use ai agents in kimik 2.5 on cloudflareInconsistent implementation patternsMissing validation for edge casesKeep implementation modular and testableUse one clear source of truth for configurationValidate behavior before optimization
  • Core setup for how to run use ai agents in kimik 2.5 on cloudflare
  • Implementation flow and reusable patterns
  • Validation and optimization strategy
  • Common pitfalls and recovery paths
  • Production best practices
  • Verification checklist for release
  • Unclear setup path for how to run use ai agents in kimik 2.5 on cloudflare
  • Inconsistent implementation patterns
  • Missing validation for edge cases
  • Keep implementation modular and testable
  • Use one clear source of truth for configuration
  • Validate behavior before optimization

Context Setup

We start with minimal setup, then move to implementation patterns and validation checkpoints for how to run use ai agents in kimik 2.5 on cloudflare.

Problem Breakdown

  • Unclear setup path for how to run use ai agents in kimik 2.5 on cloudflare
  • Inconsistent implementation patterns
  • Missing validation for edge cases

Solution Overview

Apply a step-by-step architecture: setup, core implementation, validation, and performance checks for how to run use ai agents in kimik 2.5 on cloudflare.

Additional Implementation Notes

  • Step 1: Define prerequisites and expected behavior for how to run use ai agents in kimik 2.5 on cloudflare.
  • Step 2: Implement a minimal working baseline.
  • Step 3: Add robust handling for non-happy paths.
  • Step 4: Improve structure for reuse and readability.
  • Step 5: Validate with realistic usage scenarios.

Best Practices

  • Keep implementation modular and testable
  • Use one clear source of truth for configuration
  • Validate behavior before optimization

Pro Tips

  • Prefer concise code snippets with clear intent
  • Document edge cases and trade-offs
  • Use official docs for API-level decisions

Resources

Final Thoughts

Treat how to run use ai agents in kimik 2.5 on cloudflare as an iterative build: baseline first, then reliability and performance hardening.

Full Generated Content (Unabridged)

Only real code appears in code blocks. Other content is rendered as normal headings, lists, and text.

Blog Identity

  • title: Running AI Agents with Kimi/Kimik 2.5 on Cloudflare Workers
  • slug: ai-agents-kimik-25-cloudflare-workers
  • primary topic keyword: Kimi 2.5 AI agents
  • target stack: Cloudflare Workers, TypeScript, AI/ML

SEO Metadata

  • seoTitle: Run Kimi 2.5 AI Agents on Cloudflare Workers (Step-by-Step)
  • metaDescription: Deploy autonomous AI agents using Kimi/Kimik 2.5 on Cloudflare's edge. Learn worker setup, streaming responses, and tool-calling patterns with production-ready code.
  • suggestedTags: ["Kimi 2.5", "Cloudflare Workers", "AI agents", "edge AI", "serverless LLM", "Moonshot AI"]
  • suggestedReadTime: 8 min

Hero Hook

You built a chatbot. Now you need an agent—something that can call APIs, reason through steps, and act without hand-holding. Kimi/Kimik 2.5 (Moonshot AI's latest) brings strong tool-calling and long-context reasoning, but running it on traditional servers means cold starts and regional latency.

Cloudflare Workers changes the game: deploy your agent to 300+ locations, pay per request, and stream tokens from the edge. The catch? Workers have strict CPU limits, no native Node APIs, and a unique fetch-based runtime. Most Kimi SDK examples assume a Node.js environment and break immediately.

This guide shows you how to bridge that gap—no hacks, no node_compat flags that bloat your bundle. Just clean, edge-native code that streams tool calls and handles state across requests.

Context Setup

What You'll Need

  • Cloudflare account with Workers enabled
  • Moonshot AI API key (Kimi 2.5 access)
  • Wrangler CLI installed (npm install -g wrangler)
  • Basic TypeScript knowledge

What We're Building

A stateless agent worker that:

  • Accepts user queries via HTTP POST
  • Streams Kimi 2.5 responses with Server-Sent Events (SSE)
  • Executes tool calls (web search, calculator) and returns results
  • Maintains conversation context using Cloudflare KV (optional)

Assumptions

  • You understand OpenAI-style chat completions and function calling
  • You've deployed at least one Cloudflare Worker before
  • Kimi 2.5's API is OpenAI-compatible with minor differences

Problem Breakdown

Why Standard SDKs Fail on Workers

ProblemSymptomRoot Cause
fs/path importsError: No such moduleSDKs assume Node.js file system
axios or node-fetchfetch is not a functionWorkers use native fetch, not polyfills
Streaming with ReadableStreamEmpty responses or hangsIncorrect stream handling for Web Streams API
Tool call parsingMalformed JSON in function.argumentsKimi returns partial JSON during streaming

The Hidden Complexity

Kimi 2.5 streams tool calls differently than OpenAI. The delta.tool_calls array may arrive in chunks across multiple SSE events, requiring manual reassembly. Most tutorials gloss over this—you'll see JSON.parse blow up on incomplete strings.

Solution Overview

Our Approach: Native Web APIs + Manual Stream Parsing

Instead of fighting SDK compatibility, we use:

  • Native fetch with streaming Response.body
  • Web Streams API (ReadableStream, TransformStream) for SSE
  • Manual tool call accumulation (no fragile regex parsing)
  • Cloudflare KV for cross-request state (optional)

Why not the official Moonshot SDK? It pulls in Node-specific dependencies. Our approach adds ~2KB to your worker bundle versus ~150KB with node_compat workarounds.

Alternative considered: Using Cloudflare's AI Gateway. Good for observability, but adds latency and doesn't solve the streaming/tool-calling logic you need for agents.

Implementation Steps

Step 1: Initialize Worker Project

implementation-steps-1.sh
1mkdir kimi-agent-worker && cd kimi-agent-worker 2wrangler init --yes 3npm install hono # lightweight, Workers-native framework

Edit wrangler.toml:

implementation-steps-2.toml
1name = "kimi-agent" 2main = "src/index.ts" 3compatibility_date = "2024-06-01" 4 5[vars] 6MOONSHOT_API_KEY = "" # set via wrangler secret in production 7 8[[kv_namespaces]] 9binding = "AGENT_KV" 10id = "your-kv-namespace-id" # optional, for state persistence

Step 2: Define Tool Schema and Types

Create src/types.ts:

implementation-steps-3.ts
1export interface Tool { 2 name: string; 3 description: string; 4 parameters: { 5 type: "object"; 6 properties: Record<string, unknown>; 7 required?: string[]; 8 }; 9} 10 11export interface ToolCall { 12 id: string; 13 type: "function"; 14 function: { 15 name: string; 16 arguments: string; // JSON string, may be partial during streaming 17 }; 18} 19 20export interface Message { 21 role: "system" | "user" | "assistant" | "tool"; 22 content: string; 23 tool_calls?: ToolCall[]; 24 tool_call_id?: string; 25}

Step 3: Build the Streaming Client

Create src/kimi-client.ts:

implementation-steps-4.ts
1const MOONSHOT_BASE = "https://api.moonshot.cn/v1"; 2 3export interface StreamChunk { 4 choices: Array<{ 5 delta: { 6 content?: string; 7 tool_calls?: Array<{ 8 index: number; 9 id?: string; 10 type?: "function"; 11 function?: { 12 name?: string; 13 arguments?: string; 14 }; 15 }>; 16 }; 17 finish_reason: string | null; 18 }>; 19} 20 21export async function* streamChat( 22 apiKey: string, 23 messages: Message[], 24 tools: Tool[], 25 model = "kimi-k2-5" 26): AsyncGenerator<StreamChunk, void, unknown> { 27 const response = await fetch(`${MOONSHOT_BASE}/chat/completions`, { 28 method: "POST", 29 headers: { 30 "Authorization": `Bearer ${apiKey}`, 31 "Content-Type": "application/json", 32 }, 33 body: JSON.stringify({ 34 model, 35 messages, 36 tools: tools.map(t => ({ type: "function", function: t })), 37 tool_choice: "auto", 38 stream: true, 39 }), 40 }); 41 42 if (!response.ok) { 43 throw new Error(`Kimi API error: ${response.status}`); 44 } 45 46 const reader = response.body!.getReader(); 47 const decoder = new TextDecoder(); 48 let buffer = ""; 49 50 while (true) { 51 const { done, value } = await reader.read(); 52 if (done) break; 53 54 buffer += decoder.decode(value, { stream: true }); 55 const lines = buffer.split("\n"); 56 buffer = lines.pop() || ""; 57 58 for (const line of lines) { 59 const trimmed = line.trim(); 60 if (!trimmed || trimmed === "data: [DONE]") continue; 61 if (trimmed.startsWith("data: ")) { 62 const json = trimmed.slice(6); 63 try { 64 yield JSON.parse(json) as StreamChunk; 65 } catch { 66 // Partial JSON, continue accumulating 67 } 68 } 69 } 70 } 71}

Step 4: Implement Tool Execution

Create src/tools.ts:

implementation-steps-5.ts
1export const tools: Tool[] = [ 2 { 3 name: "calculator", 4 description: "Evaluate mathematical expressions", 5 parameters: { 6 type: "object", 7 properties: { 8 expression: { type: "string", description: "Math expression to evaluate" }, 9 }, 10 required: ["expression"], 11 }, 12 }, 13 { 14 name: "web_search", 15 description: "Search the web for current information", 16 parameters: { 17 type: "object", 18 properties: { 19 query: { type: "string", description: "Search query" }, 20 }, 21 required: ["query"], 22 }, 23 }, 24]; 25 26export async function executeTool(name: string, args: string): Promise<string> { 27 const parsed = JSON.parse(args); 28 29 switch (name) { 30 case "calculator": 31 // SECURITY: Use a proper math parser in production 32 // This is a naive implementation for demonstration 33 try { 34 // eslint-disable-next-line no-new-func 35 const result = new Function(`return (${parsed.expression})`)(); 36 return String(result); 37 } catch { 38 return "Error: Invalid expression"; 39 } 40 41 case "web_search": 42 // Integrate with your search provider (SerpAPI, Brave, etc.) 43 // Example with Cloudflare's native fetch: 44 const searchRes = await fetch( 45 `https://api.bing.microsoft.com/v7.0/search?q=${encodeURIComponent(parsed.query)}`, 46 { headers: { "Ocp-Apim-Subscription-Key": "YOUR_KEY" } } 47 ); 48 const data = await searchRes.json(); 49 return JSON.stringify(data.webPages?.value?.slice(0, 3) || []); 50 51 default: 52 return `Unknown tool: ${name}`; 53 } 54}

Step 5: Assemble the Agent Handler

Create src/index.ts:

implementation-steps-6.ts
1import { Hono } from "hono"; 2import { streamText } from "./streaming"; 3import { streamChat } from "./kimi-client"; 4import { tools, executeTool } from "./tools"; 5import type { Message, ToolCall } from "./types"; 6 7type Bindings = { 8 MOONSHOT_API_KEY: string; 9 AGENT_KV?: KVNamespace; 10}; 11 12const app = new Hono<{ Bindings: Bindings }>(); 13 14app.post("/agent", async (c) => { 15 const { messages: userMessages, sessionId = crypto.randomUUID() } = 16 await c.req.json<{ messages: Message[]; sessionId?: string }>(); 17 18 const apiKey = c.env.MOONSHOT_API_KEY; 19 20 // Optional: Load previous context from KV 21 let messages = userMessages; 22 if (c.env.AGENT_KV) { 23 const stored = await c.env.AGENT_KV.get(`session:${sessionId}`); 24 if (stored) { 25 messages = [...JSON.parse(stored), ...userMessages]; 26 } 27 } 28 29 return streamText(c, async (stream) => { 30 const pendingToolCalls = new Map<number, Partial<ToolCall>>(); 31 let assistantMessage: Message = { role: "assistant", content: "" }; 32 let needsToolExecution = false; 33 34 // First pass: stream and accumulate 35 for await (const chunk of streamChat(apiKey, messages, tools)) { 36 const delta = chunk.choices[0]?.delta; 37 38 // Handle content streaming 39 if (delta.content) { 40 assistantMessage.content += delta.content; 41 await stream.write(`data: ${JSON.stringify({ type: "content", text: delta.content })}\n\n`); 42 } 43 44 // Handle tool call streaming (critical: accumulate partial calls) 45 if (delta.tool_calls) { 46 needsToolExecution = true; 47 for (const tc of delta.tool_calls) { 48 const existing = pendingToolCalls.get(tc.index) || { 49 id: tc.id || "", 50 type: "function" as const, 51 function: { name: "", arguments: "" }, 52 }; 53 54 if (tc.id) existing.id = tc.id; 55 if (tc.function?.name) existing.function!.name = tc.function.name; 56 if (tc.function?.arguments) existing.function!.arguments += tc.function.arguments; 57 58 pendingToolCalls.set(tc.index, existing); 59 } 60 } 61 62 if (chunk.choices[0]?.finish_reason) break; 63 } 64 65 // Execute tools if needed 66 if (needsToolExecution && pendingToolCalls.size > 0) { 67 const completedCalls = Array.from(pendingToolCalls.values()) 68 .filter((tc): tc is ToolCall => 69 tc.id !== undefined && 70 tc.function?.name !== undefined && 71 tc.function?.arguments !== undefined 72 ); 73 74 assistantMessage.tool_calls = completedCalls; 75 76 // Stream tool execution start 77 await stream.write(`data: ${JSON.stringify({ type: "tool_start", count: completedCalls.length })}\n\n`); 78 79 // Execute tools in parallel 80 const toolResults = await Promise.all( 81 completedCalls.map(async (tc) => { 82 const result = await executeTool(tc.function.name, tc.function.arguments); 83 return { 84 role: "tool" as const, 85 tool_call_id: tc.id, 86 content: result, 87 }; 88 }) 89 ); 90 91 // Stream tool results 92 for (const result of toolResults) { 93 await stream.write(`data: ${JSON.stringify({ type: "tool_result", id: result.tool_call_id, result: result.content })}\n\n`); 94 } 95 96 // Recursive call with tool results 97 const finalMessages = [...messages, assistantMessage, ...toolResults]; 98 99 // Optional: Save to KV 100 if (c.env.AGENT_KV) { 101 await c.env.AGENT_KV.put(`session:${sessionId}`, JSON.stringify(finalMessages), { 102 expirationTtl: 3600, // 1 hour 103 }); 104 } 105 106 // Stream final response 107 for await (const chunk of streamChat(apiKey, finalMessages, tools)) { 108 const content = chunk.choices[0]?.delta?.content; 109 if (content) { 110 await stream.write(`data: ${JSON.stringify({ type: "content", text: content })}\n\n`); 111 } 112 if (chunk.choices[0]?.finish_reason) break; 113 } 114 } 115 116 await stream.write("data: [DONE]\n\n"); 117 }); 118}); 119 120export default app;

Step 6: Add SSE Streaming Utility

Create src/streaming.ts:

implementation-steps-7.ts
1import type { Context } from "hono"; 2import type { Env } from "hono"; 3 4export function streamText<E extends Env>( 5 c: Context<E>, 6 generator: (stream: { write: (data: string) => Promise<void> }) => Promise<void> 7): Response { 8 const stream = new TransformStream(); 9 const writer = stream.writable.getWriter(); 10 const encoder = new TextEncoder(); 11 12 // Run generator without awaiting to start streaming immediately 13 generator({ 14 write: (data) => writer.write(encoder.encode(data)), 15 }).finally(() => writer.close()); 16 17 return c.newResponse(stream.readable, 200, { 18 "Content-Type": "text/event-stream", 19 "Cache-Control": "no-cache", 20 "Connection": "keep-alive", 21 }); 22}

Step 7: Deploy

implementation-steps-8.sh
1# Set your API key securely 2wrangler secret put MOONSHOT_API_KEY 3 4# Deploy 5wrangler deploy 6 7# Test 8curl -X POST https://kimi-agent.your-subdomain.workers.dev/agent \ 9 -H "Content-Type: application/json" \ 10 -d '{"messages":[{"role":"user","content":"What is 15 * 23?"}]}'

Code Snippets

Filename: wrangler.toml

Language: toml

Purpose: Worker configuration with KV binding for session state

Code:

code-snippet-1.toml
1name = "kimi-agent" 2main = "src/index.ts" 3compatibility_date = "2024-06-01" 4 5[vars] 6MOONSHOT_API_KEY = "" 7 8[[kv_namespaces]] 9binding = "AGENT_KV" 10id = "your-kv-namespace-id"

Filename: src/kimi-client.ts

Language: typescript

Purpose: Native fetch-based streaming client for Kimi 2.5 API

Code:

code-snippet-2.ts
1const MOONSHOT_BASE = "https://api.moonshot.cn/v1"; 2 3export interface StreamChunk { 4 choices: Array<{ 5 delta: { 6 content?: string; 7 tool_calls?: Array<{ 8 index: number; 9 id?: string; 10 type?: "function"; 11 function?: { name?: string; arguments?: string }; 12 }>; 13 }; 14 finish_reason: string | null; 15 }>; 16} 17 18export async function* streamChat( 19 apiKey: string, 20 messages: Message[], 21 tools: Tool[], 22 model = "kimi-k2-5" 23): AsyncGenerator<StreamChunk, void, unknown> { 24 const response = await fetch(`${MOONSHOT_BASE}/chat/completions`, { 25 method: "POST", 26 headers: { 27 "Authorization": `Bearer ${apiKey}`, 28 "Content-Type": "application/json", 29 }, 30 body: JSON.stringify({ 31 model, 32 messages, 33 tools: tools.map(t => ({ type: "function", function: t })), 34 tool_choice: "auto", 35 stream: true, 36 }), 37 }); 38 39 if (!response.ok) throw new Error(`Kimi API error: ${response.status}`); 40 41 const reader = response.body!.getReader(); 42 const decoder = new TextDecoder(); 43 let buffer = ""; 44 45 while (true) { 46 const { done, value } = await reader.read(); 47 if (done) break; 48 49 buffer += decoder.decode(value, { stream: true }); 50 const lines = buffer.split("\n"); 51 buffer = lines.pop() || ""; 52 53 for (const line of lines) { 54 const trimmed = line.trim(); 55 if (!trimmed || trimmed === "data: [DONE]") continue; 56 if (trimmed.startsWith("data: ")) { 57 try { yield JSON.parse(trimmed.slice(6)); } catch {} 58 } 59 } 60 } 61}

Filename: src/index.ts

Language: typescript

Purpose: Main Hono handler with tool call accumulation and recursive execution

Code:

```typescript

app.post("/agent", async (c) => {

const { messages: userMessages, sessionId = crypto.randomUUID() } =

await c.req.json();

const apiKey = c.env.MOONSHOT_API_KEY;

let messages = userMessages;

return streamText(c, async (stream) => {

const pendingToolCalls = new Map<number, Partial<ToolCall>>();

let assistantMessage: Message = { role: "assistant", content: "" };

let needsToolExecution = false;

for await (const chunk of streamChat(apiKey, messages, tools)) {

const delta = chunk.choices[0]?.delta;

if (delta.content) {

assistantMessage.content += delta.content;

await stream.write(data: ${JSON.stringify({ type: "content", text: delta.content })}\n\n);

}

if (delta.tool_calls) {

needsToolExecution = true;

for (const tc of delta.tool_calls) {

const existing = pendingToolCalls.get(tc.index) || {

id: tc.id || "", type: "function" as const,

function: { name: "", arguments: "" },

};

if (tc.id) existing.id = tc.id;

if (tc.function?.name) existing.function!.name = tc.function.name;

if (tc.function?.arguments) existing.function!.arguments += tc.function.arguments;

pendingToolCalls.set(tc.index, existing);

}

}

if (chunk.choices[0]?.finish_reason) break;

}

if (needsToolExecution && pendingToolCalls.size > 0) {

const completedCalls = Array.from(pendingToolCalls.values())

.filter((tc): tc is ToolCall =>

tc.id !== undefined && tc.function?.name !== undefined && tc.function?.arguments !== undefined

Pro TipFor how to run use ai agents in kimik 2.5 on cloudflare, verify installation, run a real-world validation, and document rollback steps before production.
Next Blog