
Amit Hariyale
Full Stack Web Developer, Gigawave

Full Stack Web Developer, Gigawave
how to run use ai agents in kimik 2.5 on cloudflare matters in real projects because weak implementation choices create hard-to-debug failures and inconsistent user experience.
This guide uses focused, production-oriented steps and code examples grounded in official references.
Core setup for how to run use ai agents in kimik 2.5 on cloudflareImplementation flow and reusable patternsValidation and optimization strategyCommon pitfalls and recovery pathsProduction best practicesVerification checklist for releaseUnclear setup path for how to run use ai agents in kimik 2.5 on cloudflareInconsistent implementation patternsMissing validation for edge casesKeep implementation modular and testableUse one clear source of truth for configurationValidate behavior before optimizationWe start with minimal setup, then move to implementation patterns and validation checkpoints for how to run use ai agents in kimik 2.5 on cloudflare.
Apply a step-by-step architecture: setup, core implementation, validation, and performance checks for how to run use ai agents in kimik 2.5 on cloudflare.
Treat how to run use ai agents in kimik 2.5 on cloudflare as an iterative build: baseline first, then reliability and performance hardening.
Only real code appears in code blocks. Other content is rendered as normal headings, lists, and text.
You built a chatbot. Now you need an agent—something that can call APIs, reason through steps, and act without hand-holding. Kimi/Kimik 2.5 (Moonshot AI's latest) brings strong tool-calling and long-context reasoning, but running it on traditional servers means cold starts and regional latency.
Cloudflare Workers changes the game: deploy your agent to 300+ locations, pay per request, and stream tokens from the edge. The catch? Workers have strict CPU limits, no native Node APIs, and a unique fetch-based runtime. Most Kimi SDK examples assume a Node.js environment and break immediately.
This guide shows you how to bridge that gap—no hacks, no node_compat flags that bloat your bundle. Just clean, edge-native code that streams tool calls and handles state across requests.
A stateless agent worker that:
| Problem | Symptom | Root Cause |
|---|---|---|
| fs/path imports | Error: No such module | SDKs assume Node.js file system |
| axios or node-fetch | fetch is not a function | Workers use native fetch, not polyfills |
| Streaming with ReadableStream | Empty responses or hangs | Incorrect stream handling for Web Streams API |
| Tool call parsing | Malformed JSON in function.arguments | Kimi returns partial JSON during streaming |
Kimi 2.5 streams tool calls differently than OpenAI. The delta.tool_calls array may arrive in chunks across multiple SSE events, requiring manual reassembly. Most tutorials gloss over this—you'll see JSON.parse blow up on incomplete strings.
Instead of fighting SDK compatibility, we use:
Why not the official Moonshot SDK? It pulls in Node-specific dependencies. Our approach adds ~2KB to your worker bundle versus ~150KB with node_compat workarounds.
Alternative considered: Using Cloudflare's AI Gateway. Good for observability, but adds latency and doesn't solve the streaming/tool-calling logic you need for agents.
1mkdir kimi-agent-worker && cd kimi-agent-worker
2wrangler init --yes
3npm install hono # lightweight, Workers-native frameworkEdit wrangler.toml:
1name = "kimi-agent"
2main = "src/index.ts"
3compatibility_date = "2024-06-01"
4
5[vars]
6MOONSHOT_API_KEY = "" # set via wrangler secret in production
7
8[[kv_namespaces]]
9binding = "AGENT_KV"
10id = "your-kv-namespace-id" # optional, for state persistenceCreate src/types.ts:
1export interface Tool {
2 name: string;
3 description: string;
4 parameters: {
5 type: "object";
6 properties: Record<string, unknown>;
7 required?: string[];
8 };
9}
10
11export interface ToolCall {
12 id: string;
13 type: "function";
14 function: {
15 name: string;
16 arguments: string; // JSON string, may be partial during streaming
17 };
18}
19
20export interface Message {
21 role: "system" | "user" | "assistant" | "tool";
22 content: string;
23 tool_calls?: ToolCall[];
24 tool_call_id?: string;
25}Create src/kimi-client.ts:
1const MOONSHOT_BASE = "https://api.moonshot.cn/v1";
2
3export interface StreamChunk {
4 choices: Array<{
5 delta: {
6 content?: string;
7 tool_calls?: Array<{
8 index: number;
9 id?: string;
10 type?: "function";
11 function?: {
12 name?: string;
13 arguments?: string;
14 };
15 }>;
16 };
17 finish_reason: string | null;
18 }>;
19}
20
21export async function* streamChat(
22 apiKey: string,
23 messages: Message[],
24 tools: Tool[],
25 model = "kimi-k2-5"
26): AsyncGenerator<StreamChunk, void, unknown> {
27 const response = await fetch(`${MOONSHOT_BASE}/chat/completions`, {
28 method: "POST",
29 headers: {
30 "Authorization": `Bearer ${apiKey}`,
31 "Content-Type": "application/json",
32 },
33 body: JSON.stringify({
34 model,
35 messages,
36 tools: tools.map(t => ({ type: "function", function: t })),
37 tool_choice: "auto",
38 stream: true,
39 }),
40 });
41
42 if (!response.ok) {
43 throw new Error(`Kimi API error: ${response.status}`);
44 }
45
46 const reader = response.body!.getReader();
47 const decoder = new TextDecoder();
48 let buffer = "";
49
50 while (true) {
51 const { done, value } = await reader.read();
52 if (done) break;
53
54 buffer += decoder.decode(value, { stream: true });
55 const lines = buffer.split("\n");
56 buffer = lines.pop() || "";
57
58 for (const line of lines) {
59 const trimmed = line.trim();
60 if (!trimmed || trimmed === "data: [DONE]") continue;
61 if (trimmed.startsWith("data: ")) {
62 const json = trimmed.slice(6);
63 try {
64 yield JSON.parse(json) as StreamChunk;
65 } catch {
66 // Partial JSON, continue accumulating
67 }
68 }
69 }
70 }
71}Create src/tools.ts:
1export const tools: Tool[] = [
2 {
3 name: "calculator",
4 description: "Evaluate mathematical expressions",
5 parameters: {
6 type: "object",
7 properties: {
8 expression: { type: "string", description: "Math expression to evaluate" },
9 },
10 required: ["expression"],
11 },
12 },
13 {
14 name: "web_search",
15 description: "Search the web for current information",
16 parameters: {
17 type: "object",
18 properties: {
19 query: { type: "string", description: "Search query" },
20 },
21 required: ["query"],
22 },
23 },
24];
25
26export async function executeTool(name: string, args: string): Promise<string> {
27 const parsed = JSON.parse(args);
28
29 switch (name) {
30 case "calculator":
31 // SECURITY: Use a proper math parser in production
32 // This is a naive implementation for demonstration
33 try {
34 // eslint-disable-next-line no-new-func
35 const result = new Function(`return (${parsed.expression})`)();
36 return String(result);
37 } catch {
38 return "Error: Invalid expression";
39 }
40
41 case "web_search":
42 // Integrate with your search provider (SerpAPI, Brave, etc.)
43 // Example with Cloudflare's native fetch:
44 const searchRes = await fetch(
45 `https://api.bing.microsoft.com/v7.0/search?q=${encodeURIComponent(parsed.query)}`,
46 { headers: { "Ocp-Apim-Subscription-Key": "YOUR_KEY" } }
47 );
48 const data = await searchRes.json();
49 return JSON.stringify(data.webPages?.value?.slice(0, 3) || []);
50
51 default:
52 return `Unknown tool: ${name}`;
53 }
54}Create src/index.ts:
1import { Hono } from "hono";
2import { streamText } from "./streaming";
3import { streamChat } from "./kimi-client";
4import { tools, executeTool } from "./tools";
5import type { Message, ToolCall } from "./types";
6
7type Bindings = {
8 MOONSHOT_API_KEY: string;
9 AGENT_KV?: KVNamespace;
10};
11
12const app = new Hono<{ Bindings: Bindings }>();
13
14app.post("/agent", async (c) => {
15 const { messages: userMessages, sessionId = crypto.randomUUID() } =
16 await c.req.json<{ messages: Message[]; sessionId?: string }>();
17
18 const apiKey = c.env.MOONSHOT_API_KEY;
19
20 // Optional: Load previous context from KV
21 let messages = userMessages;
22 if (c.env.AGENT_KV) {
23 const stored = await c.env.AGENT_KV.get(`session:${sessionId}`);
24 if (stored) {
25 messages = [...JSON.parse(stored), ...userMessages];
26 }
27 }
28
29 return streamText(c, async (stream) => {
30 const pendingToolCalls = new Map<number, Partial<ToolCall>>();
31 let assistantMessage: Message = { role: "assistant", content: "" };
32 let needsToolExecution = false;
33
34 // First pass: stream and accumulate
35 for await (const chunk of streamChat(apiKey, messages, tools)) {
36 const delta = chunk.choices[0]?.delta;
37
38 // Handle content streaming
39 if (delta.content) {
40 assistantMessage.content += delta.content;
41 await stream.write(`data: ${JSON.stringify({ type: "content", text: delta.content })}\n\n`);
42 }
43
44 // Handle tool call streaming (critical: accumulate partial calls)
45 if (delta.tool_calls) {
46 needsToolExecution = true;
47 for (const tc of delta.tool_calls) {
48 const existing = pendingToolCalls.get(tc.index) || {
49 id: tc.id || "",
50 type: "function" as const,
51 function: { name: "", arguments: "" },
52 };
53
54 if (tc.id) existing.id = tc.id;
55 if (tc.function?.name) existing.function!.name = tc.function.name;
56 if (tc.function?.arguments) existing.function!.arguments += tc.function.arguments;
57
58 pendingToolCalls.set(tc.index, existing);
59 }
60 }
61
62 if (chunk.choices[0]?.finish_reason) break;
63 }
64
65 // Execute tools if needed
66 if (needsToolExecution && pendingToolCalls.size > 0) {
67 const completedCalls = Array.from(pendingToolCalls.values())
68 .filter((tc): tc is ToolCall =>
69 tc.id !== undefined &&
70 tc.function?.name !== undefined &&
71 tc.function?.arguments !== undefined
72 );
73
74 assistantMessage.tool_calls = completedCalls;
75
76 // Stream tool execution start
77 await stream.write(`data: ${JSON.stringify({ type: "tool_start", count: completedCalls.length })}\n\n`);
78
79 // Execute tools in parallel
80 const toolResults = await Promise.all(
81 completedCalls.map(async (tc) => {
82 const result = await executeTool(tc.function.name, tc.function.arguments);
83 return {
84 role: "tool" as const,
85 tool_call_id: tc.id,
86 content: result,
87 };
88 })
89 );
90
91 // Stream tool results
92 for (const result of toolResults) {
93 await stream.write(`data: ${JSON.stringify({ type: "tool_result", id: result.tool_call_id, result: result.content })}\n\n`);
94 }
95
96 // Recursive call with tool results
97 const finalMessages = [...messages, assistantMessage, ...toolResults];
98
99 // Optional: Save to KV
100 if (c.env.AGENT_KV) {
101 await c.env.AGENT_KV.put(`session:${sessionId}`, JSON.stringify(finalMessages), {
102 expirationTtl: 3600, // 1 hour
103 });
104 }
105
106 // Stream final response
107 for await (const chunk of streamChat(apiKey, finalMessages, tools)) {
108 const content = chunk.choices[0]?.delta?.content;
109 if (content) {
110 await stream.write(`data: ${JSON.stringify({ type: "content", text: content })}\n\n`);
111 }
112 if (chunk.choices[0]?.finish_reason) break;
113 }
114 }
115
116 await stream.write("data: [DONE]\n\n");
117 });
118});
119
120export default app;Create src/streaming.ts:
1import type { Context } from "hono";
2import type { Env } from "hono";
3
4export function streamText<E extends Env>(
5 c: Context<E>,
6 generator: (stream: { write: (data: string) => Promise<void> }) => Promise<void>
7): Response {
8 const stream = new TransformStream();
9 const writer = stream.writable.getWriter();
10 const encoder = new TextEncoder();
11
12 // Run generator without awaiting to start streaming immediately
13 generator({
14 write: (data) => writer.write(encoder.encode(data)),
15 }).finally(() => writer.close());
16
17 return c.newResponse(stream.readable, 200, {
18 "Content-Type": "text/event-stream",
19 "Cache-Control": "no-cache",
20 "Connection": "keep-alive",
21 });
22}1# Set your API key securely
2wrangler secret put MOONSHOT_API_KEY
3
4# Deploy
5wrangler deploy
6
7# Test
8curl -X POST https://kimi-agent.your-subdomain.workers.dev/agent \
9 -H "Content-Type: application/json" \
10 -d '{"messages":[{"role":"user","content":"What is 15 * 23?"}]}'Filename: wrangler.toml
Language: toml
Purpose: Worker configuration with KV binding for session state
Code:
1name = "kimi-agent"
2main = "src/index.ts"
3compatibility_date = "2024-06-01"
4
5[vars]
6MOONSHOT_API_KEY = ""
7
8[[kv_namespaces]]
9binding = "AGENT_KV"
10id = "your-kv-namespace-id"Filename: src/kimi-client.ts
Language: typescript
Purpose: Native fetch-based streaming client for Kimi 2.5 API
Code:
1const MOONSHOT_BASE = "https://api.moonshot.cn/v1";
2
3export interface StreamChunk {
4 choices: Array<{
5 delta: {
6 content?: string;
7 tool_calls?: Array<{
8 index: number;
9 id?: string;
10 type?: "function";
11 function?: { name?: string; arguments?: string };
12 }>;
13 };
14 finish_reason: string | null;
15 }>;
16}
17
18export async function* streamChat(
19 apiKey: string,
20 messages: Message[],
21 tools: Tool[],
22 model = "kimi-k2-5"
23): AsyncGenerator<StreamChunk, void, unknown> {
24 const response = await fetch(`${MOONSHOT_BASE}/chat/completions`, {
25 method: "POST",
26 headers: {
27 "Authorization": `Bearer ${apiKey}`,
28 "Content-Type": "application/json",
29 },
30 body: JSON.stringify({
31 model,
32 messages,
33 tools: tools.map(t => ({ type: "function", function: t })),
34 tool_choice: "auto",
35 stream: true,
36 }),
37 });
38
39 if (!response.ok) throw new Error(`Kimi API error: ${response.status}`);
40
41 const reader = response.body!.getReader();
42 const decoder = new TextDecoder();
43 let buffer = "";
44
45 while (true) {
46 const { done, value } = await reader.read();
47 if (done) break;
48
49 buffer += decoder.decode(value, { stream: true });
50 const lines = buffer.split("\n");
51 buffer = lines.pop() || "";
52
53 for (const line of lines) {
54 const trimmed = line.trim();
55 if (!trimmed || trimmed === "data: [DONE]") continue;
56 if (trimmed.startsWith("data: ")) {
57 try { yield JSON.parse(trimmed.slice(6)); } catch {}
58 }
59 }
60 }
61}Filename: src/index.ts
Language: typescript
Purpose: Main Hono handler with tool call accumulation and recursive execution
Code:
```typescript
app.post("/agent", async (c) => {
const { messages: userMessages, sessionId = crypto.randomUUID() } =
await c.req.json();
const apiKey = c.env.MOONSHOT_API_KEY;
let messages = userMessages;
return streamText(c, async (stream) => {
const pendingToolCalls = new Map<number, Partial<ToolCall>>();
let assistantMessage: Message = { role: "assistant", content: "" };
let needsToolExecution = false;
for await (const chunk of streamChat(apiKey, messages, tools)) {
const delta = chunk.choices[0]?.delta;
if (delta.content) {
assistantMessage.content += delta.content;
await stream.write(data: ${JSON.stringify({ type: "content", text: delta.content })}\n\n);
}
if (delta.tool_calls) {
needsToolExecution = true;
for (const tc of delta.tool_calls) {
const existing = pendingToolCalls.get(tc.index) || {
id: tc.id || "", type: "function" as const,
function: { name: "", arguments: "" },
};
if (tc.id) existing.id = tc.id;
if (tc.function?.name) existing.function!.name = tc.function.name;
if (tc.function?.arguments) existing.function!.arguments += tc.function.arguments;
pendingToolCalls.set(tc.index, existing);
}
}
if (chunk.choices[0]?.finish_reason) break;
}
if (needsToolExecution && pendingToolCalls.size > 0) {
const completedCalls = Array.from(pendingToolCalls.values())
.filter((tc): tc is ToolCall =>
tc.id !== undefined && tc.function?.name !== undefined && tc.function?.arguments !== undefined