串流回應
串流回應
透過 Server-Sent Events(SSE)串流,即時接收 Lawbot AI 的逐字生成回應, 提供更流暢的使用者體驗。
串流運作方式
在請求中設定 stream: true, API 將以 Server-Sent Events 格式回傳一系列 data: 事件, 每個事件包含一個 JSON chunk。串流結束時會傳送 data: [DONE]。
cURL 串流範例
terminal
curl http://127.0.0.1:8000/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "lawbot-pro",
"stream": true,
"messages": [
{ "role": "user", "content": "請說明勞動基準法中的資遣費計算方式" }
]
}'SSE 回應格式:
sse_response.txt
data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","choices":[{"delta":{"role":"assistant"},"index":0}]}
data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","choices":[{"delta":{"content":"根據"},"index":0}]}
data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","choices":[{"delta":{"content":"勞動基準法"},"index":0}]}
data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","choices":[{"delta":{"content":"第 17 條"},"index":0}]}
data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","choices":[{"delta":{},"finish_reason":"stop","index":0}]}
data: [DONE]Python 串流
使用 OpenAI Python SDK,SDK 已自動處理 SSE 解析:
streaming.py
from openai import OpenAI
client = OpenAI(
api_key="YOUR_API_KEY",
base_url="http://127.0.0.1:8000",
)
# 傳入 stream=True 啟用串流
stream = client.chat.completions.create(
model="lawbot-pro",
stream=True,
messages=[
{
"role": "user",
"content": "請說明勞動基準法中的資遣費計算方式",
}
],
)
# 逐 chunk 處理回應
for chunk in stream:
content = chunk.choices[0].delta.content
if content is not None:
print(content, end="", flush=True)
print() # 最後換行Node.js / TypeScript 串流
使用 OpenAI Node.js SDK,透過 for await...of 非同步迭代:
streaming.ts
import OpenAI from "openai";
const client = new OpenAI({
apiKey: process.env.LAWBOT_API_KEY,
baseURL: "http://127.0.0.1:8000",
});
const stream = await client.chat.completions.create({
model: "lawbot-pro",
stream: true,
messages: [
{
role: "user",
content: "請說明勞動基準法中的資遣費計算方式",
},
],
});
// 逐 chunk 處理回應
for await (const chunk of stream) {
const content = chunk.choices[0]?.delta?.content;
if (content) process.stdout.write(content);
}原生 Fetch API 處理 SSE
若不使用 SDK,可透過原生 Fetch API 手動解析 SSE 串流:
raw_stream.ts
const response = await fetch(
"http://127.0.0.1:8000/chat/completions",
{
method: "POST",
headers: {
"Authorization": `Bearer ${apiKey}`,
"Content-Type": "application/json",
},
body: JSON.stringify({
model: "lawbot-pro",
stream: true,
messages: [{ role: "user", content: "..." }],
}),
}
);
const reader = response.body!.getReader();
const decoder = new TextDecoder();
while (true) {
const { done, value } = await reader.read();
if (done) break;
const text = decoder.decode(value);
const lines = text.split("\n").filter((l) => l.startsWith("data: "));
for (const line of lines) {
const data = line.slice(6); // remove "data: "
if (data === "[DONE]") break;
const parsed = JSON.parse(data);
const content = parsed.choices[0]?.delta?.content;
if (content) console.log(content);
}
}串流回應下,
usage 欄位通常只會出現在最後一個 chunk 或不包含。 若需要精確的 token 用量,請改用非串流模式。