使用 LangGraph 构建翻译智能体

吴恩达在 2024 年上半年发布过一个名为 “Translation Agent: Agentic translation using reflection workflow” 的项目。在当时的时间背景下,推理大模型还没有出现,能力比较强的模型也就 GPT-4/GPT-4-Turbo。而该项目的核心思想就是利用思维链(Chain of Thoughts)和反思(Reflection)的方式,将简单的 “把 X 翻译成 Y” 转换成三个操作步骤(翻译、反思、优化)从而提升翻译质量。类似的想法还有 《翻译 GPT 的提示词更新和优化》

在这篇博客中我遵循 《Build Effective Agents》 中的 “Workflow: Prompt chaining”,使用 LangGraph 构建了一个翻译智能体 Demo。

The prompt chaining workflow

定义 State

让我们来为智能体创建状态:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
from typing import TypedDict, Literal
from langchain_openai import ChatOpenAI


model = ChatOpenAI(
model="gemini-2.5-flash",
base_url="https://generativelanguage.googleapis.com/v1beta/openai/",
api_key = api_key,
)


class GraphState(TypedDict):
original_text: str
source_language: str
target_language: str
initial_translation: str
quality_check: Literal["Good", "Bad"]
reflection: str
final_translation: str

创建节点

接下去我们需要创建几个节点方法。

翻译节点

该节点的作用是让 LLM 翻译原始文本。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
from langchain_core.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser


translate_prompt = PromptTemplate(
template="""You are an expert translator. Translate the following text from {source_language} to {target_language}.
Provide only the translated text, without any additional explanations or preambles.

Original Text:
{text}
""",
input_variables=["source_language", "target_language", "text"],
)

translator_chain = translate_prompt | model | StrOutputParser()


def translate(state: GraphState):
"""
接收原始文本,进行第一次翻译
"""

initial_translation = translator_chain.invoke({
"text": state["original_text"],
"source_language": state["source_language"],
"target_language": state["target_language"]
})

return {"initial_translation": initial_translation}

门控节点

该节点的作用是让 LLM 检查初始翻译的质量,当翻译质量为好时,直接输出;当翻译质量为差时,进入反思和优化的过程。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
quality_check_prompt = PromptTemplate(
template="""You are a strict translation quality reviewer.
Review the initial translation based on the original text.
Does the translation accurately convey the meaning, style, and nuances of the original?
Answer with a single word: "Good" if the translation is excellent and needs no changes, or "Bad" if it has any issues or could be improved.

Original Text ({source_language}):
{original_text}

Initial Translation ({target_language}):
{initial_translation}

Your single-word assessment:
""",
input_variables=["original_text", "initial_translation", "source_language", "target_language"],
)

quality_check_chain = quality_check_prompt | model | StrOutputParser()


def check_quality(state: GraphState) -> Literal["continue_to_refine", "end_process"]:
"""
检查初版翻译的质量,并决定下一步的走向。
"""

quality_assessment = quality_check_chain.invoke({
"original_text": state["original_text"],
"initial_translation": state["initial_translation"],
"source_language": state["source_language"],
"target_language": state["target_language"]
})

if "Good" in quality_assessment:
return "end_process"
else:
return "continue_to_refine"


def finalize_translation(state: GraphState):
"""
如果初版翻译质量很好,直接将其作为最终翻译
"""
return {"final_translation": state["initial_translation"]}

反思节点

当翻译质量不如人意时,我们希望让 LLM 对初始翻译结果进行反思,例如检查用词是否准确,语言表述是否流畅等,并以此给出建议。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
reflect_prompt = PromptTemplate(
template="""You are a senior translation reviewer. Your task is to review a translation based on the original text.
Identify any potential issues in the translation regarding fluency, accuracy, terminology, and cultural nuances.
Provide a concise list of constructive feedback and suggestions for improvement.

Original Text ({source_language}):
{original_text}

Initial Translation ({target_language}):
{initial_translation}

Your Reflection and Suggestions:
""",
input_variables=["original_text", "initial_translation", "source_language", "target_language"],
)

reflection_chain = reflect_prompt | model | StrOutputParser()


def reflect(state: GraphState):
"""
审核第一次的翻译结果,并提出改进建议
"""

reflection_text = reflection_chain.invoke({
"original_text": state["original_text"],
"initial_translation": state["initial_translation"],
"source_language": state["source_language"],
"target_language": state["target_language"]
})

return {"reflection": reflection_text}

优化节点

该节点的作用是让 LLM 能够依据反思的结果在初始翻译的基础上进行优化。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
refine_prompt = PromptTemplate(
template="""You are a master translator responsible for producing the final version of a translation.
Use the original text, the initial translation, and the reviewer's reflection to create a polished and high-quality final translation.
Integrate the suggestions from the reflection to improve upon the initial version.
Provide only the final, refined translated text.

Original Text ({source_language}):
{original_text}

Initial Translation ({target_language}):
{initial_translation}

Reviewer's Reflection and Suggestions:
{reflection}

Final Polished Translation:
""",
input_variables=["original_text", "initial_translation", "reflection", "source_language", "target_language"],
)

refine_chain = refine_prompt | model | StrOutputParser()


def refine(state: GraphState):
"""
结合原始文本、初版翻译和反思建议,生成最终的优化版翻译
"""

final_translation = refine_chain.invoke({
"original_text": state["original_text"],
"initial_translation": state["initial_translation"],
"reflection": state["reflection"],
"source_language": state["source_language"],
"target_language": state["target_language"]
})

return {"final_translation": final_translation}

创建图

让我们来构建 Prompt Chaining Workflow。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
from langgraph.graph import StateGraph, START, END

workflow = StateGraph(GraphState)

# 添加节点
workflow.add_node("translate", translate)
workflow.add_node("reflect", reflect)
workflow.add_node("refine", refine)
workflow.add_node("finalize_translation", finalize_translation)

# 设置边的连接
workflow.add_edge(START, "translate")

workflow.add_conditional_edges(
"translate",
check_quality,
{
"continue_to_refine": "reflect", # 如果需要优化,则进入反思步骤
"end_process": "finalize_translation" # 如果质量好,则直接结束
}
)

workflow.add_edge("reflect", "refine")
workflow.add_edge("refine", END)
workflow.add_edge("finalize_translation", END)

# 编译图
graph = workflow.compile()
Visualization

运行

我们可以根据输入文本的不同来实现两种调用方式。

当输入文本仅仅只是一段文字时,我们可以按如下方式调用:graph.invoke({"original_text": "a quick fox ...", "source_language": "English", "target_language": "Simplified Chinese"})

更多的时候,我们往往希望输入一个 URL 链接,让智能体去翻译链接中的文章内容。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
from langchain_community.document_loaders import WebBaseLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter

user_input = "https://..."

source_language = "English"
target_language = "Simplified Chinese"

content_to_translate = ""

loader = WebBaseLoader(user_input)
docs = loader.load()
content_to_translate = docs[0].page_content

text_splitter = RecursiveCharacterTextSplitter(
chunk_size=2000, # 每个块的最大字符数
chunk_overlap=100, # 块之间的重叠字符数
separators=["\n\n", "\n"] # 按段落、换行符等分割
)
chunks = text_splitter.split_text(content_to_translate)

final_translated_text = []

for i, chunk in enumerate(chunks):
print(f"===== 正在处理第 {i+1}/{len(chunks)} 个文本块 =====")

result_state = graph.invoke({
"original_text": chunk,
"source_language": source_language,
"target_language": target_language
})

print("--- 原始文本 ---")
print(result_state.get('original_text', 'N/A'))

print("--- 初版翻译 ---")
print(result_state.get('initial_translation', 'N/A'))

if result_state.get('reflection'):
print("--- 反思与建议 ---")
print(result_state['reflection'])
print("--- [优化后] 最终翻译 ---")
else:
print("--- [高质量] 最终翻译 (无需优化) ---")

print(result_state.get('final_translation', 'N/A'))

final_translated_text.append(result_state.get('final_translation', ''))

print("最终合并后的翻译结果:")
print("\n".join(final_translated_text))

Todo

上述代码只是构建了一个翻译智能体的基本原型,使用了比较简单的提示词,参考吴恩达 Translation Agent 项目中的提示词应该会得到更好的输出结果;还可以参考 AI 味去除 项目,让模型输出更加接近真人翻译的效果。