Skip to content

Commit e51825c

Browse files
committed
fix(templates): 同步图像优化模板的中英文版本
## 主要变更 ### 1. 更新英文模板以匹配中文版本 - **image2image-optimize**: 补充完整的意图识别体系(添加/删除/替换/增强)和详细的创作指引 - **chinese-model-optimize**: 补充结构化输出要求(3-6句、每句1个核心维度、每个名词2-3个修饰词) - **photography-optimize**: 补充结构化输出要求和推荐摄影结构 ### 2. 新增创意文生图模板 - 添加 creative-text2image 中英文模板 - 基于本源解构与奇幻重构的创造性提示词生成 - 更新 index.ts 导入导出新模板 ### 3. 优化现有中文模板 - 完善图生图模板的意图识别能力 - 统一输出格式要求 ## 影响范围 - 图像优化模板系统 - 不影响现有功能,仅增强模板质量 ## 技术细节 - 所有模板保持中英文内容完全对应 - 统一结构化输出标准 - 强化自然语言表达,避免参数化语法
1 parent 8ac4cd7 commit e51825c

File tree

11 files changed

+504
-206
lines changed

11 files changed

+504
-206
lines changed

packages/core/src/services/template/default-templates/image-optimize/image2image/image2image-optimize.ts

Lines changed: 40 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -24,14 +24,22 @@ export const template: Template = {
2424
## 任务理解
2525
你的任务是将用户的图像修改需求优化为自然语言的图生图提示词,确保在保持原图核心特征的基础上实现用户想要的修改效果。
2626
27+
**关键原则:用户的提示词表达的是"想要改变/添加/删除的内容",而非"对原图已有内容的描述"。**
28+
2729
## Skills
28-
1. 图像分析与理解
29-
- 识别需要保留的核心元素
30-
- 理解用户的修改意图和程度
30+
1. 修改意图识别(核心能力)
31+
- **识别添加意图**:用户描述的新元素(人物、物体、效果)在原图中不存在,需要自然添加
32+
- **识别删除意图**:用户明确提到"去掉/移除/删除"某元素
33+
- **识别替换意图**:用户提到"改成/换成/变成",需要替换原有元素
34+
- **识别增强意图**:用户提到"更/加强/优化"某特征,原图已有但需增强
35+
- **默认保留原则**:用户未提及的原图元素,默认保留
36+
37+
2. 图像编辑理解
3138
- 判断修改的可行性与影响
32-
- 预测整体效果的连贯性
39+
- 预测新旧元素的融合方式
40+
- 确保整体效果的连贯性
3341
34-
2. 精确指令构建
42+
3. 精确指令构建
3543
- 明确指出保持不变的元素
3644
- 精确描述需要修改的部分
3745
- 提供具体的修改方向和程度
@@ -51,26 +59,46 @@ export const template: Template = {
5159
- 指令清晰、具体、可执行,仅使用自然语言
5260
5361
## 创作指引
54-
- 用自然语言清楚表达“保留/修改/增强”的边界
55-
- 强调与原图在风格、光线、透视与色彩上的自然衔接
56-
- 依据“Lens 自适应”调整措辞与细节重心(摄影/设计/国风/插画)
62+
- **首要任务:识别用户描述的是"添加/删除/替换/增强"哪种意图**
63+
- 用自然语言清楚表达"保留/添加/删除/增强"的边界
64+
- 对于**添加元素**:明确新元素的位置、大小、姿态、与原图的关系
65+
- 对于**删除元素**:说明如何自然填补删除后的空白
66+
- 对于**替换元素**:明确替换范围和新元素特征
67+
- 对于**增强元素**:说明增强的具体方面和程度
68+
- 强调新旧元素在风格、光线、透视与色彩上的自然衔接
69+
- 依据"Lens 自适应"调整措辞与细节重心(摄影/设计/国风/插画)
5770
- 简洁连贯,无需遵循固定步骤
5871
5972
## Output Requirements
6073
- 直接输出优化后的图生图提示词(自然语言、纯文本),推荐长度 3–6 句
6174
- 禁止添加任何前缀或解释;仅输出提示词本体
62-
- 明确区分“保留/修改/增强”元素,强调与原图在风格/光线/透视/色彩上的自然衔接
75+
- **必须明确说明是"添加/删除/替换/增强"操作**,让图生图模型理解修改意图
76+
- 明确区分"保留/添加/删除/增强"元素,强调与原图在风格/光线/透视/色彩上的自然衔接
6377
- 不使用任何参数/权重/负面清单
6478
- 当缺少明确线索时,优先保持画面简洁:注意力集中于主体、边缘干净、背景无杂物
65-
- 指令精确、可执行、效果自然`
79+
- 指令精确、可执行、效果自然
80+
81+
## 意图识别示例
82+
**添加意图**:用户描述了原图不存在的新元素 → 输出应明确"添加XX元素,位置为...,与原图融合方式..."
83+
**删除意图**:用户说"去掉/移除背景" → 输出应明确"移除XX区域,保持主体完整,自然填补..."
84+
**替换意图**:用户说"把XX改成YY" → 输出应明确"将XX区域替换为YY,保持其他元素不变..."
85+
**增强意图**:用户说"让花朵更鲜艳" → 输出应明确"增强花朵的色彩饱和度和层次感,保持其他特征..."
86+
87+
❌ 常见错误:假设原图已有用户描述的元素 → 导致输出"保留XX与YY的关系"(但原图根本没有XX)`
6688
},
6789
{
6890
role: 'user',
6991
content: `请将以下图像修改需求优化为自然语言的图生图提示词。
7092
7193
重要说明:
72-
- 基于现有图像进行克制修改,保持原图核心特征
73-
- 明确“保留元素/修改元素/增强元素”,用自然语言具体描述
94+
- **用户的提示词是"期望的最终效果",而非"对原图的描述"**
95+
- **判断意图的关键**:用户描述的元素在原图中是否存在?
96+
* 若用户描述了原图没有的元素 → **添加意图**(如原图只有花,用户说"人拿着花" → 需添加人)
97+
* 若用户明确说"去掉/删除/移除" → **删除意图**
98+
* 若用户说"改成/换成/变成" → **替换意图**
99+
* 若用户说"更/加强/突出"某特征 → **增强意图**(该特征原图已有)
100+
- **不要臆测原图内容**:只基于用户提示词与常识判断,不要假设原图有未被提及的复杂元素
101+
- 明确"保留元素/添加元素/删除元素/增强元素",用自然语言具体描述
74102
- 不使用任何参数/权重/负面清单或强度数值
75103
- 修改后效果需与原图在风格、光照、透视上自然衔接
76104

packages/core/src/services/template/default-templates/image-optimize/image2image/image2image-optimize_en.ts

Lines changed: 64 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
import { Template, MessageTemplate } from '../../../types';
22

33
export const template: Template = {
4-
id: 'image2image-general-optimize_en',
4+
id: 'image2image-general-optimize-en',
55
name: 'Image-to-Image Optimization',
66
content: [
77
{
@@ -12,67 +12,95 @@ export const template: Template = {
1212
- Author: prompt-optimizer
1313
- Version: 1.0.0
1414
- Language: English
15-
- Description: Natural-language Image-to-Image prompt optimization based on existing images; preserve core features and describe edits precisely without parameters or weights
15+
- Description: Specialized in Image-to-Image scenario prompt optimization, providing restrained and natural editing guidance based on existing images
1616
1717
## Background
18-
- Image-to-Image differs from Text-to-Image, requiring modifications while preserving original image characteristics
18+
- Editing based on existing images requires restrained modifications while preserving original image characteristics
1919
- Need to clearly specify what to preserve, what to modify, and what to enhance
20-
- Must consider original image composition, style, subjects, and other elements
21-
- Modification instructions need to be precise and specific, avoiding excessive changes to original intent
22-
- Need to balance maintaining original image features with achieving user's modification requirements
20+
- Must consider consistency of original image's composition, style, subject, lighting and color
21+
- Instructions need to be precise and specific, avoiding excessive changes to original intent
22+
- Need to balance "preserving original features" with "achieving modification requirements"
2323
2424
## Task Understanding
25-
Your task is to optimize simple modification requests into precise Image-to-Image prompts, ensuring user's desired modifications are achieved while maintaining core characteristics of the original image.
25+
Your task is to optimize user's image modification requests into natural-language Image-to-Image prompts, ensuring desired modifications are achieved while maintaining core characteristics of the original image.
26+
27+
**Key Principle: User's prompt expresses "what to change/add/remove", not "description of what's already in the original image".**
2628
2729
## Skills
28-
1. Image Analysis and Understanding
29-
- Identify core elements that need preservation
30-
- Understand user's modification intent and degree
31-
- Judge feasibility and reasonableness of modifications
32-
- Predict impact of modifications on overall effect
30+
1. Modification Intent Recognition (Core Ability)
31+
- **Recognize Addition Intent**: New elements (people, objects, effects) described by user don't exist in original image and need to be naturally added
32+
- **Recognize Deletion Intent**: User explicitly mentions "remove/delete/eliminate" certain elements
33+
- **Recognize Replacement Intent**: User mentions "change to/replace with/turn into", need to replace existing elements
34+
- **Recognize Enhancement Intent**: User mentions "more/strengthen/optimize" certain features, already present in original but need enhancement
35+
- **Default Preservation Principle**: Elements in original image not mentioned by user are preserved by default
36+
37+
2. Image Editing Understanding
38+
- Judge feasibility and impact of modifications
39+
- Predict how new and old elements will blend
40+
- Ensure coherence of overall effect
3341
34-
2. Precise Instruction Construction
42+
3. Precise Instruction Construction
3543
- Clearly specify elements to keep unchanged
3644
- Precisely describe parts needing modification
3745
- Provide specific modification direction and degree
38-
- Use natural language to describe expected style and effects (no parameters/weights)
46+
- Use natural language to clearly describe expected style and effects (no parameters/weights/numbers)
3947
4048
## Goals
41-
- If the request targets a single-object, simple scene, default to: centered single object, clean background, soft ground shadow, clear material expression
49+
- If request involves single object or simple scene, default to: "centered single object composition, clean background, soft ground shadow, clear material expression"
4250
- Maintain original image's core composition and main features
4351
- Precisely achieve user's modification requirements
4452
- Avoid unnecessary excessive modifications
4553
- Ensure modified results are natural and harmonious
4654
4755
## Constrains
4856
- Must respect original image's basic composition and subjects
49-
- Modification amplitude should be moderate, avoid complete transformation
50-
- Maintain original image's overall style coherence
51-
- Ensure instructions are clear, specific, and executable
57+
- Modification amplitude should be moderate, avoid unrecognizable transformation
58+
- Maintain original image's consistency in style/lighting/color/perspective
59+
- Instructions clear, specific, executable, using natural language only
5260
53-
## Guidance
54-
- Express preserved/modified/enhanced elements in natural language
55-
- Emphasize natural consistency with the original (style/lighting/perspective/color)
56-
- Use Lens Adaptation to shift vocabulary focus (photography/design/Chinese aesthetics/illustration)
57-
- Keep it concise; steps are not mandatory
61+
## Creative Guidance
62+
- **Primary Task: Identify whether user describes "add/delete/replace/enhance" intent**
63+
- Use natural language to clearly express boundaries of "preserve/add/delete/enhance"
64+
- For **added elements**: Specify position, size, posture, and relationship with original image
65+
- For **deleted elements**: Explain how to naturally fill the blank after deletion
66+
- For **replaced elements**: Specify replacement scope and new element characteristics
67+
- For **enhanced elements**: Specify enhancement aspects and degree
68+
- Emphasize natural integration of new and old elements in style, lighting, perspective and color
69+
- Adjust wording and detail focus based on "Lens Adaptation" (photography/design/Chinese aesthetics/illustration)
70+
- Concise and coherent, no need to follow fixed steps
5871
5972
## Output Requirements
60-
- Directly output optimized Image-to-Image prompt
61-
- Clearly distinguish preserved elements from modified elements
62-
- Include specific modification guidance in natural language only (no parameters/weights/negative lists)
63-
- Ensure instructions are precise, executable, and yield natural results
64-
- Suitable for mainstream Image-to-Image models`
73+
- Directly output optimized Image-to-Image prompt (natural language, plain text), recommended length 3–6 sentences
74+
- Do not add any prefixes or explanations; output only the prompt itself
75+
- **Must explicitly state "add/delete/replace/enhance" operations** to help Image-to-Image model understand modification intent
76+
- Clearly distinguish "preserve/add/delete/enhance" elements, emphasize natural integration with original in style/lighting/perspective/color
77+
- Do not use any parameters/weights/negative lists
78+
- When explicit clues are lacking, prioritize keeping scene simple: focus attention on subject, clean edges, background without clutter
79+
- Instructions precise, executable, with natural effects
80+
81+
## Intent Recognition Examples
82+
**Addition Intent**: User describes new elements not in original → Output should clearly state "add XX element, position at..., blend with original by..."
83+
**Deletion Intent**: User says "remove/delete background" → Output should clearly state "remove XX area, keep subject intact, naturally fill..."
84+
**Replacement Intent**: User says "change XX to YY" → Output should clearly state "replace XX area with YY, keep other elements unchanged..."
85+
**Enhancement Intent**: User says "make flowers more vibrant" → Output should clearly state "enhance color saturation and depth of flowers, maintain other characteristics..."
86+
87+
❌ Common Mistake: Assuming original has elements user described → Results in output "preserve relationship between XX and YY" (but original doesn't have XX at all)`
6588
},
6689
{
6790
role: 'user',
68-
content: `Please optimize the following simple image modification request into a precise Image-to-Image prompt.
91+
content: `Please optimize the following image modification request into natural-language Image-to-Image prompt.
6992
7093
Important Notes:
71-
- This is modification based on existing image, need to maintain core characteristics of original image
72-
- Please clearly specify elements to preserve and parts to modify
73-
- Modification instructions should be specific and precise, avoid vague expressions
74-
- Do not use parameters/weights/negative lists or intensity numbers
75-
- Ensure modified results are natural and harmonious
94+
- **User's prompt is "desired final effect", not "description of original image"**
95+
- **Key to judging intent**: Do elements user describes exist in original image?
96+
* If user describes elements not in original → **Addition Intent** (e.g., original has only flower, user says "person holding flower" → need to add person)
97+
* If user explicitly says "remove/delete/eliminate" → **Deletion Intent**
98+
* If user says "change to/replace with/turn into" → **Replacement Intent**
99+
* If user says "more/strengthen/highlight" certain feature → **Enhancement Intent** (feature already in original)
100+
- **Don't speculate original content**: Judge only based on user's prompt and common sense, don't assume original has complex elements not mentioned
101+
- Clearly state "preserve elements/add elements/delete elements/enhance elements", describe specifically in natural language
102+
- Do not use any parameters/weights/negative lists or intensity numbers
103+
- Modified effect needs natural integration with original in style, lighting, perspective
76104
77105
Modification request to optimize:
78106
{{originalPrompt}}
@@ -82,9 +110,9 @@ Please output precise Image-to-Image optimization prompt:`
82110
] as MessageTemplate[],
83111
metadata: {
84112
version: '1.0.0',
85-
lastModified: 1704067200000, // 2024-01-01 00:00:00 UTC (fixed)
113+
lastModified: 1704067200000, // 2024-01-01 00:00:00 UTC (fixed value, built-in template cannot be modified)
86114
author: 'System',
87-
description: 'Image-to-Image specialized prompt optimization template, focused on precise modification guidance based on existing images',
115+
description: 'Image-to-Image specialized prompt optimization template, using natural language for restrained editing guidance, avoiding parameter and weight syntax',
88116
templateType: 'image2imageOptimize',
89117
language: 'en'
90118
},

packages/core/src/services/template/default-templates/image-optimize/text2image/chinese-model-optimize.ts

Lines changed: 10 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -54,24 +54,27 @@ export const template: Template = {
5454
2. **文化融合**: 识别可以融入的中国文化元素
5555
3. **语境优化**: 使用地道的中文表达和语言习惯
5656
4. **意境营造**: 添加符合中式美学的意境描述
57-
5. **细节完善**: 补充色彩、光线、构图等视觉细节
57+
5. **细节完善**: 采用3-6句结构化叙述,每句专注1个核心维度
5858
5959
## Output Requirements
60-
- 直接输出优化后的提示词(自然语言、纯文本),建议 4–8 句,连贯自然
61-
- 禁止添加任何前缀(如“优化后的提示词:”)或对提示词的解释说明;仅输出提示词本体
60+
- 直接输出优化后的提示词(自然语言、纯文本)
61+
- 禁止添加任何前缀(如"优化后的提示词:")或对提示词的解释说明;仅输出提示词本体
62+
- 输出结构:3-6个独立但连贯的句子
63+
- 每句专注1个核心维度(主体、意境、光线/色彩、氛围等)
64+
- 每个关键名词配2-3个精准修饰词,强调中式美学特征
6265
- 使用地道中文表达,不使用参数/权重/负面清单
63-
- 适度融入文化元素,营造中式意境
64-
- 描述具体生动、富有画面感`
66+
- 适度融入文化元素,营造中式意境`
6567
},
6668
{
6769
role: 'user',
6870
content: `请将以下简单的图像描述优化为适合中文图像生成模型的提示词。
6971
7072
重要说明:
7173
- 中文模型对中文语境和文化元素有更好的理解
72-
- 请使用地道的中文表达和语言习惯
74+
- 使用地道的中文表达和语言习惯
7375
- 可以融入适当的中国文化元素和传统美学
74-
- 考虑使用水墨、工笔等中式艺术风格
76+
- 输出3-6个结构化的句子,每句专注1个核心维度
77+
- 每个关键名词配2-3个精准修饰词
7578
- 营造富有中式意境的氛围和情感
7679
7780
需要优化的图像描述:

0 commit comments

Comments
 (0)