這是一段基於前兩張照片中人物的詳細提示詞描述,旨在幫助你在其他生成工具(如文生圖或 Veo)中保持人物的一致性。
我提供了中文和英文版本,你可以根據需要選擇使用。
中文版本:
一位年輕的東亞男子,留著整齊的深色短髮(略帶九零年代風格),面部線條柔和,帶有溫柔、真誠的微笑。他身穿一件簡約的白色短袖翻領襯衫。手中抱著一把傳統的中國彈撥樂器(外形類似月琴,有圓形的蛇皮音箱和長琴頸)。照片具有復古膠片攝影的質感,色調柔和、溫暖,背景是黃昏時分的自然山景和樹木。
英文版本 (English Version):
A young East Asian man with neat, dark, short hair (slightly 90s style), a soft facial structure, and a gentle, sincere smile. He is wearing a simple white short-sleeved collared shirt. He holds a traditional Chinese plucked string instrument (resembling a yueqin, with a round snakeskin soundbox and a long neck). The image has the texture of vintage film photography with soft, warm tones, set against a backdrop of natural hills and trees at dusk.
使用建議:
在使用這些提示詞時,你可以根據具體場景的需求,在描述後添加新的動作、服裝或環境(例如第三張圖中的「賽博龐克捷運車廂」和「發光漢服」),但保留這段核心描述將有助於 AI 鎖定人物的基本特徵。
**[樂器參考]:**他坐在石欄上,正在演奏中國傳統樂器三弦琴。樂器的細節必須與特寫鏡頭**image_12.png**中的細節完全一致,包括:小巧圓潤的共鳴箱,表面緊密包裹著灰褐色蛇皮紋理;由三根琴弦支撐的小型浮動式木質琴橋;以及琴頭處帶有三個調音弦鈕的長而無品的紅木琴頸。
*[Character Reference]:** The central figure in the video is a young East Asian man whose facial features, warm expression, and sleek, dark hairstyle reminiscent of the 1990s match the description in the close-up shot **image_10.png**. He wears the same simple light beige short-sleeved shirt and light khaki trousers paired with worn-out sneakers with blue stripes, consistent with the long shot.
請細描述這把三絃(*[Character Reference]:** The central figure in the video is a young East Asian man whose facial features, warm expression, and sleek, dark hairstyle reminiscent of the 1990s match the description in the close-up shot **image_10.png**. He wears the same simple light beige short-sleeved shirt and light khaki trousers paired with worn-out sneakers with blue stripes, consistent with the long shot.
**[Instrument Reference]:** He sits on a stone railing, playing the sanxian, a traditional Chinese instrument. The instrument's details must perfectly match those in the close-up shot **image_12.png**, including: a small, rounded soundbox tightly covered with a gray-brown snakeskin texture; a small, floating wooden bridge supported by three strings; and a long, fretless rosewood neck with three tuning pegs at the headstock.)
場景設定總結:
地點: 中國式廟宇門前的霓虹燈光牌樓正門(結合傳統建築與賽博龐克霓虹燈光)。
角色 A (Dona): 18歲,完美黃金比例娃娃臉,甜美清純風。穿著白色格紋西裝外套、鴕鳥毛袖口、短褲,戴著 AirPods Max 和 Apple Watch,穿著淺青色厚底夾腳拖。動作:坐著彈月琴,
好的,我了解你的需求。你想要創作兩張場景連貫的圖像,將兩個風格迥異的角色——甜美現代風的 Dona 和賽博龐克風的三絃子——放置在同一個充滿衝突美感的背景中,並展現他們之間的互動。
這是一個非常有趣的「跨時空/跨風格」對話場景。
場景設定總結:
地點: 中國式廟宇門前的霓虹燈光牌樓正門(結合傳統建築與賽博龐克霓虹燈光)。
角色 A (Dona): 18歲,完美黃金比例娃娃臉,甜美清純風。穿著白色格紋西裝外套、鴕鳥毛袖口、短褲,戴著 AirPods Max 和 Apple Watch,穿著淺青色厚底夾腳拖。動作:坐著彈月琴,轉身與三絃子談話。
角色 B (三絃子): 年輕男性音樂家,賽博龐克風格。穿著發光漢服 (LED/電路圖案)。動作:坐著彈賽博改裝三絃,側身與 Dona 談話。
為了讓這兩張圖在生成時能保持一致性且符合你的描述,我將為你撰寫兩組精確的 Prompt(提示詞)。我會將原本的捷運背景替換成你指定的「霓虹廟宇牌樓」,並調整兩人的互動姿勢。
照片 1:Dona 的視角 (Focus on Dona interacting with Sanxianzi)
這張圖的重點在 Dona 身上,她坐著彈月琴,轉頭看向三絃子。
Prompt 1 (Dona Focus):
A cinematic wide shot in front of a grand traditional Chinese temple paifang (gate) illuminated by vibrant cyberpunk neon lights.
[Character A - Dona]: Sitting on a temple stone step on the left. An 18-year-old girl with a "perfect golden ratio" baby face, large doe eyes, pale glass skin, and honey-blonde layered hair. She wears a structured white windowpane-checked blazer with ostrich feather cuffs, white shorts, pink AirPods Max, and light cyan platform flip-flops. She holds a pale maple wood Yueqin (Moon Lute). She is turning her body to the right, smiling sweetly and talking to the man next to her.
[Character B - Sanxianzi]: Sitting on the right, partially visible or slightly out of focus. A young East Asian male in a futuristic "glowing Hanfu" with neon blue LED strips. He holds a transparent cyberpunk-modded Sanxian.
[Background]: The temple gate is detailed with intricate carvings but lit with blue, purple, and red neon signs. Wet stone floor reflecting the lights. Night atmosphere, high contrast, photorealistic, 8k. --ar 16:9
照片 2:三絃子的視角 (Focus on Sanxianzi interacting with Dona)
這張圖的重點在三絃子身上,他彈著三絃,側身回應 Dona。
Prompt 2 (Sanxianzi Focus):
A cinematic wide shot in front of a grand traditional Chinese temple paifang (gate) illuminated by vibrant cyberpunk neon lights.
[Character A - Sanxianzi]: Sitting on a temple stone step on the right. A young East Asian male musician with a relaxed, confident posture. He wears a futuristic "glowing Hanfu" integrated with electroluminescent wires and dynamic circuit patterns emitting neon blues and purples. He holds a heavily cyberpunk-modded Sanxian with a transparent body and glowing strings. He is strumming the instrument while turning his body to the left, looking at the girl and engaging in conversation.
[Character B - Dona]: Sitting on the left, partially visible. An 18-year-old girl in a white blazer with feather cuffs and pink AirPods Max, holding a wooden Yueqin.
[Background]: The majestic temple gate looms above, covered in holographic advertisements and neon lights. Dark, moody cyberpunk atmosphere with rain-slicked surfaces. Photorealistic, depth of field, 8k. --ar 16:9
一致性挑戰: 由於這兩個角色的服裝細節都非常多(Dona 的羽毛袖口、AirPods;三絃子的發光漢服、透明樂器),一次生成兩個複雜人物有時會讓 AI 混亂。
如果生成結果中某個角色的細節跑掉(例如 Dona 的 AirPods 不見了,或三絃子的漢服不發光),建議使用 Inpainting (局部重繪) 功能來修正該區域。
構圖微調: 我設定他們是「坐在廟宇前的石階上」,這樣比站著更適合「談話」的氛圍,也比較好安排兩人並排坐著彈奏樂器的構圖。
樂器細節: 提示詞中特別強調了 Dona 拿的是木製月琴,三絃子拿的是透明賽博三絃,以展現「傳統 vs 未來」的對比。
你可以將這兩組 Prompt 複製到 Midjourney 或其他生成工具中試試看!
中文提示詞 (供參考或使用支援中文的模型):
一張電影感的紀實照片,一位年輕的男性音樂家「三絃子」神態自然、自信且放鬆地坐在充滿賽博龐克風格的捷運車廂中央座位上。他身穿一套融合未來科技的「發光漢服」,傳統長袍剪裁上交織著發光的LED燈帶、電致發光線路和動態電路板圖案,散發出藍色、紫色和青色的霓虹光芒。他手中抱著一把經過高度賽博龐克改裝的三絃樂器,琴身是透明的,內部有發光的組件和線圈,琴弦也散發著光輝。捷運車廂內部充滿了閃爍的霓虹中文廣告牌、裸露的數據線纜、全息顯示屏和反光的金屬表面。窗外是雨夜中反烏托邦式的未來城市景觀。混合照明,高細節,8k分辨率。
英文提示詞 (建議優先使用,適用於 Midjourney 等主流模型):
A cinematic candid photograph of a male musician, a Sanxian player, sitting with natural confidence and relaxed posture in the center seat of a highly cyberpunk MRT subway carriage. He wears a futuristic "glowing Hanfu," traditional robes integrated with glowing LED strips, electroluminescent wires, and dynamic circuit patterns emitting neon blues, purples, and cyan light. He holds a heavily cyberpunk-modded Sanxian instrument; its body is transparent showing glowing internal components and coils, with glowing strings. The subway carriage interior is filled with flickering neon Chinese signs, exposed hanging data cables, holographic screens, reflectant wet floors, and metallic surfaces. Outside the window is a rainy dystopian futuristic cityscape at night. Mixed lighting, high detail, 8k resolution, photorealistic. --ar 16:9
(注:如果使用 Midjourney,我在最後加了 --ar 16:9 來設定為電影寬螢幕比例,你可以根據需要移除或修改。)
這是一個非常酷炫且視覺張力十足的概念!將傳統的三絃子與未來的賽博龐克元素結合,會產生很棒的化學反應。
為了達到你想要的效果,提示詞需要精確描述人物的狀態(自然自信)、服裝的細節(發光漢服)、樂器的改造(賽博龐克三絃)以及環境的氛圍(賽博龐克捷運)。
以下我為你準備了中英文版本的完整提示詞。通常圖像生成模型(如 Midjourney, Stable Diffusion)對英文的理解會更精準,建議優先使用英文版本。
核心概念:
一張電影感的紀實照片,捕捉一位音樂家在未來捷運中,自信、放鬆地抱著樂器。
中文提示詞 (供參考或使用支援中文的模型):
一張電影感的紀實照片,一位年輕的男性音樂家「三絃子」神態自然、自信且放鬆地坐在充滿賽博龐克風格的捷運車廂中央座位上。他身穿一套融合未來科技的「發光漢服」,傳統長袍剪裁上交織著發光的LED燈帶、電致發光線路和動態電路板圖案,散發出藍色、紫色和青色的霓虹光芒。他手中抱著一把經過高度賽博龐克改裝的三絃樂器,琴身是透明的,內部有發光的組件和線圈,琴弦也散發著光輝。捷運車廂內部充滿了閃爍的霓虹中文廣告牌、裸露的數據線纜、全息顯示屏和反光的金屬表面。窗外是雨夜中反烏托邦式的未來城市景觀。混合照明,高細節,8k分辨率。
英文提示詞 (建議優先使用,適用於 Midjourney 等主流模型):
A cinematic candid photograph of a male musician, a Sanxian player, sitting with natural confidence and relaxed posture in the center seat of a highly cyberpunk MRT subway carriage. He wears a futuristic "glowing Hanfu," traditional robes integrated with glowing LED strips, electroluminescent wires, and dynamic circuit patterns emitting neon blues, purples, and cyan light. He holds a heavily cyberpunk-modded Sanxian instrument; its body is transparent showing glowing internal components and coils, with glowing strings. The subway carriage interior is filled with flickering neon Chinese signs, exposed hanging data cables, holographic screens, reflectant wet floors, and metallic surfaces. Outside the window is a rainy dystopian futuristic cityscape at night. Mixed lighting, high detail, 8k resolution, photorealistic. --ar 16:9
(注:如果使用 Midjourney,我在最後加了 --ar 16:9 來設定為電影寬螢幕比例,你可以根據需要移除或修改。)
這個提示詞是按照以下邏輯構建的,你可以根據生成結果調整各個部分:
媒介與風格 (Medium & Style): A cinematic candid photograph... photorealistic. (電影感紀實照片,寫實風格) -> 確保圖像看起來像真實照片而非插畫。
主體與神態 (Subject & Pose): ...male musician, a Sanxian player, sitting with natural confidence and relaxed posture... (男性音樂家,三絃子樂手,自然自信放鬆地坐著) -> 強調你要求的「神落自然自信」。
核心服裝 (Attire): ...futuristic "glowing Hanfu," traditional robes integrated with glowing LED strips... neon blues, purples... (未來的發光漢服,結合LED和霓虹光) -> 詳細描述發光的細節。
核心樂器 (Instrument): ...cyberpunk-modded Sanxian instrument... transparent showing glowing internal components... (賽博龐克改裝三絃,透明且發光) -> 讓樂器也符合整體風格。
環境 (Setting): ...cyberpunk MRT subway carriage... neon Chinese signs, exposed hanging data cables... rainy dystopian futuristic cityscape... (賽博龐克捷運車廂,霓虹招牌,裸露線纜,雨夜未來城市) -> 營造雜亂、高科技低生活的氛圍。