GPT-4V 進階:透過 GPT-4V 讓 DALL-E 3 以文生圖創造風格近似的作品

GPT-4V 和 DALL-E 3 已經成為最引人注目的技術之一。GPT-4,作為一種先進的語言預測模型,能夠理解和生成極其自然且有深度的文本內容。另一方面, DALL-E 3 是一個圖像生成模型,能夠根據文本描述創建令人驚嘆的圖像。這兩者的結合打開了創意表達和視覺藝術的新疆界,使我們能夠創造出風格近似的作品,這在過去是不可想像的。

利用 GPT-4V 來引導 DALL-E 3 生成風格近似的圖像不僅僅是一個技術挑戰,同時也是一個藝術創作的過程。這種方法可以被廣泛應用於廣告、設計、藝術創作等領域,為我們提供了一個全新的方式來表達創意並實現視覺上的創新。

本文的主要目標是探討如何使用 GPT-4V 來生成描述性文本,進而指導 DALL-E 3 創建風格一致且富有創意的圖像作品。

實現風格近似的策略

當我們找到一張理想的參考圖時(如本文首圖),我們先儲存它,並上傳至 GPT-4V (感謝新版本提供的多模態讀圖能力!) ,並輸入以下 prompt :

Please elaborate on the details of this image to be used as a prompt input for DALL-E.

在句子中我們使用 DALL-E 而不是 DALL-E 3 是因為在 GPT-4V 的離線版本中它的最新知識是截至2023年4月,這時 DALL-E 3 並沒有發佈,所以我們在句子中只提到 DALL-E ,模型相同並不會有溝通語法上的差異。

輸出

Certainly! Here's a detailed description of the image to use as a prompt for DALL-E:

"A cosmic-themed illustration of a mystical witch. The witch has flowing, wavy hair that seamlessly merges with the vastness of the universe. Her wide-brimmed hat spirals elegantly upwards, dotted with stars, galaxies, and celestial motifs. The hat encompasses a moon and a radiant butterfly, which appears to be a part of the cosmic tapestry. The witch's skin has a porcelain hue, and her eyes have an enigmatic and alluring gaze. She holds a shimmering star between her fingers, emitting a faint glow. Her attire and form are engulfed in a whirl of galactic swirls, stardust, and luminous splatters, representing the vast and mysterious expanse of the cosmos."

I hope this description aids in your endeavor with DALL-E!

我們知道在 AI 生成繪圖中,溝通的細節越完整, AI 創作出來的圖片層次就越豐富。一般而言,圖片生成需要考量的變數:

  • 主題
  • 環境
  • 氣氛
  • 燈光
  • 色彩
  • 構圖
  • 藝術風格

我們整合這些變數,針對首圖輸入具備更多細節的 prompt :

Please describe this image in detail, including the theme, setting, atmosphere, lighting, color, composition, and artistic style, to be used to output the DALL-E prompt.

這時 GPT-4V 就給了非常詳盡的 prompt 敘述:

Certainly!

**Theme**: The image captures the essence of cosmic mysticism, blending a portrait of a sorceress or witch with celestial motifs, suggesting her deep connection or embodiment of the universe.

**Setting**: The setting is abstract, not rooted in any tangible place, creating an ethereal and otherworldly atmosphere that's enhanced by the presence of stars, galaxies, and a luminous butterfly.

**Atmosphere**: The overall feel is one of enchantment and wonder, emphasizing the vast mysteries of the cosmos and the magic it holds. The image carries a sense of harmony between the ethereal witch and the universe around her.

**Lighting**: There isn't a specific light source; instead, the entire image appears illuminated from within. The luminescence of the star the witch holds, the glow of her hat, and the various scattered stars provide a soft, dreamy light.

**Color**: The palette is anchored in rich blues and teals reminiscent of the night sky or deep space. Pops of white and light teal, especially within the stars and the celestial butterfly, add contrast and vibrancy.

**Composition**: The witch is central and is the primary focus. Her gaze, the position of her hand holding the star, and the spiral of her hat all guide the viewer's eyes in a circular motion, creating a dynamic and balanced composition. The flowing elements, like her hair and the galactic patterns, give the artwork movement and fluidity.

**Artistic Style**: This is a fusion of illustrative and graphic art. There's a high level of detail, particularly in the line work of the witch's features and the intricate patterns of the cosmos. While there's a touch of realism in the rendering of the face, the celestial motifs introduce an abstract, surreal dimension.

**DALL-E Prompt**: "An ethereal witch with wavy hair set against an abstract backdrop of the cosmos. The central figure holds a glowing star and wears a spiraled hat adorned with galaxies, a luminous butterfly, and celestial elements. The color scheme revolves around deep blues, teals, and radiant whites, creating a dreamy and magical atmosphere. The art style is a blend of detailed illustration and graphic surrealism."

I hope this description will be helpful for your DALL-E prompt!

它先針對我們所提出的變數做細節描述,最後再給出一個 prompt 。由於對細節的描述多寡會決定最後圖片生成的細節程度,所以我們決定將它所提供的 prompt 及各個變數的描述分別作為一個 prompt 進行測試比較差異性。

實際操作

版本1

輸入

A cosmic-themed illustration of a mystical witch. The witch has flowing, wavy hair that seamlessly merges with the vastness of the universe. Her wide-brimmed hat spirals elegantly upwards, dotted with stars, galaxies, and celestial motifs. The hat encompasses a moon and a radiant butterfly, which appears to be a part of the cosmic tapestry. The witch's skin has a porcelain hue, and her eyes have an enigmatic and alluring gaze. She holds a shimmering star between her fingers, emitting a faint glow. Her attire and form are engulfed in a whirl of galactic swirls, stardust, and luminous splatters, representing the vast and mysterious expanse of the cosmos.

輸出

Here are the cosmic-themed illustrations of the mystical witch based on your description. I hope you find them captivating! If you'd like any changes or further details, please let me know.

圖片1-1

prompt: Illustration of a mystical witch with a cosmic theme. The witch has flowing, wavy hair that seamlessly merges with the universe, depicting stars, galaxies, and nebulae. Her wide-brimmed hat spirals upwards, adorned with celestial motifs, stars, and galaxies. A moon and a radiant butterfly are incorporated into the hat’s design. The witch’s skin is porcelain-toned, and her captivating eyes reflect the mysteries of the cosmos. She delicately holds a shimmering star that emits a soft glow. Her form is surrounded by galactic swirls, stardust, and luminous splatters, symbolizing the vastness of the universe.

圖片1-2

prompt: Cosmic-themed illustration portraying a witch whose essence intertwines with the universe. Her long, wavy hair transitions into a canvas of stars and galaxies. A spiraling, wide-brimmed hat sits atop her head, dotted with celestial elements, including a moon and a glowing butterfly. The witch’s porcelain skin contrasts with her mysterious eyes, which seem to hold the secrets of the cosmos. In her hand, she gracefully holds a star that casts a gentle light. Her entire being is enveloped in a dance of galactic swirls and stardust, representing the infinite beauty of the cosmos.

版本2

輸入

An ethereal witch with wavy hair set against an abstract backdrop of the cosmos. The central figure holds a glowing star and wears a spiraled hat adorned with galaxies, a luminous butterfly, and celestial elements. The color scheme revolves around deep blues, teals, and radiant whites, creating a dreamy and magical atmosphere. The art style is a blend of detailed illustration and graphic surrealism.

輸出

Here are the illustrations of the ethereal witch set against a cosmic backdrop, based on your description. The color scheme and art style have been captured as per your request. I hope you find them to your liking! If you have any feedback or further requests, please let me know.

圖片2-1

prompt: Detailed illustration with graphic surrealism of an ethereal witch against an abstract cosmic background. The witch has wavy hair and holds a glowing star in her hand. She wears a spiraled hat decorated with galaxies, a luminous butterfly, and other celestial motifs. The color palette consists of deep blues, teals, and radiant whites, creating a dreamy and enchanting ambiance.

圖片2-2

prompt: Artistic blend of detailed illustration and graphic surrealism depicting an ethereal witch with wavy hair. She stands against a mesmerizing abstract backdrop of the cosmos, holding a radiant star. Her spiraled hat is adorned with galaxies, a glowing butterfly, and celestial elements. The dominant colors are deep blues, teals, and luminous whites, setting a magical and dreamlike tone to the piece.

版本3

輸入

Theme: The image captures the essence of cosmic mysticism, blending a portrait of a sorceress or witch with celestial motifs, suggesting her deep connection or embodiment of the universe.

Setting: The setting is abstract, not rooted in any tangible place, creating an ethereal and otherworldly atmosphere that's enhanced by the presence of stars, galaxies, and a luminous butterfly.

Atmosphere: The overall feel is one of enchantment and wonder, emphasizing the vast mysteries of the cosmos and the magic it holds. The image carries a sense of harmony between the ethereal witch and the universe around her.

Lighting: There isn't a specific light source; instead, the entire image appears illuminated from within. The luminescence of the star the witch holds, the glow of her hat, and the various scattered stars provide a soft, dreamy light.

Color: The palette is anchored in rich blues and teals reminiscent of the night sky or deep space. Pops of white and light teal, especially within the stars and the celestial butterfly, add contrast and vibrancy.

Composition: The witch is central and is the primary focus. Her gaze, the position of her hand holding the star, and the spiral of her hat all guide the viewer's eyes in a circular motion, creating a dynamic and balanced composition. The flowing elements, like her hair and the galactic patterns, give the artwork movement and fluidity.

Artistic Style: This is a fusion of illustrative and graphic art. There's a high level of detail, particularly in the line work of the witch's features and the intricate patterns of the cosmos. While there's a touch of realism in the rendering of the face, the celestial motifs introduce an abstract, surreal dimension.

輸出

Here are the artworks that capture the essence of cosmic mysticism, blending the portrait of a sorceress with celestial motifs based on your detailed description. I hope these illustrations resonate with the enchanting and magical atmosphere you envisioned. If you have any feedback or further requests, please let me know.

圖片3-1

prompt: Illustrative and graphic art fusion of a sorceress embodying cosmic mysticism. Set against an abstract and ethereal background, the central witch figure conveys a deep connection with the universe. The atmosphere exudes enchantment, with stars, galaxies, and a luminous butterfly reinforcing the sense of wonder. Illumination comes from the glow of the star in her hand, her hat, and scattered stars, creating a dreamy ambiance. The rich blues and teals of deep space dominate, with contrasting whites and light teals highlighting the celestial elements. The composition emphasizes the witch’s centrality, her gaze, and the flow of galactic elements, offering movement and balance.

圖片3-2

prompt: Artwork capturing the essence of cosmic mysticism through a portrait of a witch intertwined with celestial motifs. The setting is abstract and otherworldly, with stars, galaxies, and a glowing butterfly enhancing the ethereal atmosphere. The sense of magic and mystery is palpable, emphasizing the harmony between the witch and the cosmos. The image radiates light from within, thanks to the luminescence of the star, the witch’s hat, and various celestial elements. Deep blues and teals contrast with vibrant whites and light teals. The witch, central in the composition, guides the viewer’s gaze in a circular motion, with flowing elements adding fluidity. The style blends detailed illustration with surreal graphics.

結語

從上面的實際操作中我們可以觀察到隨著 prompt 描述的細節越多, DALL-E 3 的圖像生成細節層次就豐富,而且 DALL-E 3 在我們沒有指定輸出比例格式的狀況下,隨著 prompt 描述的細節,它也會跟著圖片的狀態去調整最佳輸出比比例,不限定 1:1 格式。最後,不管是版本1、版本2、版本3,圖片生成的風格大抵都與原圖有著近似的風格與元素。讀者也可以根據我們提供的範例配合自身的需求多加測試及調整。

進一步閱讀
  1. 直接複製!使用 DALL-E 3 新增的67種圖片風格,解放 AI 創作潛能 (附實例)
  2. DALL-E 3 新功能與實踐指南
  3. 如何用 Gen ID 讓 DALL-E 3 輸出相同風格的圖片
  4. 如何在 DALL-E 3 中將數種不同物體融合為一
  5. 如何在 DALL-E 3 的圖片上增加文字
  6. DALL-E 3 中的光線參數 (上)
  7. 六種適用於東方特色的 DALL-E 3 創作風格
  8. 如何利用 Negative Prompt 優化 DALL-E 3 圖像生成
  9. 如何透過修改 Seed 讓 DALL-E 3 逐步生成完美圖片
  10. DALL-E 3 中的光線參數 (下)
  11. DALL-E 3 的角度參數及其影響