Playing around in ComfyUI.
-
papa-lenin
- Posts: 35
- Joined: Sun Jul 19, 2009 5:24 pm
Re: Playing around in ComfyUI.
Amy loves her job. Amy never disappoints her clients. So when three inebriated goth girls show up asking for "extreme trail tour" she knows exactly where to lead them. Being the swamp tourist guide is never boring
You do not have the required permissions to view the files attached to this post.
-
bogbud
- Posts: 1015
- Joined: Sat May 30, 2015 12:43 am
Re: Playing around in ComfyUI.
papa-lenin wrote:
So I can do a prompt like this and it actually understands it (this is a prompt for the second image posted here, notice how it mostly fits what's in the picture):Code: Select all
....
This is outstanding. The code alone is a full story in itself
papa-lenin wrote:Amy loves her job. Amy never disappoints her clients. So when three inebriated goth girls show up asking for "extreme trail tour" she knows exactly where to lead them. Being the swamp tourist guide is never boring![]()
Oh, and we love Amy! Hope to see more of her adventures!
-
papa-lenin
- Posts: 35
- Joined: Sun Jul 19, 2009 5:24 pm
Re: Playing around in ComfyUI.
bogbud wrote:This is outstanding. The code alone is a full story in itself![]()
Haha, I can see it may appear so. I use basically 2 models on my computer - one is (obviously) the image generation model, but the other one is a normal text LLM (kinda like a small locally ran chatgpt-like model) - so I can create a system prompt which tells the LLM how to behave, show it for example a sketch or a previous image I made, give it pointers on what to do, and then, according to the instructions I wrote, it'll create the actual prompt for the image generation model.
One such "instruction set" I wrote makes these very "poetic" descriptions
The other one is using what is called "JSON-style prompting". This one is really interesting because it is almost like programming the image in a way, using natural language, and is very precise when it comes to making what you want, gives you a big level of control. Only little thing to keep in mind if you want to use Z-Image for this style is to always remove all quotation marks from the prompt, as Z-Image treats everything inside quotations as text to place on the image, so standard JSON prompt will fill the whole image with nonsensical ramblings and code bits
So how does that look like in practice? Something like this (careful, it's huge, lol):
Code: Select all
No text present on the image
{
image_context_profile: {
metadata: {
title: Isobel: The Drowned Swamp Sorceress,
style_classification: Analog Photography / Fantasy Realism,
resolution_hints: High-fidelity, sharp focus on foreground subject with soft depth of field.,
artistic_influences: [
Nature-inspired fantasy art,
Organic texture rendering (mud, fabric),
Soft natural lighting techniques
]
},
environment: {
setting_description: A serene yet dynamic swamp landscape characterized by rich, black clay soil.,
atmosphere: The air feels damp and heavy. The environment is alive with slow-moving organic processes, particularly the oozing of the earth.,
lighting_conditions: {
source: Overhead natural sunlight (diffused),
direction: Top-down with slight lateral fill,
quality: Soft, even illumination minimizing harsh shadows, enhancing skin and fabric textures.
},
background_elements: [
{
element_name: Tall Marsh Reeds,
visual_details: Dense clusters of green grasses rising behind the subject, providing a vertical natural frame.,
positioning: Background layer (Z-depth: Far)
},
{
element_name: Oozing Mud Impressions,
visual_details: Deep cavities in the soft ground filled with clear water, reflecting the sky and surrounding flora. The mud surface appears wet and glossy.,
positioning: Mid-ground to background
}
]
},
people: {
main_character: {
name: Isobel,
role: Swamp Sorceress / Drowned Guardian,
demographics: {
estimated_age_range: Young Adult (20s-30s),
gender: Female
},
physical_appearance: {
facial_features: Possesses distinct elf-like features including pointed ears and a serene yet expressive countenance. Her eyes are framed by delicate wire-rimmed glasses.,
hair_style: A rich, chestnut-red mane styled in a high ponytail with loose tendrils framing the face.
},
attire_details: {
garment_type: Form-fitting bodice with dark leather straps and shoulder armor plating.,
accessories: [
{
item_name: Protective Gauntlets,
material: Black leather or rubberized fabric, heavily textured.
},
{
item_name: Glistening Skin Surface,
description: Her skin and clothing are partially submerged in the mud, creating a 'drowning' effect where earth clings to her form like a second skin.
}
]
}
},
pose_and_activity: {
primary_pose: Standing upright yet anchored, leaning slightly forward with hands gripping a wooden staff.,
expression_analysis: Her gaze is directed downward or slightly inward, conveying deep concentration and an emotional connection to the swamping earth. Her expression balances serenity with the intensity of her immersion.
}
},
objects: {
primary_artifact: {
name: The Anchoring Staff,
description: A sturdy, rustic wooden pole held vertically by Isobel.,
role: Serves as a stabilizing element against the fluid swamp forces, symbolizing her connection to nature's roots.
},
supporting_elements: [
{
name: Mud-Coated Surfaces,
description: The black clay ground exhibits visible stratification and water pockets.,
material_properties: Viscous texture, dark earthy tones (anthracite/brown).
}
]
},
composition: {
framing_technique: Medium shot capturing the subject from mid-thigh to head height.,
focal_point_analysis: Isobel is positioned slightly off-center (Rule of Thirds), drawing the viewer's eye to her interaction with the staff and the surrounding mud.,
visual_balance: {
horizontal: Balanced by the contrast between the subject on the right and the expansive reed bed on the left.,
vertical: Stabilized by the vertical lines of the staff and background flora against the horizontal spread of the swamp.
},
color_palette_analysis: {
dominant_hues: [
Deep Forest Green (Flora),
Rich Chestnut Red (Hair & Accents),
Dark Earth Brown/Black (Mud & Armor)
],
tonal_mood: Natural, grounded, and slightly melancholic.
}
},
symbolism_and_story: {
narrative_arc: The scene captures a pivotal moment of transformation where Isobel is not merely standing in the swamp but actively merging with it. The 'drowning' motif suggests a narrative of sacrifice, resilience, or the acquisition of ancient earth powers.,
emotional_resonance: The image evokes feelings of tranquility amidst change, highlighting themes of growth through grounding and the enduring strength found within nature's depths.
}
}
}And that's a recipe for this:
bogbud wrote:Oh, and we love Amy! Hope to see more of her adventures!
Hah, that sounds good, maybe I'll make her a reappearing actress in the future
I'm really glad you guys are having fun with these, I did not anticipate to have so much entertainment while making these for you
You do not have the required permissions to view the files attached to this post.
- dabringer157
- Posts: 2
- Joined: Mon May 13, 2024 9:55 pm
Re: Playing around in ComfyUI.
papa-lenin wrote:dabringer157 wrote:Bro these are all fantastic! I've built a few workflows myself in ComfyUI, but they're all configured to use Illustrious and Pony type diffusion models since they have decently made mud and quicksand LORAs. What model(s) have you been using?
Hi, thanks for commenting![]()
I use Z-Image Turbo with a LoRA I have trained myself. The biggest thing about ZiT is that it has 1024 tokens long context window, for comparison the SDXL derivatives like Pony or IL have.. 77.
Ahh okay, that explains a lot! I’ve played around with both Z-Image Turbo and Z-Image Base, but on their own they don’t seem to understand the concept of mud and quicksand. Was training the LORA pretty difficult? That’s one part of comfyUI that I still have to learn…
-
papa-lenin
- Posts: 35
- Joined: Sun Jul 19, 2009 5:24 pm
Re: Playing around in ComfyUI.
dabringer157 wrote:Ahh okay, that explains a lot! I’ve played around with both Z-Image Turbo and Z-Image Base, but on their own they don’t seem to understand the concept of mud and quicksand. Was training the LORA pretty difficult? That’s one part of comfyUI that I still have to learn…
It's pretty easy, took about 2 hours maybe less on an RTX4090. I'd say the most important thing is to do good tagging of the images you pick, and high quality of those images too. I used about 40 pictures of various different angles and depths, and tagged them using a vision model LLM, then fixed all tags manually by adding relevant stuff to them. I'm quite pleased with the result, because not only it does QS images really really well, on top of it it gives them a very natural "photograph" look which I enjoy very much
You can find a YouTube tutorial if you search for "How to Train a Z-Image-Turbo LoRA with AI Toolkit" there.
-
papa-lenin
- Posts: 35
- Joined: Sun Jul 19, 2009 5:24 pm
Re: Playing around in ComfyUI.
A different perspective:
Bizarre Date
Bizarre Date
You do not have the required permissions to view the files attached to this post.
Who is online
Users browsing this forum: Ds-Qs-Vr-50, mjw and 3 guests