These are great! How are you running your SD-XL instance btw? I've been using 1.5 on Paperspace with A111 but from what I gather you need to run eg. ComfyUI to handle SD-XL?
This additional sentence is meaningless. I am writing it to get past the forum filter because it's mistaking my abreviations of technology for the use of capslock. Bit of a ridiculous filter IMO
messy_ai said: These are great! How are you running your SD-XL instance btw? I've been using 1.5 on Paperspace with A111 but from what I gather you need to run eg. ComfyUI to handle SD-XL?
This additional sentence is meaningless. I am writing it to get past the forum filter because it's mistaking my abreviations of technology for the use of capslock. Bit of a ridiculous filter IMO
Thanks. I'm running in A1111 still for the for all SDXL, I sometimes use comfyUI when playing with animations and video but all image stuff is in A1111. I run it locally on my own PC but a cloud GPU solution would work as well.
deadpool said: I've used a prompt about a messy waterpark slide, but this looks way different (and a lot more fun)
How do you make the "pool" at the bottom of the slide? What's the prompt for that?
With most of my images the prompt is not very complex because I'm training Lora files to create these images. For this one I first created some general slide images with no mess or gameshow scene. Then once I'm happy with some base images of slides I create some images of slide in combination with my custom trained gunge lora. This generates some ok images of which I took the best and used them to train 2 new Lora (one front view and one at an angle).
Then my prompt might look something like "hot woman on GungeslideXL sat at the end going into a pool of green gunge, gunge pouring on her head from above, TV gameshow background". But this then references both of my gungeslide lora files and my normal gunge lora to tell it what GungeslideXL and Gunge ctually mean as terms.
Sorry for the technical description but this is why my prompt is not super relevant as it would be full of custom terms that only mean something to the software when it uses my lora files. Obviously with a commercial model like bing or dall-e the job is getting the prompt to create the scene by trying different promots whereas with custom lora in SD then that is not necessary.
deadpool said: I've used a prompt about a messy waterpark slide, but this looks way different (and a lot more fun)
How do you make the "pool" at the bottom of the slide? What's the prompt for that?
With most of my images the prompt is not very complex because I'm training Lora files to create these images. For this one I first created some general slide images with no mess or gameshow scene. Then once I'm happy with some base images of slides I create some images of slide in combination with my custom trained gunge lora. This generates some ok images of which I took the best and used them to train 2 new Lora (one front view and one at an angle).
Then my prompt might look something like "hot woman on GungeslideXL sat at the end going into a pool of green gunge, gunge pouring on her head from above, TV gameshow background". But this then references both of my gungeslide lora files and my normal gunge lora to tell it what GungeslideXL and Gunge ctually mean as terms.
Sorry for the technical description but this is why my prompt is not super relevant as it would be full of custom terms that only mean something to the software when it uses my lora files. Obviously with a commercial model like bing or dall-e the job is getting the prompt to create the scene by trying different promots whereas with custom lora in SD then that is not necessary.
So if I'm understanding correctly,
1) Generate general images of woman on slide using a base model or reference pics from elsewhere. 2) Inpaint or ImgToImg the general images with your Gunge Lora 3) Use output images from 2) to create a new Lora.
I applaud the efforts. I love the flexibility of SD but that process is long winded. ELLA was released yesterday with increased prompt understanding and I wonder if using ELLA, a good base model and your lora would yield similar results. The down side is it's only SD1.5 atm and whilst they've said the SDXL version isn't going to be released, someone has already reverse engineered it and working on and SDXL version. SD3 should also be a massive improvement in prompt understanding too, although recreating Loras for it will take some time getting the captioning right.
Bing/Dalle3's prompt understanding is excellent but hamstrung by the censorship but it undoubtedly understands the prompts. Thankfully ELLA, Lavi-bridge, SD3 etc are making good inroads to prompt understanding and we won't need to go through the verbal gymnastics currently needed. A good base lora or finetune on enough images should be more than enough soon.
1) Generate general images of woman on slide using a base model or reference pics from elsewhere. 2) Inpaint or ImgToImg the general images with your Gunge Lora 3) Use output images from 2) to create a new Lora.
I applaud the efforts. I love the flexibility of SD but that process is long winded. ELLA was released yesterday with increased prompt understanding and I wonder if using ELLA, a good base model and your lora would yield similar results. The down side is it's only SD1.5 atm and whilst they've said the SDXL version isn't going to be released, someone has already reverse engineered it and working on and SDXL version. SD3 should also be a massive improvement in prompt understanding too, although recreating Loras for it will take some time getting the captioning right.
Bing/Dalle3's prompt understanding is excellent but hamstrung by the censorship but it undoubtedly understands the prompts. Thankfully ELLA, Lavi-bridge, SD3 etc are making good inroads to prompt understanding and we won't need to go through the verbal gymnastics currently needed. A good base lora or finetune on enough images should be more than enough soon.
It's not quite that complex it's more:
1. Generate or find slide images and train slide lora (single repeat and only a few images, approx 20 mins to train)
2. Combine gunge lora with slide lora and generate images
3. Make a GungeslideXL lora using the best ones that can now make endless perfect images.
Whole process is less than an hour to get done and generate hundreds of all perfect outputs.
ELLA and lavi-bridge look great but will still need a good gunge lora as the original model just lacks the context and info for what good gunge images look like. Also as they are brand new then I'm not sure the A1111 implementations are there yet.
This process though is only necessary if I want to create something quite specific like this slide scene. For general messy images my gunge lora is perfectly good enough and doesn't require any extra steps apart from prompt and use. I'm happy with things at the moment and want to create images now with the tools available. There is always promise in new things coming and that will always be the case. As they become available I'll no doubt try them out and see if they work well.
MMasia said: It's not quite that complex it's more:
1. Generate or find slide images and train slide lora (single repeat and only a few images, approx 20 mins to train)
2. Combine gunge lora with slide lora and generate images
3. Make a GungeslideXL lora using the best ones that can now make endless perfect images.
Whole process is less than an hour to get done and generate hundreds of all perfect outputs.
ELLA and lavi-bridge look great but will still need a good gunge lora as the original model just lacks the context and info for what good gunge images look like. Also as they are brand new then I'm not sure the A1111 implementations are there yet.
This process though is only necessary if I want to create something quite specific like this slide scene. For general messy images my gunge lora is perfectly good enough and doesn't require any extra steps apart from prompt and use. I'm happy with things at the moment and want to create images now with the tools available. There is always promise in new things coming and that will always be the case. As they become available I'll no doubt try them out and see if they work well.
Yeah, that's not too bad, smaller image batches for specific scenes/poses won't take too long.
I think the long term looks promising. SD3 will no doubt be crippled for NSFW but from the samples shown, the prompt interpretation is near Dalle3 levels. ELLA and the other tools will evolve quickly to fill the gap for SDXL/1.5 and may even surpass DALLE3 alongside SD3. The speed the tech is evolving is staggering. The Key as you say is making sure you main Gunge or Messy Lora is well selected/captioned for the new tech. You'll be in a great position to then tweak the image set captions as the new tech evolves.
Yeah, that's not too bad, smaller image batches for specific scenes/poses won't take too long.
I think the long term looks promising. SD3 will no doubt be crippled for NSFW but from the samples shown, the prompt interpretation is near Dalle3 levels. ELLA and the other tools will evolve quickly to fill the gap for SDXL/1.5 and may even surpass DALLE3 alongside SD3. The speed the tech is evolving is staggering. The Key as you say is making sure you main Gunge or Messy Lora is well selected/captioned for the new tech. You'll be in a great position to then tweak the image set captions as the new tech evolves.
Yeah, that's not too bad, smaller image batches for specific scenes/poses won't take too long.
I think the long term looks promising. SD3 will no doubt be crippled for NSFW but from the samples shown, the prompt interpretation is near Dalle3 levels. ELLA and the other tools will evolve quickly to fill the gap for SDXL/1.5 and may even surpass DALLE3 alongside SD3. The speed the tech is evolving is staggering. The Key as you say is making sure you main Gunge or Messy Lora is well selected/captioned for the new tech. You'll be in a great position to then tweak the image set captions as the new tech evolves.
Once you build your model (Lora?) then you can create a series of similar images using the same scene, model, clothing etc.
Question is is there a way to get a 'time series' of progression? Like images of the slime falling in sequence or is it all just kind of random?
Currently a lora will create similar types of images though the original training may have accounted for some difference in amount and coverage of slime in the captions which would let you use prompts like "lots of slime, small amount of slime, slime just starting to pour over head". But its not going to generate a time sequence from clean to messy.
However .... mine can do the same person clean and messy as a split image with the same scene and setup (see pics below). I'm looking at training a model that can better show a 4 way split from clean to messy in one wide image, that should work OK. Another option is inpainting with img2img, this is just a lot of manual work and could result in the base image being exactly the same just with more mess applied to it, which is not a great option. Img2img can allow more "imagination' with each new generation though, which might work but I'd rather not have to do manual work with my individual images if I can avoid it.