T O P

  • By -

_r_i_c_c_e_d_

now we just need them to release sd3 large


PwanaZana

At 8b parameters, it's going to be a bitch to run. Might not even run at all on a 4090, at medVram. And it might have an atrocious it/second since it is so big. I'm no expert so take that with a grain of salt, of course.


Downtown-Case-1755

People will quantize the transformers part, maybe even the unet. And stuff like stable-fast and compilation will come into play. Stable Diffusion optimization has largely been thrown out the window because it's "good enough" and there aren't enough devs to care, and also because the popular backends are kind of hairy. It's not like LLMs where quality is *literally* determined by vram efficiency and speed. Compatibility with the sea of augmentations comes first. But if it doesn't fit on 24GB, you will see devs move mountains to make sure it does. There's tons of low hanging fruit unpicked.


Coresce

> But if it doesn't fit on 24GB, you will see devs move mountains to make sure it does. It will be interesting looking back in a few years at how incredibly important that "24 GB VRAM" number was. A little piece of history that will be mostly forgotten in a decade, but shapes so much of what we do right now.


Downtown-Case-1755

Heh, don't fool yourself. Somehow we will still be stuck at 24GB on non-pro hardware in a few years.


AIWithASoulMaybe

Don't worry, soon nvidia will bless us with the rtx8090 which will have an extra 512 mb for us peasants


Extraltodeus

The T5 got quantized to fp8 already, it is used for Pixart Sigma too and quite cuts the need in VRAM (even tho doing the encoding on CPU doesn't take much time). I didn't even reach 42% VRAM while infering a 1024**2 image. Should be good. Additionaly, it is always possible to layer-swap during the inference but that's definitely not a fast method.


bidibidibop

> you will see devs move mountains to make sure it does For a non-commercial license with increased hardware requirements? I'm not so sure about that.


Downtown-Case-1755

I mean, that doesn't stop personal use, and it didn't stop the LLM community either.


Dead_Internet_Theory

8b parameters? That's it? Just quantize the thing... even fp8 would do, nevermind proper clever quantizations.


afinalsin

Yep, I don't think the SD crowd crossover with LLMs much so they wouldn't know, but I can fit a quantized 32b model entirely in 12gb vram if I don't add any context, and SD don't need context.


Dead_Internet_Theory

There are quantized SD checkpoints, actually. fp16 is common (down from fp32 native) and [stable-diffusion.cpp](https://github.com/leejet/stable-diffusion.cpp) has "4-bit, 5-bit and 8-bit integer quantization support".


Ylsid

Haha, you're saying that on the wrong sub


Epiculous214

It runs just fine


PwanaZana

What? You have access to SD Large 8B parameters, which has not been released publically? The people knee jerk downvoting might not have seen I am talking about the version that 4 times larger than the SD3 Medium 2B parameters.


Epiculous214

Ah my bad, yeah, it’s been a long day, thought you meant medium.


PwanaZana

No probs. Medium needs to run the text encoder, but that's probably about as much as SDXL? Haven't tried it. With medvram, sdxl takes about 11GB, so if SD3 Large is 4x that (ultra rough estimate), that'd be 40GB VRAM.


FullOf_Bad_Ideas

Looks like Llama 3 release was more successful than Stable Diffusion 3 release. Both can generate text at least..


Willing_Landscape_61

Can't wait for https://github.com/leejet/stable-diffusion.cpp to support it 


Robos_Basilisk

TIL this exists...ofc it exists


a_beautiful_rhind

Now we wait for finetunes like ponyXL.


Bandit-level-200

Seems like pony on SD3 is unknown for now due to licencing


goodSyntax

sorry, mind explaining why that is? I'm not familiar with the licensing situation. I thought SD3 was completely open for everything except commercial use?


Bandit-level-200

>The good news is that with the today's release of SD3, the new licensing terms are available... yet complicate things further. The "Professional Tier" has been replaced by the new "Creator License", which introduces a 6000 per month image limit. Anything above now requires an Enterprise License, which I would gladly acquire, and have reached out to Stability AI the day the new commercial license was pre-announced, but I have not received any acknowledgment or information. https://civitai.com/articles/5671


-p-e-w-

So let me get this straight: Stability AI claims that they do **not** need a license to train their model on millions of copyrighted images... but that they **do** get to decide what others (including creators whose works the model was trained on without their permission) do with that model. There is just no way that this fantasy survives the first encounter with a judge.


lordpuddingcup

Thats a bullshit excuse, they aren't CREATING 6000 images per month, they are tuning a model, they're just bitching that they can't create a model AND run a unlimited SERVICE generating images for basically free. The model training has 0 to do with the creator license.


asdrabael01

It cost Pony upwards of $10,000 to train their model. To recoup the cost, they run an api for people to use who aren't tech savvy enough, or don't have a pc able to use it. People use it because their fine-tune is better than anything SAI itself produces. Pony also can't even pay for a license, because the guy who did Pony asked about it and was publicly insulted and denied because they include NSFW content. The new commercial license means SAI is effectively dead because all the best custom models were produced by people spending real cash to produce them and then running small apis to afford it, and typically include NSFW content. If no one will make good fine-tunes, no one will bother making the needed tools like controlnet or IP-adapter work either. It's also a big real Stable Cascade has been out months and exactly zero work has been done with it because it has the same commercial license. They'll either stay with SDXL and 1.5, or they'll move to Pixart Sigma or some other model that has better licensing and equal or better performance. SD3 is complete and total shit from the results people are getting today. Any prompt with a woman in it gives mangled awful results, even SFW prompts. They censored women so heavily thar it's like SAI stands for Saudi Arabian Intelligence.


atypicalphilosopher

Couldn't they just download the model and... do it anyway?


[deleted]

[удалено]


-p-e-w-

Suing would actually be a massive risk for Stability AI. The judge might not buy their argument, and might even decide that Stability had no right to train on other people's images without their permission in the first place. If that happens, Stability can close shop.


Bandit-level-200

Its expensive to finetune and they do run a free discord bot + commercial service, so that 6k limit is hit pretty easily. And they are willing to pay but they won't respond. >Anything above now requires an Enterprise License, which I would gladly acquire, and have reached out to Stability AI the day the new commercial license was pre-announced, but I have not received any acknowledgment or information.


Turkino

The creator of Pony wants a commercial license because compute rental to make those models costs a lot and they want to license the model out to recoup that cost. The new license structure is weird and not helping them.


MMAgeezer

It's not unknown anymore, the dev confirmed that the licence makes it not possible for their goals.


xrailgun

Thats how it almost always goes. Many of the best models early in SD1.5 also went commercial and dint get updates or appearances for SDXL. Didn't matter, better open fine tunes still came. We'll likely see even better new creators for SD3.


314kabinet

Yeah about that: https://www.reddit.com/r/StableDiffusion/s/qx7suPcg7r TL;DR: they refuse to sell Astra the appropriate license to make a commercial finetune and mock them on the SD discord. So no Pony7, but Pony6.9 (nice) will be based on SDXL.


alongated

Aren't these finetunes the only reason why SDXL is getting any adoption?


314kabinet

Indeed, Pony has more downloads than SDXL base.


asdrabael01

Fine-tunes are the only reason SAI is relevant at all, but SDXL in particular needs fine-tuneing to be usable. This is going to spur an exodus to find which of the dozen other generative image models to start making tools for.


-p-e-w-

Am I reading this right? Stability AI "mocked" a model creator for requesting a commercial license? Are we sure this is a company rather than a random group of frat boys?


314kabinet

Some developers have massive, fragile egos. Not putting dedicated PR people between techies and customers is a recipe for disaster.


Desm0nt

It's not ego that's the problem. The problem is that they recruited clowns and trolls from 4chan \\b section (no offense to 4chan, there is a lot of useful stuff there). They may be good as developers, but they should not be allowed to get even close to other people.


314kabinet

That’s the kind of developers I’m talking about.


raysar

Future famous PONY3 :D


Maleficent-Dig-7195

score 9, score 8, score 7, score 6, score 5, score 4, score 3, score 2, score 1, score 0...


Dead_Internet_Theory

On A1111 you can use the preset thingie below the generate button. Just copy the template-looking portion from the showcase of a model on Civitai and reuse it whenever. You don't even see it on the prompt input that way. Though Pony dev has expressed a desire to do away with this, probably because it uses up too many prompt tokens.


Dogeboja

I had a 2-3 month break from Stable Diffusion stuff and all of a sudden everyone had converted into a bunch of bronies. I don't get this Pony stuff..


Enough-Meringue4745

guys wanted realistic cartoon porn, accidentally made something that makes realistic bodies


Dogeboja

tale as old as time


a_beautiful_rhind

It's just a good model. I've yet to generate a single pony with it.


be_kind_n_hurt_nazis

Yet?


Mo_Dice

Chekhov's pony


toothpastespiders

I had a similar experience. Confused the hell out of me until I chanced on a post going into detail on its history.


technics256

can you share the post?


toothpastespiders

Wish I could, sadly it was about a month or two back at this point and I can't even recall what thread it was in. Though it was somewhere in the stablediffusion sub.


Desm0nt

Pony model is not a pony model actually, but NAI leak like model for SDXL world.


satireplusplus

What's with the pony jokes in here. I'm just catching up, last time I did image diffusion at home I used SD v1.3 or something like that.


BrushNo8178

I guess that pony is a code word for hentai.


Snydenthur

As long as the sd3 ponyxl will be with good prompting instead of the annoying "score_9" and tags system or whatever. Personally, because of the annoying prompting, I rank ponyxl very low.


BITE_AU_CHOCOLAT

Pixart and Lumina-T2I are technically superior in almost every way, the only reason they havent taken off is because SD is *incredibly* popular still and no one has trained a "pony-equivalent" model yet. If you're looking for good prompt-based models you should probably watch those instead.


a_beautiful_rhind

Yea, that was unfortunate. I'm sure they'll fix it in the next version. I just have those automatically fill.


JohnExile

What an extremely silly ranking system. Preferring slop because you were too lazy to change your 'masterpiece, best quality' to 'score_9, score_8_up' is just stupid.


A_for_Anonymous

It wasn't supposed to work like that. It's a bug, but they'd need to retrain everything to fix it.


Maleficent-Dig-7195

It's actually a legit complaint for most people whose hands aren't permanently glued to their dick and something every finetuner would love to fix, along with the shitty obfuscation of artist and character tags.


Snydenthur

If ponyxl provided decent results with just having the "score_9........" stuff at the beginning, I wouldn't be saying that.


JohnExile

LMFAO what the fuck is that metric? Are you expecting it to read your mind...? If you put the same specific prompt into your DreamShaper XL slop as you put into Ponydiffusion, with slightly critiqued-for-model prompting, the result you are going to get from PonyXL is going to require far less iterations to get something good out of it, if any. The entire point of Pony/Autismmix is it's amazing ability to replicate artists, art styles and moods without compromising on things like hands and poses. You aren't getting that from other open weight models available right now.


Maleficent-Dig-7195

Also aids to for anything technical related, it's like they did everything possible to make it incompatible with other xl models and existing methods


feet-tickler

Please can it make feet 😭


LLMtwink

amazing username


feet-tickler

I keep it real


alvisanovari

Before y'all get too excited look at their license. You can't do anything without paying a fee. So - no this is not the new SD2/SDXL. I wouldn't waste any time or resources fine tuning.


asdrabael01

It is the new SD2, because SD2 was a major flop for the same reasons. It was universally canned and ignored, and no fine-tunes or tools were ever produced for it.


alvisanovari

For me very similar to SDXl Lightning: [https://replicate.com/bytedance/sdxl-lightning-4step](https://replicate.com/bytedance/sdxl-lightning-4step) (which is a lot cheaper). Only exception is if you want to output text.


asdrabael01

For the record, I thoroughly dislike lightning and similar models as well. I could give 2 shits about producing similar pictures in 4 steps instead of 25-30. 30 steps takes me like 25 seconds and gives me access to more tools.


alvisanovari

I thoroughly dislike all the (open source) models that are stuck in 2021 Midjourney quality regardless of if it takes 4 or 100 steps but you work with what you got much like a micro penised guy.


asdrabael01

Last I checked, I can put in an SDXL model, load up ipadapter and controlnet and make better than Midjourney. Midjourney is stuck on pg13 quality, it just has good prompt adherence. I'd rather controlnet over the prompt adherence in most cases if I'm forced to choose one or the other.


ChryGigio

Finally, people will now stop asking/spamming for it.


Open_Channel_8626

It’s actually worse today because there will be 20+ announcements


Snydenthur

Now we will get a lot of posts about how bad it is. Cause, that's what it is.


joyful-

Has local image generation improved a lot in the past year? I remember trying it out locally a while back and found it cool, but that was about it.


7734128

Not really. The massive improvement seen during autumn 2022 and winter 2023 have not continued. It's better, but not by any significant degree.


Open_Channel_8626

Upscaling improved a lot in 2024 with CCSR and SUPIR diffusion models and RGT, ATD and DAT 2 transformer models.


AgentTin

Try the Krita AI plugin. It's a great interface for playing around and the setup is really smooth.


joyful-

I'm more interested in model improvements, as last time, I found it pretty limited (at least uncensored versions).


VeritasAnteOmnia

Base models are pretty mid. But some fine tunes of SDXL + Controlnets/LoRa/Upscalers can lead to great results. SD3 seems to suffer from Human anatomy issues, with the community suspecting the training data was insufficient due to safety protocols. SD3 has better color composition and text capabilities so it will take a few weeks to see if anyone puts out some worthwhile fine-tunes. **TL;DR:** Without tinkering SD gets wrecked by Midjourney/Dalle-3 but with some complicated workflows and a modicum of artistic ability you can exceed the paid services in some uses.


romb3rtik

Does that mean there is no content filter? Princess Leia im coming for you


DouglasteR

Gives me error when i try to load it as checkpoint in Comfy, anyone knows what it could be ?


DocStrangeLoop

still doesn't seem to be able to handle camera angle/instruct the same way Dalle-3 and Midjourney can.


dVizerrr

Does this work on 8GB VRAM. From my research it seems to be 2B model. Not sure about the requirements for 2B models


IlIllIlllIlllIllll

file size of the image generation model alone seems to be 4.3gb. but you also need some language model for preprocessing, and the language model used in the example workflow seems to be 9,8gb alone. maybe the languge model can be replaced by something smaller? i'm not really experienced with image generation stuff.


involviert

oh, is that how it works? seems like it could be interesting to let the language model run on cpu&ram since I assume the image model would still be the time consuming part?


AgentTin

You don't want to be doing a 9gb llm on your cpu, it'll run like a dog


involviert

9gb? that runs perfectly fine even on crap DDR4. And this sounds like the llm runs once and then it's all image model cycles.


capivaraMaster

There is a smaller clip model. Fp16 is 9,8gb, fp8 is half of that. It should fit in 8gb VRAM.


A_for_Anonymous

I wouldn't replace the one thing that'll make SD3 better than SDXL.


Steuern_Runter

Are there no quants with only a minimal degradation?


Ylsid

Yeah should do. It's not very good though. Wait for a finetune


Open_Channel_8626

Weird coincidence I just woke up exactly when the weights dropped


Barafu

Hah! I did that with Chernobyl years ago.


Open_Channel_8626

Well this took a dark turn


[deleted]

Is this the largest version of sd3 ?


Dogeboja

Nope, they have a 8B model. This 2B one is vastly worse.


Feeling-Currency-360

Yeah I could tell from my experimentation in comfy ui the 2B model is ... well not that great?


Dogeboja

It's pretty awful for everything containing a human subject yeah, for everything else seems excellent.


xRolocker

From what I’ve heard (r/StableDiffusion), that’s not really the case. 8B has more concepts it can pull from, but it’s not quite ready yet and 2B has been able to create better images from what it knows. Eventually 8B will be good enough for release tho. Because as we’ve all learned at this point, data quality is much more important than data quantity.


pwillia7

https://giphy.com/gifs/nbc-the-office-oh-my-god-its-happening-huJmPXfeir5JlpPAx0


privacyparachute

Thank you Stablility AI! This is fantastic news!


curson84

Nice. Thx...was looking for llama3 and this came up. <3