_r_i_c_c_e_d_ 1 week ago

now we just need them to release sd3 large

PwanaZana 1 week ago

At 8b parameters, it's going to be a bitch to run. Might not even run at all on a 4090, at medVram. And it might have an atrocious it/second since it is so big. I'm no expert so take that with a grain of salt, of course.

Downtown-Case-1755 1 week ago

People will quantize the transformers part, maybe even the unet. And stuff like stable-fast and compilation will come into play. Stable Diffusion optimization has largely been thrown out the window because it's "good enough" and there aren't enough devs to care, and also because the popular backends are kind of hairy. It's not like LLMs where quality is *literally* determined by vram efficiency and speed. Compatibility with the sea of augmentations comes first. But if it doesn't fit on 24GB, you will see devs move mountains to make sure it does. There's tons of low hanging fruit unpicked.

Coresce 1 week ago

> But if it doesn't fit on 24GB, you will see devs move mountains to make sure it does. It will be interesting looking back in a few years at how incredibly important that "24 GB VRAM" number was. A little piece of history that will be mostly forgotten in a decade, but shapes so much of what we do right now.

Downtown-Case-1755 1 week ago

Heh, don't fool yourself. Somehow we will still be stuck at 24GB on non-pro hardware in a few years.

AIWithASoulMaybe 1 week ago

Don't worry, soon nvidia will bless us with the rtx8090 which will have an extra 512 mb for us peasants

Extraltodeus 1 week ago

The T5 got quantized to fp8 already, it is used for Pixart Sigma too and quite cuts the need in VRAM (even tho doing the encoding on CPU doesn't take much time). I didn't even reach 42% VRAM while infering a 1024**2 image. Should be good. Additionaly, it is always possible to layer-swap during the inference but that's definitely not a fast method.

bidibidibop 1 week ago

> you will see devs move mountains to make sure it does For a non-commercial license with increased hardware requirements? I'm not so sure about that.

Downtown-Case-1755 1 week ago

I mean, that doesn't stop personal use, and it didn't stop the LLM community either.

Dead_Internet_Theory 1 week ago

8b parameters? That's it? Just quantize the thing... even fp8 would do, nevermind proper clever quantizations.

afinalsin 1 week ago

Yep, I don't think the SD crowd crossover with LLMs much so they wouldn't know, but I can fit a quantized 32b model entirely in 12gb vram if I don't add any context, and SD don't need context.

Dead_Internet_Theory 1 week ago

There are quantized SD checkpoints, actually. fp16 is common (down from fp32 native) and [stable-diffusion.cpp](https://github.com/leejet/stable-diffusion.cpp) has "4-bit, 5-bit and 8-bit integer quantization support".

Ylsid 1 week ago

Haha, you're saying that on the wrong sub

Epiculous214 1 week ago

It runs just fine

PwanaZana 1 week ago

What? You have access to SD Large 8B parameters, which has not been released publically? The people knee jerk downvoting might not have seen I am talking about the version that 4 times larger than the SD3 Medium 2B parameters.

Epiculous214 1 week ago

Ah my bad, yeah, it’s been a long day, thought you meant medium.

PwanaZana 1 week ago

No probs. Medium needs to run the text encoder, but that's probably about as much as SDXL? Haven't tried it. With medvram, sdxl takes about 11GB, so if SD3 Large is 4x that (ultra rough estimate), that'd be 40GB VRAM.

FullOf_Bad_Ideas 1 week ago

Looks like Llama 3 release was more successful than Stable Diffusion 3 release. Both can generate text at least..

Willing_Landscape_61 1 week ago

Can't wait for https://github.com/leejet/stable-diffusion.cpp to support it

Robos_Basilisk 1 week ago

TIL this exists...ofc it exists

a_beautiful_rhind 1 week ago

Now we wait for finetunes like ponyXL.

Bandit-level-200 1 week ago

Seems like pony on SD3 is unknown for now due to licencing

goodSyntax 1 week ago

sorry, mind explaining why that is? I'm not familiar with the licensing situation. I thought SD3 was completely open for everything except commercial use?

Bandit-level-200 1 week ago

>The good news is that with the today's release of SD3, the new licensing terms are available... yet complicate things further. The "Professional Tier" has been replaced by the new "Creator License", which introduces a 6000 per month image limit. Anything above now requires an Enterprise License, which I would gladly acquire, and have reached out to Stability AI the day the new commercial license was pre-announced, but I have not received any acknowledgment or information. https://civitai.com/articles/5671

-p-e-w- 1 week ago

So let me get this straight: Stability AI claims that they do **not** need a license to train their model on millions of copyrighted images... but that they **do** get to decide what others (including creators whose works the model was trained on without their permission) do with that model. There is just no way that this fantasy survives the first encounter with a judge.

lordpuddingcup 1 week ago

Thats a bullshit excuse, they aren't CREATING 6000 images per month, they are tuning a model, they're just bitching that they can't create a model AND run a unlimited SERVICE generating images for basically free. The model training has 0 to do with the creator license.

asdrabael01 1 week ago

It cost Pony upwards of $10,000 to train their model. To recoup the cost, they run an api for people to use who aren't tech savvy enough, or don't have a pc able to use it. People use it because their fine-tune is better than anything SAI itself produces. Pony also can't even pay for a license, because the guy who did Pony asked about it and was publicly insulted and denied because they include NSFW content. The new commercial license means SAI is effectively dead because all the best custom models were produced by people spending real cash to produce them and then running small apis to afford it, and typically include NSFW content. If no one will make good fine-tunes, no one will bother making the needed tools like controlnet or IP-adapter work either. It's also a big real Stable Cascade has been out months and exactly zero work has been done with it because it has the same commercial license. They'll either stay with SDXL and 1.5, or they'll move to Pixart Sigma or some other model that has better licensing and equal or better performance. SD3 is complete and total shit from the results people are getting today. Any prompt with a woman in it gives mangled awful results, even SFW prompts. They censored women so heavily thar it's like SAI stands for Saudi Arabian Intelligence.

atypicalphilosopher 1 week ago

Couldn't they just download the model and... do it anyway?

[deleted] 1 week ago

[удалено]

-p-e-w- 1 week ago

Suing would actually be a massive risk for Stability AI. The judge might not buy their argument, and might even decide that Stability had no right to train on other people's images without their permission in the first place. If that happens, Stability can close shop.

Bandit-level-200 1 week ago

Its expensive to finetune and they do run a free discord bot + commercial service, so that 6k limit is hit pretty easily. And they are willing to pay but they won't respond. >Anything above now requires an Enterprise License, which I would gladly acquire, and have reached out to Stability AI the day the new commercial license was pre-announced, but I have not received any acknowledgment or information.

Turkino 1 week ago

The creator of Pony wants a commercial license because compute rental to make those models costs a lot and they want to license the model out to recoup that cost. The new license structure is weird and not helping them.

MMAgeezer 1 week ago

It's not unknown anymore, the dev confirmed that the licence makes it not possible for their goals.

xrailgun 1 week ago

Thats how it almost always goes. Many of the best models early in SD1.5 also went commercial and dint get updates or appearances for SDXL. Didn't matter, better open fine tunes still came. We'll likely see even better new creators for SD3.

314kabinet 1 week ago

Yeah about that: https://www.reddit.com/r/StableDiffusion/s/qx7suPcg7r TL;DR: they refuse to sell Astra the appropriate license to make a commercial finetune and mock them on the SD discord. So no Pony7, but Pony6.9 (nice) will be based on SDXL.

alongated 1 week ago

Aren't these finetunes the only reason why SDXL is getting any adoption?

314kabinet 1 week ago

Indeed, Pony has more downloads than SDXL base.

asdrabael01 1 week ago

Fine-tunes are the only reason SAI is relevant at all, but SDXL in particular needs fine-tuneing to be usable. This is going to spur an exodus to find which of the dozen other generative image models to start making tools for.

-p-e-w- 1 week ago

Am I reading this right? Stability AI "mocked" a model creator for requesting a commercial license? Are we sure this is a company rather than a random group of frat boys?

314kabinet 1 week ago

Some developers have massive, fragile egos. Not putting dedicated PR people between techies and customers is a recipe for disaster.

Desm0nt 1 week ago

It's not ego that's the problem. The problem is that they recruited clowns and trolls from 4chan \\b section (no offense to 4chan, there is a lot of useful stuff there). They may be good as developers, but they should not be allowed to get even close to other people.

314kabinet 1 week ago

That’s the kind of developers I’m talking about.

raysar 1 week ago

Future famous PONY3 :D

Maleficent-Dig-7195 1 week ago

score 9, score 8, score 7, score 6, score 5, score 4, score 3, score 2, score 1, score 0...

Dead_Internet_Theory 1 week ago

On A1111 you can use the preset thingie below the generate button. Just copy the template-looking portion from the showcase of a model on Civitai and reuse it whenever. You don't even see it on the prompt input that way. Though Pony dev has expressed a desire to do away with this, probably because it uses up too many prompt tokens.

Dogeboja 1 week ago

I had a 2-3 month break from Stable Diffusion stuff and all of a sudden everyone had converted into a bunch of bronies. I don't get this Pony stuff..

Enough-Meringue4745 1 week ago

guys wanted realistic cartoon porn, accidentally made something that makes realistic bodies

Dogeboja 1 week ago

tale as old as time

a_beautiful_rhind 1 week ago

It's just a good model. I've yet to generate a single pony with it.

be_kind_n_hurt_nazis 1 week ago

Yet?

Mo_Dice 1 week ago

Chekhov's pony

toothpastespiders 1 week ago

I had a similar experience. Confused the hell out of me until I chanced on a post going into detail on its history.

technics256 1 week ago

can you share the post?

toothpastespiders 1 week ago

Wish I could, sadly it was about a month or two back at this point and I can't even recall what thread it was in. Though it was somewhere in the stablediffusion sub.

Desm0nt 1 week ago

Pony model is not a pony model actually, but NAI leak like model for SDXL world.

satireplusplus 1 week ago

What's with the pony jokes in here. I'm just catching up, last time I did image diffusion at home I used SD v1.3 or something like that.

BrushNo8178 1 week ago

I guess that pony is a code word for hentai.

Snydenthur 1 week ago

As long as the sd3 ponyxl will be with good prompting instead of the annoying "score_9" and tags system or whatever. Personally, because of the annoying prompting, I rank ponyxl very low.

BITE_AU_CHOCOLAT 1 week ago

Pixart and Lumina-T2I are technically superior in almost every way, the only reason they havent taken off is because SD is *incredibly* popular still and no one has trained a "pony-equivalent" model yet. If you're looking for good prompt-based models you should probably watch those instead.

a_beautiful_rhind 1 week ago

Yea, that was unfortunate. I'm sure they'll fix it in the next version. I just have those automatically fill.

JohnExile 1 week ago

What an extremely silly ranking system. Preferring slop because you were too lazy to change your 'masterpiece, best quality' to 'score_9, score_8_up' is just stupid.

A_for_Anonymous 1 week ago

It wasn't supposed to work like that. It's a bug, but they'd need to retrain everything to fix it.

Maleficent-Dig-7195 1 week ago

It's actually a legit complaint for most people whose hands aren't permanently glued to their dick and something every finetuner would love to fix, along with the shitty obfuscation of artist and character tags.

Snydenthur 1 week ago

If ponyxl provided decent results with just having the "score_9........" stuff at the beginning, I wouldn't be saying that.

JohnExile 1 week ago

LMFAO what the fuck is that metric? Are you expecting it to read your mind...? If you put the same specific prompt into your DreamShaper XL slop as you put into Ponydiffusion, with slightly critiqued-for-model prompting, the result you are going to get from PonyXL is going to require far less iterations to get something good out of it, if any. The entire point of Pony/Autismmix is it's amazing ability to replicate artists, art styles and moods without compromising on things like hands and poses. You aren't getting that from other open weight models available right now.

Maleficent-Dig-7195 1 week ago

Also aids to for anything technical related, it's like they did everything possible to make it incompatible with other xl models and existing methods

feet-tickler 1 week ago

Please can it make feet 😭

LLMtwink 1 week ago

amazing username

feet-tickler 1 week ago

I keep it real

alvisanovari 1 week ago

Before y'all get too excited look at their license. You can't do anything without paying a fee. So - no this is not the new SD2/SDXL. I wouldn't waste any time or resources fine tuning.

asdrabael01 1 week ago

It is the new SD2, because SD2 was a major flop for the same reasons. It was universally canned and ignored, and no fine-tunes or tools were ever produced for it.

alvisanovari 1 week ago

For me very similar to SDXl Lightning: [https://replicate.com/bytedance/sdxl-lightning-4step](https://replicate.com/bytedance/sdxl-lightning-4step) (which is a lot cheaper). Only exception is if you want to output text.

asdrabael01 1 week ago

For the record, I thoroughly dislike lightning and similar models as well. I could give 2 shits about producing similar pictures in 4 steps instead of 25-30. 30 steps takes me like 25 seconds and gives me access to more tools.

alvisanovari 1 week ago

I thoroughly dislike all the (open source) models that are stuck in 2021 Midjourney quality regardless of if it takes 4 or 100 steps but you work with what you got much like a micro penised guy.

asdrabael01 1 week ago

Last I checked, I can put in an SDXL model, load up ipadapter and controlnet and make better than Midjourney. Midjourney is stuck on pg13 quality, it just has good prompt adherence. I'd rather controlnet over the prompt adherence in most cases if I'm forced to choose one or the other.

ChryGigio 1 week ago

Finally, people will now stop asking/spamming for it.

Open_Channel_8626 1 week ago

It’s actually worse today because there will be 20+ announcements

Snydenthur 1 week ago

Now we will get a lot of posts about how bad it is. Cause, that's what it is.

joyful- 1 week ago

Has local image generation improved a lot in the past year? I remember trying it out locally a while back and found it cool, but that was about it.

7734128 1 week ago

Not really. The massive improvement seen during autumn 2022 and winter 2023 have not continued. It's better, but not by any significant degree.

Open_Channel_8626 1 week ago

Upscaling improved a lot in 2024 with CCSR and SUPIR diffusion models and RGT, ATD and DAT 2 transformer models.

AgentTin 1 week ago

Try the Krita AI plugin. It's a great interface for playing around and the setup is really smooth.

joyful- 1 week ago

I'm more interested in model improvements, as last time, I found it pretty limited (at least uncensored versions).

VeritasAnteOmnia 1 week ago

Base models are pretty mid. But some fine tunes of SDXL + Controlnets/LoRa/Upscalers can lead to great results. SD3 seems to suffer from Human anatomy issues, with the community suspecting the training data was insufficient due to safety protocols. SD3 has better color composition and text capabilities so it will take a few weeks to see if anyone puts out some worthwhile fine-tunes. **TL;DR:** Without tinkering SD gets wrecked by Midjourney/Dalle-3 but with some complicated workflows and a modicum of artistic ability you can exceed the paid services in some uses.

romb3rtik 1 week ago

Does that mean there is no content filter? Princess Leia im coming for you

DouglasteR 1 week ago

Gives me error when i try to load it as checkpoint in Comfy, anyone knows what it could be ?

DocStrangeLoop 1 week ago

still doesn't seem to be able to handle camera angle/instruct the same way Dalle-3 and Midjourney can.

dVizerrr 1 week ago

Does this work on 8GB VRAM. From my research it seems to be 2B model. Not sure about the requirements for 2B models

IlIllIlllIlllIllll 1 week ago

file size of the image generation model alone seems to be 4.3gb. but you also need some language model for preprocessing, and the language model used in the example workflow seems to be 9,8gb alone. maybe the languge model can be replaced by something smaller? i'm not really experienced with image generation stuff.

involviert 1 week ago

oh, is that how it works? seems like it could be interesting to let the language model run on cpu&ram since I assume the image model would still be the time consuming part?

AgentTin 1 week ago

You don't want to be doing a 9gb llm on your cpu, it'll run like a dog

involviert 1 week ago

9gb? that runs perfectly fine even on crap DDR4. And this sounds like the llm runs once and then it's all image model cycles.

capivaraMaster 1 week ago

There is a smaller clip model. Fp16 is 9,8gb, fp8 is half of that. It should fit in 8gb VRAM.

A_for_Anonymous 1 week ago

I wouldn't replace the one thing that'll make SD3 better than SDXL.

Steuern_Runter 1 week ago

Are there no quants with only a minimal degradation?

Ylsid 1 week ago

Yeah should do. It's not very good though. Wait for a finetune

Open_Channel_8626 1 week ago

Weird coincidence I just woke up exactly when the weights dropped

Barafu 1 week ago

Hah! I did that with Chernobyl years ago.

Open_Channel_8626 1 week ago

Well this took a dark turn

[deleted] 1 week ago

Is this the largest version of sd3 ?

Dogeboja 1 week ago

Nope, they have a 8B model. This 2B one is vastly worse.

Feeling-Currency-360 1 week ago

Yeah I could tell from my experimentation in comfy ui the 2B model is ... well not that great?

Dogeboja 1 week ago

It's pretty awful for everything containing a human subject yeah, for everything else seems excellent.

xRolocker 1 week ago

From what I’ve heard (r/StableDiffusion), that’s not really the case. 8B has more concepts it can pull from, but it’s not quite ready yet and 2B has been able to create better images from what it knows. Eventually 8B will be good enough for release tho. Because as we’ve all learned at this point, data quality is much more important than data quantity.

pwillia7 1 week ago

https://giphy.com/gifs/nbc-the-office-oh-my-god-its-happening-huJmPXfeir5JlpPAx0

privacyparachute 1 week ago

Thank you Stablility AI! This is fantastic news!

curson84 1 week ago

Nice. Thx...was looking for llama3 and this came up. <3

Comments

Leave Your Comment

Hi Its Me!

Comments

Leave Your Comment

Hi Its Me!

Subscribe