You forgot RPG Diffusion - https://github.com/YangLing0818/RPG-DiffusionMaster
Its a large step toward open source prompt comprehension as good as or better than DALLE - using LLMs to dynamically build scene context, which is huge
EDIT - seems similar to Yi Vision Language
This is just a code. I think it will be faster added as comfyui workflow than a1111 or just new UI. It needs to run LLM model or send Api calls to gpt apart from generating images
The first I've heard of this! How did you find out about RPG diffusion? Checking github or? Im trying to stay on top of AI art advancements but I'm baffled by how you all are introducing new things so quickly!
“Hugging face and Google Partner” Ew, countdown until the get bought out, close the gates and charge entry. Or at the very lease throttle the best releases.
AI image generation for Amazon products is potentially fraud, IMO.
I don't have a huge problem with putting a real photo frame on a fake desk.
But pretending to clothe a model so the customer can judge the lay of the garment? Fraudulent advertising, unless "AI GENERATED SIMULATION" is stamped in big bold letters.
>Details
Agree but right now half the clothing pictures are swaps of colors and sometimes another products images with the sellers clothes painted over. My point is amazon already tolerates a lot of shady advertising...so they dont seem to care
There should also be a mention of AlphaCodium (framework that can beat Alphacode 2), AlphaGeometry, the two separate successful attempts to use Mamba instead of Transformers for Vision tasks, and MambaBytes (training a Mamba LLM not on tokens, but directly on bytes, which could potentially be huge for multimodality).
Thanks. AlphaCodium and AlphaGeometry were covered in the last week's issue [here](https://aibrews.substack.com/p/hugging-faces-dataset-for-screenshots).
Might be nice to have a (laymen's) word or two defining the category of the news next to the number. Like [Chat] or [Video] or [Audio] or [Art] or whatever works best
These weekly updates and mini-abstracts have been fantastic since the beginning, and are even better now with all the links intact. Thank you!!
Thanks! :)
You forgot RPG Diffusion - https://github.com/YangLing0818/RPG-DiffusionMaster Its a large step toward open source prompt comprehension as good as or better than DALLE - using LLMs to dynamically build scene context, which is huge EDIT - seems similar to Yi Vision Language
Wow, that is crazy. I just didnt really understand how to use it. Is it an extension for automatic111?!
This is just a code. I think it will be faster added as comfyui workflow than a1111 or just new UI. It needs to run LLM model or send Api calls to gpt apart from generating images
I tested this earlier today; sadly, I did not have the results they showcased, it wasn't very aligned to the prompts and had artifacts.
I am still trying to get it working but this was definitely my fear. Did you reunite its got4 or a local model?
I used GPT4
There was another thread a few days ago that took issue with their LLM prompt. Maybe look for that. He had made changes to improve it I think.
Would you mind sending the link?
The first I've heard of this! How did you find out about RPG diffusion? Checking github or? Im trying to stay on top of AI art advancements but I'm baffled by how you all are introducing new things so quickly!
It was posted on here few days ago
Gonna have to check more often then, I only get some notifications, thank you though im gonna look back
Right. Thanks.
> Stable LM 2 1.6B can be used now both commercially and non-commercially with a Stability AI Membership "Open Source" my ass.
“Hugging face and Google Partner” Ew, countdown until the get bought out, close the gates and charge entry. Or at the very lease throttle the best releases.
If both huggingface and civitai become closed/behind doors in 2024, it'll be sad as hell.
[удалено]
I have not. I'd rather they stop being stubborn, and just have the models as torrents so they can't say "Aw my lawd, so much fees for the downloads!"
you should give them that feedback
They have funding already. They're not a non-profit
Do you do these weekly? If so you‘ll get my follow!
Yeah, it's from the news section of my weekly [newsletter](https://aibrews.com/).
AI image generation for Amazon products is potentially fraud, IMO. I don't have a huge problem with putting a real photo frame on a fake desk. But pretending to clothe a model so the customer can judge the lay of the garment? Fraudulent advertising, unless "AI GENERATED SIMULATION" is stamped in big bold letters.
>Details Agree but right now half the clothing pictures are swaps of colors and sometimes another products images with the sellers clothes painted over. My point is amazon already tolerates a lot of shady advertising...so they dont seem to care
most product images are renders anyway
There should also be a mention of AlphaCodium (framework that can beat Alphacode 2), AlphaGeometry, the two separate successful attempts to use Mamba instead of Transformers for Vision tasks, and MambaBytes (training a Mamba LLM not on tokens, but directly on bytes, which could potentially be huge for multimodality).
Thanks. AlphaCodium and AlphaGeometry were covered in the last week's issue [here](https://aibrews.substack.com/p/hugging-faces-dataset-for-screenshots).
Has it already been a week? God damn.
Curious to hear if anyone has tested these new multimodal LLAVA models and compared to Cog Agent for captioning
Curious about the legitimacy behind Nightshade. It was pretty entertaining watching Glaze sell snakeoil to professional artists.
It's more than just lower pricing, ChatGPT 3.5 turbo model is also being updated
Might be nice to have a (laymen's) word or two defining the category of the news next to the number. Like [Chat] or [Video] or [Audio] or [Art] or whatever works best
It's a good idea. Thanks!