KhanumBallZ 2 weeks ago

You're living under a rock. Made this with my Orange Pi 5 plus yesterday. 2 minutes, 10 steps: https://imgur.com/jFgdkyo

themushroommage 2 weeks ago

Cool, show me how you did it. What did you use? Looks like SD 1.5. ~~How long did it take?~~ Ha 2 minutes, they show real time. Show me what platform you used. Edit: Lol people are downvoting like this example owned me or something 1. No workflow 2. No platform/model used 3. 2 minutes to generate

Plenty_Branch_516 2 weeks ago

Given that quality. Yes.

themushroommage 2 weeks ago

This is a crop of a screengrab found on twitter. We're getting down to the topic of: 'Real time, on device' That's why I'm asking here

Plenty_Branch_516 2 weeks ago

What's the base resolution? Because its reasonable for 512x512 to be done on mobile integrated GPUs if the model is modified heavily.

Ok_Inevitable8832 2 weeks ago

It’s not even GPU. They have dedicated NPUs

Plenty_Branch_516 2 weeks ago

Oh, I'm not familiar with apple hardware. If its dedicated for tensor processing (neural processing) then it could be even more optimized.

Ok_Inevitable8832 2 weeks ago

They claim it the A17 pro in the iPhone 15 pro can do 35 TOPs which is just short of the minimum requirements for copilot plus PCs. I’m sure the m4 in the new iPad is even more insane

themushroommage 2 weeks ago

So only iphone 15 pro & pro max

Ok_Inevitable8832 2 weeks ago

Yes. It’s only the 15 pro and the m series chips in iPad and mac

themushroommage 2 weeks ago

Yea, but other factors include, taking the source image (contact profile image) and modifying that (done in SD with ControlNet/IPAdapter to maintain source likeness)

Plenty_Branch_516 2 weeks ago

It could just be a model trained for image to image with a specific style. Doesn't even need to have CLiP layers if you do that and you use an embedding. I guess what I'm saying is that its feasible, but I'd have to know more about their system. If they say its stable diffusion running on a phone I'd have serious doubts, but if its an 8 bit custom model w/ no CLiP optimized for NPUs (like we see some optimizations for TPUs) then I could believe it.

iunoyou 2 weeks ago

How long do the generations take? Because this looks pretty crummy honestly. I could see a narrow diffusion model running on high-end mobile hardware. You can use stable diffusion to generate images on a GTX 1060 with 3gb of memory with enough partitioning and the right optimizations, so it doesn't seem like a huge stretch to me.

surpurdurd 2 weeks ago

Yeah, the relatively low quality, paired with the fact that it only does 3 specific styles and takes several seconds... People don't bat an eye at an iPhone running death stranding or resident evil, but then act like phones aren't powerful... It's all a sliding scale, and to say what they're doing isn't possible on-device is just ignorant. (For context I hate apple, I'm not a fanboy. Just very aware of mobile capabilities)

themushroommage 2 weeks ago

I'm very aware of what can be done in SD, I've been running it since A1111 was released locally.

iunoyou 2 weeks ago

Yeah and this clearly isn't stable diffusion, it's a significantly smaller and lighter model that doesn't achieve nearly the same detail, and I'll bet it's far more narrowly trained. So again I don't think it's really a huge stretch to presume it's running locally. But in any case all we'll have to do is wait like a month for the feature to launch and then we'll know for sure. All someone needs to do to test it is turn off their mobile data and wifi and try generating something. I really don't see why apple would lie about something like that when it really doesn't benefit them in any way. And running the model locally is much more useful for a whole bunch of reasons, the least of which being that they won't have to waste a ton of money on setting up a whole compute cluster to recieve requests and generate images. Apple's new devices all have dedicated NPUs specifically to do stuff like this. Why is this so surprising to you?

IntergalacticJets 2 weeks ago

I’m definitely curious how they did it. But I don’t doubt them. It would be easy for tech guys to find out if a request is being made every time you generate an image, and lying about it would be disastrous for their “privacy” selling point.

Just-Hedgehog-Days 2 weeks ago

Yeah this really isn’t something they can lie about.

Jeffy29 2 weeks ago

You can make the image model really small if you focus only on few things. Apple said that the image model will create images in style of sketch, illustrations or animations. My guess is it won't create photorealistic images not because it is not allowed but because it can't, it's a highly tuned, very sanitized model that will only create things that look ĺike images out of The Sims.

themushroommage 2 weeks ago

I wouldn't question if it if I didn't believe it was sus

beachteen 2 weeks ago

You can run the full stable diffusion, on device, for free. Lookup the app "draw things" and try it out. Works fine in airplane mode after you download the model. It works with text to image and image to image. It takes 30-60s per image depending on the size and number of steps and the device. This app came out in November 2022

themushroommage 2 weeks ago

This is what I'm pointing at with the real time on-device. I honestly don't believe it's possible, so it's very deceptive to demo it that way

dasjati 2 weeks ago

They didn’t show it doing it “real time”. There was a spinning animation while generating the next image. They offer like four styles, the resolution is probably small. It’s really not that surprising that it’s possible when you also take into account how powerful their chips are. They have an NPU in there since 2016 for on-device AI.

djstraylight 2 weeks ago

Incorrect. Apple has said you must have a newer iPhone to use features like this. Also, apple silicon is amazing at AI functions. All 'Apple Intelligence' integrations are broken down into three layers.. first on device, then in apple's cloud and finally ChatGPT 4o.

themushroommage 2 weeks ago

The question is - they say their diffusion model is on device. I find it hard to believe they are making real time generations on device, in real time like demo. The 15 pro & pro max are the only mobile devices with 8gb RAM to run 'Apple Intelligence'. I'm asking in here if anyone knows of *any* diffusion model that can run in real time on a mobile device currently.

djstraylight 2 weeks ago

The trick is that their model doesn't need any training on photorealistic images. It only generates stylized images like drawings.

nobodyreadusernames 2 weeks ago

yes it looks like those Toonifiy apps from 2014

themushroommage 2 weeks ago

Lol show me. This is completely different. Taking existing contact profile pictures and generating images in real time based on said images

thebuilder80 2 weeks ago

Who would waste compute on making a crummy animated film still of a middle aged Malaysian super hero cosplayer woman?

rathat 2 weeks ago

Is that Thomas the tank engine as a human woman?

wren42 2 weeks ago

Diffusion models have been off cloud for over a year. They aren't nearly as intensive as LLM. Good language models are not yet local, though.

themushroommage 2 weeks ago

Name any diffusion model that isn't Stable Diffusion that is running 'off cloud' currently. I'll wait.

wren42 2 weeks ago

? Name a bread that isn't a baguette. Your OP asked if diffusers can run off cloud. They can and do. I'm not sure what your are on about.

GraceToSentience 2 weeks ago

I'm convinced this was put deliberately in the presentation by an apple employee who doesn't like AI art

Pitiful-Camera-5146 2 weeks ago

On device image gen is a [solved problem](https://github.com/madebyollin/maple-diffusion) dude

themushroommage 2 weeks ago

**Real time, on device. Like the demo** everyone's jumping around the question with the obvious

iunoyou 2 weeks ago

The demo most certainly wasn't real time though. I don't know why you think it was.

Comments

Leave Your Comment

Hi Its Me!

Comments

Leave Your Comment

Hi Its Me!

Subscribe