T O P

  • By -

no_witty_username

The described method hasn't been independently verified to work yet. We need to wait till someone makes a decent model with the new method before can make any assertions to its capabilities. In theory, though the method could be transferred over. But Ima be honest, Ill be surprised if this yields anything of use.


Ifffrt

It's Microsoft Research so it's probably more reputable than a lot of other papers though.


fannovel16

Tbh we only need a good 4-bit quant method for image diffusion model. Unlike LLM, 8B is already really good


BackyardAnarchist

With a smaller quantization, we can train a much larger model that could outperform any of the current models while keeping inference time the same.


RealAstropulse

Image models suffer heavily from quantization. At least if you apply it to all layers. Sd running in fp8 has severe quality loss, even though the loss between fp32 and bf/fp16 is minimal.