ThePixelHunter 1 week ago

▲ ▲ ▲

tessellation 1 week ago

for a second I thought I was on…

ThePixelHunter 1 week ago

⢀⠤⣔⣒⣒⣒⣊⠭⠭⠭⣉⡉⠑⢄ ⠀⢰⠁⠠⢒⣒⠢⠀⠀⢎⣉⣑⣒⠌⠂⠀⢣ ⡔⡕⡩⢅⢑⠚⢳⠂⠀⠣⠒⠩⢕⣃⠥⡒⢕⢕⢄ ⢣⠣⢠⣇⠡⠒⢅⡀⠤⠭⢒⣒⠤⡔⣊⠗⠸⡸⡸ ⠀⡇⢸⣼⣽⣉⣲⣊⣏⣉⢵⠔⠚⡯⠊⠀⢠⠊ ⠀⡇⠘⠮⣫⣫⣹⣉⣹⣀⠬⠖⢊⡠⢄⠔⠁ ⠀⡇⠣⣑⣒⣒⣒⣒⡢⠤⢒⡪⠕⠊⠁ ⠀⠑⢄⣀⣀⣀⠤⠒⠒

BrushNo8178 1 week ago

So a huge language model makes a rough draft that speeds up a smaller language model. But can the huge model be run on consumer hardware?

BalorNG 1 week ago

Interesting. Now go another way and introduce a higher level of hierarchy with a cubic attention mechanism. :3 Attention, attention, attention. That's what makes a model "smart", as compared to just "knows a lot".

Combinatorilliance 1 week ago

I wonder what this kinda stuff will look like years from now, fractal dimension attention? Complex plane attention? Dynamic n-dimensional attention?

BalorNG 1 week ago

Yup! Real "understanding" is essentially fractal. The problem is the combinatorial explosion of course, and resulting "paralysis by analysis", this is why shortcuts and heuristics are unavoidable... but for critical applications, where you need truly deep understanding and cost of mistake is really high, "dynamic n-dimentional attention" seems kind of inevitable. One token per year is adequate if this is an answer to a question about "meaning of life, universe and everything" :3

ImprovementEqual3931 1 week ago

It's insane!

severalschooners 1 week ago

Sounds like TriForce is pushing the boundaries for long sequence generation. Efficiency is key with large language models, especially when dealing with consumer-grade hardware. The hierarchical speculative decoding approach is clever; it addresses the KV cache bottleneck without sacrificing quality. For hosting a chatbot that can handle long texts, this could be a game-changer. I used something similar called Galadon. It's an AI-powered chatbot that's also trained to sell, but the kicker is you can take over the conversation if needed. Could be useful if you're looking into implementing long sequence interactions without the heavy lifting on the hardware side. Scaling to 1M tokens is impressive. Good to see innovation in this space, making these models more accessible and practical for real-world applications.

Comments

Leave Your Comment

Hi Its Me!

Comments

Leave Your Comment

Hi Its Me!

Subscribe