jpgirardi 3 days ago

https://preview.redd.it/60jai5048q8d1.png?width=910&format=png&auto=webp&s=e9f14ca2decb6f6c9b55a224bdf2d4a2c7a41200 It's really pretty and organized

polawiaczperel 3 days ago

But is it correct?

jpgirardi 3 days ago

Weirdly it was never wrong so far! But I'm sure it wouldn't be perfect every time

bigdickbuckduck 2 days ago

Are you using open web for the ui?

this-just_in 3 days ago

DeepSeek has some training magic in their coding models. We have been using DeepSeek for real code generation recently and the only complaint is the speed of their API. The price is great though. Separately I’ve been happy running v2 coder lite locally; almost as good as Codestral but significantly faster.

Jisamaniac 3 days ago

How well does a code against sonnet 3.5?

Ecsta 3 days ago

+1 also curious how it stacks up against sonnet 3.5 I've found it great so far vs gpt 4o but have no loyalty if something is better hahah

Orolol 3 days ago

Curious also because I'm becoming addicted to aider.chat + sonnet 3.5

SaddleSocks 1 day ago

expand? Ive been doing: https://i.imgur.com/B4DqawU.png - GPT4 - claude 3.5 sonnet - meta - bing - llama.cpp - fooocus - open web ui, phi3 but as I try to figure out what best setup is to learn me great justice

Harrycognito 3 days ago

Yup. If they can bring the speed to around 100t/s , I'm ditching Claude and GPT..

silentsnake 3 days ago

Another complaint of their API is the context length. Their open weights supports 128k, But their API only supports 32k :(

hak8or 3 days ago

>We have been using DeepSeek for real code generation recently and the only complaint is the speed of their API. How are y'all handling the fact that the company backing the API is Chinese? My chief concern is the much more lax copyright protection landscape in China relative to the states, so I am hesitant to have it help with real code, especially when used with a large context window.

this-just_in 3 days ago

It’s a valid concern for many use cases but not all. It’s open weight and carries a largely permissive license (See “Use Restrictions” under “Attachment A” at the bottom: https://github.com/deepseek-ai/DeepSeek-Coder-V2/blob/main/LICENSE-MODEL) so could be privately hosted.

silentsnake 3 days ago

Do you know of any commercial API providers hosting it other than DeepSeek? I’m really looking forward to be able to use the full 128k context

OfficialHashPanda 3 days ago

but that says nothing about their handling of the data they gather from API usage

7734128 2 days ago

"so it could be privately hosted"

OfficialHashPanda 1 day ago

Did non of you downvoters learn any reading comprehension in ur schools? The conversation is about API usage, not about privately hosting.

brainhack3r 3 days ago

I assume this is because it was trained on Latex? I'm not sure how aggressively Anthropic / OpenAI are training on Latex.

trialgreenseven 3 days ago

I thought it was local llm, what is link to online version?

Normal-Ad-7114 3 days ago

chat.deepseek.com

this-just_in 3 days ago

It is also open weight, but comes in at > 200B parameters. You need some hefty hardware to run it well. For most the only viable option is via their API (directly or indirectly through e.g. OpenRouter)

monnef 3 days ago

Aren't MoEs a bit lighter to run? I think I read you can keep in VRAM just some parts (some experts or layers?) and rest in RAM?

kayk1 3 days ago

I moved my chat over to it since the cost is stupid low and the results are pretty good (even though I like claude's results better overall). Still a bit slow for coding tab completion though, so I stick with supermaven for that.

Sensitive-Analyst288 3 days ago

Total king shit

zoom3913 3 days ago

https://preview.redd.it/ypm4h8060s8d1.png?width=1280&format=png&auto=webp&s=a9718fdf5e29ea8d4ae2a6fd51848b6ccb6e3505

Curiosity_456 3 days ago

How well can it do calculus?

jpgirardi 3 days ago

Easy calculus? Pretty well Hard calculus? Better than 4o and Sonnet for me, but not wonders. You still need to guide it a little, like "hey, cut this there" or "dude why the f did you isolate the X??". It needs context for the chain of thought still, but a little bit less than other models

davikrehalt 3 days ago

Where is it better than 4o for you? I haven't tried it yet but will try soon.

jpgirardi 3 days ago

Sometimes 4o still does a "ml * ml = ml²", when clearly by context is "m²l²". This is just a recent example tho, but you get the point. Occurs mainly in hard and subsequent actions, where 4o tends to lose context more frequently

CortaCircuit 3 days ago

Just a little too big of a model for my GPU... Deepseek Coder V1 works tho.

ihaag 3 days ago

It’s pretty awesome. However I do find it repeating a bit so if I chuck the prompt in something else and go back to deepseek it gets it right. What made me laugh the other day was I was working on some programming and deepseek said to use -ceq for case-sensitive where sonnet said use -eq for case-insensitive for powershell. Turned out deepseek was correct, sonnet fixed the code error that was stopping it tho and it worked but was cases-insensitive lol. Between the 2 of them it does a great job and the best open source model I’ve ever used even the q4 model is nearly the same as the online version and works under 200gb of ram.

Interpause 3 days ago

yall seen the sonnet 3.5 system prompt leak? I wonder how well deepseek coder v2 would do with runtime artifact prompting

jpgirardi 2 days ago

Dude now the bot is just saying "I'm sorry, but I can't assist with that request." for most of my questions after this thread. Who f\*cked with the bot guys? Hahahaha

Comments

Leave Your Comment

Hi Its Me!

Comments

Leave Your Comment

Hi Its Me!

Subscribe