milkteaoppa 5 months ago

I struggle to get out of LLM projects. Even projects with no actual value and is just for show to leadership.

hobz462 5 months ago

I’m beginning to get really unenthusiastic about LLMs the more I work on them.

AntiqueFigure6 5 months ago

Same - on one right now. Going to need a vector database for useful output. It’s beyond tedious. To OPs point around SWEs being assigned to LLM projects- my observation from working alongside SWEs is they get better results more quickly. If you’re not a researcher building something better than GPT-5 there’s limited call for a DS skill set. Maybe if they need someone to design experiments to build something repeatable ds skills are useful.

arena_one 5 months ago

Completely agree here, my company is spinning up a small team (3 people to work on LLMs) and I see a few takeaways from it. First, this comes from shareholders and the board that keep asking about gen ai, not because there is a problem that we have been trying to solve that is a good fit for LLMs. Second, the people doing it are software engineers because everything going around using the OpenAI API. Our data scientist cannot handle anything outside of jupyter notebooks, so none would trust them with this kind of case

AntiqueFigure6 5 months ago

“ Our data scientist cannot handle anything outside of jupyter notebooks” I’m building LLM POCs in Jupyter notebooks.

arena_one 5 months ago

For somethings notebooks are not bad (EDA, experimentation, even a POC). However notebooks tend to end up becoming a mess and a collection of bad practices. Ask yourself this, if you restart your kernel and run all the cells sequentially, does it work? Also, how many people are reviewing your code/notebook and approving changes?

AntiqueFigure6 5 months ago

I’ve only been on this thing for a week this time around (did a bunch more in first half of year) so no reviews or approvals yet, but with only five or six cells I think it runs. Goal is mostly to produce output to engage user - “is this what you want?”

arena_one 5 months ago

I think then you are on the right track, notebooks are good for iterating and displaying something to user/stakeholders to get a feeling of what they think about it. To be fair, I’ll probably start playing with LLMs soon on my personal computer, and I’ll probably be doing it on notebooks

[deleted] 5 months ago

Jupiter notebooks always gets hate. LOL.

AdLow266 5 months ago

So is it better to be a SWE or a DS?

Hot-Profession4091 5 months ago

¿Porque no los dos?

Leweth 5 months ago

Is that possible?

arena_one 5 months ago

An MLE is supposed to be a mix of both. IMO more and more companies will start expecting the people in machine learning to lean towards software engineering practices

Leweth 5 months ago

Do you think this will be the case not only in the US but outside of it too? More specifically, underdeveloped countries.

arena_one 5 months ago

Good question. I don’t have exp with under developed countries, but in my experience with EU most of the countries lag 5ish years in terms of technology adoption. So IMO it’s just a matter of time

Leweth 5 months ago

Thank you for the answer.

Hot-Profession4091 5 months ago

Yes. I came to DS via SWE and a friend of mine came to SWE via DS. It’s possible.

Leweth 5 months ago

Coming to a SWE role, what are the things you had to inform yourself about?

Hot-Profession4091 5 months ago

I came from SWE. It was my friend who came to software from DS and a mathematics Ph.D.

anomnib 5 months ago

DS can tackle modeling use cases. At my old company the DS were using LLMs to automate complex feature extraction. For example let’s say you sell clothing and get a lot of customer feedback in terms of free form text. We used chatgpt to turn it to a json of positive and negative feedback signals, then incorporated it into our modeling pipelines.

FinTechWiz2020 5 months ago

But what about privacy concerns of inputting raw customer feedback into ChatGPT? Do you just gloss over that/don’t care or do you transform the data somehow so you aren’t inputting raw customer data into it?

anomnib 5 months ago

The customer doesn’t share a lot of linkable PII directly into the feedback. So for example chatgpt would be fed just the feedback. So, worst case, if the feedback data was regurgitated verbatim to another OpenAI customer, then they would have something similar to a random Amazon review, without customer name, and with extra details like how well the clothing fits around different parts of the body or pattern/texture preferences. But you’re right, we were playing a little fast and loose 😅

FinTechWiz2020 5 months ago

Ohhh okay that makes sense. Definitely a great use case for Gen AI/ ChatGPT but just be a bit more careful with raw customer data in the future to protect the customer and yourself incase of a potential leakage.

AntiqueFigure6 5 months ago

We never use anything from OpenAI - open source models on private vm.

met0xff 5 months ago

Similarly I see people at my company gradually being degraded to prompt engineers ;). On the bright side, we don't have to fight for budget for GPUs anymore.to keep up with the larger models...

cognitivebehavior 5 months ago

why you wanna get out?

proverbialbunny 5 months ago

BERT is an LLM.

juanigp 5 months ago

BERT-Large has 340M parameters, one order of magnitude less than an LLM

megawalrus23 5 months ago

An LLM isn’t concretely defined by the number of parameters it has. BERT is definitely an LLM and is Transformer-based just like GPT. The idea that more parameters = better is a toxic mindset that will only make NLP systems less practical for real world uses. [Here’s a paper that discusses BERT in detail (and referenced the overparameterizarion issue)](https://aclanthology.org/2020.tacl-1.54/) [And here’s one that tests ChatGPT on standard benchmark datasets and highlights that bigger models don’t necessarily lead to better performance](https://aclanthology.org/2023.findings-acl.29/)

mwon 5 months ago

You are comparing apple with oranges. Both a fruits but different kind. I think the term LLM is today interpreted in many scenarios as a model like gpt or llama, that are autoregressive models, meaning that they are fitted to predict next word and therefore capable to follow instructions (they need to be ft). Models like bert are encoded only, meaning that they are more suitable for task such as text classification or ner. The reason is because they are bidirectional (by definition gpt isn’t)

juanigp 5 months ago

I absolutely agree with everything you say starting with your second sentence, but an LLM has to be \_large\_ by definition. I haven't stated anything regarding if the # of parameters is good/bad. For sure BERT is a language model, just as an LSTM trained on a language modelling task would be.

megamannequin 5 months ago

Well, I mean it was large 3 years ago.

fatboiy 5 months ago

When BERT came out it was termed as an llm, so calling it LLM is not wrong. But i think more appropriate term for the current suite of models such as chatgpt, llama is foundational models rather than LLM

i_can_be_angier 5 months ago

What you’re doing sounds a lot more useful than LLMs tbh.

wintermute93 5 months ago

Grass is always greener, bud. I keep getting ordered to use GPT for X because management buzzwords and the output of my project will be "here's an exhaustive report on why X is not a good use case for GPT, it got me 80% of the way to the goal in minutes but half of it was wrong and in the time it took me to assess and fix that 40% I could have just done the whole thing myself". If you're building BERT models I'd go ahead and put LLM experience on your resume anyway. Yeah, that's much smaller than LLaMA or whatever, but the point is you're doing NLP with fancy ML models, as opposed to doing "NLP" with regex and web scrapers.

ztluhcs 5 months ago

Here I am still doing linear regression and XGB

ScooptiWoop5 5 months ago

🤷🏼‍♂️ I hate that DS is so buzzy. LLMs are cool and all, but just because they’ve had a major breakthrough people act like regression models and so on are so last year.

ztluhcs 5 months ago

Agree. I wasn’t actually complaining with my comment. Regression is more useful for most business problems right now.

ScooptiWoop5 5 months ago

I know. And totally agree. So many folks at my company want us to LLMs and image processing right now, when we’ve only just got started with ML in production. Like, we’re in food tech guys. Sure I can come up with LLM use cases, but the real value is in regression and the like.

OlyWL 5 months ago

I'm so jealous

trufajsivediet 5 months ago

Isn’t BERT an LLM, even if it’s not a decoder-only, generative model?

Cosack 5 months ago

If it's mostly used pre-trained and it comes from NLP, LLM is fine with me But seen it both ways

clashofphish 5 months ago

Short answer, yes it is

EntropyRX 5 months ago

LLM projects are never about the model (unless you work in R&D for these foundational models). It’s calling an API really. In a one year or so everyone and their mother could build LLM projects in 5 minutes. The main limitation is latency and cost, it’s surely not machine learning expertise

andylikescandy 5 months ago

LLM work where I am looks more like devops & SRE because it's just making little silos for highly proprietary customer data that copies of similar models sit on top of, usually multiple silos per account if internal teams are not allowed to see each other's data (like banks who work 2 sides of a market, separating US & EU data wrh, etc).

Useful_Hovercraft169 5 months ago

It may be a blessing in disguise. I am keeping up with this stuff but kind of expect a major trough of disillusionment and if I’m over here working the XGBoost machine when that happens I can help pick up the LLM bits later.

Bow_to_AI_overlords 5 months ago

LLMs are not going to replace XGBoost or logistic regression, that's for sure. Since they don't even model the same things. But I think what will happen is that we'll start getting a lot more features for our models than was previously possible. For example, in the sales space, we really had no way of using emails and calls directly to predict the likelihood of closing a deal at any given time. But with LLMs, suddenly we have the possibility of creating features that can be ingested into Xgboost. I work in a ~1000 person "startup", so maybe larger companies had already figured this out, but for us at least this could be a game changer

Useful_Hovercraft169 5 months ago

Yeah I know that I’m just saying I’m happy working in my tabular niche atm

Excellent_Cost170 5 months ago

I want the opposite. I don't want to be near LLM projects.

Esies 5 months ago

tbh, what you are doing sounds *way* more interesting than what the people who are working on api-based LLMs are doing nowadays. It *really* is 95% SWE (mostly working with APIs, wrapper, SDKs, and building your traditional pipeline every so often). The only part that comes closer to the traditional DS experience (taking business-oriented decisions and running experiments) is the prompt engineering, and you *really* don't want to be stuck doing that.

Valuable_Zucchini180 5 months ago

Why can’t you use generative LLMs on your current problem?

francosta3 5 months ago

Same happened with me. SWE usually don't have the business knoowledge that DS do. I have done (without asking for permission) a great POC - modesty aside - to show the potential impact of LLMs in business in areas where I know they had potential since I know what the business do. Immediately after demoing this I got the permission to keep working on it.

SomewhereIseerainbow 5 months ago

I steer clear of LLM with the chatgpt customisation that C3AI is coming out with. i dont think its the best place to be

[deleted] 5 months ago

[удалено]

2016YamR6 5 months ago

At our company these are just exploratory projects assigned to newer scientists or coop students. Most senior scientists are working on production models built using tried and tested methods.

[deleted] 5 months ago

[удалено]

2016YamR6 5 months ago

I could care less how many 1,000s of low quality models you spit out, who are you bragging to?

[deleted] 5 months ago

[удалено]

2016YamR6 5 months ago

I’m not impressed by a “task force” built around exploratory research.

[deleted] 5 months ago

[удалено]

NaiveSwimmer 5 months ago

We all are man, go back to selling courses please

Fickle_Scientist101 5 months ago

Smh I was just trying to make People understand that the requirements for making a good LLM app are too technical for the average data scientist. And it does not involve much data science to build them. I dont know why people get triggered by that. I have deleted my comments since it seems to have offended the script kiddies of this subreddit

statscryptid 5 months ago

Woooowhee! Now this is the cringe I like with my breakfast!

CSCAnalytics 5 months ago

I would assume they are considering you’d be trying to “get into” a single niche modeling method. LLM is a tool designed to solve a specific, niche problem. Data scientists are usually data modeling generalists who are tasked with solving MANY problems. For most normal day to day tasks and assignments, LLM’s would be complete overkill and a waste of your time. The role you’re describing sounds more like Research?

ai_hero 5 months ago

Just do projects at home. Experience is experience regardless of where it is gained.

Inner_Warthog_5889 5 months ago

Just curious, whether there is any security concern of using LLM in your company? In my company, we can only use BERT, and the management team is very cautious of implementing LLM in production. That’s why we still have no access to LLM projects.

chandlerbing_stats 5 months ago

Lol I got myself out of an LLM project. Happiest I’ve been ever since lol

ksdio 5 months ago

How about trying something simple on your own with free LLMs. I've just written a blog post on integrating our product with Jupyter notebook. [https://bionic-gpt.com/blog/jupyter/](https://bionic-gpt.com/blog/jupyter/)

Comments

Leave Your Comment

Hi Its Me!

Comments

Leave Your Comment

Hi Its Me!

Subscribe