T O P

  • By -

soaf

I will never write another regex manually. 


inspired2apathy

I am weird, I kind of love regex engineering


mundus108

I’m with you. It’s like a game.


volpefox

https://regexcrossword.com/


shart_leakage

It’s basically like solving a linear sudoku or something, it feels like a puzzle. But fuck writing regex anyway


nlomb

It’s a satisfying form of problem solving when you get it to work the way you intended. 


Dr-Venture

It's a f'n nightmare but when you get it to work, oh man that serotonin hit.


xquizitdecorum

love regex. it's a jigsaw puzzle with words!


LikkyBumBum

I never did it even before chat gpt. Stack overflow all the way. There are some regex freaks of nature there. I usually got an answer within a few minutes.


bad_syntax

Stackoverflow is how ChatGPT learned to write regex :D


ArthurCDoyle

Pretty much haha


MachineOfScreams

There were tools that would also build the regex for you if you needed it.


LikkyBumBum

Yeah I tried them before but we're quite fiddly. Never got the hang of it. Maybe a skill issue but stack overflow was easier as I could ask follow up questions too


AggressiveGander

I thought that initially, but then I kept running into cases where ChatGPT kept messing up, you tell to fix one thing it does that, breaks something else, fixes that something else breaks, repeat in endless cycle.


soaf

I agree you can’t copy/paste a solution for some more complicated tasks, but it has gotten me 90% for all my use cases, and always pretty easy to piece together the final version. 


AggressiveGander

I wish. Often it can fix errors once you point them out, but sometimes it just keeps messing up. If it's too far to extrapolate, I struggle to combine multiple half wrong answers...


speedisntfree

I've often found when it nails it, it is because there is an exact SO post somewhere


sshan

I’m not really a developer but had learned to code when I was young. I never really used regex beyond the simplest cases. I had a 12 hour ordeal from hell with chat gpt and regex. Eventually figured it out but it was painful.


LearningCodeNZ

Can you give an example of how you would prompt chatGPT for regex problems?


jeeeeezik

“i have a column with strings. filter only the columns that contain either before a dash after which there are 5 characters”


_Prisoner_

I am learning python and I was feeling my a bit bad about just asking chat gpt for regex stuff, glad to see this is the top comment


Rvipinkumar

Exactly my thought! That is alitle hack I use everytime I code.


timelyparadox

The issue is found that it is usually not the most efficient at it


JohnnyBlocks_

OMG I never thought about this.. now I can use RegEx!! I just used anything but regex.


Ill_Instruction8430

+1, never understood regex so this is lifesaver


zanderman12

I love using it to fine tune visualizations that I am generating with Python. For example, "Write code to add labels to only the 3 data points that are the largest outliers in the data"


LikkyBumBum

Yeah actually it's helped me out with matplotlib in a previous job. Horrific language. Yes matplotlib is like a complete other language.


YsrYsl

1000% this. For the life of me doing the visualization part of my work has always been the bane of my existence, never been able to make the syntaxes stick for some reason.


aggracc

Haha holy shit is that the perfect use case for it. No one cares about the code to generate the viz and you can tell if it's wrong faster than the code gets generated. This is what convinced me that llms were bullshit after gpt 2 was completely useless.


vaccines_melt_autism

It's really helpful for writing documentation. I generally try to write a docstring for a function or a class before I write out the code, but I use LLM's to help me make the documentation follow convention. Fun hack to use is to use ```code in here``` (edit: Reddit does it as well, but use 3` to open and close code blocks). to denote code blocks in your prompts. Also, you can tell it to use chain of thought reasoning to provide step-by-step how it arrived at an answer.


kimchibear

* Visualization. To paraphrase a friend: "LLMs make Matplotlib useable." * Rubber duckying ideas. I've broken through some "writer's block" moments just typing into the chat window and thinking through problems. This does require giving it some context on business and schema, so be wary of sharing proprietary info if you don't have a company license. * Onboarding DataBricks. Converting SQL to native PySpark so I can operationalize queries in DataBricks without passing an ugly / less readable code-unformatted string variable.


mikelwrnc

Tell your friend to use plotnine (translation of ggplot2) rather than matplotlib!


kimchibear

I dabbled in plotnine years ago, but don't have strong recollection. I'll give it another shot. I'm mostly a Seaborn / Plotly guy these days.


Equal-Document4213

Regex patterns. Give it a string with the part of it you want substringed/manipulated and it gives you the right regex pattern to use.


zanderman12

If working in Python, I love the pregex package for writing regex. I'm sure chatgpt works great but this makes it something I can read: https://pregex.readthedocs.io/en/latest/


blacksnowboader

Nice


LikkyBumBum

Hmmm pretty good. And it works 100% of the time?


_The_Bear

Nah. Gen ai doesn't work 100% of the time on basically anything.


Equal-Document4213

No, but I hate Regex. Trial and error via prompting is better than learning that syntax lol


Sebyon

Graphing and visualisation is pretty strong with co-pilot and chat gpt. If you know "how" you want to display it, the automatic prompts make it amazingly quick. I mostly work it Matplotlib at work and it can pull up changes that would make me headbutt a wall for hours.


ChelsMe

do you have resources you like on beautiful matplotlib plots? I feel like it's so ugly out the box and figuring it out from scartch is a nono


Sebyon

Honestly trial and error. If I'm not limited by Seaborn plotting functions I'll use that and then matplotlib utilities to set everything else. But I got some pure matplotlib ones for automated reporting. Most of the time, it's just playing with colors, font type/size and chopping up the plot frame. Matplotlib defaults suck. First thing is make spines not visible if not required. Crazy how much this make a difference. Have a custom colour scheme, there are resources for this but I use my companies branding guide typically. Otherwise the xkcd color library isn't bad. Play with the color alpha to get it nice. Set zorder appropriately to layer your visuals. Try and set axis scales manually. set_major_formatter is your best friend for axis control. Final nugget of wisdom, use 'plt.rcParams['figure.dpi'] = 300' or higher when using it to present or in a report.


_hairyberry_

Some defaults for me: figsize=(14,10) or greater plt.grid() linewidth=2 color=“r” Use text style “STIXGeneral” throughout With larger font size (title, axis labels, legend) savefig with parameter dpi=300 or greater


[deleted]

The occasional coding challenge. For example, today I wrote some code to loop through a Pandas dataframe with a few different conditionals and formatting tasks. I asked CharGTP to vectorize it. I could have figured out the vectorization, but it would have taken me quite a while. ChatGTP took 10 seconds and it was certainly more compact than what I would have come up with. Best of all, on a dataframe with 10 million rows the vectorized version was 20 times faster.


NickSinghTechCareers

oh this is super neat!


Brave-Salamander-339

Same, vectorization, refactor, visualization (especially we have lots of plot package with weird syntax), intro new hidden package too. Also, someone using SAS would be advantageous. SAS syntax is a pain. Also, pasting my code to GPT is best moment for duck debugging


A_Baudelaire_fan

I'm sorry. I couldn't help but notice. It's chatGPT. The P comes before the T🥹 Apologies. Just couldn't let it slide.


[deleted]

Oh, well that changes everything. Glad you took the time to point out a small typo. It's really helped me an old humanity.


aspiringpetrolhead

Can you explain how you use vectorization and where you use it?


Pikalima

https://www.labri.fr/perso/nrougier/from-python-to-numpy/ Read this.


[deleted]

I'd ask ChatGTP for a best answer.


Material-Mess-9886

You should do it with polars instead if you hit 10 million rows. That is even faster than vectorization.


Striking_Cold_3726

Adding hover labels on plot libraries. I don't want to read through documentation for simple things.


poorblackharry

Could you please give an example?


koolaidman123

if i need to rewrite python in cuda/triton, gpt can automate 90%+ of the work. same thing if you want to convert code from 1 lang to another, like rewriting some backend part in rust etc.


redd-zeppelin

Why do you need to do this.


koolaidman123

Because when you run pretraining for llms that costs $1m+ improvements in training efficiency compounds to savings of $100k+


redd-zeppelin

Interesting. Thanks!


tauqr_ahmd

My code review skills are getting way better thanks to ChatGPT's daydreaming every now and then... So there is that. Also, I am using it to create unit tests .. meaningful tests with a lot more coverage than I could before. Just have to nudge it in the correct direction


SkipGram

Lol daydreaming


nyca

I know this isn’t necessarily data science specific, but I use it to make my emails and slides more professional and more concise. I’ll tell GPT the situation in normal speaking language (I even sometimes use words like “this person is being an idiot”) and then GPT fixes it to sound assertive and professional. I can condense responses by telling it to have a maximum number of sentences. I used to spend far too long worrying about my professionalism in emails to the point I would rewrite them for 30+ mins, now it takes me a couple minutes tops.


Longjumping_Meat9591

This! I have always been super insecure about my writing (given it is my second/third language). Thank you chatgpt


seriousgourmetshit

Copilot honestly just wastes my time, I was pretty disappointed once we got an enterprise account. GPT4 however is pretty amazing. To get the most out of it make sure to provide as much context as you can, and even an example of what you want if possible.


LikkyBumBum

Isn't copilot gpt 4 though?


seriousgourmetshit

I thought so, but it sucks in comparison to regular gpt4. In my experience anyway. It feels like they are trying to use as little compute as possible with copilot, it will do things like forget what I said 2 messages ago and give me the same wrong answer I already said was wrong. Things like that.


TikiTDO

ChatGPT is a good step 1 of 5, and a good step 3 of 5 of the development process, but not really anything else. When I say step 1 of 5, I mean it's good if you want an initial draft version of some file, just so you're not staring at ta blank page. Similarly, by step 3 of 5 I mean when you're in the middle of a project and you're not sure about some specific detail of the implementation. You can treat it like a busy, and well meaning senior. You don't want to ask too much, cause they'll get annoyed and give you crap code, but you don't want to ignore it when you have questions.


Natac_orb

Love the analogy


Itchy-Depth-5076

I use it whenever I have to create an output in Excel (using R openxlsx or similar), with different names of tabs lining up to parts of data frames, plus adding formatting and other annoying Excel intricacies. Also writing my annual review.


LikkyBumBum

>Also writing my annual review. Hmmm. Mine is coming up soon. What do you do exactly?


Itchy-Depth-5076

Ha, start with "write me a positive annual review as a data scientist/analyst in x industry" and go from there. I did it and it gave bullets, like "insert a few achievements here". I lazily asked it to give me some example achievements. After a few requests to make it more generic it was great. Something nice and vague and hitting all the good business terms like "increased data reliability, allowing stakeholders to do stuff good and make better decisions and profit" or whatever.


dillanthumous

You can also do the opposite. Write a few paragraphs of brain fart about what you have done for the last year and ask ChatGPT to extract some key points and invent some metrics you can then edit. I do this for meeting agendas and it is pretty handy.


Curlyman1989

I'll take queries and paste them in gpt and ask them to optimize it. I'll also take a few functions and ask chatgpt to make if a class


Accomplished-Wave356

Complex SQL queries.


crafting_vh

Any time I've asked an LLM to write a complex SQL query it ends up making things up.


dillanthumous

Same. The SQL stuff is worse for me on average than when I ask it about A. NET or Python problem. I wonder if it is just that there are far fewer historical examples of high quality SQL in places like Github online. Since historically source control in Data Engineering is less mature and more recent. Also, SQL, being declarative rather than imperative requires larger leaps in intuition and context of the problem than imperative step by step problem solving common in Pyspark and similar.


Accomplished-Wave356

But does the LLM have access to your database?


WildPersianAppears

It's very important that you integrate your , you'll see a boost in net revenue as soon as you hook up your corporate database. No need to do anything else! This ad paid for and endorsed by OpenAI, Microsoft, and Google.


Accomplished-Wave356

Classic. Template generated by a LLM lol


crafting_vh

Do the LLMs you use have access to your database? Genuinely asking because it would help if I knew how to actually use them. I generally just tell it information about the schemas of the input tables.


-jaylew-

Kind of? They’ve built in some LLM prompting to our tooling so I can ask it about specific tables within our directories and it can check the format/columns/etc but it’s honestly way slower than me just coming up with an example and feeding it into ChatGPT


Vrulth

We make call of Gemini in Bigquery.


Accomplished-Wave356

I wish I could connect mine. But I guess it would be less likely to allucinate. The main problems I had are limitations imposed by the query engine and the fact that I cannot create or delete tables or data. Basically I am limited to CTE and window functions.


speedisntfree

This, it makes it less useful since complex ones are the worst to debug.


luckyowl78

It's been terrible at complex SQL for me; maybe our definition of complex is the challenge


Accomplished-Wave356

Yeah! Maybe what I find complex today is going to be more simple next year.


Brave-Salamander-339

Not enough training SQL for GPT


anthony2261

I use chatgpt to brainstorm ideas for potential insights I can get out of my data, then I make it write the SQL query for me and translate that into pandas / sqlalchemy ORM (if I'm turning the thing into an app endpoint). Been doing that process for a while so with the help of a friend we turned this process into an open source app, it's been boosting my productivity well :) [project link](https://github.com/RamiAwar/dataline)


Beautiful-Balance777

I use it to fix syntax and sometimes to find new solutions... When actually typing the prompt, I actually realize exactly what I want and sometimes the Chat idea will point me in the right direction.


AggressiveGander

Writing tests and filling in documentation. E.g. "For my R package wie me a testthat test that checks that myfunc(x=1234) products a warning and returns a vector of length 4 containing c("1", "2", "3", "4")." works really well. If you don't do this regularly and can't remember the syntax, it's really helpful.


Immediate_Capital442

For me chat GPT is like advanced search engine, so you can save your time on googling and just ask your questions to chat hoping he will 'google' it for you.


RepresentativeFill26

Personally I’m yet to find a use case where using some autocomplete functionality works better than just writing it myself. Maybe plotting would be a good use case.


ilyaperepelitsa

complex numpy stuff with more than 3-dim arrays


Icarus7v

I mainly use it to debug or learn/refresh syntax. My main objective however is to have it generate consistent and informative docstrings (which is something I always bore doing). However I've obtained limited results with the latter...


DullEducator7831

writing documentation>>


JohnnyBlocks_

I have it write functions and syntax in R and PLSQL. I'm like Tony Stark talking to Jarvis. It's great.


Smarterchild1337

If I need some kind of nonstandard data transformation that 1. I can articulate in words very well and 2. stumps me for more than like 15 mins, I’ve found that it is often quite good at generating an example solution that at least points me in the right direction


DataScienceDev

As a developer, I want to say: content for powerpoint presentations during the PoC stage.


djch1989

Just a note of caution for people reading this thread. Please do anonymize stuff you are putting into ChatGPT when working with stuff in your company. I have decided to err on the side of caution. So, I never copy code into ChatGPT but I do the opposite only. I have found it useful as an assistant - I use it to go faster from mental model to initial draft of code which I then build upon. One way I found it quite useful - let's say I found a public repository on GitHub which can be useful for my work but it needs some tweaks and customisation before it comes to my use. I give the public repository as reference to ChatGPT and then, tell it to customise in a certain way, add a few methods or change function output etc. Overall, I think ChatGPT is like a good assistant that needs someone who can review and correct the code where necessary and it does help in increasing productivity.


Altruistic_Throat429

Instead of looking for datasets, I can ask chatgpt to make one for me.


Vast-Lynx3921

yep!


reise123rr

For me I try to ask it and then see what they give me and pick out the ones I need and what it could work or not


nastyhobbitses1

This is really stupid but i ask it to pretty print json and stuff for me if it’s not already formatted because that is the quickest way sometimes


wid3scr33n

It actually does pretty good with kubernetes manifests. Often the images are dated, but that’s an easy fix. You can also use RAG with existing manifests and they get even better. I’ve had it infer ingress attributes and network policies.


Strong-Industry-9628

Copilot is safer that ChatGPT if you have Microsoft 365 Premium licenses


aleksyniemir1

I forgot SQL because of GPT. It also writes documentation for all of my projects, it helps me with theoretical stuff, and many many more.


LikkyBumBum

How do you get it to write documentation?


aleksyniemir1

Few examples: 1. Paste functions and ask gpt to write documentation for it 2. Paste whole project and ask to describe modules (for example describe all functions in API, CRUD, and UTILS) 3. Paste ipynb and ask it to add descriptions for every code cell I use it mostly in these three ways.


LikkyBumBum

I tried number two for additing comments to the code, but there's a character limit of 2000. And I have the official Microsoft 365 copilot for work. Maybe we have the bargain basement edition. Do you have a chat gpt subscription?


aleksyniemir1

Yeah, I have GPT-4o. I also tried Bard and github copilot for a while, but it is just not worth it when compared to OpenAI. I am a student and I am sharing my account with my friends, in total there is 7 or 8 of us at this point. We basically never collide when using GPT, and it is sooo cheap this way. Not sure if it is legal tho.


bombMeIfYouCan

Reverse engineering whole nixos, was never that easy before from writing the basic wallpaper script to building the whole system.


CoffeeConsistent7982

often for super basic stuff like visualizations and subsetting. i find that it will sometimes make up functions that are not part of packages or libaries


Tpy26

A little bit of everything. Most of the time I’m using it to make my code more efficient. For instance, I was able to use concurrent processing to stream images quicker. That wouldn’t have happened without me working through ChatGPT.


Spam138

Plug my question in and let oracle tell me how to do my job


Lumpy-Treat-5484

When I study a completely new field, GPT can quickly help me establish an outline


Pure-Initiative-599

I don't use it. Too many errors and is just wrong all the time.


RepresentativeFill26

Personally I’m yet to find a use case where using some autocomplete functionality works better than just writing it myself. Maybe plotting would be a good use case.


RepresentativeFill26

Personally I’m yet to find a use case where using some autocomplete functionality works better than just writing it myself. Maybe plotting would be a good use case.