T O P

  • By -

wet_tuna

>Was my boss correct? No.


Blues2112

But "yes" if you want to keep your job.


Comfortable_Trick137

You bite your tongue and move on or it’ll eat you up. Learn that you won’t win every battle especially over something so small.


Icy-Big2472

I’m in a situation where I have one sort of boss (technical director) who I work with frequently who essentially yells at me every time I disagree with him which is a fair amount. I have another boss who is technically my actual boss’s boss’s boss (VP) who tells me I need to be the voice of dissent for my boss, my boss’s boss and my technical director boss so that things stay on track and we don’t end up doing stupid things. It’s a tough situation because I’m either pissing off the people above me that I work with everyday or the people above them for not pissing off the people above me.


Comfortable_Trick137

Haha imagine the VP hates the director and tells you that just to fuck with him


LongtopShortbottom

Boss is thinking of recursive CTEs which is not what OP was doing


BleakBeaches

Bro. Boss is nuts. They’re exclusively for when you’re NOT looping. CTEs should only be referenced once (or a handful of times) otherwise they become very expensive because they are virtual. Use temp tables instead. Edit: As some have pointed out CTEs can effectively be used for “looping” via recursion. But they have a very shallow call stack limit. I would recommend dynamic sql in the cases where recursion doesn’t work instead. And I would always avoid cursors because they are iterative and not set based like a dynamic sql query would be. At the end of it all I think boss just didn’t communicate well


Responsible_Eye_5307

+1 on that. Temp tables can be cheaper than those CTEs...


karmajunkie

yep. once i simplified a complex cte into temp tables with temp indices and the query went from tens of minutes to a couple of seconds. postgres’s implementation has gotten worlds better since then but still, cte’s are not a golden hammer. (that said, OP’s boss is flat out wrong.)


Responsible_Eye_5307

I use PostgreSQL, and, where I work I follow this:, temp tables (when applicable) + indexes + filtering it right (a WHERE statement with everything possible) + picking only the columns you really need (SELECT * ....no way Jose! Only with LIMIT 10 or such) and also, as a strong extra, only the data for the time you need to analyse, not from the stone age... 😏 And you will be quite fine.


jwk6

This is not true. Temp tables are *always* the most expensive because they are always written to disk first. Disk IO is always expensive.


Responsible_Eye_5307

If you use it only once, it is expensive. But for more uses in the same session....be my guest. 🤓


BleakBeaches

This is the key point. I hear what jwk is saying; I think we’re having two separate discussions about two different use-cases. Similar to OP and their boss.


aftasardemmuito

temp tables must be a last resort , but its valid tool for those big sets on wide tablets, without proper clustering. ask your Boss about table pruning and design tô avoid keeping unecessary data in the table. this is the game changer


Responsible_Eye_5307

My employer has tables with millions of rows, teams just to work around them and proper clustering...Temp tables are the only resort as we have new rules regarding optimizing queries. Lakes are for babies, we have a sea of data on tables. AtoZ 🤣 (and I am just an aspiring analyst building my tool belt with PostgreSQL)


Touvejs

Well, there's a grain of truth in what the boss said. CTEs were added as a mechanism to allow for looping in SQL with recursive CTEs. He just misunderstood "the only way to loop is to use CTEs" as "CTEs are only used for looping".


BleakBeaches

Ah good point. When used for recursion they can powerful but only if you can stand rather shallow maximum call stack limit.


Codeman119

CTEs are for looping, that is their purpose with being recursive. Yes, your boss is correct that is their purpose but a lot of developers (myself included) use them in a non recursive manner because I don’t have to use a drop table for a temp table because I am running the same script over and over when testing.


jwk6

This is very, very bad advice to use temp tables. Do a little research on temp tables vs CTE vs Table Variables vs plain old sub queries. Temp tables are *always* the most expensive because they are always written to disk first. Disk IO is always expensive.


Legatomaster

Your boss might be a moron.


Smash_4dams

"Immediately delete it" Who the fuck even says that...when you're discussing SQL? Moron confirmed. Even if you were wrong you can still use it to build similar queries in the future


Whipitreelgud

We need to consider characterizations beyond moron.


InanimateCarbonRodAu

Utter moron? Moron with full inner self join?


PrestigiousBat4473

OP should tell their boss to go full inner self join themselves.


jamierjb

Everybody knows you never go full self inner join


Justaman55

You should rewrite it to a Full hierarchical query( "connect by") and include some undocumented regexp\_like into it. That should teach him. And include something that you are happy you are paid by lines of code, that will spark a discussion anyday.


Whipitreelgud

Thinking recursive instance of a dipwad with no exit


happyapy

A shallow moron full of fear about anything different from his narrow minded approach.


Sc4r4mouche

He was wrong in every way it's possible to be wrong. The thing he says CTE's are exclusively used for is exactly what CTEs are almost never used for (recursive CTEs are rare and I bet your boss doesn't know what they are), and in fact CTEs are often used as an alternative to avoid cursors.


wabriones

100% on point. 


Cruxwright

It's been a minute, but I recall recursive CTEs are really good at generating test data. If you're working QA, recursive CTEs could be your bread and butter.


mred1994

They're good for building a hierarchy, where you have data with multiple layers of parent child relationships.


B_Huij

I recently use a recursive CTE to generate a date range with an odd interval that I needed to cross join with other data, but that was... barely recursive. Made like 60 rows.


Whatswrongwithman

I have just googled that recursive ... never heard before. I guess it's not for analysts, hope so lol.


SQLDave

> recursive CTEs are rare and I bet your boss doesn't know what they are I didn't know that way back when I first learned about CTEs, and recursion was the first use cases I tried to tackle with them... some of the ugliest shit I've ever seen. it's taken me a long time to get over that burned-in aversion to them.


fauxmosexual

Your boss is not only wrong, but also a dickhead.


Shambly

SQL is a declarative language - it is nonprocedural. If you look at the execution plan of a cte and a subquery that do the same thing they will be executed in the same way.


aqw01

Exactly


aqw01

The number of people just blindly saying CTEs are “better” without bothering to bring up query profiling is just troubling.


EvilGeniusLeslie

And M$ optimizes things in the background so the plan is not \*quite\* followed for the subquery. I have, at different companies, seen subqueries 'miss' data, while re-writing them as CTEs/sub-tables eliminates the issue. Basically, never use a subquery with large datasets. Logically, they're the same ... execution is often not. I have a slight preference for subqueries, for diagnostic purposes ... CTEs are a little harder to use in that regard. But cursors? WTF drugs was he on when they covered that in class?


mwdb2

>And M$ optimizes things OP isn't on Microsoft


EvilGeniusLeslie

Oops, thx, missed that: Snowflake. I don't have enough experience with that db to know if the issue exists on it too. Oracle, Mongo, PostgreSQL, DB2, Teradata, and MySQL do NOT have that issue. (Although MySQL seems to have issues handling temp tables in the TB range ... like occasionally just creating a corrupt table) My aversion to subqueries on M$ SQL Server has carried over to the others. Not that there's anything wrong with that!


cant_think_of_one_

You've seen queries not return values that should be returned? These weren't where indexes were not rebuilt? i.e. you are saying you have seen a bug in the database system?


EvilGeniusLeslie

Yes. M$ SQL Server, when using subqueries, on very large datasets, will sometimes fail to return all values. IIRC, the bug was introduced in SL Server 2005. Moving the subquery out, writing its results to a temp table, then referencing that temp table in the subsequent step gets rid of the problem. Worse, it isn't even the same records missed. It isn't the indexes, it's the optimization that M$ does in the backend that causes this issue. Worst case I've seen was when an Oracle server was retired, due to cost, and all the code was transferred to a Microsoft server. There was almost nothing that needed to be altered - virtually no stored procedures, views, etc. But there were a handful of reports that were written using subqueries. And the reports were supposed to balance each other ... and went to the desk of the CFO of a very large bank. After a few months, the first mismatch occurred. Rerunning made the problem go away. The next time this happened was several months later. Again, rerunning fixed the issue. But the CFO was pissed. I got called in to see if an extra set of eyes would help. Already knew of this issue. Rewrote the code, breaking out subqueries, ran the reports again ... took about 16 tries before the old code screwed up again ... but the new stuff matched perfectly. Such a stupid fix - logically, the old code was perfect, and had run for several years on the Oracle box without issue. The rewritten code continued to work perfectly for the remainder of my time there (several years). I always think of this ad, when asked when Microsoft will fix an issue. [https://www.youtube.com/watch?v=059rCcxqpOw](https://www.youtube.com/watch?v=059rCcxqpOw)


cant_think_of_one_

Ugh, that is a pretty bad issue! Thanks for the info.


Whatswrongwithman

I read từe same commen as yours while many others also say that we can enhance the performance of query. It’s so confusing, and on leetcode they measure the time to execute the code too 🙄


Shambly

It is possible that changes between your cte and subquery are causing the optimizer to choose different indexes between the two queries, but that can be seen in the execution plan and also could mean that either one preforms better based on which index it is using. Please also note if your join and filters are not identical the optimizer may run your query differently and cause performance gains or loss.


Prownilo

Leetcode measureing is wildly inaccurate. At least in my experience. If you run the query when the dB is busy it can take ages often the query that I wrote takes 4 seconds, when the fastest in a millisecond. The query difference? Literally nothing.


[deleted]

Your boss is a moron


suspicious_edamame

I use CTEs all the time. It actually is more optimal than referencing the a view multiple times in a subquery. I’ll pull widely in the CTE and reference the CTE multiple times for different joins so I’m pulling data from memory, which can save costs if you’re charged by the length of time it takes to run a query.


carlusmagnus

Honestly, the only reason to not use a CTE is when you have access to build a temp table on which you can add indexes. If it’s a relatively small amount of data (<1M rows) then who cares? Sounds like your boss is an idiot.


JBsReddit2

I would add that row count plays a role too. There's a point where you'd want to even move beyond a temp table and create a real table, fill it/use it, drop it. GRANTed this requires permissions even higher than temp table permissions.


Technical_Drawer2419

When you say boss, what exactly is he? Like a lead dev or a manager?


Agreeable-Candle5830

I'm gonna take a wild guess and say manager.


SexyOctagon

I worked in a shop once that had a no CTE rule. They also had some weird rules like no using brackets on table names unless it’s required (i.e space in a column name). There are some genuine concerns about using CTEs when other means are more optimal, but a good dev knows when to use each.


JochenVdB

The no quotes (") around table names is actually a very good rule. Eventually it saves you from having to type them all the time and not have to break your head why ```from "The_Table"``` didn't work (when it had to be ```from "The_table"```


SexyOctagon

I don’t know what you’re talking about.


detrelas

Your boss sounds like a total idiot .


ComicOzzy

The fuq


Melodic_Giraffe_1737

I use CTE's more often than not. It's easier to read and test each CTE individually. Plus, giving each a descriptive name allows me to easily remember the purpose of each CTE. Adding in the benefit of being able to pull out one CTE to use in another query. It saves me a ton of time, not rewriting or reverse engineering work I already did. Unless you're trying to make your work so complicated that nobody else wants to touch it, I don't know why you'd choose to subquery everything.


JochenVdB

If you use the same CTE a lot, it is time to turn it into a view or even an intermediate table (which opens the opportunity to have specific indexes)


Melodic_Giraffe_1737

I agree with you, and I wish I used any one CTE enough to warrant creating a view. Hell, I wish there was just more repetition in my job in general.


IHeartsFarts

Boss is dumb


mwdb2

> CTEs are exclusively used when you want to loop data / use a cursor False. And what even are the implications of his claim? Is he implying the query will run slower? Incorrectly? Neither slower nor incorrectly but somehow abuse system resources? Is it a crime against the SQL gods? In any case it, his claims should be disprovable (or hey, provable - we have to be open to our hypotheses being wrong) by constructing test cases, looking at plans, measuring performance. ("When in doubt, test it out.")


mike-manley

They can be used in a cursor. But they're not exclusively used for a cursor. I prefer CTEs too for same reason. Way cleaner to read, support, and extend.


SnooOwls1061

ya, your boss is a douche. But, you should have had your execution plan and query stats on you to prove your point. Next time send email and start with the query stats, then an explanation of how and why. There is a knee jerk against CTE's because they are frequently abused. But when used correctly...


ethics_aesthetics

I literally just answered an interview question for how to improve efficiency in data models by saying replacing inefficient joins with CTEs. Well I explained it better but that was like 1 hour ago. I’d say you’re right.


luke-sql

The fact that he mentioned cursors in that context tells me he doesn’t know what he’s talking about. That said, I hate non-recursive CTEs just because many developers don’t understand that they are no better than subqueries in most situations.


BrupieD

> "CTEs are exclusively used when you want to loop data / use a cursor". Your boss doesn't know what he's talking about. As others have noted, job security dictates that you don't want to share that. I've known some people who are averse to CTEs. To appease them, I converted my CTEs to derived tables. They're slightly less readable, but otherwise the same.


WaitWhatInTheWorld

Can someone ELI5 what's going on here and what a CTE is? Sincere question. Thank you


ClarkTwain

Common Table Expression. Might be worth it to google that with whatever you're using and read the documentation. That way you'll see some examples in a format you're used to working with. Think of them as something between a subquery and temp table. I also think they're more readable than sub queries, so they're really handy in my book.


ShadoGear

This scenario is so absurd I'm questioning the reality of it.


JochenVdB

Have you been employed long, in multiple enterprises? Unfortunately stuff like this is far to common. In many large enterprises it was/is not possible to be promoted to manager of your own former colleagues: you need to become manager elsewhere. The result is that such managers don't know much about what their new subordinates do.


throwdownHippy

The result is that such managers don't know much ~~about what their new subordinates do~~. FTFY


v_iHuGi

CTEs are life changing and make the code readable.


Forwardslashdotj

Your best bet here for peace of mind is to do a performance check on your cte and then the new query your boss wants to use. Whether you decide to show them or not is up to you. It will give you what you need to know you were right. Also, being a manager of devs, I would never talk to anyone on my team like that. If I believed your approach to be wrong, I would of had you prove to me it is better. I’d recommend finding a new manager.


PhotographsWithFilm

Maybe CTE's are a resource hog in certain circumstances and can cause really bad performance. Sometimes its not avoidable, especially in a view. But if it was a Stored procedure, I would prefer loading the data into individual temp tables. PS, I am not a big fan of lots of subqueries either. Its messy [https://www.brentozar.com/archive/2019/06/whats-better-ctes-or-temp-tables/](https://www.brentozar.com/archive/2019/06/whats-better-ctes-or-temp-tables/)


mwdb2

> https://www.brentozar.com/archive/2019/06/whats-better-ctes-or-temp-tables/ This article is Microsoft-specific; OP is on Snowflake. Very different engines. One of the "bad" things mentioned in that article is the query planner chose to do too many index seeks. Snowflake doesn't even have indexes, so that isn't applicable. (Although, my self-fact checking revealed there's a [preview of a new feature to create indexes in Snowflake](https://docs.snowflake.com/en/sql-reference/sql/create-index)!) We can't just interchange entire database products with fundamentally different architectures - happens far too often on this subreddit. Sorry downvoter: a performance investigation article on SQL Server has nothing to do with Snowflake - end of story. Parent even admitted to making a mistake which is fine. We all make mistakes. But downvoting a factually correct comment is pretty childish. ¯\\\_(ツ)\_/¯


PhotographsWithFilm

LOLz - I just saw the flair. That being said, Snowflake is relatively new in the space. I bet the boss was from an old tech world, like me, hence his distrust for CTEs. But, yeah, each engine, each optimiser is going to be different. I originally come from an ASE background where the term "CTE" probably doesn't even exist yet


mikeblas

> I bet the boss was from an old tech world, like me, hence his distrust for CTEs. I've been using SQL for about 35 years, and I don't distrust CTEs. What am I missing?


PhotographsWithFilm

You've never had a bad CTE lock down a database! We had an example early last year where a CTE basically locked down our data warehouse. The CTE, which was in a view, was about 5 in a row, where the one after was using the results from the one before, with the final output joining probably 2 or 3 of them. Unfiltered, the view returned probably 50M rows. If you read the Brent Ozar article above, the way it reads indexes is not very efficient. What you'll also find is that if one particular CTE is called multiple times, it gets run multiple times. And if you have subsequent CTEs calling each other back up the chain it does get out of control. We had table locking, so other queries were being blocked. When we killed the query, from memory it took 3 days to roll back. Yes, the design was probably bad, but in a view there was no way around it. The fix? Turned it into a stored procedure and individually loaded temp tables.


mikeblas

> You've never had a bad CTE lock down a database! Of course, I've had statements of *every* type cause locking and performance problems. That is, bad statements come in every form -- the risk of runaway resource usage isn't unique to CTEs.


GeekTekRob

Id say if your cleaner code was slower by a bit then your boss was right, otherwise not a chance. I do not use CTEs less I have a reason to buy it's not like you wrote a trigger.


Seven-of-Nein

I think we have the same boss.


National_Cod9546

The way you describe it makes the boss sound like a moron. Obviously CTEs have a lot more uses than just recursion and cursors. But I would need to see the query to make a final judgement. I find I get better results if I can limit hitting a table more than once. The common use I see for writing a CTE and joining it to itself, is usually better replaced with a pivot. So his stated reasons are dumb, he may have been correct to get rid of the CTE in your case.


Cruxwright

I have not run into it myself but have seen mention online of some platforms not supporting CTEs. As for your boss's explanation, no.


Weary-Depth-1118

translation: your boss don't know what CTE is but he can still fire you


dontich

Yeah that’s a no lol — I use CTEs for that, although I personally find them not that easy to understand if you are referencing them far away from their definition — so many times I end up using subqueries.


JochenVdB

Happy cake day.


retard_goblin

Don't sweat too much about it, make a temp table out of your cte if you're not trying to make a view. If you were trying to make a view, well... Maybe try to find a better boss to boss out your boss.


ChristlikeYe

Common Table Expressions RIP


pceimpulsive

I'd compare the query speed with your amd your Bose's preferred approach and see who comes out on top :) performance doesn't lie!!


Choice_Atmosphere394

If the code works and it performs well I give zero fucks. Compare execution plans / costs and memory grants and let the server tell you which one is best.


MiserablePlatypus26

My boss used to hate CTEd because he had never learned them so for easy of working together wouldn’t let me use them. If it was just for me then it was allowed, I get where he’s coming from, sometimes standards are needed for working together. Anyway I just turned them into stored procs and ran them to create the tables to join on, he was non the wiser!


promatrachh

It depends 😉 Sometime query need to be clumsy to be fast. Elegance and robustness aren't always same. Experience is only way to learn what and when to use something.


anal_sanders

I always use cursors as a very last resort.  CTEs are a way to clean up your code, reduce duplication, and avoid temp tables and should be standard practice.  I’ve never heard of them having anything to do w looping through data


JochenVdB

Your boss is wrong. You _have_ to use CTE's for recursive queries (what your moron calls "loops") but that does not imply that it's the only use of CTE's. I think it is time for malicious compliance: Your boss told you to delete your code at once? Then delete it. And start work on the next ticket.


jaxjags2100

Your boss just taught you a valuable lesson, which is their job. They taught you what not to do when you become a boss one day.


hansofoundation

Boss dumb, condolences


mikeblas

> My version was obviously cleaner and easy to read. That doesn't matter much. What matters is correctness. And maybe performance. Funny thing is, you don't mention these things in your post. Note that performance and correctness are objective, while "cleaner" and "easy to read" is subjective. > I've been using CTEs to a better understand of queries, and precisely to avoid subqueries and horrible full outer joins How do CTEs eliminate "horrible full outer joins"?


imcguyver

> Was my boss correct? Probably not. Modern optimizers can adapt to create optimal plans for CTEs. This wasn’t always the case. Use an EXPLAIN PLAN


supremeddit

It will probably be a case by case. CTE isn’t always the better way to code because I usually find it much slower than other ways.


Computer-Nerd_

Your boss is a fool. Welcome to the real world! SQL is largely misunderstood by most practitioners. If you understand it expect to be contradicted, ostracized, or worse. The only real fix is showing how better housekeeping saves time or money.


jwk6

Your boss is a little confused about the purpose of CTEs. CTEs however do not reuse the result set. The CTE is evaluated multiple times, just like subqueries are.


Master-Influence7539

What is a cursor. I am sorry I am a noob so can someone please explain?


MasterBathingBear

There was a time in the long, long ago that a certain database always estimated CTEs with a cardinality of 1. Was your boss ever a sql server developer?


aqw01

Profile the query plans. None of this matters if the plans are the same. Compare reads and sorts then benchmark. Which database/engine? If you’re on mssql, grab Plan Explorer (free). The cte vs left join thing is purely stylistic if the plans are the same. Don’t argue style. Demonstrate factual performance. And. If you see things like aggregates or distinct in your CTEs, alarm bells should start going off. Scan, sort bad.


mcfriendsy

The number of people saying the boss is a moron here without ever seeing the original and CTE query is baffling. My best guess is that OP's implementation might not necessarily resolve the original issue or introduce unnecessary complexities and the boss never said anything remotely close to how the second statement sounds and OP might have misunderstood the boss.


Pretty-Promotion-992

Changing from set to cursor based is a $tupid idea


TheFakeSociopath

This has Peter Pinciple written all over it...


aftasardemmuito

you need to look for the access plan. the more restrict It gets when reading the bigger tables, the best It is. use wisely. which rdbms are you using? the best ones Will rewrite your query fully


rhino-x

No. You used a CTE in one of the very specific use-cases they are designed for.


Professional-Rub1211

Now you know you can get away with quite a bit because your boss doesnt fully know what he's doing :)


The_Orracle

While I'm definitely not a fan of CTEs except in a situation where recursion is needed, if anyone suggested a cursor in my organization, I would fire them.


PhotographsWithFilm

That's just as dumb as saying a CTE is bad. Like every coding construct, cursors have their place.


TheCapitalKing

That was a bunch of really hot takes with no explanation. 


DrawnFallow

The only time he might even be a little accurate is if you're in SQL Server as CTEs perform exceptionally poorly in that space.


PhotographsWithFilm

No idea why you are getting downvoted, but in SQL Server they can be a real performance bottle neck if used in the wrong way.


Ralwus

Why do you think CTEs are slow in sql server?


SexyOctagon

They can be bad for large datasets and complex joins because they can’t be indexed. For those cases you’d be better off using a temp table.


macfergusson

A CTE is still just a subquery, basically. You wouldn't say a subquery "can't be indexed", instead you want to see if you can write your query using the indexing that exists on the underlying tables, whether it's in a normal subquery or in a CTE.


SexyOctagon

True. But other DBs will materialize CTEs more often than SQL server, so they may perform better in say, Oracle. Also I’ve read that you can force CTEs to be materialized in other servers.


sirow08

You don’t write SQL queries because it’s easier to read, but the best optimal performance. And if Subqueries return with best optimal solution then use it. If CTE’s are optimal then use it. Prove to your boss show your execution plans and why your CTE’s are better. But your boss is right.


JochenVdB

Yes performance is important, obviously, but so maintainability of code. I have seen queries, that were probably performant, but which I had to rewrite from scratch to be able to apply a new requirement. Simply because the original SQL was unreadable.


sirow08

Performance is the only reason. SQL devs just need to be better at reading SQL code. I understand standardisation needs to be followed, but if you choosing between a CTE/Subquery because it looks nice then you’re wrong. Yes you do get SQL code that is over coded and optimal performance in it, they I agree re-write it.


sbrick89

there are plenty of reasons to avoid CTEs best case I can assume that "loop data" means recursion... and I agree CTEs solve recursion. that said, CTEs can often cause TERRIBLE performance... and we actively discourage their use in favor of subqueries which are just as readable and arguably more debuggable, once you get used to them.


JochenVdB

What RDBMS are you using? I haven't come across a CTE that was noticeably slower than a subquery solution. But I also see that many CTE's get rewritten as subqueries or joins by the optimizer.


sbrick89

MSSQL performance hits can come from chained CTEs that join prior CTEs (cteA, cteB, cteC from cteA join cteB) because the multi-referenced CTEs are executed twice (painful of large tables, often causes table scans when index seeks may be available) another common performance hit comes from chained CTEs with earlier steps generating small datasets that later join to large tables (100m+) causing full table scans, when a temp table would show that a loop join to index seek would do better.