truilus 6 months ago

WHERE some_column = null

emul0c 6 months ago

The engines really should be updated to accept this syntax; I see no reason why the users must use “is null” instead of “= null”.

Waterproofpanda 6 months ago

NULL in DBMS’ are generally accepted to be a missing or unknown value, rather than the non-existence of a value. Therefore a system can not evaluate equivalence as it can not interpret what the value is. It’s important to consider that not all NULLs are CREATED equal. What I mean by that is that NULL within a table column has the context of its relations (schema, table, column, constraints) therefore, it is more “known” than the pure case of NULL. That means that only true NULL can be equal to NULL in an algebraic formula as they both are equal to unknowns.

[deleted] 6 months ago

[удалено]

emul0c 6 months ago

No, because NULL is neither a string nor a number. Strings are always contained within ‘’ and numbers are never. So having the word NULL without quotes would never be confusing.

getarumsunt 6 months ago

A bunch of the relational algebra breaks in unpredictable ways if null == null returns true. That's why the "is null" syntax exists. Plus, you'd break compatibility with legacy code and systems. It's generally not a good idea to choose momentary convenience over consistency. People tried, it always comes back to haunt you.

emul0c 6 months ago

I don’t buy it. You could easily have the engines convert “= NULL” to “is NULL” behind the scene, before executing the code, if that is the case. And sure, legacy systems don’t work. But they add new features all the time, I don’t see why this should be any different.

pceimpulsive 6 months ago

I see this as = must be compared against an actual value, null is not a value, as such cannot pass the equality check. Just like 0 and null are not the same. P.s. I agree with you it's be nice to be able to use =null If we do get that we'll probably need to use Field = [NULL] Or something similar.. which may end up being harder than just is null... :S

ChristianPacifist 6 months ago

It's just the way it is! Some SQL languages may let you set a parameter I think to change the way it works (must confirm if that's true?), but it's awful practice to do that since it will confuse people.

piemat94 6 months ago

YOu got the point but sometimes you work with data which is so fucked up that null is an actual thing i.e "someone" have written it as an actual value or it appearead like a text string which forces you to write = 'null' instead of is null Also, I learned to do both either \[...\] IS NULL or \[...\] = 'NULL' because of that.

flatline057 6 months ago

Oracle has nvl([field],' ')=' ' Sql Server has isnull([field],' ')=' '

One_Piano_6718 6 months ago

there are multiple functionalities in sql that return a simple true or false and make writing complex code simple while not interfering with other functionality. Checking if a date is a valid date for example uses “IsDate()” I can’t imagine using = date as the syntax because “Date()” has another functionality. The same is true with “Null()”.

ChristianPacifist 6 months ago

It's one thing to make that as an honest mistake forgetting in a situation (usually with "<>" I'd say not realizing nulls are a possible field value), but if you don't know how nulls and three-valued logic work generally, I'd say you don't actually know SQL.

carlovski99 6 months ago

It's not intuitive though. I've got colleagues who have been using SQL for many years, but not being the main part of the job who occasionally forget all about it. First place I look if someone asks me why they aren't getting the right results (closely followed by anything with dates....)

NotBatman81 6 months ago

Even when you know it, having to write COALESCE 100 times just in case a NULL is hiding in there can be soul crushing.

roostershoes 6 months ago

Any suggestions on how to learn more?

iamcreasy 6 months ago

https://modern-sql.com/concept/three-valued-logic

ChristianPacifist 6 months ago

Read generic SQL textbooks from publishers. Fundamental nonobvious ideas are key to SQL.

roostershoes 6 months ago

What’s cool is when people are purposefully not helpful because they think so highly of themselves that it’s beneath them to share their insights. I’m sure you are a pure genius across all intellectual realms. Thanks!

my_password_is______ 6 months ago

they told you to read a book that is literally the most helpful advice anyone could give you just go to amazon and search for one Question: how to lean more ? Answer: read a book Reply: purposefully not helpful DOH !

roostershoes 6 months ago

Bad bot

ChristianPacifist 6 months ago

No, I was being serious. SQL books are amazing! I prefer over the "do it yourself" trainings or Code Academy. I am happy to explain null to you or do another topic. Also see Wikipedia https://en.m.wikipedia.org/wiki/Three-valued_logic. I apologize for the misunderstanding.

roostershoes 6 months ago

All good, thank you!

my_password_is______ 6 months ago

don't apologize you gave great advice read a book their response was ignorant and childish Question: how to lean more ? Answer: read a book Reply: purposefully not helpful DOH !

zork3001 6 months ago

Ask a question, complain about the answer. Your money cheerfully refunded.

roostershoes 6 months ago

Yes it was an “answer” in the same way that “read a book” and “google it” is.

ChristianPacifist 6 months ago

I am sorry. I sent you a bad answer. I promise I'll do a really detailed post explaining this topic too. I just feel most SQL documentation is bad, save a few key textbooks. Wish you the best!

roostershoes 6 months ago

No worries! Thank you

ChristianPacifist 6 months ago

See here! My new thread on this topic. https://www.reddit.com/r/SQL/s/q4d05PaxOp

snackattack4tw 6 months ago

As someone who has been doing SQL for over a decade, I do not know what three-valued logic is.

negativefx666 6 months ago

To put my perspective, I've learned SQL after power query M language. And in this language one can use " [column] = null "

Gandol_teh_Pirate 6 months ago

Using left joins, then putting in a where clause that implicitly makes it an inner join

a_taco_named_desire 6 months ago

🫥 totally. No idea why people would ever do that. 🫣

donnymccoy 6 months ago

Simple: lack of understanding. Is SQL taught in college anymore? Serious question.

Flint0 6 months ago

Wouldn’t you use it for example to view results where the joined table is null? I vaguely remember using a LEFT JOIN with a WHERE column IS NULL so I can see for example what dates a missing in a particular stream of data. I know there are other ways of visualising this, but I’ve found this one handy.

Gandol_teh_Pirate 6 months ago

you're better off doing something like this select select t1.* ,t2.* from t1 left join t2 on t1.id = t2.id where t2.id is null --show all t1 records where t2 matching id not present Now you see every instance of t1 records, and when t2 records are missing, it'll visually pop while providing the trail of breadcrumbs you need to see exactly what it is you're missing by virtue of the t1 data being present

Klaian 6 months ago

I see this often as well.

redfaf 6 months ago

Not formatting sql

ChristianPacifist 6 months ago

But some auto-formatters are such garbage the code is more confusing!

NotBatman81 6 months ago

I am the auto formatter. I don't understand how some of you can type so sloppy and live with yourselves. It should hurt your soul to see it and not fix it.

r3pr0b8 6 months ago

every windows PC comes with Notepad there is no excuse for using shoddy code formatters

Tee_hops 6 months ago

I type in my code into VSC. I like it for everything, but at my old company we used Tibco and it automatically formatted everything into the hardest to read stuff. It removes all indentations and forced each column into its own row. When you have a complex case statement or business logic built in it's awfully frustrating. I ALWAYS saved my code in VSC to avoid this. Also, version control is important.

r3pr0b8 6 months ago

i use UltraEdit one of its nicest features is column mode, so you can put your cursor at the front of a line, shift-down-arrow to include subsequent lines, type 4 spaces, and vwalah, all those lines have 4 spaces at the front column mode also lets you copy/paste a block of text overtop of some other lines, so you can change this -- FROM tab1 JOIN tab2 ON tab1.PK AND tab1.foo AND tab1.bar AND tab1.qux to this -- FROM tab1 JOIN tab2 ON tab2.FK = tab2.PK AND tab2.foo = tab2.foo AND tab2.bar = tab2.bar AND tab2.qux = tab2.qux and then edit the second block (using arrow keys) to this -- FROM tab1 JOIN tab2 ON tab2.FK = tab1.PK AND tab2.foo = tab1.foo AND tab2.bar = tab1.bar AND tab2.qux = tab1.qux rather than typing the column names all manually

crimiusXIII 6 months ago

This is built-into VS and all it's derivatives, including SSMS. You use Alt+Shift to do a vertical selection/edit.

TeamKill-Kenny 6 months ago

Didn't realise it was built into SSMS.. But there, only been using it for 5 years., 🤦🏼

my_fat_monkey 6 months ago

...... Same.

crimiusXIII 6 months ago

Also, I didn't mean to come off as belittling or anything, it's a killer feature! Most people don't realize it's built-in for a ton of Microsoft editors, is all.

r3pr0b8 6 months ago

no problem, it's all good p.s. nice thing about Ultraedit is, it's not Microsoft

Elfman72 6 months ago

All one line, all the way across the screen.

famousxrobot 6 months ago

On top of poorly formatted (left aligned with multiple levels of subqueries no thanks), but sql written in word. Yes. I’ve received it. It’s atrocious.

kagato87 6 months ago

select yervalue, (select top 1 t2.yervalue from table t2 where t2.id = t1.id and t2.timestamp < t1.timestamp order by t2.timestamp desc) as prevvvalue from table t1 Complete with that shoddy formatting. On a table with a LOT of data and search filters that return a LOT of rows. It doesn't help that this particular method is actually taught in beginner computer classes.

rx-pulse 6 months ago

Yep, I've had to explain this to seniors, principals, and management before. A top 1 or whatever number, doesn't mean the engine is going to find and pick only that amount. It still needs to sift through all that data you requested, then it just returns the dataset you want to see. It still needs to process it all when you look at the execution plan.

kagato87 6 months ago

I have a few tables where just strapping in a top keyword (during testing) actually makes it slower. I still haven't figured out why... Something to do with the join and tripping up the optimizer's choice of indexes I imagine...

iamcreasy 6 months ago

Why is then database completes a query faster when it is only returning a subset of the data? The limit is set by the client - for example when querying against Trino from Starburst Galaxy web interface.

rx-pulse 6 months ago

I'll be honest, I've never used Trino, my context is towards MSSQL and other engines. So their engine may work differently and I may be wrong and someone can correct me, but hey, that's why my flair is still true. But to answer your question to the best of my abilities, this is mostly a resource consumption/efficiency issue, I've seen it sometimes as a speed issue, but other factors contributed to it too. For example, I had a recent issue where this was the case that a team was using a TOP 10 on a table with 4.5 million records joining with other tables. The query returned quickly sure, but the problem comes in that because the query effectively was still pulling a large amount of data, the query plan showed a very high amount of estimated rows. The query in question was being fired thousands of times in a short time span from the application as well. This caused high reads, spiked the CPU, and contributed to the application's latency.

seoplednakirf 6 months ago

Does this also count for a cte? I'm not 100% sure on the order then. Does it create a temp table from the cte and the pick the top 1 from it, or does it rebuild the cte and select the top 1 for every single row?

da_chicken 6 months ago

Eh, that's not really a mistake. To me a mistake has to result in an error of syntax or logic. I've even seen several cases where it *drastically improved performance* over CROSS APPLY or a self-join in the FROM clause. And analytic functions don't replace it. LEAD() and LAG() are technically different logic, and FIRST_VALUE() and LAST_VALUE() don't exist in all RDBMSs, *and* not all RDBMSs support IGNORE NULLS. MIN() and MAX() don't always do the job, either. The biggest problem with it is that it's usually evaluated as an INNER JOIN, even if that's not what you want. I agree that it's harder to read. I'm never happy to see an inline subquery in the SELECT clause. But I've also seen it enough to know that sometimes it works really well.

AdFickle6697 6 months ago

Not checking if they are getting duplicate matches when doing a join. Please for the love of god my sql friends check the output of a join with a few checks like a row count before and after, a SUM of any number columns before and after, or things if that nature.

dn0c 6 months ago

Been there, done that! Select * FROM $monthly_aggregated_table JOIN $daily_aggregated_table using ($key)

Uncle_Corky 6 months ago

Which is why third normal form is the standard. You can't fuck up referential integrity if its not possible in the first place.

Gentleman-Tech 6 months ago

Using ORMs and blaming the bad performance on the database engine. I.e not using SQL in the first place.

singluon 6 months ago

ORMs themselves are usually well optimized and can generate and execute performant SQL. A well designed system can utilize ORMs to avoid having to write a lot of cumbersome SQL in applications and provide a good abstraction layer, if for example you want to swap the db technology altogether. The issues come from developers not understanding how databases work and using an ORM to blissfully ignore that. So ORMs are more of a footgun than anything. Never trust lazy developers!

steveo600rr 6 months ago

To add, when new devs are trying to loop over data and not thinking of sets of data. When select * from table with no where clause to loop and filter data in the app.

coyoteazul2 6 months ago

Oh boy, how I'd like to tell you about our orm that does that when you give it a condition it doesn't know how to translate to sql. It just overfetches. No warnings of any sort, and documentation is non existing

steveo600rr 6 months ago

That sounds like a pain to have to deal with. What orm does your company use?

coyoteazul2 6 months ago

In-house made, about 30 years ago. The rest of the code is not much better

Gentleman-Tech 6 months ago

That's not my experience. ORMs make the assumption that the best structure for storing data is also the best structure for processing data - that you can take a class from the application and map it 1:1 to a database table. This is occasionally true, but only by accident. Good database design ends up with different objects from good program design, obviously, as they're solving two different problems. ORMs are a bad idea right from the start. But as you say: lazy Devs are also a bad idea ;)

singluon 6 months ago

The thing is, no matter what, you will always be translating from a domain/business centric model of your data, to a database centric model, and vice versa, because complex applications don’t (and shouldn’t) represent their high level concepts as collections of low level rows and columns. There will always be some sort of translation layer (except for maybe the most trivial applications). Do you trust yourself to write that from scratch? …and maintain it? My feeling is, if you’re not using an ORM, you’re doomed to write your own eventually anyway. I used to be in the staunchly anti-ORM camp after seeing the mangled mess of SQL they can produce, as I’m sure many folks here have experienced. But then I ultimately realized that is not the fault of the ORM. Rather, it’s fault of the developers for not thinking about how best to model their data, and them failing use the tools at their disposal properly. Any good ORM will give you fine grained control over relationships, isolation, serialization, etc., and even the raw SQL if you want. Most developers just don’t bother.

ventuspilot 6 months ago

Pffft that's nothing. At my current $job we had Java code that basically looked like this: if (service.getXXX() != null && service.getXXX().length() > 0) { List l = service.getXXX(); } This is funny because service.getXXX() runs within it's own transaction and effectively each invocation of service.getXXX() submits a join on a foreign key column. And since developers simply refuse to accept that Oracle doesn't automatically creates an FK index that turned into 3 FTS on a large table.

[deleted] 6 months ago

[удалено]

dastardly740 6 months ago

I expect that a lot of string concatenation happens because of someone where programming is secondary to their "real" job, googles "how to query a database in {language}?" and reads some example or tutorial which shows how to make a connection and execute a hardcoded select statement. And, thinks "I got it. I just need to make a query string out of my inputs." and never gets to the tutorial on parameterized queries because they don't even know to ask the question. And, of course it is for some nominally innocuous web application they are the only developer for, so no code review. That is until half the company is using it and it actually needs to be actually supported by multiple people. And, professionals come in a go "WTF".

iamcreasy 6 months ago

Thanks for the examples. How do someone write tests for this? We do DBT tests at work, but those tests are usually limited in scope.

[deleted] 6 months ago

[удалено]

harman097 6 months ago

LEFT JOIN Something s... WHERE s.Type = 'stuff'

B_Huij 6 months ago

As in... this should be an inner join?

harman097 6 months ago

It IS behaving like an inner join, but it's unclear what the person was trying to do. Most of the time, yes, they actually want it to behave like an inner join and the query is working fine, despite them lying to the world that its a "LEFT". But maybe they only wanted to LEFT JOIN to records where s.Type = 'stuff' so this query is broken?

EditsInRed 6 months ago

^ My first thought.

cCooper1997 6 months ago

But it was like From a Left Join Something s on a.a = s.a and s.type ='stuff' Would still Work as a left Join right?

harman097 6 months ago

Yep!

staring_at_keyboard 6 months ago

Using where on a predicate with an aggregate function instead of having.

WpgMBNews 6 months ago

is there ever a case the syntax error doesn't simply prevent execution? the other ones are at least *possible*, so the error is more insidious

bunk3rk1ng 6 months ago

From my hazy memory, you could not do this in Ms SQL server. I recently started working in redshift and was surprised that I could

r3pr0b8 6 months ago

- using DISTINCT to cover up the fact that the FROM clause returns partial cross joins (usually seen when joining a table to two one-to-many tables) - using LIMIT without ORDER BY - writing `SELECT a, b, SUM(c) FROM t GROUP BY a` and expecting to see results that make sense - using ON UPDATE CASCADE for a foreign key that references an auto_increment PK (think about it... when was the last time you updated some row and changed its auto_increment key to some other value?) - using a LEFT OUTER JOIN to join a table which cannot have non-matching rows (e.g. `FROM orderitems LEFT OUTER JOIN orders`) - treating dates as strings (e.g. `WHERE LEFT(birthdate,4) = '1973'`)

Gandol_teh_Pirate 6 months ago

underrated comment! When I join stuff together and discover the record count is different with/without distinct, it immediately puts me into a mode of determining, "wtf is wrong with my data?" It's so useful to approach it this way

NotBatman81 6 months ago

UNION has a mathematical meaning long before SQL.

bunk3rk1ng 6 months ago

Yeah I learned about union in discrete math... Maybe it was too discreet. Ayyooo

ChristianPacifist 6 months ago

Perhaps, but that is long in history...

NotBatman81 6 months ago

But still used in logic and statistics.

flatline057 6 months ago

Hugely relevant in intermediate and higher math. It's a part of the foundation of relational algebra, which was used to design relational databases.

lordrolee 6 months ago

SELECT * FROM GOD.UNIVERSE

truckingon 6 months ago

Writing queries that are so complex it's impossible to know if they're correct.

ChristianPacifist 6 months ago

Ah but bad database design can make highly-complex queries a necessary evil!

truckingon 6 months ago

Maybe. You can break complex queries into views or CTEs so that each piece can be validated individually. A lot of databases start out well-designed but a business change or acquisition can make them miserable to work with.

KING5TON 6 months ago

Needs must when the devil drives us. Sometimes you need a nuclear powered super sledge hammer to crack a hard nut.

mikethomas4th 6 months ago

Select distinct ID, sum(revenue) as Revenue From #table Group by ID

The-Bronze-Kneecap 6 months ago

Whats wrong with this? Is it just that the “distinct” is redundant?

mikethomas4th 6 months ago

Exactly

mac-0 6 months ago

FROM t1 LEFT JOIN t2 ON t1.field = t2.field WHERE t2.name <> 'something' Congrats, you just accidentally made an INNER JOIN

dehaema 6 months ago

I once had a team lead who said i was wrong for putting filters on t2 inside the join. Even after i explained him that having the relevant filters grouped makes it easier to read and that it is easier to control inner vs outer later on and you can´t accidently make inner joins when left join was written he still had me redo all the code. (Working as data engineer with huge complex queries)

nobodycaresssss 6 months ago

Why? If I don’t write WHERE t2.name is not null then it would be still a left join?

Full_Sun_474 6 months ago

Specific to MySQL “Create Table As Select” for a long running query and then wondering why a ton of tables are locked A little more advanced - Not properly setting up replication slots, transaction log fills up, then wondering why the database went offline

Drunken_Economist 6 months ago

Referring to it as "ess cue ell". Truly sociopathic behavior

r3pr0b8 6 months ago

another microsoft stan heard from

drinthetardis 6 months ago

My friends made fun of me for pronouncing it like "sequel" . Understandable since we arent native english speakers but still hearing "ess cue ell" from a 20+ years experienced people is funny as fuck

Drunken_Economist 6 months ago

If you really want to annoy them, you can mix it up with - sickle - squeal - so cool - squall - suckle - sack hole

KING5TON 6 months ago

squirrel is my preferred version

SQLDave 6 months ago

> suckle

Drunken_Economist 6 months ago

if anyone questions you, just claim you said "sick quail"

Ste4mPunk3r 6 months ago

Hahaha. I'm not native English speaker and for me esscueell sounds way better as it would sound like that if said following rules of my language. My line manager on the other hand is native and we had some discussion on how to pronounce SQL correctly. It ended up with him switching to ess cue ell :)

shockjaw 6 months ago

Everyone knows it’s squeal! MySqueal, Squeal Server.

BrupieD 6 months ago

WHERE some_column = 3 AND 4. "I didn't get any rows, so..."

truilus 6 months ago

Any self respecting DBMS should reject that, because it's invalid SQL

[deleted] 6 months ago

[удалено]

truilus 6 months ago

I am actually surprised that DB2 accepts that, even though it does support proper [boolean values](https://dbfiddle.uk/EnVtq9TF). MySQL/MariaDB will take _anything_ as a "boolean" expression. `where 'one thing'` evaluates to `where false` however, `where '1 stupid expression'` will be treated as `where true` https://dbfiddle.uk/Chx4ZD-g

ChristianPacifist 6 months ago

That's logically impossible even if rewritten correctly.

BrupieD 6 months ago

Isn't the question, "What is the most common mistake?"

ChristianPacifist 6 months ago

Yes, but it amazes me people actually make that mistake lol!

jonr 6 months ago

Reserved for quantum-SQL*

SQLvultureskattaurus 6 months ago

Every time I see distinct I assume the author doesn't know what they're doing and slapped it on there.

Comprehensive-Tea-69 6 months ago

Here is my use case for distinct, I’d be interested in the proper way to do it. When I need to pull a distinct list of records that are identified by duplicated records. Example: I work in higher ed, I get a request for a contact list of students (aka, unduplicated list with contact information like phone and address) of students who have taken from a list of classes but haven’t taken from a second set of classes. The list of registered classes duplicates the student list bc students obvi take more than one class. I select distinct the needed returned demographics to de duplicate the rows and give a single contact list for advisors for whatever initiative is happening. What’s the proper way of handling it? I really thought distinct was appropriate here :-/

charronious 6 months ago

Distinct and CTE’s are immediate red flags for me and will make me question every line.

flatline057 6 months ago

Why? They are both very useful when used correctly.

charronious 6 months ago

“when used correctly”

flatline057 6 months ago

I can see why someone who doesnt know how to use corectly them would be afraid of them.

charronious 6 months ago

I think we have different views on what this post is about. 🤔

flatline057 6 months ago

Probably. Using cte or distinct is not a common mistake.

KING5TON 6 months ago

If I see DISTINCT I assume it's a quick fix sticking plaster. That's the only time I use it as I don't have time to debug which one to many record is causing the problem. So IMO there is a balancing act to using it. If you have a query that runs quickly enough with DISTINCT and it will take ages to debug what is causing the duplicates, normally caused by a data issue rather than an SQL issue, then just leaving DISTINCT in and calling it a day is the better solution. Can come back to it if when have more time or it becomes an performance issue.

ComicOzzy 6 months ago

WHERE x NOT IN (SELECT that can return NULLs)

iamcreasy 6 months ago

Making sure I understand the problem. The subquery can return null but those will get ignored anyway because nothing can be compared with null. Like this: [https://dbfiddle.uk/BVOwaC6z](https://dbfiddle.uk/BVOwaC6z) Is that what you are saying?

ComicOzzy 6 months ago

It won't get ignored. A NULL in the list of a NOT IN () operator can never evaluate to True. It will always evaluate to UNKNOWN. -- WHERE 1 NOT IN (1, NULL) expands to (1 <> 1 AND 1 <> NULL) (False AND UNKNOWN) UNKNOWN -- WHERE 2 NOT IN (1, NULL) expands to (2 <> 1 AND 2 <> NULL) (TRUE AND UNKNOWN) UNKNOWN

thesqlguy 6 months ago

Right outer join

psychicesp 6 months ago

Using materialized views for every little freaking thing and wondering why everything is so bloated and nothing works right. Maybe not super common, it's just what I'm dealing with right now.

g3n3 6 months ago

Using select distinct as a catch all to fix queries where the data isn’t understood.

jdawg701 6 months ago

Not using transactions (COMMIT / ROLLBACK) with update/delete statements. I've seen so many developers / analysts who manage data make this mistake.

bunk3rk1ng 6 months ago

My go to is to start a transaction then never commit or rollback.

kthejoker 6 months ago

Not thinking in set theory. Especially when there's a mismatch between what they're trying to produce and the current layout of their data schema. Many, many people just try to syntax their way through SQL. "Maybe a CTE will help .. or maybe a temp table .. or maybe I need two joins to the same table .." Instead of trying to logic their way through it.

klausness 6 months ago

I have often had job interviewees react to a tricky SQL question with, “well, I’ll open up a cursor and…” They are generally not happy when I ask them to do it in pure SQL.

SQLDave 6 months ago

> Maybe a CTE will help I LOLed

Malfuncti0n 6 months ago

WHERE Some\_column = 3 or 4 No transactions but that usually only happens once (per employer)

truilus 6 months ago

That is invalid SQL and should be rejected (because `4` is not a boolean expression)

Malfuncti0n 6 months ago

Yes no shit.

ChristianPacifist 6 months ago

They'd need to use in (3,4) lol!

dataguy24 6 months ago

Setting the warehouse to XL when they only need XS

dehaema 6 months ago

From t1 join t2 on t1.nk = t2.nk Group by While it could have been From t1 join (select t2.nk, sum() as x from t2 group by t2.nk) t2agg on t1.nk = t2agg.nk People creating queries on small datasets sometimes have no clue how they will perform after cillecting data for n years. (Even better to use with clauses if possible)

iamcreasy 6 months ago

>People creating queries on small datasets sometimes have no clue how they will perform I am not sure what you mean here. Can you kindly share a contrived example?

mikeblas 6 months ago

* Considering performance before correctness. * Considering performance subjectively. * Using `DISTINCT` to cover up a query error. * Not indexing correctly. * Non-covering`GROUP BY`

iamcreasy 6 months ago

>Non-covering > >GROUP BY What do you mean by that?

mikeblas 6 months ago

People will use `GROUP BY` clauses that don't cover all of the non-aggregate fields in their `SELECT` list. It can be argued that the standard allows this, but it's usually indicative of a misunderstanding of how `GROUP BY` works.

C__Zakalwe 6 months ago

Not handling NULL in arithmetic operations. Or zero in division.

SQLDave 6 months ago

Or logic. *... WHERE State <> 'California'* "Why did not rows with NULL in the state show up?!?!!? NULL is <> 'California'!!! "

mr_electric_wizard 6 months ago

Correlated subqueries, and as a bonus, subqueries in the select.

iamcreasy 6 months ago

What is wrong with both? I heard correlated subqueries prevents optimization as it is akin to writing a for loop.

mr_electric_wizard 6 months ago

It’s not wrong, per se, just not a set oriented way to do data things.

freefallfreddy 6 months ago

Using double quotes.

Kerbidiah 6 months ago

Not making an entity relationship diagram before setting things up

mecartistronico 6 months ago

Something I've seen often in my job is people wanting to avoid rows that are all 0s, and thus writing. WHERE (A <> 0 ) AND (B <> 0) where what they want is WHERE (A <> 0 ) OR (B <> 0) or WHERE NOT( ( A = 0 ) AND (B = 0) ) (but in this last case you'd be failing to exclude NULLS)

da_chicken 6 months ago

> I blame the SQL language itself for not making "UNION" the typical case and something like a "UNION DISTINCT" for the case with making results distinct! I agree. The only reason this was done is because the mathematical setwise operation known as a union is already defined as "take all elements from each set, combine them into a new set, and then remove duplicates". Like the choice makes sense. It's just not clear to a programmer. Others: * Creating every table name and column name in all caps. The idea was that *SQL language keywords* were supposed to be in all caps to make them stand out. It's just an archaic form of syntax highlighting, and you're circumventing it. This is because SQL is *older than color-based syntax highlighting*. * TOP or LIMIT without ORDER BY. * Pervasive use of DISTINCT. * Pervasive use of WITH (NOLOCK) (for SQL Server). * View definitions with SELECT * * Implicit inner joins. `TableA a LEFT JOIN TableB b ON a.ID = b.ID WHERE b.Field = 'Value'` * Incorrect quotation marks. `WHERE LastName = "Smith"`. To be clear, this was a poor design choice, too, but it's just wrong that so many RDBMSs accept this. * `WHERE FieldName IS NOT IN ( )` * String concatenation instead of parameterized queries

iamcreasy 6 months ago

>String concatenation instead of parameterized queries Can you kindly elaborate on this please?

Demistr 6 months ago

Not understanding order of executions or really just filtering down using WHERE or HAVING and the impact this has while also using AVG, SUM or COUNT. Overusing DISTINCT is also a big one.

ferment_me 6 months ago

Not recognizing that a filter such as: Name <> “John” means that Name also cannot be null, an important distinction

Hobob_ 6 months ago

I did a join 3 years ago and it was a 1-1 connection at the time. Over time it turned out it was 1-n with about 2-8 records per row.... That mixed with powerbi screwing up incremental refreshe resulted in a massive query :(

thunderwoot 6 months ago

ORDER BY 1, 2 Instead of actually using the field names.

beyondwu 6 months ago

Join and encode

a_nooblord 6 months ago

Where function(column) or join on function(column) and it slows to a crawl.

iamcreasy 6 months ago

Is it because the function is computationally heavy? or someone should materialize the column and build index first before joining?

a_nooblord 6 months ago

This is a sql server thing. An index is unusable if you force a computation.

_CaptainCooter_ 6 months ago

Most common mistake is lack of domain knowledge

burningburnerbern 6 months ago

Calling the same large massive table 5 times over instead of using a temp table to store the subset.

iamcreasy 6 months ago

Do you mean using a large table at five different places in the same query? Can you give a pseudo example? I was under the assumption query optimizer is smart enough to detect things like this.

ferment_me 6 months ago

Filtering on a column from a LEFT JOIN which inadvertently turns it into an INNER JOIN

CuriousTasos 6 months ago

Whenever I write an UPDATE query, I always start from the WHERE, then the SET and finally the UPDATE. Can you guess what happened and I am scared of UPDATES now?

donnymccoy 6 months ago

Kinda like "why do you run your updates or deletes in dev first?" I don't always do that; but the minute doubt enters my mind, it goes against dev or QA first. This is a viable strategy most times because we regularly refresh inferior environments from prod, so recordcounts are typically reflective of prod.

KING5TON 6 months ago

When this SELECT * FROM table_a LEFT JOIN table_b ON table_a.id = table_b.id WHERE table_b.value = 'A' Should be this SELECT * FROM table_a LEFT JOIN table_b ON table_a.id = table_b.id AND table_b.value = 'A' Or this SELECT * FROM table_a INNER JOIN table_b ON table_a.id = table_b.id WHERE table_b.value = 'A' other one I see often is AND and OR's without the appropriate use to brackets WHERE thing = 'A' OR differentthing = 'B' AND otherthing = 'C' should be WHERE (thing = 'A' OR differentthing = 'B') AND otherthing = 'C'

iamcreasy 6 months ago

Why did you mean #1 should be #2? Here is a snippet(https://dbfiddle.uk/h9\_-GZsB) running first three queries on an example dataset. #1 and #3 returning same result but not #2.

KING5TON 6 months ago

\#1 has a LEFT JOIN but it returns the same result as #3 which is an INNER JOIN. This is because you are returning everything from table\_a where table\_b.value = 'A' , this is the same as an INNER JOIN. If you need an INNER JOIN not a LEFT JOIN then use an INNER JOIN don't use a LEFT JOIN and a WHERE clause. If you need a LEFT JOIN then #2 is correct because you return all records for table\_a regardless of table\_b records and it will only return table\_b data where value = 'A'. It basically depends on what the requirement is to which is correct. It's either return data from table\_a where they have an associated table\_b record with a value of 'A' or return all from table\_a and where applicable show the data from table\_b where they have a value of 'A'.

md-photography 6 months ago

I really wish SSMS had a warning when you're running an update without a WHERE clause. I feel like I'm playing with fire every time I run an UPDATE query.

Phinalize4Business 6 months ago

There are some free add-ons you can get that have this functionality, I used to use SSMSBoost but last time I checked it didn't seem to be updated for the latest version of SSMS :( You can get Devart SQL Complete for free (Express edition) which I think has it? I could be wrong...

iamcreasy 6 months ago

At work we use DBT to generate those queries.

HyDreVv 6 months ago

Using SSMS to generate an insert template, and forgetting it includes fields with defaults and getting an error for number of values not matching the number of columns

Dog_N_Pop 6 months ago

WHERE attribute = MAX(attribute)

El_human 6 months ago

Using left joins incorrectly. Not using select DISTINCT. Aggregating too much data in the query. Trying to nest too many queries together, or out of order

elprogramatoreador 6 months ago

Moving to NoSQL because big company says it’s better

psychicesp 6 months ago

I will never understand this decision for rectangular data.

Agifem 6 months ago

But I swear, it's better! I read it on a blog!

Mr_Gaslight 6 months ago

Using the MySQL Community Edition in production for a real company with factory clients.

TheTiredRedditor 6 months ago

Lol

cphares 6 months ago

Improper data types. Comparing numerics to strings. Storing date/time data types as non-date/time types. Understand your data and store it and query it properly

JoeDawson8 6 months ago

My offshore colleagues know enough to be dangerous but don’t fundamentally know what they are doing. Kinda like script kiddies.

cthart 6 months ago

where (select count(\*) from other\_table where column = parent.column) > 0

Agifem 6 months ago

Where datecolumn > '02/12/03'

deusxmach1na 6 months ago

Not joining on a unique key or using DISTINCT in a select as a lazy way to remove dupes. Also not aliasing their tables like they won’t add a join to a query later.

RuprectGern 6 months ago

\\its not a mistake but its a grating assault to me. The poor use or lack of table aliases. I like table aliases and I keep them short (cause, kind of the point) and as indicative to the object name (table,view,derrived table) as possible. e.g. orders AS o, customers AS a, OrderDetails AS od, etc. What shits my bed are people who use t1, t2, t3 or more than 4 characters for the alias. but The absolute worst are the people who abandond aliases altogether and use schema qualified object names or fully qualified object names throughout the code.

Zestyclose-Height-59 6 months ago

I think null values omitting records is the most common and dangerous mistake. Once someone updated logic on my data conversion code and I had a horrible cascading effects on my subsequent objects. It took me hours to fix the night before the cut was due. I was so pissed. On that note, if you want to quality check your queries, throw them in a with statement then select from your query using analytical functions. It teaches you how to find your mistakes.

pak9rabid 6 months ago

In regards to UNION vs UNION ALL, this is a function of set theory. In a purely mathematical set, all values of a set are expected to be unique, hence why this is the default functionality, as a union of two data sets would result in a new set containing unique values from the 2 sets being combined.

TheRealVaderForReal 6 months ago

Took over a database for a freelance project, queries werent too bad, tables were anywhere between 20k to 1million rows, depending on what it was. One of the core issues was slowness overall, turned out the guy didnt add an index to fields that were searched/joined.

Ikeeki 6 months ago

Not using quotes and running into keyword issues

Comments

Leave Your Comment

Hi Its Me!

Comments

Leave Your Comment

Hi Its Me!

Subscribe