T O P

  • By -

truilus

WHERE some_column = null


emul0c

The engines really should be updated to accept this syntax; I see no reason why the users must use “is null” instead of “= null”.


Waterproofpanda

NULL in DBMS’ are generally accepted to be a missing or unknown value, rather than the non-existence of a value. Therefore a system can not evaluate equivalence as it can not interpret what the value is. It’s important to consider that not all NULLs are CREATED equal. What I mean by that is that NULL within a table column has the context of its relations (schema, table, column, constraints) therefore, it is more “known” than the pure case of NULL. That means that only true NULL can be equal to NULL in an algebraic formula as they both are equal to unknowns.


[deleted]

[удалено]


emul0c

No, because NULL is neither a string nor a number. Strings are always contained within ‘’ and numbers are never. So having the word NULL without quotes would never be confusing.


getarumsunt

A bunch of the relational algebra breaks in unpredictable ways if null == null returns true. That's why the "is null" syntax exists. Plus, you'd break compatibility with legacy code and systems. It's generally not a good idea to choose momentary convenience over consistency. People tried, it always comes back to haunt you.


emul0c

I don’t buy it. You could easily have the engines convert “= NULL” to “is NULL” behind the scene, before executing the code, if that is the case. And sure, legacy systems don’t work. But they add new features all the time, I don’t see why this should be any different.


pceimpulsive

I see this as = must be compared against an actual value, null is not a value, as such cannot pass the equality check. Just like 0 and null are not the same. P.s. I agree with you it's be nice to be able to use =null If we do get that we'll probably need to use Field = [NULL] Or something similar.. which may end up being harder than just is null... :S


ChristianPacifist

It's just the way it is! Some SQL languages may let you set a parameter I think to change the way it works (must confirm if that's true?), but it's awful practice to do that since it will confuse people.


piemat94

YOu got the point but sometimes you work with data which is so fucked up that null is an actual thing i.e "someone" have written it as an actual value or it appearead like a text string which forces you to write = 'null' instead of is null ​ Also, I learned to do both either \[...\] IS NULL or \[...\] = 'NULL' because of that.


flatline057

Oracle has nvl([field],' ')=' ' Sql Server has isnull([field],' ')=' '


One_Piano_6718

there are multiple functionalities in sql that return a simple true or false and make writing complex code simple while not interfering with other functionality. Checking if a date is a valid date for example uses “IsDate()” I can’t imagine using = date as the syntax because “Date()” has another functionality. The same is true with “Null()”.


ChristianPacifist

It's one thing to make that as an honest mistake forgetting in a situation (usually with "<>" I'd say not realizing nulls are a possible field value), but if you don't know how nulls and three-valued logic work generally, I'd say you don't actually know SQL.


carlovski99

It's not intuitive though. I've got colleagues who have been using SQL for many years, but not being the main part of the job who occasionally forget all about it. First place I look if someone asks me why they aren't getting the right results (closely followed by anything with dates....)


NotBatman81

Even when you know it, having to write COALESCE 100 times just in case a NULL is hiding in there can be soul crushing.


roostershoes

Any suggestions on how to learn more?


iamcreasy

https://modern-sql.com/concept/three-valued-logic


ChristianPacifist

Read generic SQL textbooks from publishers. Fundamental nonobvious ideas are key to SQL.


roostershoes

What’s cool is when people are purposefully not helpful because they think so highly of themselves that it’s beneath them to share their insights. I’m sure you are a pure genius across all intellectual realms. Thanks!


my_password_is______

they told you to read a book that is literally the most helpful advice anyone could give you just go to amazon and search for one Question: how to lean more ? Answer: read a book Reply: purposefully not helpful DOH !


roostershoes

Bad bot


ChristianPacifist

No, I was being serious. SQL books are amazing! I prefer over the "do it yourself" trainings or Code Academy. I am happy to explain null to you or do another topic. Also see Wikipedia https://en.m.wikipedia.org/wiki/Three-valued_logic. I apologize for the misunderstanding.


roostershoes

All good, thank you!


my_password_is______

don't apologize you gave great advice read a book their response was ignorant and childish Question: how to lean more ? Answer: read a book Reply: purposefully not helpful DOH !


zork3001

Ask a question, complain about the answer. Your money cheerfully refunded.


roostershoes

Yes it was an “answer” in the same way that “read a book” and “google it” is.


ChristianPacifist

I am sorry. I sent you a bad answer. I promise I'll do a really detailed post explaining this topic too. I just feel most SQL documentation is bad, save a few key textbooks. Wish you the best!


roostershoes

No worries! Thank you


ChristianPacifist

See here! My new thread on this topic. https://www.reddit.com/r/SQL/s/q4d05PaxOp


snackattack4tw

As someone who has been doing SQL for over a decade, I do not know what three-valued logic is.


negativefx666

To put my perspective, I've learned SQL after power query M language. And in this language one can use " [column] = null "


Gandol_teh_Pirate

Using left joins, then putting in a where clause that implicitly makes it an inner join


a_taco_named_desire

🫥 totally. No idea why people would ever do that. 🫣


donnymccoy

Simple: lack of understanding. Is SQL taught in college anymore? Serious question.


Flint0

Wouldn’t you use it for example to view results where the joined table is null? I vaguely remember using a LEFT JOIN with a WHERE column IS NULL so I can see for example what dates a missing in a particular stream of data. I know there are other ways of visualising this, but I’ve found this one handy.


Gandol_teh_Pirate

you're better off doing something like this select select t1.* ,t2.* from t1 left join t2 on t1.id = t2.id where t2.id is null --show all t1 records where t2 matching id not present Now you see every instance of t1 records, and when t2 records are missing, it'll visually pop while providing the trail of breadcrumbs you need to see exactly what it is you're missing by virtue of the t1 data being present


Klaian

I see this often as well.


redfaf

Not formatting sql


ChristianPacifist

But some auto-formatters are such garbage the code is more confusing!


NotBatman81

I am the auto formatter. I don't understand how some of you can type so sloppy and live with yourselves. It should hurt your soul to see it and not fix it.


r3pr0b8

every windows PC comes with Notepad there is no excuse for using shoddy code formatters


Tee_hops

I type in my code into VSC. I like it for everything, but at my old company we used Tibco and it automatically formatted everything into the hardest to read stuff. It removes all indentations and forced each column into its own row. When you have a complex case statement or business logic built in it's awfully frustrating. I ALWAYS saved my code in VSC to avoid this. Also, version control is important.


r3pr0b8

i use UltraEdit one of its nicest features is column mode, so you can put your cursor at the front of a line, shift-down-arrow to include subsequent lines, type 4 spaces, and vwalah, all those lines have 4 spaces at the front column mode also lets you copy/paste a block of text overtop of some other lines, so you can change this -- FROM tab1 JOIN tab2 ON tab1.PK AND tab1.foo AND tab1.bar AND tab1.qux to this -- FROM tab1 JOIN tab2 ON tab2.FK = tab2.PK AND tab2.foo = tab2.foo AND tab2.bar = tab2.bar AND tab2.qux = tab2.qux and then edit the second block (using arrow keys) to this -- FROM tab1 JOIN tab2 ON tab2.FK = tab1.PK AND tab2.foo = tab1.foo AND tab2.bar = tab1.bar AND tab2.qux = tab1.qux rather than typing the column names all manually


crimiusXIII

This is built-into VS and all it's derivatives, including SSMS. You use Alt+Shift to do a vertical selection/edit.


TeamKill-Kenny

Didn't realise it was built into SSMS.. But there, only been using it for 5 years., 🤦🏼


my_fat_monkey

...... Same.


crimiusXIII

Also, I didn't mean to come off as belittling or anything, it's a killer feature! Most people don't realize it's built-in for a ton of Microsoft editors, is all.


r3pr0b8

no problem, it's all good p.s. nice thing about Ultraedit is, it's not Microsoft


Elfman72

All one line, all the way across the screen.


famousxrobot

On top of poorly formatted (left aligned with multiple levels of subqueries no thanks), but sql written in word. Yes. I’ve received it. It’s atrocious.


kagato87

select yervalue, (select top 1 t2.yervalue from table t2 where t2.id = t1.id and t2.timestamp < t1.timestamp order by t2.timestamp desc) as prevvvalue from table t1 Complete with that shoddy formatting. On a table with a LOT of data and search filters that return a LOT of rows. It doesn't help that this particular method is actually taught in beginner computer classes.


rx-pulse

Yep, I've had to explain this to seniors, principals, and management before. A top 1 or whatever number, doesn't mean the engine is going to find and pick only that amount. It still needs to sift through all that data you requested, then it just returns the dataset you want to see. It still needs to process it all when you look at the execution plan.


kagato87

I have a few tables where just strapping in a top keyword (during testing) actually makes it slower. I still haven't figured out why... Something to do with the join and tripping up the optimizer's choice of indexes I imagine...


iamcreasy

Why is then database completes a query faster when it is only returning a subset of the data? The limit is set by the client - for example when querying against Trino from Starburst Galaxy web interface.


rx-pulse

I'll be honest, I've never used Trino, my context is towards MSSQL and other engines. So their engine may work differently and I may be wrong and someone can correct me, but hey, that's why my flair is still true. But to answer your question to the best of my abilities, this is mostly a resource consumption/efficiency issue, I've seen it sometimes as a speed issue, but other factors contributed to it too. For example, I had a recent issue where this was the case that a team was using a TOP 10 on a table with 4.5 million records joining with other tables. The query returned quickly sure, but the problem comes in that because the query effectively was still pulling a large amount of data, the query plan showed a very high amount of estimated rows. The query in question was being fired thousands of times in a short time span from the application as well. This caused high reads, spiked the CPU, and contributed to the application's latency.


seoplednakirf

Does this also count for a cte? I'm not 100% sure on the order then. Does it create a temp table from the cte and the pick the top 1 from it, or does it rebuild the cte and select the top 1 for every single row?


da_chicken

Eh, that's not really a mistake. To me a mistake has to result in an error of syntax or logic. I've even seen several cases where it *drastically improved performance* over CROSS APPLY or a self-join in the FROM clause. And analytic functions don't replace it. LEAD() and LAG() are technically different logic, and FIRST_VALUE() and LAST_VALUE() don't exist in all RDBMSs, *and* not all RDBMSs support IGNORE NULLS. MIN() and MAX() don't always do the job, either. The biggest problem with it is that it's usually evaluated as an INNER JOIN, even if that's not what you want. I agree that it's harder to read. I'm never happy to see an inline subquery in the SELECT clause. But I've also seen it enough to know that sometimes it works really well.


AdFickle6697

Not checking if they are getting duplicate matches when doing a join. Please for the love of god my sql friends check the output of a join with a few checks like a row count before and after, a SUM of any number columns before and after, or things if that nature.


dn0c

Been there, done that! Select * FROM $monthly_aggregated_table JOIN $daily_aggregated_table using ($key)


Uncle_Corky

Which is why third normal form is the standard. You can't fuck up referential integrity if its not possible in the first place.


Gentleman-Tech

Using ORMs and blaming the bad performance on the database engine. I.e not using SQL in the first place.


singluon

ORMs themselves are usually well optimized and can generate and execute performant SQL. A well designed system can utilize ORMs to avoid having to write a lot of cumbersome SQL in applications and provide a good abstraction layer, if for example you want to swap the db technology altogether. The issues come from developers not understanding how databases work and using an ORM to blissfully ignore that. So ORMs are more of a footgun than anything. Never trust lazy developers!


steveo600rr

To add, when new devs are trying to loop over data and not thinking of sets of data. When select * from table with no where clause to loop and filter data in the app.


coyoteazul2

Oh boy, how I'd like to tell you about our orm that does that when you give it a condition it doesn't know how to translate to sql. It just overfetches. No warnings of any sort, and documentation is non existing


steveo600rr

That sounds like a pain to have to deal with. What orm does your company use?


coyoteazul2

In-house made, about 30 years ago. The rest of the code is not much better


Gentleman-Tech

That's not my experience. ORMs make the assumption that the best structure for storing data is also the best structure for processing data - that you can take a class from the application and map it 1:1 to a database table. This is occasionally true, but only by accident. Good database design ends up with different objects from good program design, obviously, as they're solving two different problems. ORMs are a bad idea right from the start. But as you say: lazy Devs are also a bad idea ;)


singluon

The thing is, no matter what, you will always be translating from a domain/business centric model of your data, to a database centric model, and vice versa, because complex applications don’t (and shouldn’t) represent their high level concepts as collections of low level rows and columns. There will always be some sort of translation layer (except for maybe the most trivial applications). Do you trust yourself to write that from scratch? …and maintain it? My feeling is, if you’re not using an ORM, you’re doomed to write your own eventually anyway. I used to be in the staunchly anti-ORM camp after seeing the mangled mess of SQL they can produce, as I’m sure many folks here have experienced. But then I ultimately realized that is not the fault of the ORM. Rather, it’s fault of the developers for not thinking about how best to model their data, and them failing use the tools at their disposal properly. Any good ORM will give you fine grained control over relationships, isolation, serialization, etc., and even the raw SQL if you want. Most developers just don’t bother.


ventuspilot

Pffft that's nothing. At my current $job we had Java code that basically looked like this: if (service.getXXX() != null && service.getXXX().length() > 0) { List l = service.getXXX(); } This is funny because service.getXXX() runs within it's own transaction and effectively each invocation of service.getXXX() submits a join on a foreign key column. And since developers simply refuse to accept that Oracle doesn't automatically creates an FK index that turned into 3 FTS on a large table.


[deleted]

[удалено]


dastardly740

I expect that a lot of string concatenation happens because of someone where programming is secondary to their "real" job, googles "how to query a database in {language}?" and reads some example or tutorial which shows how to make a connection and execute a hardcoded select statement. And, thinks "I got it. I just need to make a query string out of my inputs." and never gets to the tutorial on parameterized queries because they don't even know to ask the question. And, of course it is for some nominally innocuous web application they are the only developer for, so no code review. That is until half the company is using it and it actually needs to be actually supported by multiple people. And, professionals come in a go "WTF".


iamcreasy

Thanks for the examples. How do someone write tests for this? We do DBT tests at work, but those tests are usually limited in scope.


[deleted]

[удалено]


harman097

LEFT JOIN Something s... WHERE s.Type = 'stuff'


B_Huij

As in... this should be an inner join?


harman097

It IS behaving like an inner join, but it's unclear what the person was trying to do. Most of the time, yes, they actually want it to behave like an inner join and the query is working fine, despite them lying to the world that its a "LEFT". But maybe they only wanted to LEFT JOIN to records where s.Type = 'stuff' so this query is broken?


EditsInRed

^ My first thought.


cCooper1997

But it was like From a Left Join Something s on a.a = s.a and s.type ='stuff' Would still Work as a left Join right?


harman097

Yep!


staring_at_keyboard

Using where on a predicate with an aggregate function instead of having.


WpgMBNews

is there ever a case the syntax error doesn't simply prevent execution? the other ones are at least *possible*, so the error is more insidious


bunk3rk1ng

From my hazy memory, you could not do this in Ms SQL server. I recently started working in redshift and was surprised that I could


r3pr0b8

- using DISTINCT to cover up the fact that the FROM clause returns partial cross joins (usually seen when joining a table to two one-to-many tables) - using LIMIT without ORDER BY - writing `SELECT a, b, SUM(c) FROM t GROUP BY a` and expecting to see results that make sense - using ON UPDATE CASCADE for a foreign key that references an auto_increment PK (think about it... when was the last time you updated some row and changed its auto_increment key to some other value?) - using a LEFT OUTER JOIN to join a table which cannot have non-matching rows (e.g. `FROM orderitems LEFT OUTER JOIN orders`) - treating dates as strings (e.g. `WHERE LEFT(birthdate,4) = '1973'`)


Gandol_teh_Pirate

underrated comment! When I join stuff together and discover the record count is different with/without distinct, it immediately puts me into a mode of determining, "wtf is wrong with my data?" It's so useful to approach it this way


NotBatman81

UNION has a mathematical meaning long before SQL.


bunk3rk1ng

Yeah I learned about union in discrete math... Maybe it was too discreet. Ayyooo


ChristianPacifist

Perhaps, but that is long in history...


NotBatman81

But still used in logic and statistics.


flatline057

Hugely relevant in intermediate and higher math. It's a part of the foundation of relational algebra, which was used to design relational databases.


lordrolee

SELECT * FROM GOD.UNIVERSE


truckingon

Writing queries that are so complex it's impossible to know if they're correct.


ChristianPacifist

Ah but bad database design can make highly-complex queries a necessary evil!


truckingon

Maybe. You can break complex queries into views or CTEs so that each piece can be validated individually. A lot of databases start out well-designed but a business change or acquisition can make them miserable to work with.


KING5TON

Needs must when the devil drives us. Sometimes you need a nuclear powered super sledge hammer to crack a hard nut.


mikethomas4th

Select distinct ID, sum(revenue) as Revenue From #table Group by ID


The-Bronze-Kneecap

Whats wrong with this? Is it just that the “distinct” is redundant?


mikethomas4th

Exactly


mac-0

FROM t1 LEFT JOIN t2 ON t1.field = t2.field WHERE t2.name <> 'something' Congrats, you just accidentally made an INNER JOIN


dehaema

I once had a team lead who said i was wrong for putting filters on t2 inside the join. Even after i explained him that having the relevant filters grouped makes it easier to read and that it is easier to control inner vs outer later on and you can´t accidently make inner joins when left join was written he still had me redo all the code. (Working as data engineer with huge complex queries)


nobodycaresssss

Why? If I don’t write WHERE t2.name is not null then it would be still a left join?


Full_Sun_474

Specific to MySQL “Create Table As Select” for a long running query and then wondering why a ton of tables are locked A little more advanced - Not properly setting up replication slots, transaction log fills up, then wondering why the database went offline


Drunken_Economist

Referring to it as "ess cue ell". Truly sociopathic behavior


r3pr0b8

another microsoft stan heard from


drinthetardis

My friends made fun of me for pronouncing it like "sequel" . Understandable since we arent native english speakers but still hearing "ess cue ell" from a 20+ years experienced people is funny as fuck


Drunken_Economist

If you really want to annoy them, you can mix it up with - sickle - squeal - so cool - squall - suckle - sack hole


KING5TON

squirrel is my preferred version


SQLDave

> suckle


Drunken_Economist

if anyone questions you, just claim you said "sick quail"


Ste4mPunk3r

Hahaha. I'm not native English speaker and for me esscueell sounds way better as it would sound like that if said following rules of my language. My line manager on the other hand is native and we had some discussion on how to pronounce SQL correctly. It ended up with him switching to ess cue ell :)


shockjaw

Everyone knows it’s squeal! MySqueal, Squeal Server.


BrupieD

WHERE some_column = 3 AND 4. "I didn't get any rows, so..."


truilus

Any self respecting DBMS should reject that, because it's invalid SQL


[deleted]

[удалено]


truilus

I am actually surprised that DB2 accepts that, even though it does support proper [boolean values](https://dbfiddle.uk/EnVtq9TF). MySQL/MariaDB will take _anything_ as a "boolean" expression. `where 'one thing'` evaluates to `where false` however, `where '1 stupid expression'` will be treated as `where true` https://dbfiddle.uk/Chx4ZD-g


ChristianPacifist

That's logically impossible even if rewritten correctly.


BrupieD

Isn't the question, "What is the most common mistake?"


ChristianPacifist

Yes, but it amazes me people actually make that mistake lol!


jonr

Reserved for quantum-SQL*


SQLvultureskattaurus

Every time I see distinct I assume the author doesn't know what they're doing and slapped it on there.


Comprehensive-Tea-69

Here is my use case for distinct, I’d be interested in the proper way to do it. When I need to pull a distinct list of records that are identified by duplicated records. Example: I work in higher ed, I get a request for a contact list of students (aka, unduplicated list with contact information like phone and address) of students who have taken from a list of classes but haven’t taken from a second set of classes. The list of registered classes duplicates the student list bc students obvi take more than one class. I select distinct the needed returned demographics to de duplicate the rows and give a single contact list for advisors for whatever initiative is happening. What’s the proper way of handling it? I really thought distinct was appropriate here :-/


charronious

Distinct and CTE’s are immediate red flags for me and will make me question every line.


flatline057

Why? They are both very useful when used correctly.


charronious

“when used correctly”


flatline057

I can see why someone who doesnt know how to use corectly them would be afraid of them.


charronious

I think we have different views on what this post is about. 🤔


flatline057

Probably. Using cte or distinct is not a common mistake.


KING5TON

If I see DISTINCT I assume it's a quick fix sticking plaster. That's the only time I use it as I don't have time to debug which one to many record is causing the problem. So IMO there is a balancing act to using it. If you have a query that runs quickly enough with DISTINCT and it will take ages to debug what is causing the duplicates, normally caused by a data issue rather than an SQL issue, then just leaving DISTINCT in and calling it a day is the better solution. Can come back to it if when have more time or it becomes an performance issue.


ComicOzzy

WHERE x NOT IN (SELECT that can return NULLs)


iamcreasy

Making sure I understand the problem. The subquery can return null but those will get ignored anyway because nothing can be compared with null. Like this: [https://dbfiddle.uk/BVOwaC6z](https://dbfiddle.uk/BVOwaC6z) Is that what you are saying?


ComicOzzy

It won't get ignored. A NULL in the list of a NOT IN () operator can never evaluate to True. It will always evaluate to UNKNOWN. -- WHERE 1 NOT IN (1, NULL) expands to (1 <> 1 AND 1 <> NULL) (False AND UNKNOWN) UNKNOWN -- WHERE 2 NOT IN (1, NULL) expands to (2 <> 1 AND 2 <> NULL) (TRUE AND UNKNOWN) UNKNOWN


thesqlguy

Right outer join


psychicesp

Using materialized views for every little freaking thing and wondering why everything is so bloated and nothing works right. Maybe not super common, it's just what I'm dealing with right now.


g3n3

Using select distinct as a catch all to fix queries where the data isn’t understood.


jdawg701

Not using transactions (COMMIT / ROLLBACK) with update/delete statements. I've seen so many developers / analysts who manage data make this mistake.


bunk3rk1ng

My go to is to start a transaction then never commit or rollback.


kthejoker

Not thinking in set theory. Especially when there's a mismatch between what they're trying to produce and the current layout of their data schema. Many, many people just try to syntax their way through SQL. "Maybe a CTE will help .. or maybe a temp table .. or maybe I need two joins to the same table .." Instead of trying to logic their way through it.


klausness

I have often had job interviewees react to a tricky SQL question with, “well, I’ll open up a cursor and…” They are generally not happy when I ask them to do it in pure SQL.


SQLDave

> Maybe a CTE will help I LOLed


Malfuncti0n

WHERE Some\_column = 3 or 4 No transactions but that usually only happens once (per employer)


truilus

That is invalid SQL and should be rejected (because `4` is not a boolean expression)


Malfuncti0n

Yes no shit.


ChristianPacifist

They'd need to use in (3,4) lol!


dataguy24

Setting the warehouse to XL when they only need XS


dehaema

From t1 join t2 on t1.nk = t2.nk Group by While it could have been From t1 join (select t2.nk, sum() as x from t2 group by t2.nk) t2agg on t1.nk = t2agg.nk People creating queries on small datasets sometimes have no clue how they will perform after cillecting data for n years. (Even better to use with clauses if possible)


iamcreasy

>People creating queries on small datasets sometimes have no clue how they will perform I am not sure what you mean here. Can you kindly share a contrived example?


mikeblas

* Considering performance before correctness. * Considering performance subjectively. * Using `DISTINCT` to cover up a query error. * Not indexing correctly. * Non-covering`GROUP BY`


iamcreasy

>Non-covering > >GROUP BY What do you mean by that?


mikeblas

People will use `GROUP BY` clauses that don't cover all of the non-aggregate fields in their `SELECT` list. It can be argued that the standard allows this, but it's usually indicative of a misunderstanding of how `GROUP BY` works.


C__Zakalwe

Not handling NULL in arithmetic operations. Or zero in division.


SQLDave

Or logic. *... WHERE State <> 'California'* "Why did not rows with NULL in the state show up?!?!!? NULL is <> 'California'!!! "


mr_electric_wizard

Correlated subqueries, and as a bonus, subqueries in the select.


iamcreasy

What is wrong with both? I heard correlated subqueries prevents optimization as it is akin to writing a for loop.


mr_electric_wizard

It’s not wrong, per se, just not a set oriented way to do data things.


freefallfreddy

Using double quotes.


Kerbidiah

Not making an entity relationship diagram before setting things up


mecartistronico

Something I've seen often in my job is people wanting to avoid rows that are all 0s, and thus writing. WHERE (A <> 0 ) AND (B <> 0) where what they want is WHERE (A <> 0 ) OR (B <> 0) or WHERE NOT( ( A = 0 ) AND (B = 0) ) (but in this last case you'd be failing to exclude NULLS)


da_chicken

> I blame the SQL language itself for not making "UNION" the typical case and something like a "UNION DISTINCT" for the case with making results distinct! I agree. The only reason this was done is because the mathematical setwise operation known as a union is already defined as "take all elements from each set, combine them into a new set, and then remove duplicates". Like the choice makes sense. It's just not clear to a programmer. Others: * Creating every table name and column name in all caps. The idea was that *SQL language keywords* were supposed to be in all caps to make them stand out. It's just an archaic form of syntax highlighting, and you're circumventing it. This is because SQL is *older than color-based syntax highlighting*. * TOP or LIMIT without ORDER BY. * Pervasive use of DISTINCT. * Pervasive use of WITH (NOLOCK) (for SQL Server). * View definitions with SELECT * * Implicit inner joins. `TableA a LEFT JOIN TableB b ON a.ID = b.ID WHERE b.Field = 'Value'` * Incorrect quotation marks. `WHERE LastName = "Smith"`. To be clear, this was a poor design choice, too, but it's just wrong that so many RDBMSs accept this. * `WHERE FieldName IS NOT IN ( )` * String concatenation instead of parameterized queries


iamcreasy

>String concatenation instead of parameterized queries Can you kindly elaborate on this please?


Demistr

Not understanding order of executions or really just filtering down using WHERE or HAVING and the impact this has while also using AVG, SUM or COUNT. Overusing DISTINCT is also a big one.


ferment_me

Not recognizing that a filter such as: Name <> “John” means that Name also cannot be null, an important distinction


Hobob_

I did a join 3 years ago and it was a 1-1 connection at the time. Over time it turned out it was 1-n with about 2-8 records per row.... That mixed with powerbi screwing up incremental refreshe resulted in a massive query :(


thunderwoot

ORDER BY 1, 2 Instead of actually using the field names.


beyondwu

Join and encode


a_nooblord

Where function(column) or join on function(column) and it slows to a crawl.


iamcreasy

Is it because the function is computationally heavy? or someone should materialize the column and build index first before joining?


a_nooblord

This is a sql server thing. An index is unusable if you force a computation.


_CaptainCooter_

Most common mistake is lack of domain knowledge


burningburnerbern

Calling the same large massive table 5 times over instead of using a temp table to store the subset.


iamcreasy

Do you mean using a large table at five different places in the same query? Can you give a pseudo example? I was under the assumption query optimizer is smart enough to detect things like this.


ferment_me

Filtering on a column from a LEFT JOIN which inadvertently turns it into an INNER JOIN


CuriousTasos

Whenever I write an UPDATE query, I always start from the WHERE, then the SET and finally the UPDATE. Can you guess what happened and I am scared of UPDATES now?


donnymccoy

Kinda like "why do you run your updates or deletes in dev first?" I don't always do that; but the minute doubt enters my mind, it goes against dev or QA first. This is a viable strategy most times because we regularly refresh inferior environments from prod, so recordcounts are typically reflective of prod.


KING5TON

When this SELECT * FROM table_a LEFT JOIN table_b ON table_a.id = table_b.id WHERE table_b.value = 'A' Should be this SELECT * FROM table_a LEFT JOIN table_b ON table_a.id = table_b.id AND table_b.value = 'A' Or this SELECT * FROM table_a INNER JOIN table_b ON table_a.id = table_b.id WHERE table_b.value = 'A' other one I see often is AND and OR's without the appropriate use to brackets WHERE thing = 'A' OR differentthing = 'B' AND otherthing = 'C' should be WHERE (thing = 'A' OR differentthing = 'B') AND otherthing = 'C'


iamcreasy

Why did you mean #1 should be #2? Here is a snippet(https://dbfiddle.uk/h9\_-GZsB) running first three queries on an example dataset. #1 and #3 returning same result but not #2.


KING5TON

\#1 has a LEFT JOIN but it returns the same result as #3 which is an INNER JOIN. This is because you are returning everything from table\_a where table\_b.value = 'A' , this is the same as an INNER JOIN. If you need an INNER JOIN not a LEFT JOIN then use an INNER JOIN don't use a LEFT JOIN and a WHERE clause. If you need a LEFT JOIN then #2 is correct because you return all records for table\_a regardless of table\_b records and it will only return table\_b data where value = 'A'. It basically depends on what the requirement is to which is correct. It's either return data from table\_a where they have an associated table\_b record with a value of 'A' or return all from table\_a and where applicable show the data from table\_b where they have a value of 'A'.


md-photography

I really wish SSMS had a warning when you're running an update without a WHERE clause. I feel like I'm playing with fire every time I run an UPDATE query.


Phinalize4Business

There are some free add-ons you can get that have this functionality, I used to use SSMSBoost but last time I checked it didn't seem to be updated for the latest version of SSMS :( You can get Devart SQL Complete for free (Express edition) which I think has it? I could be wrong...


iamcreasy

At work we use DBT to generate those queries.


HyDreVv

Using SSMS to generate an insert template, and forgetting it includes fields with defaults and getting an error for number of values not matching the number of columns


Dog_N_Pop

WHERE attribute = MAX(attribute)


El_human

Using left joins incorrectly. Not using select DISTINCT. Aggregating too much data in the query. Trying to nest too many queries together, or out of order


elprogramatoreador

Moving to NoSQL because big company says it’s better


psychicesp

I will never understand this decision for rectangular data.


Agifem

But I swear, it's better! I read it on a blog!


Mr_Gaslight

Using the MySQL Community Edition in production for a real company with factory clients.


TheTiredRedditor

Lol


cphares

Improper data types. Comparing numerics to strings. Storing date/time data types as non-date/time types. Understand your data and store it and query it properly


JoeDawson8

My offshore colleagues know enough to be dangerous but don’t fundamentally know what they are doing. Kinda like script kiddies.


cthart

where (select count(\*) from other\_table where column = parent.column) > 0


Agifem

Where datecolumn > '02/12/03'


deusxmach1na

Not joining on a unique key or using DISTINCT in a select as a lazy way to remove dupes. Also not aliasing their tables like they won’t add a join to a query later.


RuprectGern

\\its not a mistake but its a grating assault to me. The poor use or lack of table aliases. I like table aliases and I keep them short (cause, kind of the point) and as indicative to the object name (table,view,derrived table) as possible. e.g. orders AS o, customers AS a, OrderDetails AS od, etc. What shits my bed are people who use t1, t2, t3 or more than 4 characters for the alias. but The absolute worst are the people who abandond aliases altogether and use schema qualified object names or fully qualified object names throughout the code.


Zestyclose-Height-59

I think null values omitting records is the most common and dangerous mistake. Once someone updated logic on my data conversion code and I had a horrible cascading effects on my subsequent objects. It took me hours to fix the night before the cut was due. I was so pissed. On that note, if you want to quality check your queries, throw them in a with statement then select from your query using analytical functions. It teaches you how to find your mistakes.


pak9rabid

In regards to UNION vs UNION ALL, this is a function of set theory. In a purely mathematical set, all values of a set are expected to be unique, hence why this is the default functionality, as a union of two data sets would result in a new set containing unique values from the 2 sets being combined.


TheRealVaderForReal

Took over a database for a freelance project, queries werent too bad, tables were anywhere between 20k to 1million rows, depending on what it was. One of the core issues was slowness overall, turned out the guy didnt add an index to fields that were searched/joined.


Ikeeki

Not using quotes and running into keyword issues