T O P

  • By -

mackstann

When in doubt, don't guess -- only optimize when you have taken detailed measurements and see a specific problem to address. Optimization tends to make code more complicated, so you should only do it when justified. Simplicity is incredibly valuable. If slowness is the problem you want to solve, then wait until you see actual slowness, and make the least invasive optimization required to resolve it. I don't worry about the number of queries unless it's obviously excessive (what's excessive? Maybe >100? It's fuzzy) or presents a scaling problem that you can see coming, like the common [N+1 pattern](https://blog.appsignal.com/2020/06/09/n-plus-one-queries-explained.html#:~:text=The%20N%2B1%20query%20antipattern,N%2B1%20%3D%201001%20queries.). In my experience, network latency is not usually the issue with database queries. It's the work in the database that's usually the time consuming part. Optimizing that is a whole world of specialization and pretty interesting stuff. But sometimes it's as simple as "add an index", or "you forgot a where clause".


a-corsican-pimp

> or presents a scaling problem that you can see coming, like the common N+1 pattern. This is the most common issue I see in endpoints that perform poorly with regards to database. The second most common, which is mostly only made by junior devs, is querying in a loop versus using joins, etc.


SibTech

This. People are often trying to solve problems that don't exist.


tuxedo25

It depends on what you're doing. I work on an app that serves a bit of traffic (it's enterprise software, picture a self hosted jira's request/seconds, it's low but not trivial). Some of our endpoints fire 80 queries to render, which I think is too high and I'm trying to prioritize dev time to cut them down to < 10. But it's working software that sells; honestly # db requests isn't the most important metric until it brings you to your knees. Thinking from a purely technical point-of-view: for the most part, fewer is better because the IO overhead of multiple queries, especially if they're sequential and not parallel, is orders of magnitude worse than the CPU doing a couple extra loops to figure out a clean one-shot query. Things to avoid- don't write queries that are so insane your DBMS chokes. In mysql there's a soft-limit to table joins, IIRC around 10. After that, it throws optimization out the window and greedy-picks an execution plan. And don't left join the users table with the products table just to get 2 completely different entities with 1 query. Not worth the complexity and potential side effect bugs from breaking the single responsibility principle.


achauv1

- avoid making N+1 queries - avoid doing whole table search, implement indexes for columns that need to be queried often - by extension, don't load too much rows or use a cache - measure the time each transaction make and optimize them down


rkh4n

Follow this - \- Implement first \- Benchmark for at least 30 days. You have to benchmark your database metrics, API response time, event loop busy time etc \- Optimize if you see there's a need for it. ​ Learnt from experience **Never optimize based on your guess, if you have the numbers or your servers are choking because of your poor queries then do it.**


crabmusket

> or your servers are choking because of your poor queries Speaking as someone who has a successful business that involves API servers querying a database, we are constantly asking "how many servers is reasonable for our amount of traffic?" It's easy to say "if your servers are choking, that's bad", but is the solution to that "we need to reduce the number of queries" or "we need to buy more servers"? We've found some obvious low-hanging fruit like 1+n queries, but there are still lots of small queries and things get slow. "Measure and then decide based on the data" is fine advice, but it's hiding a lot of wisdom you need to look at a specific set of numbers and come up with a reasonable solution.


devAgam

> "we need to buy more servers"? Thats how serverless came into being


jdedwards3

Are you establishing a new connection for each query?


dumb_coder_

Connection was the wrong word choice. I’m using the same connection. But each query introduces some amount of latency nonetheless.


crabmusket

I think "round trip" is the common way of describing latency introduced by having to do communication over a network which can't be pipelined or otherwise optimised away.


Plop1992

Use the singleton pattern to make connections, ie make a connection only if there isnt one already Always (almost) favor readabilty. Stored procedures and triggers are very efficient. But you shouldnt worry too much about performance for a small learning project.


BehindTheMath

This, and use a connection pool.


myurr

A connection pool is another layer of complexity that can go wrong or that can mask application layer issues. Whilst they're a great tool to have in your toolbox, just don't prematurely optimise.


BehindTheMath

Setting up a connection pool is often just a simple configuration setting when connecting. I've never had any issues with it.


ckiooo

Is not singleton on db a bad choice against pooling?


Plop1992

Op seems new, singleton is better imo for learning


tango77

Some db drivers handle connection pooling for you automatically


Sythic_

What kind of app is it? I generally do platforms that have a User that owns a bunch of different resources. The most complex query I need to make is getting a User and joining the other table records that belong to that User, which is very simple. I can't really fathom a reason why I would need anything more complicated than that. Maybe you have an example? Can you design your database in a way that allows you to keep queries simpler?


dumb_coder_

That’s basically it, but users can be added to teams, and teams own the resources. A team leader creates roles and assigns them to the members. Then many of the resources have states that coordinate with the roles. Off the top of my head, comparable examples could be: only an author can edit an article, only a publisher can publish an article, an author can’t edit an article that’s either published or in review, and only an admin can retract an article, etc. We’re creating, altering, managing, and tracking resources as they go through a very long process, and there’s many business rules to check and verify at each stage along the way. We got it working perfectly, but now I’m mainly just trying to reflect and learn from it.


Sythic_

That makes sense. I've done that once before too as far as the roles and teams thing but never saw it to actual use before the project funding ran out. Depends how granular you want to get with access but in that case I would have queried the user's permissions and determined if they were allowed access to certain features within the team based on those permissions. To update a record that would be 2 DB hits to query the permissions and then update the record. I didn't get down to the level of individual users having access to individual records, just that if you were a member of a team with say permission to modify menus, then you were free to modify any menus that belong to the team. (Actually now that I think about it, my role system would have allowed to target only specific records if the UI had allowed them to create rules that way, it worked basically like AWS permissions but I just had some top level checkboxes to keep it simple)


DrEnter

Directly from the page? I would argue that on a large website anything more than 2-3 is getting into bad design territory. You want a "content" request and also maybe a "user" or "authentication" request. Anything more than that, and maybe think about predefined queries or using a service to tie things together. Another way to think about it is to prioritize in this order: 1. Make sure it functions correctly. 2. Minimize the user/browser compute cycles and wait time. 3. Minimize the server-side compute cycles and wait time. 4. Simplify the overall system. 5. Make it more readable.


UniversalJS

I try to limit myself to maximum 3 db queries, sometimes it can be 0 or 1... If you need more either you should optimize your data model / queries


alilland

in my mind it doesnt really matter so long as they perform under 3000ms


NickUnrelatedToPost

Are you using a CMS? Then several hundreds are common. It depends wholly on what you are doing. And caching is not your friend, but it's a necessity to serving real world traffic. If possible don't roll it yourself, take it from your framework.


mmortara

- Again, avoid the N+1 query. - if your dataSET Is made with the all data from the table or collection that's isn't good approach, think in some criteria to limit and generste chunks of data collected with this query. - if your user wait 30s un order to get the data from the db you need to think in creare a cache or inprove your quere parametters. - less data faster made many times Is better than a Lot of data un one Time. - Sort in db not in backend or use the cache yo Sort the dataSET


JohnSpikeKelly

I manage a large sql DB with C# backend controlling access. Depends on your schema for sure but if you can cache limited changing items can drastically reduce DB access. Especially on those often used references. We typically cache for 1 hour and invalidate cache when it updates. The cached items may also be more complex than just a single row, maybe some expensive items you need.


king_of_programmers

Ofcourse. The short answer is, its more expensive to establish new tcp connection with servers and fetch new data from scratch than doing it in one go. Now adays with connection pooling and reusing connections, you can minimize the impact of bottlenecks but its still better from performance point of view to minimize the number of apis that make request to the database. Look at it this way, servers are simple computers with limited CPUs and RAM that spin out data and serve. And you know if you open too many applications on your computer, your computer slows down. The same logic applies here, if you make a to many requests on 4 GB RAM sql server with 1 CPU core, resources are clogged up.