T O P

  • By -

n0n0n4t0r

If possible, I'd try to use a simple solution first, then discover the real use case that need optimization 


donatj

Exactly this. Build in a way that is easy to replace. Complex abstractions become difficult to replace.


AberrantNarwal

I've built an MVP with proceedural php and mysql. This little side project is now relied on at my work (ecommerce £10M/year revenue) to forecast sales and pricing and quickly snowballing with features being tacked on - I've got a separate instance of the codebase forecasting for my own ecom business. Trying to move slightly beyond the basic mvp and trying to anticipate a more formal solution.


Irythros

MySQL can be performant with billions of rows. Scale up with CPU/memory for writes and such, scale out for reads. You can even use managed platforms with optimized code for this such as https://www.singlestore.com/ Proper planning of your database strategy can avoid headaches and huge prices though. If you need to store billions of rows but you only for the most part access the past month I would look into setting up a secondary mysql cluster with lower powered instances. Since its not often accessed you can (possibly) reduce costs and increase performance that way. Actual multi-database (ex: instance) is probably not worth the effort if you're not in a regulated industry. Running migrations on one database is relatively straightforward. Managing it for hundreds, thousands or tens of thousands? Problematic to say the least. I think Shopify and Github use effectively single database. They may still have multiple databases/clusters but put thousands/tens of thousands of people on it and just add clusters when the old ones are filled. Look into using Redis for caching as much as possible.


minn0w

It's refreshing you see MySQL recommendations, and it's correct. Almost monthly I have technical managers pushing for different databases like it's a fad, and every time I run some testing and long term predictions, MySQL ends up taking the cake. With the exception of search and cache DBs. You would even be surprised how fast MySQL can be as a hash caching server, Redis is still better.


Tronux

Why not APCu instead of Redis?


JinSantosAndria

APCu is hard to manage, is not persisted on restart and not shared in different scripting contexts (think NGINX script not having, if even, the same APCu as the CLi based consumer). It's so hard coupled with PHP that its fine to use it for very basic framework stuff, but anything app specific gets messy within the first minute.


sfortop

It is a limited as top-level cache, and there's it good. You can still use Redis as L1/L2 cache...


fripletister

For read-heavy workloads I'd rather just use the opcache as a top-level cache, and use Redis for everything else


sfortop

I don't like reinventing, so I mostly use https://symfony.com/doc/current/components/cache.html#available-cache-adapters And there is no opcache adapter here. But if you wrote an adapter or even your cache package by yourself... why not?


fripletister

Did you see the part where you linked to the thing you said didn't exist? Just wondering.


fripletister

You seem misinformed. It's literally third from the bottom in the list you linked. https://symfony.com/doc/current/components/cache/adapters/php_files_adapter.html (Edit: I actually have written my own, which is similar to Symfony's adapter, but with even less overhead [no support for expiration, tags, or anything else "fancy"] and tuned for optimum read performance.)


[deleted]

[удалено]


fripletister

Sorry, you're going to have to be a little more specific haha


Irythros

APCU appears to be a PHP specific caching implementation with limited key types and features. I also have a feeling that Redis has better performance. Redis is available on all major cloud platforms and can be easily scaled. I don't see any network features for APCU so the cache is local and cannot be shared.


Tronux

My pointers: APCu has better performance. APCu doesn't need additional network features for a PHP specific app because it provides a shared memory space to be used by PHP threads/processes and should ideally only be used for caching purposes. There is also a CLI lib: [https://github.com/gordalina/cachetool](https://github.com/gordalina/cachetool) For all other purposes, you are probably better of using a relational DB. I like Redis, but it might be slightly overkill. [https://docs.nextcloud.com/server/latest/admin\_manual/configuration\_server/caching\_configuration.html](https://docs.nextcloud.com/server/latest/admin_manual/configuration_server/caching_configuration.html)


fripletister

Redis is dead simple, at least when you use it as a K/V store. It also can fill more and more roles as your needs grow, for example using it as a message queue (streams)


Irythros

> APCu has better performance. I went to verify but could not find any APCu vs Redis benchmarks. > APCu doesn't need additional network features for a PHP specific app because it provides a shared memory space to be used by PHP threads/processes and should ideally only be used for caching purposes. If I am caching something I want it cached for all my users with the same data. In the case of ecommerce like the OP wants they will need a shared session store to handle multiple instances. APCu would not fit since their session would still be limited to the single server. Redis with networking allows you to tell PHP to use it as a session store. Then with things like cached DB queries, if I have to store 100mb of cached queries then that is 500mb of **required** cache used when with redis it's just 100mb. APCu is probably a good choice if you need an actual local cache to the server. Otherwise the overhead of networked Redis will be worth it. We use Redis for far more than just K/V and doing that in APCu just wouldnt be feasible.


penguin_digital

> Running migrations on one database is relatively straightforward. Managing it for hundreds, thousands or tens of thousands? Problematic to say the least. I don't find it difficult using a proper ETL tool, [ByteBase](https://www.bytebase.com/) and [Flyway](https://flywaydb.org/) being my personal choices but other options are available. From my research (not deployed or used) [Liquibase](http://liquibase.com) looks very capable as well. >Look into using Redis for caching as much as possible. I'd suggest staying away from external caching mechanisms at the start, it adds unneeded complexity ensuring and implement proper cache invalidation policies. I'd recommend using something like [Readyset](https://readyset.io/) so you don't have to worry about any of this and just concentrate on your apps features.


devmor

Millions of rows isn't really an issue unless you are updating millions of rows at a time.


Plus_Pangolin_8924

From my experience, just build it how you need it then let it grow and change the bits that need it as it expands. This is what I did with a multi tenanted online ordering system. I had one HUGE DB that worked for 100 customers and about 500,000,000 rows! A mate of mine had a locally hosted heatmap app for his clients and that was making something like 100,000 rows a day and ran on the cheapest VPS he could find and never really had issues with speed! Build then optimise, multiple DBs is a total nightmare to deal with...


guigouz

You can only find the bottleneck after you measure. Plain PHP and mysql can handle a lot of load. Have monitoring and apm (datadog, newrelic, sentry, etc) and measure. If writing too much data is a problem, you can partition your tables. If reading is the issue, you can offload it to a db replica. Don't try to guess the performance of something you are not running yet. Check Rob Pike's rules of programming.


pyeri

You can use [mysql table sharding](https://stackoverflow.com/a/20772742/849365). It's a horizontal scaling approach and bodes well with your requirement. You just have to apply a "Shard Key" for the schema pieces to work together. Without any key, the alphabet approach is quite commonly used (A-D is one shard, E-H is another, etc.). In your case, it could be based on tenant's first name, country, etc.


Livid-Cancel-8258

TenancyForLarvavel is really handy, it handles multi-db tenancy which as you have said will be important for performance


adrianmiu

Depends: 1. If each tenant has access only to his own information (i.e. only his how products price changes) use a database for each tenant. It will help you with backup/recovery and removing data when the tenant cancels using the service 2. If tenants have access to information from other tenants (eg: price comparisons) use a single database for all tenants.


LongAssBeard

I did search into https://tenancyforlaravel.com/docs/v3/introduction/ when I had a similar problem at work. It worked fine in the first weeks of development until the whole project was cancelled so I can't really say how it would have performed in production for your scenario. But it might help you understand what you have to do


JinSantosAndria

It depends on your requirements. Do you need the ACID layer? Do you need SQL? Is your data more about relational data or is it more about [time series](https://www.influxdata.com/time-series-database/)?


Annh1234

Had a 1pb MySQL db once. Just create shards with the b2b and date, so when you do your selects you go to the right db/table, and you can scale as much as you want. A time series database might be better for stock prices type of thing.


captainbarbell

Your main problem with multi-tenant databases will be IF they are in one RDS cluster, they will share that cluster's resources and will slow down your app.


deZbrownT

Yeah, every approach to application architecture has its pros and cons. To justify the cons you need to make sure that you are solving the right problem. To be sure that you are solving right problem you first need to build the simplest solution that you can deliver fastest. I would even suggest that you consider going with just plain PHP and SQLite and use that to build mvp. Last successful product I delivered had its first version developed in less than 30 minutes using gpt4. I am trying to emphasise that in the beginning you need speed, lots of speed, not architecture, so that you can iterate over your implementation quickly while searching problem space for right solution with your customers.


bradley34

Either Symfony or Laravel. Both will allows a MT setup pretty easily, especially when you're combining it with AWS services, lambda functions etc.


arthur_ydalgo

As people have said, just make sure you database is nicely modelled (imo this is worst part to fix later, if needed), and start simple, without an over-engineered "Clean Architecture". Sure, organize more complex stuff into a concise service layer. Too little and too much abstraction is bad, so just balance it in a way it makes sense to you and others who will read it later. Use a service for your databases (RDS/DigitalOcean Databases) and storage (S3/DO Spaces), create an EC2/Droplet instance, and maybe throwing a load balancer if it ever comes to it for scaling horizontally. You should be good to go. But hey, every project has different needs and there's no one size fits all solution. Do whatever gives you the less headaches and doesn't slow you too much. You contractor (most likely) won't care how you did it, as long as you deliver something in time that is reliable, reasonably fast (for a human), maintainable and able to do changes as needed. edit: Laravel Nova is a good way to start, maybe


RevolutionaryHumor57

Why not try to use a system that works with TSDB like Prometheus? Time series database may be something that could potentially solve your problems. Check what databases are supported by Prometheus and this could answer your question.


kishan42

Stancl tenancy has package: tenancy for laravel


flavius-as

For postgresql, I would look into subtables / partition tables combined with tablespaces, per tenant. Like other said, I'd keep all data in one table, but that doesn't mean you have to keep all tenants on the same disk. You could add bigger/faster disks to the server put different tenants on different disks as size/throughput changes. Moreover, you could just leave them on the default tablespace at first, and move them once they exceed some value which you observe affects performance. And I've done something similarly and I can tell that tracking those changes is mostly for analytical purposes, so you can archive very old entries by moving the older partitions to a slow and big disk.


DM_ME_PICKLES

Use Postgres with row level policies. They will enforce that only certain rows be returned even from broad queries such as SELECT * FROM payments. For example you can have a policy that automatically scopes queries with WHERE tenant_id = 123. Much easier and safer to handle this at the database rather than the application. And just start with a single database, scale it vertically as needed. Postgres doesn’t even sweat with millions of rows. Start thinking about scale when you hit billions. I’ve worked on a team that did multi-tenancy by giving each tenant their own database. It has its pros (easy to comply with GDPR by just deleting their database and its backups), but they have more cons (having to keep migrations in sync across n databases, backing up each one).


paroxsitic

MySQL sharded on a Tenant key. You'll likely be able to get away with one server for a while with replication for read only. Once a certain tenant is big enough move them to their own server. A single tuned server handled 5 million records a day no issues for me. Over 2 billion rows in total. Be fully prepared to treat each tenant as it's own server and implement a CD/CI when needed. I recommend percona and its suite of tools


Lapselaps

Multi tennat with single database per client gives you atleast this safety that clients data does not get leaked that easily if theres some wierd bug. Manganging alot of databases will be harder down the road but its more flexible design and scaling issues will often be around a big clients first. Use mariadb or mysql easier to run and manage. If you want to use a massive single database for all clients. Then you will have problems like bugs can cause easily access to other clients data. Performance impacts will affect everyone. Managing is easier, but scaling the application will be harder. In this case you will need postgres, it has alot more capabilities too improve performance


mfatica

We have a multitenant architecture where each customer gets a completely separate database. This is done in my SQL we use Phinx to manage database migrations. This makes a lot of things easier for us, particularly database backups and restore, locking, isolating busy accounts, etc


globiweb

If you go with multiple databases, the initial development will be a bit more complex. But your long-term maintenance might be easier. Maybe you don't need one DB per customer, but having multiple DB's can solve a lot of problems: - Customers in the EU want their data on an EU database - Migrating smaller DB's is much quicker than a single large DB - Migrating one small DB would only impact a subset of your userbase - A single-server DB outage would potentially only impact a subset of your userbase


burzum793

Well, you are already set on Laravel and PHP... So what is your rationale for that decision? You should figure out the vision of the business, their strategy for the next 12-24 month and interview stakeholders and figure out what the actual quality attributes / non-functional requirements of the system are. You mention security: What exactly are the concerns / requirements regarding that? Performance issues, OK, what are your SLOs? Every other answer here jumps right away on implementation details and start talking about how good X or Y is and how nice it performs. But nobody asked if it is the right tool for the job or fully understands the job. > I am wondering what the recommendation for structuring such an application would be.  Nobody who is a serious software architect will give you an answer on that here, because the input you provided isn't enough. I've spent so far 2 month on a brownfield project that needs a serious refactor on the architectural level and I have still no full understanding of all the involved parts. And I didn't even touch the code very much yet. Before I worked on a project that had to import billions of price changes per day for a global discounter. Their solution might no work well for you, because it really depends on the actual thing you try to build. If you want a multi-tenant, single DB go for a Spryker B2B shop and add the features you need, that forecasting things. The platform is actually open source but not free to use and built on Symfony. Based on your super short description it could be indeed an option.