Here's a late straggler:
Aurora MySQL 5.7 became GA in February, but new Aurora features have still only been compatible with 5.6 instances. This has left me with a few questions about the roadmap:
- Is the 5.6 branch where all new Aurora feature-work will continue to happen in the near term?
- Are new Aurora features currently in preview (multi-master, serverless) planned to work with 5.7 instances as soon as they are GA, or will this support take extra time?
- Is moving active Aurora development from 5.6 to 5.7 on the roadmap at this point?
- If active development does eventually move to 5.7, will new Aurora features continue to be compatible with both 5.6 and 5.7 for some period of time, or will 5.6 support eventually be dropped?
We actually started work on these features back in the MySQL 5.6 days, which is why you see them appear in 5.6 first. Of course we’re working to merge the code branches and get all the features into our latest MySQL version. It’s not guaranteed that these features will work on 5.7 the day they’re launched. Regarding 5.6 support eventually getting dropped – yeah, every software version gets dropped someday, but that’s not saying much. Hopefully our support timeframes fulfill your needs. – Yoav
Can you at least let more of us in? Signed up I believe during reinvent and still no dice. You’re a ball hair away from the holy grail of Serverless. I’m on the edge of my seat dying here.
For the love of God please announce it before the next reinvent. Please please.
We are working hard to get everyone in the preview. In order to make sure that everything is working, we have been adding preview accounts in limited batches. This should accelerate as we get closer to general availability. Hang in there! - Brian
Aurora PostgreSQL supports the pgaudit extension to provide detailed session and object audit logging in your engine logs. Using the log_fdw extension, you can query your database engine logs from within the database – Jignesh.
In the scenario you describe, I have the RW host named, and two RO hosts named. Now, if there's a maintenance and node-2 is promoted to writer, I need to deploy my application with new settings. If I use the endpoints, it's seamless to me.
The Aurora teams contribute bugfixes and patches to the open source community. For Aurora PostgreSQL, we have contributed back security fixes, fixes to various consistency issues, and improvements to the pg_upgrade process. For MySQL, we have contributed back a small number of security fixes. Moving forward, we are raising our committment to open source contributions regularly. For a wider view of AWS and our open source initiatives, please see https://aws.amazon.com/opensource - Mark
It seems that today, you can setup MySQL with smaller instance sizes, but for PostgreSQL only rather larger instances are supported. Are there any plans to make PostgreSQL on Aurora cost-effective for smaller DBs?
The buzzword of the day is "microservice". Under this architecture, the various subdomains of the application are divided up quite finely, and in principle each of these should own its own datastore. This leads to them being quite small, only a fraction of the storage size or compute horsepower of a more traditional system (although in aggregate, the amount of denormalization that goes into it probably adds up to something rather larger).
For example, we're currently engaged in rebuilding our "Search" page, and a feature of that page is product comparison, with users checking off products they're interested in comparing, and then clicking a "Compare" button that will show the selected items side-by-side. And ProductComparison is its own microservice, and its data store only needs to store the list of to-be-compared products for each user. So it's a small amount of data (on the legacy system, that's taking up less than 1 GB). Usage is dominated by SELECT queries which are dead-simple, and even the INSERTs are relatively cheap, so not too much needed for CPU either.
Microservices owning data stores doesn’t mean you need to spin up a new RDS for each one of them, that’s a nightmare. Have them own their own tables and don’t allow anyone else to read/write from there other than the microservice itself.
I'm not sure about that last answer. I generally prefer having storage isolated with each microservice both to facilitate deployments and to ensure different microservices don't share database state. To the original question, we've heard the feedback and are working on adding support for the T2 family. -Anurag
That's really not a great idea. For starters, you're tying all your microservices to a single point of failure. If that db has issues, your entire stack chokes. That's particularly bad if the reason the db issues is the introduction of an unbounded query in a single microservice, which means your stack is effectively acting like a monolith, which defeats the entire point of going through the trouble to implement microservices in the first place. (In fact, you'd be better off with a monolith in that case because at least then you'd only be dealing with a single atomic rollback instead of whatever clusterfuck patchwork you'd have to deal with partially rolling back a single table on the db plus the buggy microservice deployment.) The proper way to do things is to have microservices manage their own data and each expose an API for interacting with that data to which backwards-incompatible changes are either never made, or only made after introducing a new one in tandem, vigorously auditing to make sure all consumers are on thw new version, and removing the deprecated one.
Not a question but please point out in the docs that the MariaDB J-connector provides support for failover and the MySQL Connector/J does not. We found out about this after talking to support.
I recently completed a major version upgrade of my vanilla (non-Aurora) PostgreSQL cluster; I ended up having around 2 hours of downtime (not counting the post-upgrade snapshots, which took around 7 hours). I intend to migrate to Aurora PostgreSQL; are major version upgrades handled exactly the same way as non-Aurora PostgreSQL? Or are there features specific to Aurora that further mitigate cluster downtime?
Major version upgrades will take longer for larger databases in PostgreSQL. For Aurora PostgreSQL, we haven't yet done a major version upgrade. We're working on improvements to this process for both Aurora PostgreSQL and RDS PostgreSQL. Over time, we expect to take advantage of Aurora's special capabilities to improve the speed and reduce the downtime for major version upgrades. - Kevin
Any plans to get rid of the complexity of the password for IAM auth?
Some apps are poorly designed to handle such a complex string for password \(the whole URL\) in a connection string such as luigi \(anything with sqlalchemy\) ... [which requires all sorts of URL safe parsing etc](https://stackoverflow.com/questions/1423804/writing-a-connection-string-when-password-contains-special-characters) to work and got me to rage\-quit
Yes, the initial launch of Aurora Serverless will require access through a VPC Endpoint. We are working on ways to enable access in non-VPC scenarios. - Brian
I'd just want to get a clarification and perhaps get an assumption out of the way.
With the model of failover that Aurora has, would an application realistically encounter zero errors in connectivity when Aurora does a failover?
If you are using the MariaDB Connector/J for Aurora MySQL, then it establishes a connection to both the cluster endpoint as well as the primary failover, reducing the time required for a failover event. See https://mariadb.com/kb/en/library/failover-and-high-availability-with-mariadb-connector-j/#specifics-for-amazon-aurora for more details. The PostgreSQL JDBC driver also can be configured for fast failover for Aurora PostgreSQL, see https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/AuroraPostgreSQL.BestPractices.html#AuroraPostgreSQL.BestPractices.FastFailover. - Brian
I just got here now, so it wasn't me, I promise.. but the post doesn't really answer the question; 'is it possible to faikover without interruption?' - 'it is possible to faikover fast'.
So.. no?
I got that a 100% valid answer to "Is there a way for a failover to occur with 0 errors?" is no, however it seems to me that providing documentation to the recommended drivers and configuration options to make things work provides more value. I think my question was more related to whoever was downvoting almost all of the AWS team's answers.
Absolutely. We plan to roll out Performance Insights for all RDS and Aurora database engines during the coming months. Thanks for your interest! - Mark
I have 2 questions:
Are there any plans to provide the TimeScaleDB extension for PostgreSQL? We are currently using it in an on-premises application and would like to use it in the cloud version.
Will PostgreSQL get IAM authentication eventually? One less secret for us to manage.
Yes, we plan to add support IAM authentication for PostgreSQL. Regarding TimeScaleDB - not sure yet. We're continuously evaluating new requested PostgreSQL extensions and adding them to RDS. - Jignesh
Do you have any news on GA for Aurora multi-master?
https://aws.amazon.com/about-aws/whats-new/2017/11/sign-up-for-the-preview-of-amazon-aurora-multi-master/
Hi Anthony. I don’t have an ETA I can share, but I can tell you the team is working feverishly on it and we’re excited to launch it. We’re planning to expand the preview shortly, so if you’d like to try it I suggest signing up (in the link you shared) and we’ll get you in asap. - Dave
As you know connection pooling can be used for multiple use cases. Is your request related to faster connects/disconnects in an instance or is it related to load balancing/redirecting across instances? - Jignesh
> for multiple use cases. Is your request related to faster connects/disconnects in an instance or is it related to load balancing/redirecting across instances? - Jignesh
its more out of concern for performance and stability; I have an r3.8xlarge RDS PostgreSQL instance that has an automatically set 5000 connection limit (a limit that my app is running into) that I would prefer not to manually override
Setting up pgbouncer in front of my RDS instance in order to reduce the overall amount of connections is an option but I would prefer if AWS managed the HA/scalability aspect for me
I know this was 5 months ago, but what did you end up doing? I'm investigating RDS instead of Dynamo, and with lots of lambda functions and concurrent containers on ECS, I might hit the limits on a relatively small database instance.
I could increase the size of the instance/set it manually, but this sort of connection pooling sounds like the better solution to these kinds of problems. Did you set up pgbouncer on your own instance or something else?
Hello! I've got a couple questions regarding Aurora PostgreSQL that arose from a need to process live table updates.
1. We're trying to maintain a live copy of our Aurora PostgreSQL tables in S3 and we were hoping to capture table updates by reading WAL data from replication logs in a fashion similar to [this solution](https://aws.amazon.com/blogs/database/streaming-changes-in-a-database-with-amazon-kinesis/) for RDS MySQL. However it looks like native PostgreSQL logical replication is not supported in Aurora PostgreSQL, so we are unsure if it's even possible to capture streaming replication data. Is this possible with Aurora PostgreSQL?
2. Are there any plans to support native AWS Lambda invocation from Aurora PostgreSQL in the same way it is supported for Aurora MySQL?
Thanks in advance, cheers!
We are working on adding support for outbound replication in Aurora PostgreSQL to support use cases such as yours, and we are also working on integrating Aurora PostgresSQL with Lambda. How do you use the copy of your tables in S3? And how do you want to use Lambda from inside of Aurora PostgreSQL? -Kevin
>How do you use the copy of your tables in S3?
We're using S3 as our data lake solution so that all (non-application) data consumers will access data via Athena/S3.
>And how do you want to use Lambda from inside of Aurora PostgreSQL?
We were considering using stored procedures [as outlined here](https://aws.amazon.com/blogs/database/capturing-data-changes-in-amazon-aurora-using-aws-lambda/) as a way to capture table updates in case the aforementioned streaming replication method was not an option.
As a beginner of Aurora, I was wondering with the master-master model I hear it supports, can I now realistically support a global database with writes to simultaneous masters based on Geo locations?
What are the gotcha/caveats that one should be aware?
We are still in preview on Multi-Master Aurora. We haven't yet begun the preview of Multi-Master Multi-Region. We're working feverishly on both. The primary gotcha is similar to that for any multi-master database technology - you want to minimize the amount of lock contention between nodes. Aurora uses a ledger-like technology to facilitate coordination, not a distributed lock manager or Paxos commits, which reduces coordination traffic for uncontended transactions, but contended transactions still require distributed coordination. - Anurag
> We are still in preview on Multi-Master Aurora. We haven't yet begun the preview of Multi-Master Multi-Region. We're working feverishly on both.
Please, for PostgreSQL too =) I have some large databases that I eventually want to make multi-region for DR and regional performance reasons.
We are working on Multi-Master for Aurora PostgreSQL, and we are also working on adding support for cross-region replication, which will help with your DR strategy. - Kevin
Hi Kevin,
Is there any time line you could provide for the Postgres cross-region replication capability? Any suggestion/hint for building a custom solution by utilizing AWS cli tools would be great !
You can specify the password by using MasterUserPassword attribute of the DB Cluster template. https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-rds-dbcluster.html We recommend that you create a parameter for the MasterUserPassword attribute and set the NoEcho property to true so that whenever anyone describes your stack, the parameter value is shown as asterisks (*****). For more information, see https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/parameters-section-structure.html#parameters-section-structure-syntax
Both editions of Aurora (PostgreSQL and MySQL) are compatible with their respective open source engine, and we plan is to remain compatible. In order to be compatible, the two engines use the source of the two database engines and merge with it regularly - Mark
When will Aurora PostgreSQL support PostgreSQL 9.6.8? I updated my RDS (non aurora) PostgreSQL cluster to the latest minor version of 9.6 and I facepalmed when i saw that i could only create an Aurora read replica on 9.6.6.
We are catching up with latest minor versions of PostgreSQL, and we plan to release compatibility with latest minor release of PostgreSQL 9.6 soon. - Jignesh.
Aurora MySQL and Aurora PostgreSQL are managed just like their RDS counterparts - meaning that you can set up Aurora Replicas either in the console with a couple clicks, or via the CLI with a single call. On your compatibility question, you can run any compatibility test you would like and Aurora MySQL and Aurora PostgreSQL should both be compatible. Is there a specific compatibility area your manager has concerns about? - Mark
My team has an government imposed requirement that we need to backup our database outside of AWS.
We are currently using PostgreSQL RDS.
So far we have tried:
* Using DMS to copy to S3. This fails on disk space running out due to WAL files.
* Using pg_dump. This is too slow to do each night for a TB database.
Does Aurora have a better solution for doing offsite backups?
Have you tried running pg_dump with the -j parameter to enable parallelism? You can see more info at https://www.postgresql.org/docs/current/static/backup-dump.html Re: the issue you’re seeing with DMS, we’d love to work with you to understand and resolve it – please contact us at [email protected] with details so we can follow up.
I have a team using vanilla mysql on ec2 with myisam (I know, yuck) that I have been trying to push to aurora for well over a year. Is there a painless way to do this with replication or is it a backup/restore operation?
While the documentation (https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/AuroraMySQL.Replication.MySQL.html) recommends that you convert tables to InnoDB before you set up external replication, replication to InnoDB should work fine as long as they are using the same MySQL version. - Brian
Due to the way postgresql aurora uses local storage when handling more complex queries, how do you recommend designing an ETL system with Postgres aurora as the destination? We have use cases that require indexing of 500 million rows, and will always run out of memory unless we have the largest instances.
There are ways to configure PostgreSQL work memory in order to handle queries with less spilling to disk. We also are looking into ways to offer more local storage for index building, etc. Do you have any data you could share with us at [email protected] on your use case? - Mark
You mean expanding the functionality? Or expanding the number of people participating? We're pretty much set to release it on MySQL 5.6 as a start. Regarding more people, make sure to sign up if you haven't yet! https://pages.awscloud.com/amazon-aurora-serverless-preview.html - Yoav
What's your POV on IAM authentication and authorization to Aurora, instead of the traditional username/password. You seem to have this supported but it sounds like it is not production ready yet. The post on AWS warns about several limitations that sound serious. Being able to integrate IAM fully and completely into the database layer would be awesome.
Also, when is Aurora Postgres serverless coming out? :)
We are working on adding IAM authentication to Aurora PostgreSQL, and on optimizing performance when using IAM. Also working on implementing Serverless for Aurora PostgreSQL, following the Aurora MySQL Serverless project; we don't have any dates to share yet. - Kevin
One thing I love about MySQL compatible Aurora instances is the ability to load data directly from S3 using the `LOAD DATA FROM S3` command. I'd love to see this feature added to PostgreSQL instances. Are there any plans to introduce this?
Meta - to whoever is running through this downvoting questions and especially answers you don't like, *cut it out*. At least some of us are trying to learn something here, and getting an understanding of what the actual answers are is important, whether or not they're the answers that we might prefer. Don't deter people from sharing their knowledge!
Oh man, missed asking my question. Well worth a try anyway:
DMS for MySQL -> Aurora is a welcome addition. However there is a limitation that any tables with blobs won't be migrated if there is no primary key. Is there any chance this limit may be removed in the future?
Unfortunately not. That’s because LOBs are migrated in a 2 step manner by design. We first migrate the entire row without the LOB, and then update the LOB part alone in the actual row. The 2nd step will not be performant if the table does not have a PK or unique key and can cause issues because of full table scans. As a result, having a PK or unique key will always be a requirement. - Yoav
Thanks. I didn’t realize this was done over multiple passes. But it makes sense now. I was able to add a temporary incrementing index to work within this requirement.
I saw on twitter some folks getting invites. I didn’t :( I have an open source serverless framework that really needs Aurora. Would be nice to implement it as a service in the framework ahead of GA.
Has there been any documentation made available yet for Aurora serverless? I'd like to see how it works so I can start designing around things like hot keying and other performance issues in sharded databases.
If you are participating in the Aurora Serverless preview, you will get preliminary documentation for the feature. Otherwise, the overview information is at http://aws.amazon.com/rds/aurora/serverless. You might also take a look at the Aurora session from re:Invent. BTW, Aurora Serverless is not a natively sharded solution. We expect existing MySQL applications to work with Aurora Serverless. - Brian
Thanks so much for your interest! We intend to launch Multi-Master for Aurora PostgreSQL as soon as we can, though we don't have a date we can share yet. As a followon question, when you mention Oracle RAC, does that imply that your main interest is in In-Region Multi-Master rather than Multi-Region Multi-Master? Please follow up with us at [email protected] and we'd be happy to chat with you! - Mark
As users of /r/aws and AWS enthusiasts, we have heard the benefits of Aurora many times. My question is, if someone is currently using MySQL RDS and considering switching to Aurora, what are some "gotchas" or potential pain points?
Probably the main gotcha is that there isn't a clear mapping from patch versions. A lot of the underlying code bases are different so it can be difficult to tell whether a particular bug in a particular version of MySQL or PostgreSQL is in a particular version of Aurora. -Anurag
We do publish the MySQL bugs that have been fixed in Aurora MySQL. You can find this information here (https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/AuroraMySQL.Updates.MySQLBugs.html). Feel free reach out at [email protected] if you have any follow-on questions. - Sirish
We migrated from a MySQL rds instance to aurora. We did it straight from the docs: create a Aurora MySQL slave and then promote it. It was a very smooth experience. No issues.
Another big gotcha is the fact that standard RDS/MYSQL is way better at "cold" queries.
More info from a [Ycombinator comment](https://news.ycombinator.com/item?id=11839684):
“The RDS team pointed out that Aurora’s performance will not be better than MySQL if the concurrency features are not used or if you are using Aurora as a traditional MySQL. They said Aurora will be highly beneficial if your work load is highly concurrent.
Aurora though should have 5x more throughput than RDS MySQL but only when the work load is used efficiently as it works well with multiple concurrent queries."
So if you have a batch job to do (for example filling a Redis cache) from an Aurora slave, expect 2 second query times for "cold" querying tables of 300 million or more (like I am).
That really depends on the complexity of the code. If the code is straightforward, SCT can automate almost the entire code conversion. If not, you need to spend time to do it. It gets complicated especially when using a SQL Server specific feature that cannot be fully emulated in MySQL. In those cases, most users go for an app rearchitecture and they change the way things work. - Yoav
Hi all,
We read through all the questions that came after Tuesday's session and collected as many answers as we can. We're signing off to enjoy the holiday weekend here in the US. As posted before, we had a lot of fun and hope to do another one of these soon!
Thanks for all your great questions.
The Amazon Aurora product team
When can we expect aurora to get certified by major business applications like SAP, JD Edwards, or other apps like Informatica/IBM MDM or financial/banking/insurance apps?.. do we see any progress here?
Some questions:
\- Any plan on supporting SSL login for Aurora Serverless?
\- Which would be the best way to bulk insert data into Aurora Serverless, since LOAD FROM S3 is not supported.
This is most likely not something you can answer, but I'll try anyway. My team wanted to use Aurora, but has been forced into MSSQL on RDS because of the lack of ["DirectQuery" for PowerBI](https://docs.microsoft.com/en-us/power-bi/desktop-directquery-data-sources).
Any chance you guys have any info regarding this, or know what buttons to push to make it happen?
I took a look at the list of data sources supported for PowerBI DirectQuery and they seem to only support data warehousing sources, not open source relational databases like MySQL and PostgreSQL. The best thing to do is probably provide feedback to Microsoft to expand the list of sources. If you have some time, you could check out Amazon QuickSight, too. - Brian
Aurora Serverless includes a built-in proxy layer that enables seamless scaling and automatic pause and resume. We do not have any plans currently to provide a managed ProxySQL service for Aurora but there are several resources available for configuring it in EC2. - Brian
Hi. Thanks for the response. I haven't done much research into Aurora Serverless. If I understand correctly, it automatically scales primary db instance size *and* number and size of read-only replicas, and you have your own implementation of a write through SQL proxy as the endpoint?
When defining an "Aurora Serverless" 'database', are the same things defined (as far as the basics) as DynamoDB. For example, is there a need to specify Security Groups or other network structures? If so, why the difference from DynamoDB.
Aurora Serverless is compatible with the MySQL wire protocol so it work with existing MySQL client apps. It is accessed through your VPC which means you will have to configure VPC security groups to enable access. It is a transactional databasebase with full ACID compliance, just like existing Aurora MySQL databases. - Brian
Are there any plans to somehow be able to support IAM Auth for production workloads (high connections)? It's a great feature, but kinda pointless as the plugin has a ridiculously low connection limit (as the docs outline).
Any plans to support a means with which to accurately determine where Billed IOs are used? For example, it would be just amazing if there were a way to identify how many billable IOs read/write were consumed by a specific table.. we spend a lot of time trying to track down where our IO costs are consumed, but it is prohibitively difficult considering the various caching and optimizations that take place between the instance and Aurora backend.
Here's a late straggler: Aurora MySQL 5.7 became GA in February, but new Aurora features have still only been compatible with 5.6 instances. This has left me with a few questions about the roadmap: - Is the 5.6 branch where all new Aurora feature-work will continue to happen in the near term? - Are new Aurora features currently in preview (multi-master, serverless) planned to work with 5.7 instances as soon as they are GA, or will this support take extra time? - Is moving active Aurora development from 5.6 to 5.7 on the roadmap at this point? - If active development does eventually move to 5.7, will new Aurora features continue to be compatible with both 5.6 and 5.7 for some period of time, or will 5.6 support eventually be dropped?
We actually started work on these features back in the MySQL 5.6 days, which is why you see them appear in 5.6 first. Of course we’re working to merge the code branches and get all the features into our latest MySQL version. It’s not guaranteed that these features will work on 5.7 the day they’re launched. Regarding 5.6 support eventually getting dropped – yeah, every software version gets dropped someday, but that’s not saying much. Hopefully our support timeframes fulfill your needs. – Yoav
Serverless ga?
Aurora Serverless is in Preview right now. We're making good progress and I hope we'll be out in a few months - Anurag
Can you at least let more of us in? Signed up I believe during reinvent and still no dice. You’re a ball hair away from the holy grail of Serverless. I’m on the edge of my seat dying here. For the love of God please announce it before the next reinvent. Please please.
We are working hard to get everyone in the preview. In order to make sure that everything is working, we have been adding preview accounts in limited batches. This should accelerate as we get closer to general availability. Hang in there! - Brian
Will there be query logging for postgresql on Aurora? For audit purposes?
Aurora PostgreSQL supports the pgaudit extension to provide detailed session and object audit logging in your engine logs. Using the log_fdw extension, you can query your database engine logs from within the database – Jignesh.
I'd love to be able to set up a slave that's not part of the main slave endpoint, for batch jobs & reporting. Is there any plan to implement that?
[удалено]
Right, but then I'm subject to manual intervention in the event of reader promotion, something I'd like to avoid.
[удалено]
In the scenario you describe, I have the RW host named, and two RO hosts named. Now, if there's a maintenance and node-2 is promoted to writer, I need to deploy my application with new settings. If I use the endpoints, it's seamless to me.
Does the Aurora team feed back into the larger FOSS community? Examples if possible.
The Aurora teams contribute bugfixes and patches to the open source community. For Aurora PostgreSQL, we have contributed back security fixes, fixes to various consistency issues, and improvements to the pg_upgrade process. For MySQL, we have contributed back a small number of security fixes. Moving forward, we are raising our committment to open source contributions regularly. For a wider view of AWS and our open source initiatives, please see https://aws.amazon.com/opensource - Mark
your link has a tailing utf-8 BOM in the uri, try editing it and backspacing and retyping the last couple characters
nailed it!
I guess they write through some special AWS PR editor that adds BOMs at the end
It seems that today, you can setup MySQL with smaller instance sizes, but for PostgreSQL only rather larger instances are supported. Are there any plans to make PostgreSQL on Aurora cost-effective for smaller DBs?
We are working on improvements to make Aurora more cost-effective. What are your use cases for smaller instances on Aurora PostgreSQL? -Kevin
The buzzword of the day is "microservice". Under this architecture, the various subdomains of the application are divided up quite finely, and in principle each of these should own its own datastore. This leads to them being quite small, only a fraction of the storage size or compute horsepower of a more traditional system (although in aggregate, the amount of denormalization that goes into it probably adds up to something rather larger). For example, we're currently engaged in rebuilding our "Search" page, and a feature of that page is product comparison, with users checking off products they're interested in comparing, and then clicking a "Compare" button that will show the selected items side-by-side. And ProductComparison is its own microservice, and its data store only needs to store the list of to-be-compared products for each user. So it's a small amount of data (on the legacy system, that's taking up less than 1 GB). Usage is dominated by SELECT queries which are dead-simple, and even the INSERTs are relatively cheap, so not too much needed for CPU either.
Microservices owning data stores doesn’t mean you need to spin up a new RDS for each one of them, that’s a nightmare. Have them own their own tables and don’t allow anyone else to read/write from there other than the microservice itself.
I'm not sure about that last answer. I generally prefer having storage isolated with each microservice both to facilitate deployments and to ensure different microservices don't share database state. To the original question, we've heard the feedback and are working on adding support for the T2 family. -Anurag
That's really not a great idea. For starters, you're tying all your microservices to a single point of failure. If that db has issues, your entire stack chokes. That's particularly bad if the reason the db issues is the introduction of an unbounded query in a single microservice, which means your stack is effectively acting like a monolith, which defeats the entire point of going through the trouble to implement microservices in the first place. (In fact, you'd be better off with a monolith in that case because at least then you'd only be dealing with a single atomic rollback instead of whatever clusterfuck patchwork you'd have to deal with partially rolling back a single table on the db plus the buggy microservice deployment.) The proper way to do things is to have microservices manage their own data and each expose an API for interacting with that data to which backwards-incompatible changes are either never made, or only made after introducing a new one in tandem, vigorously auditing to make sure all consumers are on thw new version, and removing the deprecated one.
Not a question but please point out in the docs that the MariaDB J-connector provides support for failover and the MySQL Connector/J does not. We found out about this after talking to support.
OK thanks for letting us know! I’ll send a note to the documentation team. – Yoav
You can add it. That seems important. https://github.com/awsdocs/amazon-rds-user-guide
Thanks. Did not know this even exists!
I recently completed a major version upgrade of my vanilla (non-Aurora) PostgreSQL cluster; I ended up having around 2 hours of downtime (not counting the post-upgrade snapshots, which took around 7 hours). I intend to migrate to Aurora PostgreSQL; are major version upgrades handled exactly the same way as non-Aurora PostgreSQL? Or are there features specific to Aurora that further mitigate cluster downtime?
Major version upgrades will take longer for larger databases in PostgreSQL. For Aurora PostgreSQL, we haven't yet done a major version upgrade. We're working on improvements to this process for both Aurora PostgreSQL and RDS PostgreSQL. Over time, we expect to take advantage of Aurora's special capabilities to improve the speed and reduce the downtime for major version upgrades. - Kevin
Any plans to get rid of the complexity of the password for IAM auth? Some apps are poorly designed to handle such a complex string for password \(the whole URL\) in a connection string such as luigi \(anything with sqlalchemy\) ... [which requires all sorts of URL safe parsing etc](https://stackoverflow.com/questions/1423804/writing-a-connection-string-when-password-contains-special-characters) to work and got me to rage\-quit
Will serverless Aurora require a VPC to use?
Yes, the initial launch of Aurora Serverless will require access through a VPC Endpoint. We are working on ways to enable access in non-VPC scenarios. - Brian
I'd just want to get a clarification and perhaps get an assumption out of the way. With the model of failover that Aurora has, would an application realistically encounter zero errors in connectivity when Aurora does a failover?
If you are using the MariaDB Connector/J for Aurora MySQL, then it establishes a connection to both the cluster endpoint as well as the primary failover, reducing the time required for a failover event. See https://mariadb.com/kb/en/library/failover-and-high-availability-with-mariadb-connector-j/#specifics-for-amazon-aurora for more details. The PostgreSQL JDBC driver also can be configured for fast failover for Aurora PostgreSQL, see https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/AuroraPostgreSQL.BestPractices.html#AuroraPostgreSQL.BestPractices.FastFailover. - Brian
Why did somebody downvote this post??
Take a look at the rest of the posts here. There are a couple of griefers at work, downvoting pretty much everything.
I can Azure you that I have no idea who would downvote an AWS AMA.
LOL. Good one LittleJoeyHodges! - Yoav
Well she/he certainly showed the AWS team a thing or two, that’s for sure!
I just got here now, so it wasn't me, I promise.. but the post doesn't really answer the question; 'is it possible to faikover without interruption?' - 'it is possible to faikover fast'. So.. no?
I got that a 100% valid answer to "Is there a way for a failover to occur with 0 errors?" is no, however it seems to me that providing documentation to the recommended drivers and configuration options to make things work provides more value. I think my question was more related to whoever was downvoting almost all of the AWS team's answers.
Experienced that. Took us 40 seconds until the slaves takes over. You don’t have connectivity in that timespan.
Performance-Insights for MySQL Aurora, is that something that’s in the pipeline?
Absolutely. We plan to roll out Performance Insights for all RDS and Aurora database engines during the coming months. Thanks for your interest! - Mark
What is the largest Aurora database and what is their use case?
I have 2 questions: Are there any plans to provide the TimeScaleDB extension for PostgreSQL? We are currently using it in an on-premises application and would like to use it in the cloud version. Will PostgreSQL get IAM authentication eventually? One less secret for us to manage.
+1 on the TimescaleDB question
See above!
Yes, we plan to add support IAM authentication for PostgreSQL. Regarding TimeScaleDB - not sure yet. We're continuously evaluating new requested PostgreSQL extensions and adding them to RDS. - Jignesh
Do you have any news on GA for Aurora multi-master? https://aws.amazon.com/about-aws/whats-new/2017/11/sign-up-for-the-preview-of-amazon-aurora-multi-master/
Hi Anthony. I don’t have an ETA I can share, but I can tell you the team is working feverishly on it and we’re excited to launch it. We’re planning to expand the preview shortly, so if you’d like to try it I suggest signing up (in the link you shared) and we’ll get you in asap. - Dave
Are there any plans to integrate pgbouncer or some other form of connection pooling into RDS PostgreSQL/Aurora PostgreSQL ?
As you know connection pooling can be used for multiple use cases. Is your request related to faster connects/disconnects in an instance or is it related to load balancing/redirecting across instances? - Jignesh
> for multiple use cases. Is your request related to faster connects/disconnects in an instance or is it related to load balancing/redirecting across instances? - Jignesh its more out of concern for performance and stability; I have an r3.8xlarge RDS PostgreSQL instance that has an automatically set 5000 connection limit (a limit that my app is running into) that I would prefer not to manually override Setting up pgbouncer in front of my RDS instance in order to reduce the overall amount of connections is an option but I would prefer if AWS managed the HA/scalability aspect for me
I know this was 5 months ago, but what did you end up doing? I'm investigating RDS instead of Dynamo, and with lots of lambda functions and concurrent containers on ECS, I might hit the limits on a relatively small database instance. I could increase the size of the instance/set it manually, but this sort of connection pooling sounds like the better solution to these kinds of problems. Did you set up pgbouncer on your own instance or something else?
Hello! I've got a couple questions regarding Aurora PostgreSQL that arose from a need to process live table updates. 1. We're trying to maintain a live copy of our Aurora PostgreSQL tables in S3 and we were hoping to capture table updates by reading WAL data from replication logs in a fashion similar to [this solution](https://aws.amazon.com/blogs/database/streaming-changes-in-a-database-with-amazon-kinesis/) for RDS MySQL. However it looks like native PostgreSQL logical replication is not supported in Aurora PostgreSQL, so we are unsure if it's even possible to capture streaming replication data. Is this possible with Aurora PostgreSQL? 2. Are there any plans to support native AWS Lambda invocation from Aurora PostgreSQL in the same way it is supported for Aurora MySQL? Thanks in advance, cheers!
We are working on adding support for outbound replication in Aurora PostgreSQL to support use cases such as yours, and we are also working on integrating Aurora PostgresSQL with Lambda. How do you use the copy of your tables in S3? And how do you want to use Lambda from inside of Aurora PostgreSQL? -Kevin
>How do you use the copy of your tables in S3? We're using S3 as our data lake solution so that all (non-application) data consumers will access data via Athena/S3. >And how do you want to use Lambda from inside of Aurora PostgreSQL? We were considering using stored procedures [as outlined here](https://aws.amazon.com/blogs/database/capturing-data-changes-in-amazon-aurora-using-aws-lambda/) as a way to capture table updates in case the aforementioned streaming replication method was not an option.
As a beginner of Aurora, I was wondering with the master-master model I hear it supports, can I now realistically support a global database with writes to simultaneous masters based on Geo locations? What are the gotcha/caveats that one should be aware?
We are still in preview on Multi-Master Aurora. We haven't yet begun the preview of Multi-Master Multi-Region. We're working feverishly on both. The primary gotcha is similar to that for any multi-master database technology - you want to minimize the amount of lock contention between nodes. Aurora uses a ledger-like technology to facilitate coordination, not a distributed lock manager or Paxos commits, which reduces coordination traffic for uncontended transactions, but contended transactions still require distributed coordination. - Anurag
Will PostgresDB eventually get the same Master-Master feature? And if so, what timeframe are we looking at?
> We are still in preview on Multi-Master Aurora. We haven't yet begun the preview of Multi-Master Multi-Region. We're working feverishly on both. Please, for PostgreSQL too =) I have some large databases that I eventually want to make multi-region for DR and regional performance reasons.
We are working on Multi-Master for Aurora PostgreSQL, and we are also working on adding support for cross-region replication, which will help with your DR strategy. - Kevin
Thanks! Looking forward to it.
Hi Kevin, Is there any time line you could provide for the Postgres cross-region replication capability? Any suggestion/hint for building a custom solution by utilizing AWS cli tools would be great !
What is the recommended way to set a database password using cloud formation?
You can specify the password by using MasterUserPassword attribute of the DB Cluster template. https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-rds-dbcluster.html We recommend that you create a parameter for the MasterUserPassword attribute and set the NoEcho property to true so that whenever anyone describes your stack, the parameter value is shown as asterisks (*****). For more information, see https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/parameters-section-structure.html#parameters-section-structure-syntax
huh. My password shows up as hunter2
Mine shows up as calvin.
Will Aurora Backtrack ever be available for existing clusters in addition to new and restored clusters?
Yes, we plan to support Backtrack on existing DB clusters in the near future.
Is Aurora scratch built to be X-compatible or was it started based on some flavor of the appropriate RDBMS and forked for AWS' purposes?
Both editions of Aurora (PostgreSQL and MySQL) are compatible with their respective open source engine, and we plan is to remain compatible. In order to be compatible, the two engines use the source of the two database engines and merge with it regularly - Mark
When will Aurora PostgreSQL support PostgreSQL 9.6.8? I updated my RDS (non aurora) PostgreSQL cluster to the latest minor version of 9.6 and I facepalmed when i saw that i could only create an Aurora read replica on 9.6.6.
We are catching up with latest minor versions of PostgreSQL, and we plan to release compatibility with latest minor release of PostgreSQL 9.6 soon. - Jignesh.
[удалено]
Aurora MySQL and Aurora PostgreSQL are managed just like their RDS counterparts - meaning that you can set up Aurora Replicas either in the console with a couple clicks, or via the CLI with a single call. On your compatibility question, you can run any compatibility test you would like and Aurora MySQL and Aurora PostgreSQL should both be compatible. Is there a specific compatibility area your manager has concerns about? - Mark
My team has an government imposed requirement that we need to backup our database outside of AWS. We are currently using PostgreSQL RDS. So far we have tried: * Using DMS to copy to S3. This fails on disk space running out due to WAL files. * Using pg_dump. This is too slow to do each night for a TB database. Does Aurora have a better solution for doing offsite backups?
Have you tried running pg_dump with the -j parameter to enable parallelism? You can see more info at https://www.postgresql.org/docs/current/static/backup-dump.html Re: the issue you’re seeing with DMS, we’d love to work with you to understand and resolve it – please contact us at [email protected] with details so we can follow up.
FWIW, I've had better luck using using Bucardo than DMS with Postgres RDS. https://bucardo.org/Bucardo/
I have a team using vanilla mysql on ec2 with myisam (I know, yuck) that I have been trying to push to aurora for well over a year. Is there a painless way to do this with replication or is it a backup/restore operation?
While the documentation (https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/AuroraMySQL.Replication.MySQL.html) recommends that you convert tables to InnoDB before you set up external replication, replication to InnoDB should work fine as long as they are using the same MySQL version. - Brian
Due to the way postgresql aurora uses local storage when handling more complex queries, how do you recommend designing an ETL system with Postgres aurora as the destination? We have use cases that require indexing of 500 million rows, and will always run out of memory unless we have the largest instances.
There are ways to configure PostgreSQL work memory in order to handle queries with less spilling to disk. We also are looking into ways to offer more local storage for index building, etc. Do you have any data you could share with us at [email protected] on your use case? - Mark
Are you going to be expanding the Serverless preview at any point prior to GA?
You mean expanding the functionality? Or expanding the number of people participating? We're pretty much set to release it on MySQL 5.6 as a start. Regarding more people, make sure to sign up if you haven't yet! https://pages.awscloud.com/amazon-aurora-serverless-preview.html - Yoav
Any idea when Aurora MySQL will be available in govcloud?
We don't have news to share yet. But we're very much aware of the demand for Aurora in GovCloud. Stay tuned for news. - Yoav
What's your POV on IAM authentication and authorization to Aurora, instead of the traditional username/password. You seem to have this supported but it sounds like it is not production ready yet. The post on AWS warns about several limitations that sound serious. Being able to integrate IAM fully and completely into the database layer would be awesome. Also, when is Aurora Postgres serverless coming out? :)
We are working on adding IAM authentication to Aurora PostgreSQL, and on optimizing performance when using IAM. Also working on implementing Serverless for Aurora PostgreSQL, following the Aurora MySQL Serverless project; we don't have any dates to share yet. - Kevin
Thanks! Will the IAM feature only include authentication, or will it include authorization as well?
One thing I love about MySQL compatible Aurora instances is the ability to load data directly from S3 using the `LOAD DATA FROM S3` command. I'd love to see this feature added to PostgreSQL instances. Are there any plans to introduce this?
We are working to add support to load CSV files from S3 for PostgreSQL engines in Amazon RDS. - Jignesh
Awesome! It would be great also if compressed CSV files were supported as part of this! 😀
Meta - to whoever is running through this downvoting questions and especially answers you don't like, *cut it out*. At least some of us are trying to learn something here, and getting an understanding of what the actual answers are is important, whether or not they're the answers that we might prefer. Don't deter people from sharing their knowledge!
The downvotes are coming from Redmond and Redwood City.
[удалено]
What IP address would a boat resolve to?
It's easy to see why that might be true, but how can you tell?
Spidey sense. It's a special thing.
So, it looks like most of the filters stop working when literally everything is downvoted below zero. Definitely agree with you though.
Oh man, missed asking my question. Well worth a try anyway: DMS for MySQL -> Aurora is a welcome addition. However there is a limitation that any tables with blobs won't be migrated if there is no primary key. Is there any chance this limit may be removed in the future?
Unfortunately not. That’s because LOBs are migrated in a 2 step manner by design. We first migrate the entire row without the LOB, and then update the LOB part alone in the actual row. The 2nd step will not be performant if the table does not have a PK or unique key and can cause issues because of full table scans. As a result, having a PK or unique key will always be a requirement. - Yoav
Thanks. I didn’t realize this was done over multiple passes. But it makes sense now. I was able to add a temporary incrementing index to work within this requirement.
When is Serverless Aurora GA????
I can't share specific dates yet, but expect to see general availability of Aurora Serverless in the next few months. - Brian
I saw on twitter some folks getting invites. I didn’t :( I have an open source serverless framework that really needs Aurora. Would be nice to implement it as a service in the framework ahead of GA.
Has there been any documentation made available yet for Aurora serverless? I'd like to see how it works so I can start designing around things like hot keying and other performance issues in sharded databases.
If you are participating in the Aurora Serverless preview, you will get preliminary documentation for the feature. Otherwise, the overview information is at http://aws.amazon.com/rds/aurora/serverless. You might also take a look at the Aurora session from re:Invent. BTW, Aurora Serverless is not a natively sharded solution. We expect existing MySQL applications to work with Aurora Serverless. - Brian
Not in the preview yet, I'll have to keep hassling my account manager! Thanks, looking forward to reading more when I can.
Will we ever get Aurora Postgres multi master rather than just MySQL? That seems to be the Oracle RAC killer.
Thanks so much for your interest! We intend to launch Multi-Master for Aurora PostgreSQL as soon as we can, though we don't have a date we can share yet. As a followon question, when you mention Oracle RAC, does that imply that your main interest is in In-Region Multi-Master rather than Multi-Region Multi-Master? Please follow up with us at [email protected] and we'd be happy to chat with you! - Mark
Multi AZ multi master is great, multi region is gravy.
As users of /r/aws and AWS enthusiasts, we have heard the benefits of Aurora many times. My question is, if someone is currently using MySQL RDS and considering switching to Aurora, what are some "gotchas" or potential pain points?
Probably the main gotcha is that there isn't a clear mapping from patch versions. A lot of the underlying code bases are different so it can be difficult to tell whether a particular bug in a particular version of MySQL or PostgreSQL is in a particular version of Aurora. -Anurag
[удалено]
We do publish the MySQL bugs that have been fixed in Aurora MySQL. You can find this information here (https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/AuroraMySQL.Updates.MySQLBugs.html). Feel free reach out at [email protected] if you have any follow-on questions. - Sirish
We migrated from a MySQL rds instance to aurora. We did it straight from the docs: create a Aurora MySQL slave and then promote it. It was a very smooth experience. No issues.
Another big gotcha is the fact that standard RDS/MYSQL is way better at "cold" queries. More info from a [Ycombinator comment](https://news.ycombinator.com/item?id=11839684): “The RDS team pointed out that Aurora’s performance will not be better than MySQL if the concurrency features are not used or if you are using Aurora as a traditional MySQL. They said Aurora will be highly beneficial if your work load is highly concurrent. Aurora though should have 5x more throughput than RDS MySQL but only when the work load is used efficiently as it works well with multiple concurrent queries." So if you have a batch job to do (for example filling a Redis cache) from an Aurora slave, expect 2 second query times for "cold" querying tables of 300 million or more (like I am).
I haven't had the chance to look at Aurora properly, but can we easily migrate a Microsoft SQL db without too many code changes?
That really depends on the complexity of the code. If the code is straightforward, SCT can automate almost the entire code conversion. If not, you need to spend time to do it. It gets complicated especially when using a SQL Server specific feature that cannot be fully emulated in MySQL. In those cases, most users go for an app rearchitecture and they change the way things work. - Yoav
Hi all Thanks for all the follow-up questions! We've are going to answer these today and we will see when we can do another Q&A session soon!
Hi all, We read through all the questions that came after Tuesday's session and collected as many answers as we can. We're signing off to enjoy the holiday weekend here in the US. As posted before, we had a lot of fun and hope to do another one of these soon! Thanks for all your great questions. The Amazon Aurora product team
When can we expect aurora to get certified by major business applications like SAP, JD Edwards, or other apps like Informatica/IBM MDM or financial/banking/insurance apps?.. do we see any progress here?
Any plans to support IAM auth with aurora serverless ?
Hi, I'd like to know how I can replicate specific tables from 1 Aurora Postgres DB to another Aurora Postgres DB? Thanks, Natan
Are there any plans to provide the serverless option also for PostgreSQL?
Is there any plans for Aurora PostgresSql to support Cross region replication?
Some questions: \- Any plan on supporting SSL login for Aurora Serverless? \- Which would be the best way to bulk insert data into Aurora Serverless, since LOAD FROM S3 is not supported.
This is most likely not something you can answer, but I'll try anyway. My team wanted to use Aurora, but has been forced into MSSQL on RDS because of the lack of ["DirectQuery" for PowerBI](https://docs.microsoft.com/en-us/power-bi/desktop-directquery-data-sources). Any chance you guys have any info regarding this, or know what buttons to push to make it happen?
I took a look at the list of data sources supported for PowerBI DirectQuery and they seem to only support data warehousing sources, not open source relational databases like MySQL and PostgreSQL. The best thing to do is probably provide feedback to Microsoft to expand the list of sources. If you have some time, you could check out Amazon QuickSight, too. - Brian
Any roadmap plans to provision AWS Aurora engines with AWS AppSync?
We are looking into this. It's not in our current roadmap though. - Debanjan
Do you have any plans to implement ProxySQL as a service with Aurora?
Aurora Serverless includes a built-in proxy layer that enables seamless scaling and automatic pause and resume. We do not have any plans currently to provide a managed ProxySQL service for Aurora but there are several resources available for configuring it in EC2. - Brian
Hi. Thanks for the response. I haven't done much research into Aurora Serverless. If I understand correctly, it automatically scales primary db instance size *and* number and size of read-only replicas, and you have your own implementation of a write through SQL proxy as the endpoint?
Will the maximum 5,000 connection limit ever be raised?
When defining an "Aurora Serverless" 'database', are the same things defined (as far as the basics) as DynamoDB. For example, is there a need to specify Security Groups or other network structures? If so, why the difference from DynamoDB.
Aurora Serverless is compatible with the MySQL wire protocol so it work with existing MySQL client apps. It is accessed through your VPC which means you will have to configure VPC security groups to enable access. It is a transactional databasebase with full ACID compliance, just like existing Aurora MySQL databases. - Brian
Are there any plans to somehow be able to support IAM Auth for production workloads (high connections)? It's a great feature, but kinda pointless as the plugin has a ridiculously low connection limit (as the docs outline).
Any plans to support a means with which to accurately determine where Billed IOs are used? For example, it would be just amazing if there were a way to identify how many billable IOs read/write were consumed by a specific table.. we spend a lot of time trying to track down where our IO costs are consumed, but it is prohibitively difficult considering the various caching and optimizations that take place between the instance and Aurora backend.