Connect your app to existing MySQL and PostgreSQL database

Amplify's native integration supports any MySQL or Postgres database, no matter if they're hosted on AWS within a VPC or outside of AWS with a 3rd party hosted database provider.

You must create a connection string using the following database information to get started:

Database hostname
Database port
Database username
Database user password
Database name

Step 1 - Set secrets for database connection

First, provide all the database connection information as secrets, you can use the Amplify sandbox's secret functionality to set them or go to the Amplify console to set secrets in a shared environment:

npx ampx sandbox secret set SQL_CONNECTION_STRING

Connection string format for MySQL

mysql://user:password@hostname:port/db-name

Connection string format for PostgreSQL

postgres://user:password@hostname:port/db-name

Step 2 - Generate TypeScript representation of your database schema

Run the following command to generate the Data schema with your database connection information. It'll infer an a.model() representation for all database tables that have primary key specified.

npx ampx generate schema-from-database --connection-uri-secret SQL_CONNECTION_STRING --out amplify/data/schema.sql.ts

Info

Connecting to a database behind the VPC

If your RDS database exists within a VPC, it must be configured to be Publicly accessible. This does not mean the instance needs to be accessible from all IP addresses on the Internet, but this flag is required to allow your local machine to connect via an Inbound Rule rather than being inside your VPC or connected to the VPC via VPN.

To generate the TypeScript representation of your database schema:

If your database is protected by a VPC, you will need to add an Inbound Rule for the database port from your local IP address. The npx ampx generate schema-from-database command connects to your database from your local workstation to read schema information.

If you are connecting to an RDS Proxy, the machine you run the generate schema-from-database command must be in the same VPC as the proxy itself, or you must connect to it via VPN. Simply opening an Inbound Rule for the database port is not sufficient.

To connect to your database during runtime: When you specify connection information, we'll compare the hostname you supply against a list of RDS instances, clusters, and proxies in your account in the same region as your project. If we find a match, we'll automatically detect the VPC configuration for your database and provision a SQL Lambda function to connect to the database and retrieve the schema.

The VPC security group in which your database instance, cluster, or proxy is installed must have Inbound rules that allow traffic from the following TCP ports:

The specified database port (e.g., 3306 for MySQL databases or 5432 for Postgres databases).
Port 443 (HTTPS) to allow the Lambda function to connect to AWS Systems Manager to retrieve configuration parameters.

Finally, the security group must have Outbound rules that allow traffic on those same ports from the security group itself.

Info

Handling of implicit fields (id, createdAt, updatedAt)

When creating new DynamoDB-backed data models via a.model(), a set of a implicit fields, such as id, createdAt, and updatedAt are added by default. When connecting to an existing SQL database, these fields are not implicitly added as they are not part of the underlying data source. If createdAt and updatedAt are valid columns in the underlying database table, then Amplify Data will automatically populate those fields with their respective values upon create and update mutations.

Learn more

RDS Proxy for improved connectivity

Consider adding an RDS Proxy in front of the cluster to manage database connections.

When using Amplify GraphQL API with a relational database like Amazon RDS, each query from your application needs to open a separate connection to the database.

If there are a large number of queries occurring concurrently, it can exceed the connection limit on the database and result in errors like "Too many connections". To avoid this, Amplify can use an RDS Proxy when connecting your GraphQL API to a database.

The RDS Proxy acts as an intermediary sitting in front of your database. Instead of each application query opening a direct connection to the database, they will connect through the Proxy. The Proxy helps manage and pool these connections to avoid overwhelming your database cluster. This improves the availability of your API, allowing more queries to execute concurrently without hitting connection limits.

However, there is a tradeoff of increased latency - queries may take slightly longer as they wait for an available connection from the Proxy pool. There are also additional costs associated with using RDS Proxy. Please refer to the pricing page for RDS Proxy to learn more.

Connecting to a database with a custom SSL certificate

Amplify creates an AWS Lambda function using a Node.js runtime to connect your AppSync API to your SQL database. The Lambda function connects to the database using Secure Socket Layer (SSL) or Transport Layer Security (TLS) to protect data in transit. Amplify automatically uses the correct root certificate authority (CA) certificates for Amazon RDS databases, and the Node.js runtime includes root CAs from well-known certificate providers to connect to non-RDS databases.

However, if your database uses a custom or self-signed SSL certificate, you can upload the PEM-encoded public CA certificate of 4 KB or less to your Amplify project as a secret when you generate the database configuration, and specify that secret when generating the schema from your database:

Terminal

npx ampx sandbox secret set CUSTOM_SSL_CERT < /path/to/custom/ssl/public-ca-cert.pem
npx ampx generate schema-from-database --connection-uri-secret SQL_CONNECTION_STRING --ssl-cert-secret CUSTOM_SSL_CERT --out amplify/data/schema.sql.ts

The Lambda function will then use the specified root CA to validate connections to the database.

When deploying your app to production, you need to add the PEM-encoded public CA certificate as a secret. Make sure to add the certificate with the same secret name you used in the sandbox environment. For example, we used CUSTOM_SSL_CERT above. Make sure to preserve the newlines and the ------BEGIN CERTIFICATE------ and ------END CERTIFICATE------ delimiters in the value. Finally, make sure the size of the entire value does not exceed 4KB.

This creates a new schema.sql.ts with a schema reflecting the types of your database. Do not edit the schema.sql.ts file directly. Import the schema to your amplify/data/resource.ts file and apply any additive changes there. This ensures that you can continuously regenerate the TypeScript schema representation of your SQL database without losing any additive changes that you apply out-of-band.

import { type ClientSchema, a, defineData } from '@aws-amplify/backend';
import { schema as generatedSqlSchema } from './schema.sql';
// Add a global authorization rule
const sqlSchema = generatedSqlSchema.authorization(allow => allow.guest())
// Relational database sources can coexist with DynamoDB tables managed by Amplify.
const schema = a.schema({
  Todo: a.model({
    content: a.string(),
  }).authorization(allow => [allow.guest()])
// Use the a.combine() operator to stitch together the models backed by DynamoDB
// and the models backed by Postgres or MySQL databases.
const combinedSchema = a.combine([schema, sqlSchema]);
// Don't forget to update your client types to take into account the types from
// both schemas.
export type Schema = ClientSchema<typeof combinedSchema>;
export const data = defineData({
  // Update the data definition to use the combined schema, instead of just
  // your DynamoDB-backed schema
  schema: combinedSchema

Step 3 - Fine-grained authorization rules

Use the .setAuthorization() modifier to set model-level and field-level authorization rules for the SQL-backed data models. Review Customize your auth rules for detailed authorization rule options.

// Add an authorization rule to the schema
const sqlSchema = generatedSqlSchema.setAuthorization((models) => [
  // Model-level authorization rules
  models.event.authorization((allow) => [allow.publicApiKey()]),
  // Field-level authorization rules
  models.event.fields.id.authorization(allow => [allow.publicApiKey(), allow.guest()]),
  models.event.fields.created_at.authorization(allow => [allow.publicApiKey(), allow.guest()])

Step 4 - Deploy your Data resources using the cloud sandbox

Finally, you can now validate your Data resources with your cloud sandbox:

npx ampx sandbox

On the client side, you can now make create, read, update, delete, and subscribe to your SQL-backed data models:

const { data: events } = await client.models.event.list()

Step 5 - Configuring database connection for production

When deploying your app to production, you need to add the database connection string as a secret. Make sure to add the appropriate database connection string with the same secret name you used in the sandbox environment. For example, we used SQL_CONNECTION_STRING above.

Rename generated models and fields

To improve the ergonomics of your API, you might want to rename the generate fields or types to better accommodate your use case. Use the renameModels() modifier to rename the auto-inferred data models.

// Rename models or fields to be more idiomatic for frontend code
const sqlSchema = generatedSqlSchema.authorization(allow => allow.guest())
  .renameModels(() => [
    //⌄⌄⌄⌄⌄ existing model name based on table name
    ['event', 'Event']
    //        ^^^^^^ renamed data model name

Add relationships between tables

const sqlSchema = generatedSqlSchema
  .authorization(allow => allow.guest())
  .setRelationships((models) => [
    models.Note.relationships({
      comments: a.hasMany("Comment", "note_id"),
    models.Comment.relationships({
      note: a.belongsTo("Note", "note_id")

Add custom queries, mutations, subscriptions auto-generated SQL data schema

Use the .addToSchema(...) to add in additional queries, mutations, and subscriptions to your auto-generated SQL data schema.

Note: you can't add additional data models via a.model(). They should be exclusively generated via npx ampx generate schema-from-database.

Use an inline SQL statement as a query or mutation handler

// Add custom mutations or queries that execute SQL statements
const sqlSchema = generatedSqlSchema.authorization(allow => allow.guest())
  .addToSchema({
    listEventsWithDecodedLatLong: a.query()
      // reference custom types added to the schema
      .returns(a.ref("EventWithDecodedCoord").array())
      .handler(a.handler.inlineSql(
          `SELECT
            name,
            address,
            ST_X(geom) AS longitude,
            ST_Y(geom) AS latitude
          FROM locations;`
      .authorization(allow => [allow.guest()]),
      // Define custom types to provide end-to-end typed results
      // for custom queries / mutations
      EventWithDecodedCoord: a.customType({
        id: a.integer(),
        name: a.string(),
        address: a.string(),
        longitude: a.float(),
        latitude: a.float(),

Reference an existing SQL file as a query or mutation handler

You can define the different SQL handlers in separate .sql files and reference them in your custom queries / mutations.

First, configure the custom query or mutation in your amplify/data/resource.ts file:

const sqlSchema = generatedSqlSchema.authorization(allow => allow.guest())
  .addToSchema({
    createNewLocationWithLongLat: a.mutation()
      .arguments({
        lat: a.float().required(),
        long: a.float().required(),
        name: a.string().required(),
        address: a.string().required()
      .returns(a.json().array())
      .authorization(allow => allow.authenticated())
      .handler(a.handler.sqlReference('./createNewLocationWithLongLat.sql'))

Next, add a corresponding sql file to handle the request:

createNewLocationWithLongLat.sql
INSERT INTO locations (name, address, geom)
VALUES (:name, :address, ST_GEOMFROMTEXT(CONCAT('POINT (', :long, ' ', :lat, ')'), 4326));

createNewLocationWithLongLat.sql
INSERT INTO locations (name, address, geom)
VALUES (:name, :address, ST_SetSRID(ST_MakePoint(:long, :lat), 4326))

The return type for custom queries and mutations expecting to return row data from SQL statements must be an array of the corresponding model. This is true even for custom get queries, where a single row is expected.

Example

getPostBySlug: a
  .arguments({
    slug: a.string().required(),
  .returns(a.ref("Post").array())
  .handler(
    a.handler.inlineSql(`
    SELECT id, title, slug, content, created_at, updated_at
    FROM posts
    WHERE slug = :slug;

How does it work?

The Amplify uses AWS Lambda functions to enable features like querying data from your database. To work properly, these Lambda functions need access to common logic and dependencies.

Amplify provides this shared code in the form of Lambda Layers. You can think of Lambda Layers as a package of reusable runtime code that Lambda functions can reference.

When you deploy an Amplify API, it will create two Lambda functions:

SQL Lambda

This allows you to query and write data to your database from your API.

NOTE: If the database is in a VPC, this Lambda function will be deployed in the same VPC as the database. The usage of Amazon Virtual Private Cloud (VPC) or VPC peering, with AWS Lambda functions will incur additional charges as explained, this comes with an additional cost as explained on the Amazon Elastic Compute Cloud (EC2) on-demand pricing page.

Updater Lambda

This automatically keeps the SQL Lambda up-to-date by managing its Lambda Layers.

A Lambda layer that includes all the core SQL connection logic lives within the AWS Amplify service account but is executed within your AWS account, when invoked by the SQL Lambda. This allows the Amplify service team to own the ongoing maintenance and security enhancements of the SQL connection logic.

This allows the Amplify team to maintain and enhance the SQL Layer without needing direct access to your Lambdas. If updates to the Layer are needed, the Updater Lambda will receive a signal from Amplify and automatically update the SQL Lambda with the latest Layer.

Mapping of SQL data types to field types for auto-generated schema

Note: MySQL does not support time zone offsets in date time or timestamp fields. Instead, we will convert these values to datetime, without the offset. Unlike MySQL, PostgreSQL does support date time or timestamp values with an offset.

SQL	Mapped field types
String
char	a.string()
varchar	a.string()
tinytext	a.string()
text	a.string()
mediumtext	a.string()
longtext	a.string()
Geometry
geometry	a.string()
point	a.string()
linestring	a.string()
geometryCollection	a.string()
Numeric
smallint	a.integer()
mediumint	a.integer()
int	a.integer()
integer	a.integer()
bigint	a.integer()
tinyint	a.integer()
float	a.float()
double	a.float()
decimal	a.float()
dec	a.float()
numeric	a.float()
Date and Time
date	a.date()
datetime	a.datetime()
timestamp	a.datetime()
time	a.time()
year	a.integer()
Binary
binary	a.string()
varbinary	a.string()
tinyblob	a.string()
blob	a.string()
mediumblob	a.string()
longblob	a.string()
Others
bool	a.boolean()
boolean	a.boolean()
bit	a.integer()
json	a.json()
enum	a.enum()

Troubleshooting

Debug Mode

To return the actual SQL error instead of a generic error from underlying API responses, an environment variable DEBUG_MODE can be set to true on the Amplify-generated SQL Lambda function. You can find this Lambda function in the AWS Lambda console with the naming convention of: <stack-name>-<api-name>-SQLLambdaFunction<hash>.

My SQL table doesn't get generated when running `npx ampx generate schema-from-database`

This is likely because the table doesn't have a designated primary key. A primary key is required for npx ampx generate schema-from-database to infer the table structure and create a create, read, update, and delete API.

Connect to external Amazon DynamoDB data sources