Prerequisites

  • An up-and-running Amazon Redshift instance. Instructions for creating a Amazon Redshift destination are outside the scope of this tutorial; our instructions assume that you have an instance up and running. For help getting started with Amazon Redshift, refer to Amazon’s documentation.

  • Permissions in Amazon Web Services (AWS) that allow you to:

    • Create/manage Security Groups, which is required to whitelist Stitch’s IP addresses.
    • View database details, which is required for retrieving the database’s connection details.
  • Superuser privileges, or privileges that allow you to create users and grant privileges. This is required to create a database user for Stitch.


Step 1: Verify your Stitch account's data pipeline region

First, you’ll log into Stitch and verify the data pipeline region your account is using. Later in this guide, you’ll need to grant Stitch access by whitelisting our IP addresses.

The IP addresses you’ll whitelist depend on the Data pipeline region your account is in.

  1. Sign into your Stitch account, if you haven’t already.
  2. Click User menu (your icon) > Edit User Settings and locate the Data pipeline region section to verify your account’s region.
  3. Locate the list of IP addresses for your region:

Keep this list handy - you’ll need it later.


Step 2: Configure database connection settings

In this step, you’ll configure the database server to allow traffic from Stitch to access it. There are two ways to connect your database:

  • A direct connection will work if your database is publicly accessible.
  • An SSH tunnel is required if your database isn’t publicly accessible. This method uses a publicly accessible instance, or an SSH server, to act as an intermediary between Stitch and your database. The SSH server will forward traffic from Stitch through an encrypted tunnel to the private database.

Click the option you’re using below and follow the instructions.

For Stitch to successfully connect with your database instance, you’ll need to add our IP addresses to the appropriate Security Group via the AWS management console.

Security Groups must reside in the same VPC as the instance. Use the instructions below to create a security group for Stitch and grant access to the VPC.

  1. Log into your AWS account.
  2. Navigate to the Security Group Management page, typically Services > Compute > EC2.
  3. Click the Security Groups option, under Network & Security in the menu on the left side of the page.
  4. Click Create Security Group.
  5. In the window that displays, fill in the fields as follows:
    • Security group name: Enter a unique name for the Security Group. For example: Stitch
    • Description: Enter a description for the security group.
    • VPC: Select the VPC that contains the database you want to connect to Stitch. Note: The Security Group and database must be in the same VPC, or the connection will fail.
  6. In the Inbound tab, click Add Rule.
  7. Fill in the fields as follows:
    • Type: Select Custom TCP Rule
    • Port Range: Enter the port your database uses. (5439 by default)
    • CIDR, IP or Security Group: Paste one of the Stitch IP addresses for your Stitch data pipeline region that you retrieved in Step 1.
  8. Click Add Rule to add an additional Inbound rule.
  9. Repeat steps 6-8 until all the IP addresses for your Stitch data pipeline region have been added.

    This is what a Security Group using Stitch’s North America IP addresses looks like:

    Whitelisting Stitch North America IP addresses through Inbound Security Group rules

  10. When finished, click Create to create the Security Group.
  1. Follow the steps in the Setting up an SSH Tunnel for a database in Amazon Web Services guide to set up an SSH tunnel for Amazon Redshift.
  2. Complete the steps in this guide after the SSH setup is complete.

Step 3: Create a Stitch Amazon Redshift database user

In the following tabs are the instructions for creating a Stitch Amazon Redshift database user and explanations for the permissions Stitch requires.

  1. If you haven’t already, connect to your Amazon Redshift instance using your SQL client.

  2. After connecting, run this command to create a user named stitch. Replace <password> with a secure password:

    CREATE USER stitch WITH PASSWORD '<password>';
  3. Next, you’ll assign the CREATE permissions to the Stitch user. For <database_name>, enter the name of the database where all Stitch-replicated data should be loaded.

    Note: This must be a pre-existing database.

    GRANT CREATE ON DATABASE <database_name> TO stitch
  4. If you restricted access to the system tables, you’ll also need to run the following commands to grant the Stitch user SELECT permissions.

    GRANT SELECT ON ALL TABLES IN SCHEMA information_schema TO stitch
    
    GRANT SELECT ON ALL TABLES IN SCHEMA pg_catalog TO stitch

In the table below are the database user privileges Stitch requires to connect to and load data into Amazon Redshift.

Privilege name Reason for requirement
CREATE ON DATABASE

Required to create the necessary database objects to load and store your data.

CREATE permissions on the database are required to successfully load data. When Stitch loads data, it will run a CREATE SCHEMA IF NOT EXISTS command, which will create a schema if it doesn’t already exist. To run this command, the Stitch user must have the CREATE ON DATABASE permission.

Note: The CREATE ON SCHEMA permission is not a sufficient alternative for CREATE ON DATABASE. As outlined in Amazon Redshift’s documentation, this permission only allows a user to create objects within a schema, but not the schema itself.

Ownership of all schemas and tables that Stitch will load to

Stitch occasionally needs to run COMMENT and ALTER TABLE commands to properly insert data.

Note: COMMENT and ALTER TABLE commands are used to set table comments. Stitch uses table comments to store Primary Key information. Altering or removing table comments will lead to replication issues.

If the database user you use to connect to Amazon Redshift creates the schemas and tables, you will not need to grant explicit ownership permissions. If another user creates the schemas and tables prior to Stitch loading data into them, you will need to alter ownership to the Stitch database user. Failure to do so will result in replication issues.

SELECT ON ALL TABLES IN information_schema

Required to select rows from tables in the information_schema schema. Prior to loading data, Stitch will use the data in this schema to verify the existence and structure of integration schemas and tables.

Note: Stitch will only ever read data from systems tables.

SELECT ON ALL TABLES IN pg_catalog

Required to select rows from tables in the pg_catalog schema. Prior to loading data, Stitch will use the data in this schema to verify the existence and structure of integration schemas and tables.

Note: Stitch will only ever read data from systems tables.


Step 4: Connect Stitch

To complete the setup, you need to enter your Amazon Redshift connection details into the Destination Settings page in Stitch.

Step 4.1: Locate the Amazon Redshift connection details

  1. Sign into the AWS Console, if needed.
  2. Navigate to the Redshift option.
  3. On the Redshift Dashboard page, click the Clusters option on the left side of the page. This will open the Redshift Clusters page.

  4. In the list of clusters, locate and click on the instance you want to connect to Stitch. This will open the Cluster Details page.

  5. Locate the following fields:
    • Endpoint
    • Database Name: This field contains the name of the database used to launch the instance. You’ll only need this info if you want to connect this specific database to Stitch.

      You can connect this database to Stitch, or another database within Amazon Redshift.

    • Port: This is the port used by the database.

Leave this page open for now - you’ll need it to complete the setup.

Step 4.2: Enter connection details into Stitch

  1. If you aren’t signed into your Stitch account, sign in now.
  2. Click the Destination tab.

  3. Locate and click the Amazon Redshift icon.
  4. Fill in the fields as follows:

    • Display Name: Enter a display name for your destination, to distinguish various connections of the same type.

    • Description (optional): Enter a description for your destination.

    • Host (Endpoint): Paste the Endpoint address from the Amazon Redshift Details page in AWS into this field. Don’t include the port number, if it’s appended to the end of the endpoint string - this will cause errors.

    • Port: Enter the port used by the Amazon Redshift instance. The default is 5439.

    • Username: Enter the Stitch Amazon Redshift database user’s username.

    • Password: Enter the password for the Stitch Amazon Redshift database user.

    • **: Enter the name of the Amazon Redshift database you want to connect to Stitch.

Step 4.3: Define SSH connection details

If you’re using an SSH tunnel to connect your Amazon Redshift database to Stitch, you’ll also need to define the SSH settings. Refer to the Setting up an SSH Tunnel for a database in Amazon Web Services guide for assistance with completing these fields.

  1. Click the Encryption Type menu.
  2. Select SSH to display the SSH fields.

  3. Fill in the fields as follows:

    • Remote Address: Paste the Public DNS of the SSH sever (EC2 instance) into this field. Refer to the Amazon SSH guide for instructions on retrieving this info.

    • SSH Port: Enter the SSH port of the SSH server (EC2 instance) into this field. This will usually be 22.

    • SSH User: Enter the Stitch Linux (SSH) user’s username.

Step 4.4: Save the destination

When finished, click Check and Save.

Stitch will perform a connection test to the Amazon Redshift database; if successful, a Success! message will display at the top of the screen. Note: This test may take a few minutes to complete.


Questions? Feedback?

Did this article help? If you have questions or feedback, feel free to submit a pull request with your suggestions, open an issue on GitHub, or reach out to us.