Extracting data | Stitch Documentation

Documentation and guides for configuring and managing data replication for your Stitch integrations.

Select data - Select the tables and columns you want Stitch to replicate from your integration.
Replication Scheduling - Set the replication schedule for an integration, which defines when and how often Stitch should run replication jobs.
Replication Methods - Replication Methods define the approach Stitch takes when extracting data from a source during a replication job.
Replication Keys - Replication Keys are source table columns that Stitch uses to identify new and updated data when using an incremental Replication Method.
Replication progress - Monitor the status of an integration’s replication job, including extraction and loading progress.

Select data

Select the tables and columns you want Stitch to replicate from your integration.

Setting Tables and Columns to Replicate

After you connect an integration and Stitch performs a structure sync, the next thing you’ll do is select tables and columns to replicate.

Replicating Database Views

Replicating a database view is almost the same as replicating a database table. In this guide, we’ll cover the database integrations that support views and the additional steps required to replicate a database view.

Syncing New and Additional Columns on Already-Syncing Tables

What happens when you add a brand-new column in a data source or you want to sync additional columns on an already-syncing table? How will your row count be impacted? In this guide, we cover how Stitch handles new columns, what you can expect for existing rows, and how to backfill data.

Understanding Data Typing

Data typing in Stitch.

Replication Scheduling

Set the replication schedule for an integration, which defines when and how often Stitch should run replication jobs.

Replication Scheduling

Create a replication schedule for your integration’s using Stitch’s Replication Frequency and Anchor Time features.

Replication Frequency

Replication Frequency is a type of replication scheduling that runs replication jobs based on a time interval you specify.

Anchor Scheduling

Anchor Scheduling is a type of replication scheduling that ‘anchors’ the start time of extraction jobs to a time you select. This allows you to establish predictable replication and ensure that your downstream processes run as scheduled with the most up-to-date data.

Advanced Replication Scheduling Using Cron Expressions

The Advanced Scheduler feature allows you to specify granular start times for data extraction. Using cron expressions, you can specify the exact times, days of the week, or even days of the month data extraction should begin.

Replication Scheduling for Tables

A workaround for replicating sets of tables on different schedules.

Replication Methods

Replication Methods define the approach Stitch takes when extracting data from a source during a replication job.

Replication Methods

Replication Methods define the approach Stitch takes when extracting data from a source during a replication job. Additionally, Replication Methods can also impact how data is loaded into your destination and your overall row usage. This guide contains an overview of each method, how it compares to Stitch’s other methods, and links to detailed documentation about the method.

Full Table Replication

Full Table Replication is a replication method in which all rows in a table - including new, updated, and existing - are replicated during every replication job. his guide contains an overview of how Full Table Replication works, when it should be used, its limitations, and how to enable it for an integration.

Key-based Incremental Replication

Key-based Incremental Replication is a replication method in which Stitch identifies new and updated data using a column called a Replication Key. This guide contains an overview of how Key-based Incremental Replication works, when it should be used, its limitations, and how to enable it for an integration.

Log-based Incremental Replication

Available for select database integrations, Log-based Incremental Replication is a replication method in which Stitch identifies modifications to records - including inserts, updates, and deletes - using a database’s binary log files. This guide contains an overview of how Log-based Incremental Replication works, when it should be used, its limitations, and how to enable it for a supporting database integration.

Deleted Record Handling

Stitch’s detection of deleted records depends on how records are deleted in the source and the Replication Method being used. In this guide, we explain the different deletion methods and how each one works with each of Stitch’s supported Replication Methods.

Replication Keys

Replication Keys are source table columns that Stitch uses to identify new and updated data when using an incremental Replication Method.

Replication Keys for Database Integrations

Replication Keys are columns that Stitch uses to identify new and updated data for replication. These columns are one of the most important components of Stitch, as they enable Stitch to correctly capture new and updated data. In this guide, we’ll walk you through what a Replication Key is, what its requirements are, and how to choose the best column for the job.

Replication Keys for MongoDB Integrations

Replication Keys for MongoDB integrations have their own set of quirks and gotchas, owing to how MongoDB itself is designed. In this guide, we’ll explain what to watch out for and how to choose the best field for the job.

Syncing Historical SaaS Data & Resetting SaaS Replication Keys

By default, a historical replication job for most SaaS integrations will go back one year. While the Start Date setting allows you to define historical data loads, it can also reset an integration’s Replication Keys when you need to re-replicate data.

Replication progress

Monitor the status of an integration’s replication job, including extraction and loading progress.

Monitoring Replication Progress

It can be difficult to be patient when all you want is your data. In the Integration Details page for every integration, you can check out that integration’s Replication Stats. This section will give you a better idea of where your data is in the replication process.

Extraction Logs

Extraction logs provide detail about the extraction portion of the replication process for a given integration.

Loading Reports

Loading reports provide detail about the loading portion of the replication process for a given integration.