Depending on the Replication Method being used and how records are deleted in the source, deletes may not be captured during the replication process.
Deletion methods
There are two methods that can be used to delete a source record:
- Soft deletes, which will leave a record in the source and use a flag to indicate deletion, such as
is_deleted
ordeleted_on
. If the delete event updates the record’s Replication Key value, Stitch will detect and replicate the changes. - Hard deletes, which completely remove records from the source. It’s as if the record never existed. If using Key-based Incremental Replication, this will remove the record’s Replication Key value, which Stitch uses to identify new and updated records. Without a Replication Key value to check, Stitch can’t identify the change and update the record in the destination.
Delete support overview
In the table below are each of Stitch’s Replication Methods and the level at which each deletion method is supported.
Click the Replication Method name to check out examples of how each deletion method works with that specific Replication Method.
Replication Method | Soft deletes | Hard deletes |
Full Table |
Supported Record will have an updated deleted flag value, if a flag is available in the source. |
Sometimes supported Record may remain or be removed. Refer to the Full Table Replication section for more info and examples. |
Key-based Incremental |
Supported Record will have an updated deleted flag value, if a flag is available in the source. |
Not supported Record will have an updated deleted flag value, if a flag is available in the source. |
Log-based Incremental |
Supported Record will have an updated deleted flag value, if a flag is available in the source. |
Sometimes supported Record will remain in the destination. The Note: Only deletes completed as part of a supported event type will be captured. |
Full Table Replication
Click the tabs below to view examples for how each deletion method is handled using Full Table Replication.
In this example, we’ll demonstrate what happens when two records are soft-deleted in the source. This table uses the is_deleted
column as a deletion flag, which will be true
if the record is deleted.
During the initial job:
-
In the source: A table contains three records:
id: 1
,id: 2
, andid: 3
. -
During the job: Stitch replicates the entire table.
-
In the destination: Stitch loads the entire table in full.
Source
|
Destination
|
During the second job:
-
In the source: Records
1
and3
are soft deleted (is_deleted: true
). -
During the job: Stitch replicates the entire table.
-
In the destination: Stitch overwrites the table in the destination, resulting in the
is_deleted
values for records1
and3
being updated.
Source
|
Destination
|
In this example, we’ll demonstrates what happens when a record is hard-deleted - or removed entirely - from the source.
Important: Append-Only destinations
If using a destination that’s configured to use Append-Only loading, this example is relevant only to integrations that support configurable Replication Methods.
For other integrations, see the Hard deletes with Append-Only loading tab for an example of how hard deletes are handled.
During the initial job:
-
In the source: A table contains three records:
id: 1
,id: 2
, andid: 3
. -
During the job: Stitch replicates the entire table.
-
In the destination: Stitch loads the table in full.
Source
|
Destination
|
During the second job:
-
In the source: Before the second job begins, record
2
is hard deleted and removed entirely from the source. -
During the job: Stitch replicates the table in full.
-
In the destination: Stitch deletes and recreates the table in a single transaction, loading the replicated table in full.
Source
|
Destination
|
In this example, we’ll demonstrates what happens when a record is hard-deleted from the source and loaded into a destination using Append-Only loading.
How hard deletes are handled in an Append-Only destination depends on the integration:
- If you can define Replication Methods for an integration, Stitch will delete and re-create tables in a single transaction. This is the scenario described in the Hard deletes tab for this section.
- Otherwise, Stitch will append the table in full to the destination table. This is the scenario we’ll outline here.
During the initial job:
-
In the source: A table contains three records:
id: 1
,id: 2
, andid: 3
. -
During the job: Stitch replicates the entire table.
-
In the destination: Stitch loads the table in full.
Source
|
Destination
|
During the second job:
-
In the source: Before the second job begins, record
2
is hard deleted and removed entirely from the source. -
During the job: Stitch replicates the table in full.
-
In the destination: Stitch appends the replicated data to the end of the table in the destination. In this case, records
1
and3
would be appended.
Source
|
Destination
|
Key-based Incremental Replication
Click the tabs below to view examples for how each deletion method is handled using Key-based Incremental Replication.
In this example, we’ll demonstrate what happens when two records are soft-deleted in the source. This table uses the is_deleted
column as a deletion flag, which will be true
if the record is deleted.
During the initial job:
-
In the source: A table contains three records:
id: 1
,id: 2
, andid: 3
. -
During the job: Depending on the type of integration, the initial extraction will vary slightly:
- Database integration: Stitch replicates the table in full.
- SaaS integration: Stitch uses the Start Date defined in the Integration Settings as the initial Replication Key (
updated_at
) value. Records with anupdated_at
value greater than or equal to the Start Date are replicated.
For all integrations, Stitch saves
updated_at: 2018-07-27 12:07:00
as the maximum Replication Key value. -
In the destination: Stitch loads the table.
Source
|
Destination
|
During the second job:
-
In the source: Records
1
and3
are soft deleted (is_deleted: true
). The Replication Key value (updated_at
) is updated. -
During the job: Stitch replicates all records with an
updated_at
value>= 2018-07-27 12:07:00
. -
In the destination: The
is_deleted
andupdated_at
values are updated for records1
and3
.
Source
|
Destination
|
In this example, we’ll demonstrates what happens when a record is hard-deleted - or removed entirely - from the source.
Note: To remove hard deletes from destination tables replicated in this way, you’ll need to:
- Temporarily drop the table in the destination
- Reset the table’s Replication Keys
- Wait for Stitch to re-create the table during the next replication job
These steps must be done in this order, or you may cause issues with replication.
Hard delete: Initial jobDuring the initial job:
-
In the source: A table contains three records:
id: 1
,id: 2
, andid: 3
. -
During the job: Depending on the type of integration, the initial extraction will vary slightly:
- Database integration: Stitch replicates the table in full.
- SaaS integration: Stitch uses the Start Date defined in the Integration Settings as the initial Replication Key value. Records with a Replication Key value greater than or equal to the Start Date are replicated.
For all integrations, Stitch saves
updated_at: 2018-07-27 12:07:00
as the maximum Replication Key value. -
In the destination: Stitch loads the table.
Source
|
Destination
|
During the second job:
-
In the source: Before the second job begins, record
2
is hard deleted and removed entirely from the source. Records1
and3
are updated. -
During the job: Stitch replicates all records with an
updated_at
value>= 2018-07-27 12:07:00
. -
In the destination: The
updated_at
values for records1
and3
are updated.Record
2
remains in the destination table.
Source
|
Destination
|
Log-based Incremental Replication
Click the tabs below to view examples for how each deletion method is handled using Log-based Incremental Replication.
In this example, we’ll demonstrate what happens when two records are soft-deleted in a source table using Log-based Incremental.
Soft delete: Initial jobDuring the initial job:
-
In the source: A table contains three records:
id: 1
,id: 2
, andid: 3
. -
During the job: Stitch uses
SELECT
replication to replicate the table in full. -
In the destination: Stitch loads the table in full.
Source
|
Destination
|
During the second job:
-
In the source: Records
1
and3
are soft deleted (is_deleted: true
) using anUPDATE
event.As
UPDATE
is a supported event type for Log-based Replication, log messages describing the modifiedis_deleted
values for records1
and3
are written to the database’s binary log. -
During the job: After the historical replication job, Stitch will read updates for the table from the database’s binary logs.
Stitch reads the log messages for records
1
and3
and replicates the updated records. -
In the destination: The
is_deleted
andupdated_at
values are updated for records records1
and3
.
Source
|
Destination
|
In this example, we’ll demonstrate what happens when a record is hard deleted - or removed entirely - from a source table using Log-based Incremental. In this situation, Stitch will use a TIMESTAMP
column named _sdc_deleted_at
to indicate when records have been identified as deleted in the source.
Note: Identifying hard deleted records is dependent on how the record is deleted, as not all event types are recorded by databases in binary logs. For example: If a record is deleted using TRUNCATE
in a MySQL database, Stitch will not detect the deletion.
The following event types will be detected by Log-based Incremental for database integrations that support this Replication Method:
INSERT
UPDATE
DELETE
During the initial job:
-
In the source: A table contains three records:
id: 1
,id: 2
, andid: 3
. -
During the job: Stitch uses
SELECT
replication to replicate the table in full. -
In the destination: Stitch loads the table in full.
Source
|
Destination
|
During the second job:
-
In the source: Record
2
is hard deleted using aDELETE
event.As
DELETE
is a supported event type for Log-based Replication, log messages describing the delete of record2
are written to the database’s binary log. -
During the job: After the historical replication job, Stitch will read updates for the table from the database’s binary logs.
Stitch reads the log message for record
2
and identifies the record as deleted. -
In the destination: Record
2
is overwritten with the exception of its Primary Key value. Stitch will place a timestamp in the_sdc_deleted_at
column to indicate when the deletion took place.
Source
|
Destination
|
Related | Troubleshooting |
Questions? Feedback?
Did this article help? If you have questions or feedback, feel free to submit a pull request with your suggestions, open an issue on GitHub, or reach out to us.