From time to time, Stitch may run into problems when attempting to load data into your destination. For example: A table contains more columns than the destination’s supported limit.
On your end, this will usually look like you’re missing data. When Stitch is unable to load data, however, the occurrence will be logged in a table named _sdc_rejected
. Every integration schema created by Stitch will include this table as well as the other tables in the integration.
This rejection log can be useful for investigating data discrepancies and troubleshooting errors surfaced during the data loading process.
In this guide, we’ll cover:
Reasons for data rejection
Reasons for rejection will depend on the type of destination you’re using, as each has its own data requirements and restrictions.
There are some common causes for rejection, however:
- Column names contain data type suffixes (ex:
__bigint
), which are reserved by Stitch - Table and/or column names contain
_sdc
or_rjm
prefixes, which are reserved by Stitch - Table or column names exceed the supported character limit
- Integer data that falls outside the range supported by the data warehouse
For a detailed rollup of how each destination handles data - including what situations will result in rejected records - refer to the Data Loading guide for the destination you’re using.
_sdc_rejected table schema
In every integration schema created by Stitch is a table named _sdc_rejected
which acts as a log for a particular integration’s rejected records.
This table contains information about when and why a data rejection occurred. The _sdc_rejected
table contains the following columns:
Column name | Data type | Description |
_sdc_rejected_at | TIMESTAMP |
Timestamp indicating when the loading error occurred. Example data:2016-11-29 10:50:55.816
|
table_name | STRING |
The name of the table the record was destined for. Example data:customers
|
reason | STRING |
The reason the record was rejected. Example data:Field collision on id
|
record | STRING |
As much of the raw Transit record data as Stitch can fit. Example data:"{""id"":1,""ID"":""Paul""}"
|
Take a look at the sample data in the last column. If Stitch was attempting to load this record into a Redshift destination, it would be rejected. Why?
In this case, it’s because Redshift is case-insensitive. Because id
and Id
canonicalize to the same name - that is, they differ only by case - a collision error surfaced when Stitch attempted to load the data.
Resolving data rejection issues
In some cases, you may be able to pinpoint and resolve the root cause of the rejection.
Consider the id
and Id
example from the previous section. If these fields came from a database integration, you could re-name the columns - for example: customer_id
and first_order_id
- in the source database and re-replicate the data. This would resolve the field collision error and allow Stitch to load the data.
However, it may not be possible to resolve every rejected record issue. While you may be able to resolve the issue in a database integration, the majority of SaaS integrations don’t provide users with the ability to define and control the structure of their data.
Related | Troubleshooting |
Questions? Feedback?
Did this article help? If you have questions or feedback, feel free to submit a pull request with your suggestions, open an issue on GitHub, or reach out to us.