AWS & Redshift I/O Connection Errors

An I/O error occurred while sending to the backend.

While typically transient, I/O errors arise from connection issues between Stitch and your data warehouse. If persistent, these errors may be indicative of a larger issue. To ensure Stitch can continue loading your data, these errors should be addressed promptly.

Note: While we’ve only seen this issue affect Amazon Redshift users, it is possible that PostgreSQL destinations may also be affected.

I/O Errors, Explained

I/O errors arise from connection issues between Stitch and your data warehouse. When Stitch connects to your destination to perform a connection check or load data, an I/O error can arise if the connection to the destination is severed. This is typically caused by a timeout issue.

For example: Stitch attempts to load a large amount of data into your destination. Due to the data volume, Stitch’s query takes a long time to run and, as a result, the server closes the idle connection.

The timeout settings on the Redshift cluster and on the network components between Stitch and the Redshift cluster determine how long a connection can remain idle before it’s terminated. If Stitch is unable to complete its query before the timeout limits, the connection will be terminated and an I/O error may occur.

Potential Causes and Recommendations

To work through this section, you’ll need some technical expertise and familiarity with Amazon Web Services. If need be, we suggest looping in a developer or a member of your engineering team to help out.

Server Firewall Timeout Settings

One potential source of timeout issues may be due to the destination server’s firewall timeout settings. If the connection is from any other computer than an Amazon EC2 instance, these settings govern how long the connection may be inactive before it is terminated by the firewall.

Stitch sends a TCP keepalive signal within 200 seconds of a connection going idle, and every 75 seconds thereafter. This is to ensure that Stitch’s connection isn’t prematurely terminated.

Server Command/Query Settings

In addition to the destination server’s firewall timeout settings, the statement_timeout and WLM (Work Load Management) Timeout settings may be potential causes.

Statement timeout: The statement_timeout setting defines how long, in milliseconds, a statement may take to complete before it is aborted by the server. For example: If statement_timeout is set to 100 milliseconds, any query that takes longer than 100 milliseconds to complete will be canceled.

If the current period of this setting is very short - 1 millisecond, for example - we suggest increasing this setting to ensure Stitch’s queries can complete successfully.

Note: This parameter applies to the entire cluster.
WLM timeout: The WLM timeout parameter (max_execution_time) is functionally similar to statement_timeout, but only applies to a single queue in a WLM configuration. For more info on query queues, refer to Amazon’s documentation.

Implement Performance and Workload Management

In addition to increasing the values of the timeout settings, you should also consider implementing performance improvement features like Encodings, SORT, and DIST keys. These features can improve query efficiency, thus reducing the time it takes for Stitch’s queries to complete.

Refer to our Encodings, SORT, and DIST Keys guide for more info and application instructions.

Next Steps

If increasing the destination server’s timeout settings or applying performance improvement features doesn’t resolve the occurrence of I/O errors, we recommend reaching out to your provider’s support team:

For Redshift destinations, contact Amazon Support
For Panoply destinations, contact Panoply Support

Questions? Feedback?

Did this article help? If you have questions or feedback, feel free to submit a pull request with your suggestions, open an issue on GitHub, or reach out to us.