Google BigQuery is a fully managed, cloud-based big data analytics web service for processing very large read-only data sets. BigQuery was designed for analyzing data on the order of billions of rows, using a SQL-like syntax.
For more information, check out Google’s Google BigQuery overview.
This guide serves as a reference for version 1 of Stitch’s Google BigQuery destination.
Details and features
Stitch features
High-level details about Stitch’s implementation of Google BigQuery, such as supported connection methods, availability on Stitch plans, etc.
Release status |
Deprecated |
Stitch plan availability |
All Stitch plans |
Stitch supported regions |
Operating regions determine the location of the resources Stitch uses to process your data. Learn more. |
Supported versions |
Not applicable |
Connect API availability |
Unsupported
This version of the Google BigQuery destination is not currently available in Stitch’s Connect API. |
SSH connections |
Unsupported
Stitch does not support using SSH tunnels to connect to Google BigQuery destinations. |
SSL connections |
Supported
Stitch will attempt to use SSL to connect by default. No additional configuration is needed. |
VPN connections |
Unsupported
Virtual Private Network (VPN) connections may be implemented as part of a Premium plan. Contact Stitch Sales for more info. |
Static IP addresses |
Supported
This version of the Google BigQuery destination has static IP addresses that can be whitelisted. |
Default loading behavior |
Append-Only |
Nested structure support |
Supported
|
Destination details
Details about the destination, including object names, table and column limits, reserved keywords, etc.
Note: Exceeding the limits noted below will result in loading errors or rejected data.
Maximum record size |
20MB |
Table name length |
1,024 characters |
Column name length |
128 characters |
Maximum columns per table |
10,000 |
Maximum table size |
None |
Maximum tables per database |
None |
Case sensitivity |
Insensitive |
Reserved keywords |
Refer to the Reserved keywords documentation. |
Supported Google Cloud Storage regions
When you set up a Google BigQuery destination, you’ll select a Google Storage location. This determines the location of the internal Google Storage bucket Stitch uses during the replication process.
Stitch supports the following Google Cloud Storage regions for version 1 of the Google BigQuery destination:
Region description | Region name | |
Americas | United States |
US
|
Europe | European Union |
EU
|
London |
europe-west2
|
|
Asia Pacific | Singapore |
asia-southeast1
|
Sydney |
australia-southeast1
|
|
Tokyo |
asia-northeast1
|
Google BigQuery pricing
Unlike many other cloud-based data warehouse solutions, Google BigQuery’s pricing model is based on usage and not a fixed-rate. This means that your bill can vary over time.
Before fully committing yourself to using Google BigQuery as your data warehouse, we recommend familiarizing yourself with the Google BigQuery pricing model and how using Stitch may impact your costs.
Learn more about Stitch & Google BigQuery pricing
Replication
Loading behavior
Stitch will use Append-Only replication when loading data into this version of the Google BigQuery destination.
In Append-Only replication, existing rows aren’t updated. Multiple versions of a row can exist in a table, creating a log of how a row changed over time. Note: While this may look like a discrepancy, it is intended functionality for Google BigQuery v1 destinations.
Because of this loading strategy, querying may require a different strategy than usual. Using some of the system columns Stitch inserts into tables will enable you to locate the latest version of a record at query time.
Refer to the Understanding loading behavior guide for more info and examples.
Primary Keys
Google BigQuery destinations don’t have native support for Primary Keys. While Primary Key columns will be present in destination tables, no constraints will be applied to the columns.
Incompatible sources
No compatibility issues have been discovered between Google BigQuery and Stitch's integration offerings.
See all destination and integration incompatibilities.
Transformations
System tables and columns
Stitch will create the following tables in each integration’s dataset:
Additionally, Stitch will insert system columns (prepended with _sdc
) into each table.
Data typing
Stitch converts data types only where needed to ensure the data is accepted by Google BigQuery. In the table below are the data types Stitch supports for Google BigQuery destinations, and the Stitch types they map to.
- Stitch type: The Stitch data type the source type was mapped to. During the Extraction and Preparing phases, Stitch identifies the data type in the source and then maps it to a common Stitch data type.
- Destination type: The destination-compatible data type the Stitch type maps to. This is the data type Stitch will use to store data in Google BigQuery.
- Notes: Details about the data type and/or its allowed values in the destination, if available. If a range is available, values that exceed the noted range will be rejected by Google BigQuery.
Stitch type | Destination type | Notes |
ARRAY | ARRAY |
|
BIGINT | INTEGER |
|
BOOLEAN | BOOLEAN | |
DATE | TIMESTAMP |
|
DOUBLE | FLOAT | |
FLOAT | FLOAT | |
INTEGER | INTEGER |
|
NUMBER | NUMERIC |
|
RECORD | RECORD | |
STRING | STRING |
|
JSON structures
Google BigQuery supports nested records within tables, whether it’s a single record or repeated values. This means that when nested data structures are loaded into Google BigQuery, they will be maintained.
Column names
Column names in Google BigQuery:
- Must contain only letters (a-z, A-Z), numbers (0-9), or underscores (
_
) - Must begin with a letter or an underscore
-
Must be less than the maximum length of 128 characters. Columns that exceed this limit will be rejected by Google BigQuery.
- Must not be prefixed or suffixed with any of Google BigQuery’s or Stitch’s reserved keyword prefixes or suffixes
Stitch will perform the following transformations to ensure column names adhere to the rules imposed by Google BigQuery:
Transformation | Source column | Destination column |
Convert uppercase and mixed case to lowercase |
CUSTOMERID or cUsTomErId
|
customerid
|
Convert spaces to underscores |
customer id
|
customer_id
|
Convert special characters to underscores |
customer#id or !customerid
|
customer_id and _customerid
|
Convert leading numbers to underscores |
4customerid
|
_customerid
|
Timezones
Google BigQuery will store the value in UTC as TIMESTAMP
.
More info about timestamp data types can be found in BigQuery’s documentation.
Compare destinations
Not sure if Google BigQuery is the data warehouse for you? Check out the Choosing a Stitch Destination guide to compare each of Stitch’s destination offerings.
Related | Troubleshooting |
Questions? Feedback?
Did this article help? If you have questions or feedback, feel free to submit a pull request with your suggestions, open an issue on GitHub, or reach out to us.