This integration is powered by Singer's GitLab tap. For support, visit the GitHub repo or join the Singer Slack.
GitLab integration summary
Stitch’s GitLab integration replicates data using the GitLab REST API. Refer to the Schema section for a list of objects available for replication.
GitLab feature snapshot
A high-level look at Stitch's GitLab (v1) integration, including release status, useful links, and the features supported in Stitch.
STITCH | |||
Release status |
Released on March 1, 2017 |
Supported by | |
Stitch plan |
Standard |
API availability |
Available |
Singer GitHub repository | |||
REPLICATION SETTINGS | |||
Anchor Scheduling |
Supported |
Advanced Scheduling |
Supported |
Table-level reset |
Unsupported |
Configurable Replication Methods |
Unsupported |
DATA SELECTION | |||
Table selection |
Unsupported |
Column selection |
Unsupported |
Select all |
Unsupported |
||
TRANSPARENCY | |||
Extraction Logs |
Supported |
Loading Reports |
Supported |
Connecting GitLab
GitLab setup requirements
To set up GitLab in Stitch, you need:
-
Access to any projects you want to replicate data from. Stitch will only be able to access the same projects as the user who creates the integration.
Step 1: Create a GitLab token
- Sign into your GitLab account.
- Click the user menu (your icon) > Settings.
- Click the Access Tokens tab.
- In the Name field, enter
Stitch
. This will allow you to easily identify what application is using the token. - In the Scopes section, check the api box. This will allow Stitch to access your API and replicate your GitLab data.
- Click Create Personal Access Token.
- The new Access Token will display at the top of the page. Copy the token before navigating away from the page - GitLab won’t display it again.
Step 2: Add GitLab as a Stitch data source
- Sign into your Stitch account.
-
On the Stitch Dashboard page, click the Add Integration button.
-
Click the GitLab icon.
-
Enter a name for the integration. This is the name that will display on the Stitch Dashboard for the integration; it’ll also be used to create the schema in your destination.
For example, the name “Stitch GitLab” would create a schema called
stitch_gitlab
in the destination. Note: Schema names cannot be changed after you save the integration. - In the API URL field, enter
https://gitlab.com/api/v4
. - In the Private Token field, paste the Personal Access Token you created in the previous section.
-
In the Projects and Groups to Track fields, you’ll enter the projects and/or groups you want to track as a space-separated list.
For example:
stitchdata/group-a
, orstitchdata/project-a stitchdata/project-b
Note: A value for one of these fields must be provided. Additionally, the way you define these settings determines how some data is replicated:
- If groups are provided but projects aren’t, all group projects will be replicated.
- If groups and projects are provided, the selected projects of the listed groups will be replicated.
- If projects are provided but groups aren’t, all listed projects will be replicated.
Step 3: Define the historical replication start date
The Sync Historical Data setting defines the starting date for your GitLab integration. This means that data equal to or newer than this date will be replicated to your data warehouse.
Change this setting if you want to replicate data beyond GitLab’s default setting of 1 year. For a detailed look at historical replication jobs, check out the Syncing Historical SaaS Data guide.
Step 4: Create a replication schedule
In the Replication Frequency section, you’ll create the integration’s replication schedule. An integration’s replication schedule determines how often Stitch runs a replication job, and the time that job begins.
GitLab integrations support the following replication scheduling methods:
-
Advanced Scheduling using Cron (Advanced or Premium plans only)
To keep your row usage low, consider setting the integration to replicate less frequently. See the Understanding and Reducing Your Row Usage guide for tips on reducing your usage.
Initial and historical replication jobs
After you finish setting up GitLab, its Sync Status may show as Pending on either the Stitch Dashboard or in the Integration Details page.
For a new integration, a Pending status indicates that Stitch is in the process of scheduling the initial replication job for the integration. This may take some time to complete.
Initial replication jobs with Anchor Scheduling
If using Anchor Scheduling, an initial replication job may not kick off immediately. This depends on the selected Replication Frequency and Anchor Time. Refer to the Anchor Scheduling documentation for more information.
Free historical data loads
The first seven days of replication, beginning when data is first replicated, are free. Rows replicated from the new integration during this time won’t count towards your quota. Stitch offers this as a way of testing new integrations, measuring usage, and ensuring historical data volumes don’t quickly consume your quota.
GitLab table reference
Schemas and versioning
Schemas and naming conventions can change from version to version, so we recommend verifying your integration’s version before continuing.
The schema and info displayed below is for version 1 of this integration.
This is the latest version of the GitLab integration.
Table and column names in your destination
Depending on your destination, table and column names may not appear as they are outlined below.
For example: Object names are lowercased in Redshift (CusTomERs
> customers
), while case is maintained in PostgreSQL destinations (CusTomERs
> CusTomERs
). Refer to the Loading Guide for your destination for more info.
branches
The branches
table contains high-level info about repository branches in your projects.
Note: To replicate branch data, you must set this table and the projects
table to replicate. Data for this table will only be replicated when the associated project (in the projects
table) is also updated.
Key-based Incremental |
|
Primary Keys |
project_id name |
Useful links |
Join branches with | on |
---|---|
commits |
branches.project_id = commits.project_id branches.commit_id = commits.id |
issues |
branches.project_id = issues.project_id |
milestones |
branches.project_id = milestones.project_id |
projects |
branches.project_id = projects.project_id |
commit_id STRING |
developers_can_merge BOOLEAN |
developers_can_push BOOLEAN |
merged BOOLEAN |
name STRING |
project_id INTEGER |
protected BOOLEAN |
commits
The commits
table contains info about repository commits in a project.
Note: To replicate commit data, you must set this table and the projects
table to replicate. Data for this table will only be replicated when the associated project (in the projects
table) is also updated.
Key-based Incremental |
|
Primary Key |
id |
Useful links |
Join commits with | on |
---|---|
branches |
commits.project_id = branches.project_id commits.id = branches.commit_id |
issues |
commits.project_id = issues.project_id |
milestones |
commits.project_id = milestones.project_id |
projects |
commits.project_id = projects.project_id |
allow_failure BOOLEAN |
author_email STRING |
author_name STRING |
committer_email STRING |
committer_name STRING |
created_at DATE-TIME |
id STRING |
message STRING |
project_id INTEGER |
short_id STRING |
title STRING |
groups
The groups
table contains info about the groups in your GitLab account.
Full Table |
|
Primary Key |
id |
Useful links |
avatar_url STRING |
|
description STRING |
|
full_name STRING |
|
full_path STRING |
|
id INTEGER |
|
lfs_enabled BOOLEAN |
|
name STRING |
|
path STRING |
|
projects
ARRAY
|
|
request_access_enabled BOOLEAN |
|
visibility_level INTEGER |
|
web_url STRING |
issues
The issues
table contains info about issues contained within projects.
Key-based Incremental |
|
Primary Key |
id |
Replication Key |
updated_at |
Useful links |
Join issues with | on |
---|---|
branches |
issues.project_id = branches.project_id |
commits |
issues.project_id = commits.project_id |
milestones |
issues.project_id = milestones.project_id issues.milestone_id = milestones.id |
projects |
issues.project_id = projects.project_id |
assignee_id INTEGER |
author_id INTEGER |
confidential BOOLEAN |
created_at
DATE-TIME |
description STRING |
due_date STRING |
id INTEGER |
iid INTEGER |
labels ARRAY |
milestone_id INTEGER |
project_id INTEGER |
state STRING |
subscribed BOOLEAN |
title STRING |
updated_at
DATE-TIME |
user_notes_count INTEGER |
web_url STRING |
milestones
The milestones
table contains info about project milestones.
Note: To replicate milestone data, you must set this table and the projects
table to replicate. Data for this table will only be replicated when the associated project (in the projects
table) is also updated.
Key-based Incremental |
|
Primary Key |
id |
Replication Key |
updated_at |
Useful links |
created_at DATE-TIME |
description STRING |
due_date STRING |
group_id INTEGER |
id INTEGER |
iid INTEGER |
project_id INTEGER |
start_date STRING |
state STRING |
title STRING |
updated_at DATE-TIME |
projects
The projects
table contains info about specific projects.
Key-based Incremental |
|
Primary Key |
id |
Replication Key |
last_activity_at |
Useful links |
Join projects with | on |
---|---|
branches |
projects.project_id = branches.project_id |
commits |
projects.project_id = commits.project_id |
issues |
projects.project_id = issues.project_id |
milestones |
projects.project_id = milestones.project_id |
users |
projects.creator_id = users.id |
approvals_before_merge INTEGER |
||||
archived BOOLEAN |
||||
avatar_url STRING |
||||
builds_enabled BOOLEAN |
||||
container_registry_enabled BOOLEAN |
||||
created_at
DATE-TIME |
||||
creator_id INTEGER |
||||
default_branch STRING |
||||
description STRING |
||||
forks_count INTEGER |
||||
http_url_to_repo STRING |
||||
id INTEGER |
||||
issues_enabled BOOLEAN |
||||
last_activity_at
DATE-TIME |
||||
lfs_enabled BOOLEAN |
||||
merge_requests_enabled BOOLEAN |
||||
name STRING |
||||
name_with_namespace STRING |
||||
namespace OBJECT
|
||||
only_allow_merge_if_all_discussions_are_resolved BOOLEAN |
||||
only_allow_merge_if_build_succeeds BOOLEAN |
||||
open_issues_count INTEGER |
||||
owner_id INTEGER |
||||
path STRING |
||||
path_with_namespace STRING |
||||
permissions OBJECT
|
||||
public BOOLEAN |
||||
public_builds BOOLEAN |
||||
request_access_enabled BOOLEAN |
||||
shared_runners_enabled BOOLEAN |
||||
shared_with_groups
ARRAY |
||||
snippets_enabled BOOLEAN |
||||
ssh_url_to_repo STRING |
||||
star_count INTEGER |
||||
tag_list
ARRAY |
||||
visibility_level INTEGER |
||||
web_url STRING |
||||
wiki_enabled BOOLEAN |
users
The users
table contains info about the users in your GitLab account.
Full Table |
|
Primary Key |
id |
Useful links |
Join users with | on |
---|---|
projects |
users.id = projects.creator_id |
avatar_url STRING |
id INTEGER |
name STRING |
state STRING |
username STRING |
web_url STRING |
Related | Troubleshooting |
Questions? Feedback?
Did this article help? If you have questions or feedback, feel free to submit a pull request with your suggestions, open an issue on GitHub, or reach out to us.