GitLab (v1) | Stitch Documentation

GitLab is supported by the Singer community
This integration is powered by Singer's GitLab tap. For support, visit the GitHub repo or join the Singer Slack.

GitLab integration summary

Stitch’s GitLab integration replicates data using the GitLab REST API. Refer to the Schema section for a list of objects available for replication.

GitLab feature snapshot

A high-level look at Stitch's GitLab (v1) integration, including release status, useful links, and the features supported in Stitch.

STITCH
Release status	Released on March 1, 2017	Supported by	Singer Community
Stitch plan	Standard	API availability	Available
Singer GitHub repository	singer-io/tap-gitlab
REPLICATION SETTINGS
Anchor Scheduling	Supported	Advanced Scheduling	Supported
Table-level reset	Unsupported	Configurable Replication Methods	Unsupported
DATA SELECTION
Table selection	Unsupported	Column selection	Unsupported
Select all	Unsupported
TRANSPARENCY
Extraction Logs	Supported	Loading Reports	Supported

Connecting GitLab

GitLab setup requirements

To set up GitLab in Stitch, you need:

Access to any projects you want to replicate data from. Stitch will only be able to access the same projects as the user who creates the integration.

Step 1: Create a GitLab token

Sign into your GitLab account.
Click the user menu (your icon) > Settings.
Click the Access Tokens tab.
In the Name field, enter Stitch. This will allow you to easily identify what application is using the token.
In the Scopes section, check the api box. This will allow Stitch to access your API and replicate your GitLab data.
Click Create Personal Access Token.
The new Access Token will display at the top of the page. Copy the token before navigating away from the page - GitLab won’t display it again.

Step 2: Add GitLab as a Stitch data source

Sign into your Stitch account.
On the Stitch Dashboard page, click the Add Integration button.
Click the GitLab icon.
Enter a name for the integration. This is the name that will display on the Stitch Dashboard for the integration; it’ll also be used to create the schema in your destination.

For example, the name “Stitch GitLab” would create a schema called stitch_gitlab in the destination. Note: Schema names cannot be changed after you save the integration.
In the API URL field, enter https://gitlab.com/api/v4.
In the Private Token field, paste the Personal Access Token you created in the previous section.
In the Projects and Groups to Track fields, you’ll enter the projects and/or groups you want to track as a space-separated list.

For example: stitchdata/group-a, or stitchdata/project-a stitchdata/project-b

Note: A value for one of these fields must be provided. Additionally, the way you define these settings determines how some data is replicated:
- If groups are provided but projects aren’t, all group projects will be replicated.
- If groups and projects are provided, the selected projects of the listed groups will be replicated.
- If projects are provided but groups aren’t, all listed projects will be replicated.

Step 3: Define the historical replication start date

The Sync Historical Data setting defines the starting date for your GitLab integration. This means that data equal to or newer than this date will be replicated to your data warehouse.

Change this setting if you want to replicate data beyond GitLab’s default setting of 1 year. For a detailed look at historical replication jobs, check out the Syncing Historical SaaS Data guide.

Step 4: Create a replication schedule

Replication schedules affect the time Extraction begins, not the time to data loaded. Refer to the Replication Scheduling documentation for more information.

In the Replication Frequency section, you’ll create the integration’s replication schedule. An integration’s replication schedule determines how often Stitch runs a replication job, and the time that job begins.

GitLab integrations support the following replication scheduling methods:

Replication Frequency
Anchor Scheduling
Advanced Scheduling using Cron (Advanced or Premium plans only)

To keep your row usage low, consider setting the integration to replicate less frequently. See the Understanding and Reducing Your Row Usage guide for tips on reducing your usage.

Initial and historical replication jobs

After you finish setting up GitLab, its Sync Status may show as Pending on either the Stitch Dashboard or in the Integration Details page.

For a new integration, a Pending status indicates that Stitch is in the process of scheduling the initial replication job for the integration. This may take some time to complete.

Initial replication jobs with Anchor Scheduling

If using Anchor Scheduling, an initial replication job may not kick off immediately. This depends on the selected Replication Frequency and Anchor Time. Refer to the Anchor Scheduling documentation for more information.

Free historical data loads

The first seven days of replication, beginning when data is first replicated, are free. Rows replicated from the new integration during this time won’t count towards your quota. Stitch offers this as a way of testing new integrations, measuring usage, and ensuring historical data volumes don’t quickly consume your quota.

Replication will continue after the seven days are over. If you’re no longer interested in this source, be sure to pause or delete the integration to prevent unwanted usage.

GitLab table reference

Schemas and versioning

Schemas and naming conventions can change from version to version, so we recommend verifying your integration’s version before continuing.

The schema and info displayed below is for version 1 of this integration.

This is the latest version of the GitLab integration.

Table and column names in your destination

Depending on your destination, table and column names may not appear as they are outlined below.

For example: Object names are lowercased in Redshift (CusTomERs > customers), while case is maintained in PostgreSQL destinations (CusTomERs > CusTomERs). Refer to the Loading Guide for your destination for more info.

branches

The branches table contains high-level info about repository branches in your projects.

Note: To replicate branch data, you must set this table and the projects table to replicate. Data for this table will only be replicated when the associated project (in the projects table) is also updated.

Replication Method

Key-based Incremental

Primary Keys

project_id

name

Useful links

branches schema on GitHub

GitLab API method

branches table foreign keys

Join branches with	on
commits	branches.project_id = commits.project_id branches.commit_id = commits.id
issues	branches.project_id = issues.project_id
milestones	branches.project_id = milestones.project_id
projects	branches.project_id = projects.project_id

branches table schema

commit_id

STRING

developers_can_merge

BOOLEAN

developers_can_push

BOOLEAN

merged

BOOLEAN

name

STRING

project_id

INTEGER

protected

BOOLEAN

commits

The commits table contains info about repository commits in a project.

Note: To replicate commit data, you must set this table and the projects table to replicate. Data for this table will only be replicated when the associated project (in the projects table) is also updated.

Replication Method

Key-based Incremental

Primary Key

Useful links

commits schema on GitHub

GitLab API method

commits table foreign keys

Join commits with	on
branches	commits.project_id = branches.project_id commits.id = branches.commit_id
issues	commits.project_id = issues.project_id
milestones	commits.project_id = milestones.project_id
projects	commits.project_id = projects.project_id

commits table schema

allow_failure

BOOLEAN

author_email

STRING

author_name

STRING

committer_email

STRING

committer_name

STRING

created_at

DATE-TIME

STRING

message

STRING

project_id

INTEGER

short_id

STRING

title

STRING

groups

The groups table contains info about the groups in your GitLab account.

Replication Method

Full Table

Primary Key

Useful links

groups schema on GitHub

GitLab API method

groups table schema

avatar_url

STRING

description

STRING

full_name

STRING

full_path

STRING

INTEGER

lfs_enabled

BOOLEAN

name

STRING

path

STRING

projects

ARRAY

Click to expand projects

INTEGER

request_access_enabled

BOOLEAN

visibility_level

INTEGER

web_url

STRING

issues

The issues table contains info about issues contained within projects.

Replication Method	Key-based Incremental
Primary Key	id
Replication Key	updated_at
Useful links	issues schema on GitHub GitLab API method

issues table foreign keys

Join issues with	on
branches	issues.project_id = branches.project_id
commits	issues.project_id = commits.project_id
milestones	issues.project_id = milestones.project_id issues.milestone_id = milestones.id
projects	issues.project_id = projects.project_id

issues table schema

assignee_id

INTEGER

author_id

INTEGER

confidential

BOOLEAN

created_at

DATE-TIME

description

STRING

due_date

STRING

INTEGER

iid

INTEGER

labels

ARRAY

milestone_id

INTEGER

project_id

INTEGER

state

STRING

subscribed

BOOLEAN

title

STRING

updated_at

DATE-TIME

user_notes_count

INTEGER

web_url

STRING

milestones

The milestones table contains info about project milestones.

Note: To replicate milestone data, you must set this table and the projects table to replicate. Data for this table will only be replicated when the associated project (in the projects table) is also updated.

Replication Method	Key-based Incremental
Primary Key	id
Replication Key	updated_at
Useful links	milestones schema on GitHub GitLab API method

milestones table foreign keys

Join milestones with	on
branches	milestones.project_id = branches.project_id
commits	milestones.project_id = commits.project_id
issues	milestones.project_id = issues.project_id milestones.id = issues.milestone_id
projects	milestones.project_id = projects.project_id

milestones table schema

created_at

DATE-TIME

description

STRING

due_date

STRING

group_id

INTEGER

iid

INTEGER

project_id

INTEGER

start_date

STRING

state

STRING

title

STRING

updated_at

DATE-TIME

projects

The projects table contains info about specific projects.

Replication Method	Key-based Incremental
Primary Key	id
Replication Key	last_activity_at
Useful links	projects schema on GitHub GitLab API method

projects table foreign keys

Join projects with	on
branches	projects.project_id = branches.project_id
commits	projects.project_id = commits.project_id
issues	projects.project_id = issues.project_id
milestones	projects.project_id = milestones.project_id
users	projects.creator_id = users.id

projects table schema

approvals_before_merge

INTEGER

archived

BOOLEAN

avatar_url

STRING

builds_enabled

BOOLEAN

container_registry_enabled

BOOLEAN

created_at

DATE-TIME

creator_id

INTEGER

default_branch

STRING

description

STRING

forks_count

INTEGER

http_url_to_repo

STRING

INTEGER

issues_enabled

BOOLEAN

last_activity_at

DATE-TIME

lfs_enabled

BOOLEAN

merge_requests_enabled

BOOLEAN

name

STRING

name_with_namespace

STRING

namespace

OBJECT

Click to expand namespace

INTEGER

kind

STRING

name

STRING

path

STRING

only_allow_merge_if_all_discussions_are_resolved

BOOLEAN

only_allow_merge_if_build_succeeds

BOOLEAN

open_issues_count

INTEGER

owner_id

INTEGER

path

STRING

path_with_namespace

STRING

permissions

OBJECT

Click to expand permissions

group_access

OBJECT

Click to expand group_access

project_access

OBJECT

Click to expand project_access

public

BOOLEAN

public_builds

BOOLEAN

request_access_enabled

BOOLEAN

shared_runners_enabled

BOOLEAN

shared_with_groups

ARRAY

Click to expand shared_with_groups

group_access_level

INTEGER

group_id

INTEGER

group_name

STRING

snippets_enabled

BOOLEAN

ssh_url_to_repo

STRING

star_count

INTEGER

tag_list

ARRAY

visibility_level

INTEGER

web_url

STRING

wiki_enabled

BOOLEAN

users

The users table contains info about the users in your GitLab account.

Replication Method

Full Table

Primary Key

Useful links

users schema on GitHub

GitLab API method

users table foreign keys

Join users with	on
projects	users.id = projects.creator_id

users table schema

avatar_url

STRING

INTEGER

name

STRING

state

STRING

username

STRING

web_url

STRING

Questions? Feedback?

Did this article help? If you have questions or feedback, feel free to submit a pull request with your suggestions, open an issue on GitHub, or reach out to us.

Related	Troubleshooting
Destination & Integration Compatibility Replication Scheduling Syncing Historical SaaS Data Resetting Replication Keys Nested Data Structures & Row Count Impact	Third-Party Downtime Understanding & Reducing Your Usage Re-Authorizing Integrations Replication Issues