Using an Airbyte Server (advanced)

This page is for users who have decided that the Airbyte server is the right deployment option for their production instance.

Setting up an Airbyte Server

With a full Airbyte server you will get an infrastructure that manages the scheduled runs of these connections as well as an Airbyte UI that helps to manage your Airbyte connections as well as.

  1. [recommended] We recommend you spin up an EC2 instance of the Airbyte app and a corresponding RDS. Follow along with the linked Airbyte docs to set up your server.
  2. If you anticipate scaling issues hosting your Airbyte server on EC2 and expertise in managing Kubernetes, you can instead deploy Airbyte on Kubernetes.

Securely Storing Secrets

By default, Airbyte Server stores secrets, such as API keys and other types of credentials, as unencrypted strings in the Airbyte database. Airbyte offers alternate options for storing secrets in a more secure manner. We recommend using Hashicorp Vault to securely store secrets. To configure your Airbyte Server to use Vault, perform the following steps:

  1. Create a new Key/Value Secrets Engine in your Vault instance. Both Engine versions 1 and 2 should work.

  2. If you're running Airbyte on EC2, add the following environment variables to the bootloader, server, and worker Docker services:

    SECRET_PERSISTENCE=VAULT
    VAULT_ADDRESS=<VAULT URL>
    VAULT_PREFIX=<Secrets Engine Name>/
    VAULT_AUTH_TOKEN="<VAULT TOKEN>"
    
  3. If you're running Airbyte on Kubernetes, add the following entries to your Helm values.yaml file or a custom Helm values file specified with -f, --values options during deployment.

    airbyte-bootloader:
      extraEnv:
        - name: SECRET_PERSISTENCE
          value: VAULT
        - name: VAULT_ADDRESS
          value: <VAULT URL>
        - name: VAULT_PREFIX
          value: <Secrets Engine Name>/
        - name: VAULT_AUTH_TOKEN
          value: "<VAULT TOKEN>"
    server:
      extraEnv:
        - name: SECRET_PERSISTENCE
          value: VAULT
        - name: VAULT_ADDRESS
          value: <VAULT URL>
        - name: VAULT_PREFIX
          value: <Secrets Engine Name>/
        - name: VAULT_AUTH_TOKEN
          value: "<VAULT TOKEN>"
    worker:
      extraEnv:
        - name: SECRET_PERSISTENCE
          value: VAULT
        - name: VAULT_ADDRESS
          value: <VAULT URL>
        - name: VAULT_PREFIX
          value: <Secrets Engine Name>/
        - name: VAULT_AUTH_TOKEN
          value: "<VAULT TOKEN>"
    

You can also store secrets using GCP Secret Manager or AWS Secret Manager if you're already integrated into one of those cloud platforms. See the Airbyte documentation for environment variables required for each option.

Using your Airbyte Server

Instead of running your Airbyte connections through the command line you’ll need to recreate them in your new Airbyte server. Fortunately, all the parameters will stay the same.

Create a Faros Destination

This only needs to be done once and will be used by all sources.

  1. Add a new destination definition
  1. Add a Faros destination definition
  2. Choose Faros Destination
  3. Configure the Faros destination (Note: select "v2" for GraphQL API version)

Add your sources to the UI

The example below reproduces the GitHub CLI command in the Airbyte server UI.

  1. Add the connector to your sources Settings > Sources > New connector

    1. Connector display name: Faros Feeds
      Docker repository name: farosai/airbyte-faros-feeds-source
      Docker image tag: latest
            Note: this pulls the latest version at the time you create the connector. It does not update to the latest version each time the source runs.
      Connector Documentation URL: <https://docs.faros.ai>
      

  2. Create a new source to pull from GitHub Enterprise using the connector you created above Sources > New source

    1. Source type: Faros Feeds (or whatever you called it in step 1a)
      Choose 'github' from the Feed type dropdown. This will show the GitHub specific configuration fields.
      Authentication: Choose your authentication method and complete accordingly.
      Repos Query Mode: Select GitHub Org and enter a list of repositories to pull.
      GitHub API URL: Enter the GitHub Enterprise API URL (defaults to GitHub API URL)
      Cutoff days: fetch entities updated in the last number of days (defaults to 90 days)
      Feed command line arguments: Leave blank. This is how we can pass extra feed arguments which aren’t present in the UI.
      Enable debug logs if desired.
      


  3. Create a connection between the source and the Faros destination

    1. Destination Stream Prefix: Enter a prefix matching faros_feeds (e.g., ghefaros_feeds). The first part will be used as the origin for your records. The second part (faros_feeds) is used by the destination connector to convert records emitted by the source into the Faros models.
    2. There's a single stream: faros_feed. Activate it and select the sync mode.