Airbyte Deployment Options

After completing the above, you will have successfully pulled data from your source into your Faros graph. As you move forward with your Faros implementation, you will want to continually sync your source data into Faros. This can be done the following ways:

1. Manually re-run the commands from your machine when you want fresh data

We recommend starting with this. This requires the least amount of overhead and is great for initially getting data into Faros.

2. Embed the commands in a scheduled job

If you have an existing scheduling/orchestration system or are comfortable implementing it on your own, add your airbyte-cli command into the scheduled job. This will sync your data on a regular cadence. Note: for incremental syncs you would need to manage the state file.

3. Set up an Airbyte Server to schedule it for you

This solution offers better visibility, state management and a helpful UI to manage and maintain several Airbyte connections in one place. But because of the overhead required to set up the server, it is only recommended for production implementations. Airbyte servers can be run with either on single-host/VM (e.g EC2) or Kubernetes.

Summary comparison

Airbyte Deployment optionInfrastructureOrchestration / SchedulingIncremental syncs state managementLogs storageWeb UI & APISetup complexity
Manually execute airbyte-local-cliYour machineNoneCustomCustomNoSimple
Scheduled job to execute airbyte-local-cliSource and destination containers on a single host/VM (e.g EC2)CustomCustomCustomNoSimple
Airbyte Server on a single host/VMAll containers on a single host/VM (e.g EC2)AirbyteDatabaseInternalYesModerate
Airbyte Server on KubernetesContainers on custom/managed
k8s cluster
AirbyteDatabaseInternal or externalYesComplex