After completing the above, you will have successfully pulled data from your source into your Faros graph. As you move forward with your Faros implementation, you will want to continually sync your source data into Faros. This can be done the following ways:
We recommend starting with this. This requires the least amount of overhead and is great for initially getting data into Faros.
If you have an existing scheduling/orchestration system or are comfortable implementing it on your own, add your airbyte-cli command into the scheduled job. This will sync your data on a regular cadence. Note: for incremental syncs you would need to manage the state file.
This solution offers better visibility, state management and a helpful UI to manage and maintain several Airbyte connections in one place. But because of the overhead required to set up the server, it is only recommended for production implementations. Airbyte servers can be run with either on single-host/VM (e.g EC2) or Kubernetes.
|Airbyte Deployment option||Infrastructure||Orchestration / Scheduling||Incremental syncs state management||Logs storage||Web UI & API||Setup complexity|
|Manually execute airbyte-local-cli||Your machine||None||Custom||Custom||No||Simple|
|Scheduled job to execute airbyte-local-cli||Source and destination containers on a single host/VM (e.g EC2)||Custom||Custom||Custom||No||Simple|
|Airbyte Server on a single host/VM||All containers on a single host/VM (e.g EC2)||Airbyte||Database||Internal||Yes||Moderate|
|Airbyte Server on Kubernetes||Containers on custom/managed|
|Airbyte||Database||Internal or external||Yes||Complex|
Updated 8 months ago