๐ง
C-CORE is experimenting with Prefect for workflow automation.
#
Why Prefect?A number of workflow automation tools exist: Prefect, Airflow, Luigi, ArgoW, Kubeflow, among others. Due to our small team size and experience with Python, Prefect had the least friction. A helpful Reddit reply: https://www.reddit.com/r/dataengineering/comments/qq3lvl/comment/hk0keli/?utm_source=share&utm_medium=web2x&context=3 .
#
Orchestration APIThe Prefect orchestration server is open source, and we could run it on our own cloud resource. However, for a neglible cost per month, we use the managed Prefect API.
#
ComputeThe Prefect API only orchestrates work jobs, it does not provide any compute resources for executing the tasks. Instead, we deploy Prefect Agents to our own compute resources for executing tasks. During development, the LocalAgent can execute tasks on a local laptop. For production, we deploy the KubernetesAgent on our GKE Autopilot Kuberenetes cluster. This pattern was inspired by another random Reddit comment (https://www.reddit.com/r/dataengineering/comments/oqbiiu/comment/h6cx7aj/?utm_source=share&utm_medium=web2x&context=3).
#
DependenciesPrefect documentation on which dependencies should be included in Dockerfile: https://docs.prefect.io/orchestration/flow_config/docker.html#dependency-requirements .
#
Further ReadingMixing GKE Autopilot with Cloud Run using Crossplane and Ambasador: https://chaordic.io/blog/gke-autopilot-a-serverless-game-changer/