Data Engineering Collection
Repositories tagged with "data-engineering"
Repositories tagged with "data-engineering"
superset
apache
โApache Superset is a Data Visualization and Data Exploration Platformโ
Made-With-ML
GokuMohandas
โLearn how to develop, deploy and iterate on production-grade ML applications.โ
airflow
apache
โApache Airflow - A platform to programmatically author, schedule, and monitor workflowsโ
data-engineering-zoomcamp
DataTalksClub
โData Engineering Zoomcamp is a free 9-week course on building production-ready data pipelines. The next cohort starts in January 2026. Join the course here ๐๐ผโ
applied-ml
eugeneyan
โ๐ Papers & tech blogs by companies sharing their work on data science & machine learning in production.โ
prefect
PrefectHQ
โPrefect is a workflow orchestration framework for building resilient data pipelines in Python.โ
airbyte
airbytehq
โOpen-source data movement for ELT pipelines and AI agents โ from APIs, databases & files to warehouses, lakes, and AI applications. Both self-hosted and Cloud.โ
taipy
Avaiga
โTurns Data and AI algorithms into production-ready web applications in no time.โ
argo-workflows
argoproj
โWorkflow Engine for Kubernetesโ
dagster
dagster-io
โAn orchestration platform for the development, production, and observation of data assets.โ
Cookbook
andkret
โThe Data Engineering Cookbookโ
data-engineer-roadmap
datastacktv
โRoadmap to becoming a data engineer in 2021โ
great_expectations
great-expectations
โAlways know what to expect from your data.โ
cocoindex
cocoindex-io
โIncremental engine for long horizon agents ๐ Star if you like it!โ
xonsh
xonsh
โ๐ Python-powered shell. Full-featured, cross-platform and AI-friendly.โ
risingwave
risingwavelabs
โEvent streaming platform for agentic AI. Continuously ingest, transform, and serve event streams in real time, at scale.โ
mage-ai
mage-ai
โ๐ง Build, run, and manage data pipelines for integrating and transforming data.โ
connect
redpanda-data
โFancy stream processing made operationally mundaneโ