Data Quality Collection
Repositories tagged with "data-quality"
Repositories tagged with "data-quality"
Made-With-ML
GokuMohandas
โLearn how to develop, deploy and iterate on production-grade ML applications.โ
applied-ml
eugeneyan
โ๐ Papers & tech blogs by companies sharing their work on data science & machine learning in production.โ
OpenMetadata
open-metadata
โOpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column level lineage, and seamless team collaboration.โ
fg-data-profiling
Data-Centric-AI-Community
โ1 Line of code data quality profiling & exploratory data analysis for Pandas and Spark DataFrames. โ
great_expectations
great-expectations
โAlways know what to expect from your data.โ
cleanlab
cleanlab
โCleanlab's open-source library is the standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.โ
fiftyone
voxel51
โRefine high-quality datasets and visual AI modelsโ
evidently
evidentlyai
โEvidently is โโan open-source ML and LLM observability framework. Evaluate, test, and monitor any AI-powered system or data pipeline. From tabular data to Gen AI. 100+ metrics.โ
feast
feast-dev
โThe Open Source Feature Store for AI/MLโ
lakeFS
treeverse
โlakeFS - Data version control for your data lake | Git for dataโ
mlops-course
GokuMohandas
โLearn how to design, develop, deploy and iterate on production-grade ML applications.โ
data-diff
datafold
โCompare tables within or across databasesโ
whylogs
whylabs
โAn open-source data logging library for machine learning models and data pipelines. ๐ Provides visibility into data quality & model performance over time. ๐ก๏ธ Supports privacy-preserving data collection, ensuring safety & robustness. ๐โ
soda-core
sodadata
โData Contracts engine for the modern data stack. https://www.soda.ioโ
featureform
featureform
โThe Virtual Feature Store. Turn your existing data infrastructure into a feature store.โ
feathr
feathr-ai
โFeathr โ A scalable, unified data and AI engineering platform for enterpriseโ
Curator
NVIDIA-NeMo
โScalable data pre processing and curation toolkit for LLMsโ
re-data
re-data
โre_data - fix data issues before your users & CEO would discover them ๐โ