Feathr – A scalable, unified data and AI engineering platform for enterprise
-
Updated
Apr 4, 2024 - Scala
Feathr – A scalable, unified data and AI engineering platform for enterprise
Automated data quality suggestions and analysis with Deequ on AWS Glue
Test data management tool for any data source, batch or real-time. Generate, validate and clean up data all in one tool.
The Lightning Catalog is an open-source data catalog designed for preparing data at any scale in ad-hoc analytics, data virtualization, data warehousing, lake houses, and ML projects.
Data quality control tool built on spark and deequ
A library for Spark that helps to stadardize any input data (DataFrame) to adhere to the provided schema.
Data generation and validation tool for any data source
Example API implementation for Data Caterer
A Quality Spark DQ Library
Simple Spark wrapper for validating data
An extensible and configurable ETL tool built on top of Apache Spark
Add a description, image, and links to the data-quality topic page so that developers can more easily learn about it.
To associate your repository with the data-quality topic, visit your repo's landing page and select "manage topics."