Data-Centric AI Competition 2021 - Tips and Tricks of a Top 5% Finish

Sharing the techniques that worked (and did not work) in the competition organized by Andrew Ng & DeepLearning.AI

Link to Medium writeup: https://towardsdatascience.com/data-centric-ai-competition-tips-and-tricks-of-a-top-5-finish-9cacc254626e

Introduction

Data is food for AI, and there is vast potential for model performance improvement by shifting from a model-centric to a data-centric approach. That is the motivation behind the recent Data-Centric AI Competition organized by Andrew Ng and DeepLearning.AI.

In this repo, I unveil the methods (and codes) of my Top 5% ranked submission (~84% accuracy, ranked 24), including the various techniques that worked and did not work for me. Do check out the Medium article for a more in-depth look at my thought process and methods behind the submission.

About the Competition

Link to competition page: https://https-deeplearning-ai.github.io/data-centric-comp/
A collaboration between DeepLearning.AI and Landing AI, the Data-Centric AI Competition aims to elevate data-centric approaches to improving the performance of machine learning models.
In most machine learning competitions, you are asked to build a high-performance model given a fixed dataset.
However, machine learning has matured to the point that high-performance model architectures are widely available, while approaches to engineering datasets have lagged.
The Data-Centric AI Competition inverts the traditional format and instead asks you to improve a dataset given a fixed model. We will provide you with a dataset to improve by applying data-centric techniques such as fixing incorrect labels, adding examples that represent edge cases, apply data augmentation, etc.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
data		data
label_book		label_book
scripts		scripts
.gitignore		.gitignore
Full_Notebook_Best_Submission.ipynb		Full_Notebook_Best_Submission.ipynb
README.md		README.md
experiment_tracker.xlsx		experiment_tracker.xlsx

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data-Centric AI Competition 2021 - Tips and Tricks of a Top 5% Finish

Sharing the techniques that worked (and did not work) in the competition organized by Andrew Ng & DeepLearning.AI

Introduction

About the Competition

Contents

About

Releases

Packages

Languages

kennethleungty/Data-Centric-AI-Competition

Folders and files

Latest commit

History

Repository files navigation

Data-Centric AI Competition 2021 - Tips and Tricks of a Top 5% Finish

Sharing the techniques that worked (and did not work) in the competition organized by Andrew Ng & DeepLearning.AI

Introduction

About the Competition

Contents

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages