Skip to content

Latest commit

 

History

History

Chapter 4: Getting to Know Your Data

Chapter 4: Getting to Know Your Data

Here is what you will learn as part of this chapter:

  1. Improving data integrity with Delta Live Tables (DLT)
  2. Monitoring data quality with Databricks Lakehouse Monitoring
  3. Exploring data with Databricks Assistant
  4. Generating data profiles with AutoML
  5. Using embeddings for machine-readable data
  6. Enhancing data retrieval with Databricks Vector Search
  7. Applying our learning

Technical requirements

Here are the technical requirements needed to complete the hands-on examples in this chapter:

  • The Databricks Assistant is a newer feature that an administrator can enable. We will show the Assistant in this chapter.
  • We use the missingno library to address missing numbers in our project data.
  • In the section using AutoML, we reference the AutoML-generated notebook, which you can find in the GitHub repository.

Links

In the chapter

Further Reading