Here is what you will learn as part of this chapter:
- Revising the Medallion architecture pattern
- Transforming data to Delta with Auto Loader
- Delta Live Tables starting with Bronze
- Maintaining and optimizing Delta Tables
- Applying our learning
Databricks ML Runtime includes several pre-installed libraries useful for machine learning and data science projects. For this reason, we will be using clusters with an ML Runtime.
Further Reading
- Use liquid clustering for Delta tables
- Spark Structured Streaming
- Delta Live Tables
- DLT Databricks Demo)
- AutoLoader options
- Schema evolution with Auto Loader
- Common loading patterns with Auto Loader
- Stream processing with Apache Kafka and Databricks
- Blog: Context length in LLMs: All you need to know
- How We Performed ETL on One Billion Records For Under $1 With Delta Live Tables
- Create tables - Managed vs External
- Take full advantage of the auto-tuning available
- Import Python modules from Databricks repos
- Deletion Vectors
- Databricks ML Runtime
- Cluster advanced options
- Deploy provisioned throughput Foundation Model APIs
- Scaling Deep Learning Using Delta Lake Storage Format on Databricks
- DeltaTorchLoader