Skip to content

Latest commit

 

History

History
369 lines (309 loc) · 28 KB

TOC-recommended.md

File metadata and controls

369 lines (309 loc) · 28 KB

Overview

Azure HDInsight and Hadoop Architecture

Release notes

Get Started

How To

Import and export data

Batch process data

Use Hadoop for batch processing

Use Hive for batch queries

Use Pig for batch processing

Use Spark for batch processing

ACTION: TODO: Where are the ADLS/WASB paths covered?

Use Spark SQL for batch queries

Interactively query data

Use Spark with notebooks

Process data in real-time

Use Spark for stream processing

Use BI tools with HDInsight

Build data processing pipelines

Use Azure Data Factory

Use Oozie

Perform Machine Learning

Use R Server

Use Spark for Machine Learning

ACTION: MIGRATE suggest migrating content from here

Perform Deep Learning

Use HBase

Use Phoenix

Use Storm

Use domain-joined HDInsight (Preview)

Use Kafka (Preview)

Develop

Develop MapReduce programs

Develop Hive applications

Hive samples

Develop Spark applications

Spark samples

Developing Machine Learning solutions with HDInsight

SparkML samples

Spark MLLib samples

R Server on HDInsight samples

Mahout on HDInsight Samples

Serialize and deserialize data

Analyze big data

Deep Dives

Extend clusters

Build HDInsight applications

Secure

Manage

Manage Clusters

Manage Linux Clusters

Troubleshoot

Troubleshooting Spark on HDInsight

Reference

Related

Migrating from Windows clusters

Resources