Skip to content

Hadoop Security Guide outline

Roman V Shaposhnik edited this page Mar 14, 2017 · 5 revisions

This is based on:

Outline #1 (scroll bellow for a different outline based on Hortonworks guide)

  1. Security Overview

    1. Introduction to Security
    2. Potential Security Risks
    3. Managing Security
    4. Vulnerability Assessment
      1. Thinking Like the Enemy
      2. Defining Assessment and Testing
      3. Evaluating the Tools
    5. Security Threats
      1. Threats to Data
      2. Threats to ...
    6. Common exploits and attacks
    7. What to do when you get exploited
  2. Securing Ambari

    1. Setting up HTTPS with a self-signed certificate for the Ambari web interface
    2. Setting up HTTPS with an authority certificate for the Ambari web interface
    3. Setting up two-way SSL between the Ambari server and Ambari agents
  3. Securing your Hadoop Cluster

    1. User and group management
    2. Authentication
    3. Authorization
    4. Kerberos
      1. Enabling Hadoop user interface security through SPNEGO
    5. ACL
      1. ACL Management for HDFS
      2. ACL Management for YARN
      3. Hive authorization
    6. Tag based policies
    7. Key Management (Ranger KMS)
  4. Securing your Hadoop Cluster perimeter

    1. Knox
    2. Manual configuration of security (SSL, etc.) for REST APIs
      1. Ambari
      2. WebHDFS/HTTPFS
      3. YARN
      4. Hive
      5. Oozie
    3. Securing JDBC connection via SSL support for HiveServer2
  5. Securing your Hadoop Cluster "control plane"

    1. Configuring SSL support for
      1. HDFS
      2. YARN (including Job History)
      3. Ambari
  6. Data Protection

    1. Data at REST
      1. Enabling transparent data encryption
    2. Data in Motion
  7. General Practices

    1. Managing security across the cluster (Ranger)
      1. Enabling and Verifying PAM authentication for Ranger
    2. Secure Gateway (Knox)
    3. Separation of Duties
  8. Secure Install

    1. Security Technical Implementation Guide (STIG)
    2. OpenSCAP
    3. SELinux
    4. Private IP
    5. Non root install
    6. Disk partitioning
    7. Umask
  9. Hadoop securely talking to non-Hadoop services

    1. Enabling Hadoop services to use a credential keystore file
  10. Auditing

    1. Audit Facility
    2. Managing Audit Log
  11. Cross-component data lineage

    1. Falcon
  12. Federal Standard and Regulation

    1. Introduction
    2. HIIPA, PCI, STIG, ...
  13. Extending Hadoop security beyond open source capabilities

    1. Protegrity?
  14. Writing secure applications for Hadoop

    1. Delegation tokens in Apache Hadoop
    2. Take a look at Slider and how they handle long running services
  15. Appendix

    1. Encryption Standards
      1. Synchronous Encryption
      2. Public-key Encryption
    2. Audit System Reference
      1. Audit Event Fields
      2. Audit Record Types

Outline #2 (this one is based on Hortonworks guide)

  1. HDP Security Overview
  2. Understanding Data Lake Security
  3. HDP Security Features 1. Administration 1. Authentication and Perimeter Security 1. Authorization 1. Audit 1. Data Protection
  4. Authentication
  5. Enabling Kerberos Authentication Using Ambari
  6. Configuring Ambari Authentication with LDAP or AD
  7. Advanced Security Options for Ambari
  8. Enabling SPNEGO Authentication for Hadoop
  9. Setting Up Kerberos Authentication for Non-Ambari Clusters
  10. Perimeter Security with Apache Knox
  11. Configuring Authorization in Hadoop
  12. Installing Ranger Using Ambari
  13. Using Ranger to Provide Authorization in Hadoop
  14. Data Protection: Wire Encryption
  15. Enabling RPC Encryption
  16. Enabling Data Transfer Protocol
  17. Enabling SSL: Understanding the Hadoop SSL Keystore Factory
  18. Creating and Managing SSL Certificates
  19. Enable SSL for WebHDFS, HTTPFS, MapReduce Shuffle, and YARN
  20. Enable SSL for HttpFS
  21. Enable SSL on HiveServer2
  22. SPNEGO setup for WebHCat
  23. Configure SSL for Knox
  24. Set Up SSL for Ambari
  25. Configure Ambari Ranger SSL
  26. Configure Non-Ambari Ranger SSL
  27. Connecting to SSL-Enabled Components
  28. Data Protection: HDFS Encryption
  29. Ranger KMS Administration Guide
  30. HDFS "Data at Rest" Encryption
  31. Auditing in Hadoop
  32. Using Apache Solr for Ranger Audits
  33. Manually Enabling Audit Settings in Ambari Clusters
  34. Enabling Audit Logging in Non-Ambari Clusters
  35. Manging Auditing in Ranger