OSHA Open Source Hadoop Administration

This open source course provides participants with a comprehensive understanding of the steps necessary to install, configure, operate and maintain Hadoop. The course begins with an overview of the Big Data landscape, and then dives into a system administration working view of running Hadoop.

Kontakt oss: Kurs@sgpartner.no

Kurskode: OSHA Kategorier: ,

COURSE OBJECTIVE:
Upon successful completion of this course, participants should be able to: • Describe the fundamental concepts of using Big Data • Identify where Hadoop fits into a Big Data strategy • Learn to plan your Hadoop cluster. • Learn HDFS features. • Learn how to get data into HDFS. • Learn to work with MapReduce. • Learn installation and configuration of Hadoop. • Learn cluster maintenance.

 

TARGET AUDIENCE:
This course is intended for System administrators, DevOps engineers, and software developers responsible for managing and maintaining Hadoop clusters.

COURSE PREREQUISITES:
Not available. Please contact.

COURSE CONTENT:
• The content of this course is designed to support the course objectives. Hadoop Introduction • A Brief History of Hadoop • Core Hadoop Components • Fundamental ConceptsPlanning Your Hadoop Cluster • General Planning Considerations • Choosing Hardware • Network Considerations • Configuring Nodes • Planning for Cluster ManagementHDFS • HDFS Features • Writing and Reading Files • NameNode Considerations • HDFS Security • Namenode Web UI • Hadoop File ShellGetting Data into HDFS • Pulling data from External Sources with Flume • Importing Data from Relational Databases with Sqoop • REST Interfaces • Best Practices• MapReduce • MapReduce overview • Features of MapReduce • Architectural Overview • YARN MapReduce Version 2 • Failure Recovery • The JobTracker Web UIHadoop Installation & Initial Configuration • Configuration & Deployment Types • Installing Hadoop • Specifying the Hadoop Configuration • Initial HDFS & MapReduce Configuration • Log FilesInstalling/Configuring Hive, Impala, and Pig • Hive • Impala • PigHadoop Clients • What is a Hadoop Client? • Installing and Configuring Hadoop Clients • Installing and Configuring Hue • Hue Authentication and ConfigurationAdvanced Cluster Configuration • Advanced Configuration Parameters • Configuring Hadoop Ports • Explicitly Including and Excluding Hosts • Configuring HDFS for Rack Awareness & HDFS High AvailabilityHadoop Security • Why Hadoop Security Is Important • Hadoop's Security System Concepts • What Kerberos Is and How it Works • Securing a Hadoop Cluster with KerberosManaging and Scheduling Jobs • Managing Running Jobs • Scheduling Hadoop Jobs • Configuring the FairSchedulerCluster Maintenance • Checking HDFS Status • Copying Data Between Clusters • Adding/Removing Cluster Nodes • Rebalancing the Cluster • NameNode Metadata Backup • Cluster Upgrades Cluster Monitoring and Troubleshooting • General System Monitoring • Managing Hadoop's Log Files • Monitoring the Clusters • Common Troubleshooting Issues

FOLLOW ON COURSES:
Not available. Please contact.

Additional information