Administrator Training for Apache Hadoop

• Installing and managing other ... Establish yourself as a trusted and valuable resource by completing ... Cloudera Administrator Training for Apache...

46 downloads 859 Views 381KB Size
TRAINING SHEET

Administrator Training for Apache Hadoop

Take your knowledge to the next level with Cloudera’s Apache Hadoop Training and Certification Cloudera University’s three-day administrator training course for Apache Hadoop provides System Administrators a comprehensive understanding of all the steps necessary to operate and manage Hadoop clusters. From installation and configuration, through load balancing and tuning your cluster, Cloudera’s Administration course has you covered. Through lecture and interactive, hands-on exercises, attendees will cover topics such as • Introduction to Apache Hadoop and HDFS • Apache Hadoop architecture • Proper cluster configuration and deployment • Populating HDFS using Apache Sqoop • Management and monitoring tools • Job scheduling

“Cloudera Administrator Training for Apache Hadoop helped me to advance my use of Apache Hadoop and cultivate a better understanding of the platform’s inner workings. The course material, interactive labs and exercises really helped cement together all the little bits and pieces that I had bumped into prior to the class into a useful mental model of how Apache Hadoop works.” Eric Marshall, Senior System Administrator

• Best practices for maintaining Apache Hadoop in Production • Installing and managing other Apache Hadoop projects • Diagnosing, tuning and solving Apache Hadoop issues

Audience This course is designed for people with at least a basic level of Linux system administration experience. Prior knowledge of Hadoop is not required.

Cloudera, Inc. 210 Portage Avenue, Palo Alto, CA 94306 USA | 1-888-789-1488 or 1-650-362-0488 | cloudera.com ©2011 Cloudera, Inc. All rights reserved. Cloudera and the Cloudera logo are trademarks or registered trademarks of Cloudera Inc. in the USA and other countries. All other trademarks are the property of their respective companies. Information is subject to change without notice.

TRAINING SHEET

Administrator Training for Apache Hadoop Course Outline: Cloudera Administrator Training for Apache Hadoop • An Introduction To Hadoop And HDFS o Why Hadoop? o HDFS o MapReduce o Hive, Pig, HBase and other ecosystem projects o Hands-On Exercise: Installing a pseudo distributed cluster • Planning Your Hadoop Cluster o General Planning Considerations o Choosing The Right Hardware o Node Topologies o Choosing The Right Software • Deploying Your Cluster o Installing Hadoop o Using SCM Express for easy installation o Typical Configuration Parameters o Configuring Rack Awareness o Using Configuration Management Tools o Hands-On Exercise: Installing a Hadoop Cluster • Managing and Scheduling Jobs o Starting and stopping MapReduce jobs o Hands-On Exercise: Managing jobs o The FIFO Scheduler o The Fair Scheduler o Hands-On Exercise: Using the FairScheduler • Cluster Maintenance o Checking HDFS with fsck o Hands-On Exercise: Breaking the Cluster o Copying data with distcp

• Cluster Maintenance (continued) o Rebalancing cluster nodes o Adding and removing cluster nodes o Hands-On Exercise: Verifying the Cluster’s Self Healing Features o Backup And Restore o Upgrading and Migrating o Hands-On Exercise: Backing Up and Restoring the NameNode Metadata • Cluster Monitoring, Troubleshooting and Optimizing o Hadoop Log Files o Using the NameNode and JobTracker Web UIs o Interpreting Job Logs o Monitoring with Ganglia o Other monitoring tools o General Optimization Tips o Benchmarking Your Cluster • Populating HDFS From External Sources o Using Sqoop o Using Flume o Best Practices for Data Ingestion • Installing And Managing Other Hadoop Projects o Hive o Pig o HBase o Hands-On Exercise: Configuring the Hive Shared Metastore • Cloudera Certified Administrator Exam

Cloudera Certified Administrator for Apache Hadoop (CCAH)

Establish yourself as a trusted and valuable resource by completing the online certification exam for Apache Hadoop Administrators. The exam is demanding and is designed to test your fluency with concepts and terminology in the following areas: Apache Hadoop Cluster Overview

Daemons and normal operation of an Apache Hadoop cluster, both in data storage and in data processing. The current features of computing systems that motivate a system like Apache Hadoop

Apache Hadoop Cluster Planning

Principal points to consider in choosing the hardware and operating systems to host an Apache Hadoop cluster

Apache Hadoop Cluster Management

Cluster handling of disk and machine failures. Regular tools for monitoring and managing the Apache Hadoop file system

Job Scheduling

How the default scheduler and the fair scheduler handle the tasks in a mix of jobs running on a cluster

Monitor and Logging

Basic functionality and features of Apache Hadoop’s logging and monitoring systems

Cloudera, Inc. 210 Portage Avenue, Palo Alto, CA 94306 USA | 1-888-789-1488 or 1-650-362-0488 | cloudera.com ©2011 Cloudera, Inc. All rights reserved. Cloudera and the Cloudera logo are trademarks or registered trademarks of Cloudera Inc. in the USA and other countries. All other trademarks are the property of their respective companies. Information is subject to change without notice.