Hadoop Jump Start
This course provides a well grounded introduction to Hadoop and its building blocks – HDFS, Map/Reduce, Pig, Hive, HBase, Sqoop and Flume. Anyone with a basic understanding of programming and databases (SQL) can benefit from this course. Prior knowledge of Java is not essential, although it is useful. As part of lab work, students build their own development cluster and implement a wide range of technical use cases.
The environment can be used for further learning and experimentation after the course as well. Students should be able to perform basic tasks on a Hadoop project, based upon their prior background and experience.
Register Now
or call us now on +91 9850033661
Highlights of Blockchain Certification Training
- 32 Hours of top quality Instructor-led training(16 hours theory and 16 hours of lab work)
- 25+ years experienced Instructor
- 100+ Hadoop workshops completed
- 2000+ professionals trained
- 20+ hand-on exercises, culled from real-world examples
- Set up own cluster as well as work on a production-grade 7-node cluster.
- Ideal groundwork for professionals aiming for Hadoop certifications
- Course covers 50 top interview questions and one practical exam.
Who Should Attend
- Senior Managers, Managers and Team Leaders tasked with the responsibility of Big Data data analytics
- Professionals who intend to seriously consider Data Analytics as their career option
- Project Managers who would like to bid for Data Analytics projects and lead the delivery teams
- IT administrators likely to be tasked with the responsibility of Hadoop installation and maintenance.
- Marketing and Sales professionals who would like to understand what Big Data, Data Analytics and Machine Learning is all about – and package such offerings
Exercises
- Setting up a multi-node Apache Hadoop cluster from scratch
- Performing file I/O using HDFS
- Implementing an end-to-end data pipeline with Hive
- Creating User Defined Functions in Hive
- Working with HBase Shell and loading data from Hive
- Ingesting sensor data and log files using Flume
- Import/Export data from various RDBMS’s using Sqoop
Training Benefits
-
BENEFITS FOR INDIVIDUALS
- Get a good grip on Big Data and Data Analytics
- Gain the ability to identify and classify data analysis problems – and identify solutions
- Hands-on introduction to Hadoop Software – and their applications
- Apply the acquired concepts to practical situations – such as Business Data Analysis – to support decision making
- Get a practical introduction to the Hadoop eco-system components.
-
BENEFITS FOR ORGANIZATIONS
- Build a team with a view to harness availability of data for business growth
- Offer data oriented products and services
- Create experts who can further spread their learning into the organisation
Curriculum
Introduction to Big Data Analytics
- What is Big Data? – The 3V Paradigm
- Limitations of Conventional Technologies
- Essentials of Distributed Computing
- Introduction to Hadoop & Its Ecosystem
- Hadoop Distributed File System (HDFS)
HDFS Architecture
- Anatomy of file Read / Write
- Fault Tolerance in HDFS
- Setting Up a Hadoop Cluster
Setting Up a Hadoop Cluster
- Exercise : Installing & Configuring Hadoop 2.7.1 Cluster
- Exercise : Working with HDFS through Shell & Web UI
- Map / Reduce
- Map Reduce Concepts
- Map Reduce Job Execution LifeCycle
- Exercise : Running a Map/ Reduce Job – Word Count
- Example Map/Reduce API Overview
- Exercise : Using EcListItempse to build Map/Reduce AppListItemcations
- Exercise : Deploying Map/Reduce Jobs on Cluster
- Advanced Map/Reduce Examples
- Pig
- Pig Introduction & Basic Concepts
- Pig / Latin Language Overview
- Exercises : Analyzing Stock Market Data using Pig/Latin
- Exercises : Working With Complex Data Types
- Hive Basics & Architecture
- Hive Query Language . Exercise – Working with Hive
- Exercise: Analyzing weather data using Hive QL
Advanced Hive
Hive Formats & SerDes
Exercise: Working with ORC, XML & RegEx SerDes
Exercise: Optimizing Hive Queries using Partitions & Clusters
Overview of Hive Functions
Exercise : Creating User Defined functions(UDF)
Introduction to Hbase
Need for Low Latency Queries
Introduction to Hbase & NoSQL
Database
Hbase Data Model & Architecture
Exercise : Working with Hbase Shell
Role of Zookeeper
Sqoop
Using Sqoop to Extract data from MySQL
Exercises: Loading Data in HDFS, Hive & Hbase in Various formats
Flume
Flume Architecture & Data Model
Configuring Flume Agents to build custom data flows
Exercise: Ingesting Sensor Data into HDFS
Exercise: Aggregating Weblogs into HDFS
Building An End to End Hadoop AppListItemcation
Exercise: Running HQL Queries through a JDBC Client
Exercise: Reading and Writing to Hbase
Exercise: HiveBase Integration
Exercise: Displaying the results on a Dashboard