iNEWPM

Course Details

Hadoop 2.X - Bigdata Analytics

Fundamentals of Basics

1.     Java

·         Overview of Java

·         Classes and Objects

·         Garbage Collection and Modifiers

·         Inheritance, Aggregation, Polymorphism

·         Command line argument

·         Abstract class and Interfaces

·         String Handling

·         Exception Handling, Multithreading

·         Serialization and Advanced Topics

·         Collection Framework, GUI, JDBC

2.     Linux

·         Unix History & Over View

·         Command line file-system browsing

·         Bash/CORN Shell

·         Users Groups and Permissions

·         VI Editor

·         Introduction to Process

·         Basic Networking

·         Shell Scripting live scenarios

3.     SQL

·         Introduction to SQL, Data Definition Language (DDL)

·         Data Manipulation Language(DML)

·         Operator and Sub Query

·         Various Clauses, SQL Key Words

·         Joins, Stored Procedures, Constraints, Triggers

·         Cursors /Loops / IF Else / Try Catch, Index

·         Data Manipulation Language (Advanced)

·         Constraints, Triggers,

·         Views, Index Advanced

Hadoop - Basic

1.     Introduction to Bigdata

·         Introduction and relevance

·         Uses of Big Data analytics in various industries like Telecom, E- commerce, Finance and Insurance etc.

·         Problems with Traditional Large-Scale Systems

2.     Hadoop (Big Data) Ecosystem

·          Motivation for Hadoop

·         Different types of projects by Apache

·         Role of projects in the Hadoop Ecosystem

·         Key technology foundations required for Big Data

·         Limitations and Solutions of existing Data Analytics Architecture

·         Comparison of traditional data management systems with Big Data management systems

·         Evaluate key framework requirements for Big Data analytics

·         Hadoop Ecosystem & Hadoop 2.x core components

·         Explain the relevance of real-time data

·         Explain how to use big and real-time data as a Business planning tool

3.     Building Blocks

·         Quick tour of Java (As Hadoop is Written in Java , so it will help us to understand it better)

·         Quick tour of Linux commands ( Basic Commands to traverse the Linux OS)

·         Quick Tour of RDBMS Concepts (to use HIVE and Impala)

·         Quick hands on experience of SQL.

·         Introduction to Cloudera VM and usage instructions

4.     Hadoop Cluster Architecture – Configuration Files

·         Hadoop Master-Slave Architecture

·         The Hadoop Distributed File System - data storage

·         Explain different types of cluster setups (Fully distributed/Pseudo etc.)

·         Hadoop Cluster set up - Installation

·         Hadoop 2.x Cluster Architecture

·         A Typical enterprise cluster – Hadoop Cluster Modes

5.     Hadoop Core Components – HDFS & Map Reduce (YARN)

6.     HDFS Overview & Data storage in HDFS

·         Get the data into Hadoop from local machine (Data Loading Techniques) - vice versa

·         MapReduce Overview (Traditional way Vs. MapReduce way)

·         Concept of Mapper & Reducer

·         Understanding MapReduce Program Skeleton

·         Running MapReduce job in Command line/Eclipse

·         Develop MapReduce Program in JAVA

·         Develop MapReduce Program with the streaming API

·         Test and Debug a MapReduce Program in the Design Time

·         How Partitioners and Reducers Work Together

·         Writing Customer Partitioners Data Input and Output

·         Creating Custom Writable and Writable Comparable Implementations

7.      Data Integration Using Sqoop and Flume

·         Integrating Hadoop into an Existing Enterprise

·         Loading Data from an RDBMS into HDFS by Using Sqoop

·         Managing Real-Time Data Using Flume

·         Accessing HDFS from Legacy Systems with FuseDFS and HttpFS

·         Introduction to Talend (community system)

·         Data loading to HDFS using Talend

8.     Data Analysis using PIG

·         Introduction to Hadoop Data Analysis Tools

·         Introduction to PIG - MapReduce Vs Pig, Pig Use Cases

·         Pig Latin Program & Execution

·         Pig Latin : Relational Operators, File Loaders, Group Operator, COGROUP Operator, Joins and COGROUP, Union, Diagnostic Operators, Pig UDF

·         Use Pig to automate the design and implementation of MapReduce applications

·         Data Analysis using PIG

9.     Data Analysis using HIVE

·

Are you providing Training Classes
IT Courses / Govt Exam Preparation
Higher Studies / Studies Abroad
NEWFreeCompanies HiringUpdates//nuPM