All Courses

Big Data Hadoop Course in Chandigarh




Hadoop is a free, Java-based programming framework that supports the processing of large data sets in a parallel distributed computing environment. It is part of the Apache project sponsored by the Apache Software Foundation. Hadoop makes it possible to run applications on systems with thousands of nodes involving thousands of terabytes of data which is not feasible with traditional systems.

Course Detail

Duration: 6 Months Version: Latest
Regular: 2 Hours per day WeekEnds: 2 - 3 Hours per day
Weekdays: Monday - Friday Weekend: : Saturday and Sunday
Online Training: Not Available

Understanding on Big Data

  • What is Big Data
  • Data Facts
  • Aspects / Principle of Big Data
  • Difference b/w Big Data & Traditional BI
  • Examples : Where to use Big Data
  • Big Data Business Opportunities
  • Distributed File System computation with Facebook Example

Understanding on LINUX

  • Understanding File System working
  • Basic commands of LINUX
  • Shell scripting
  • Use Cases - Assignments

Understanding of JAVA

  • Introduction to OOP's concept
  • Understanding on Data types
  • Functions
  • Methods
  • Setup of Eclipse
  • Coding examples
  • Use Cases : Assignments

Understanding of Python

  • Concepts of Python
  • Data Types in Python
  • Exception Handling in Python
  • File Handling in Python

Hadoop 1.x Architecture

  • Understanding Hadoop Architect
  • Basic Understanding of Hadoop core components
  • In depth understanding of HDFS
  • Understanding HDFS services - NameNode & DataNode
  • Understanding on File System Read & Write
  • Real-time Cluster setup based on requirement

Hadoop 2.x Architecture

  • Understanding YARN Architect
  • Architect Difference b/w Hadoop 1.x & Hadoop 2.x
  • Understanding File System Read & Write

Cluster Installation

  • Hadoop 1.x
    • Environment Settings
    • Pseudo Mode Installation
    • Distributed Mode Installation
    • Basic configuration of Hadoop properties
    • Understanding in-built scripts
    • Running Basic Map Reduce code
  • Hadoop 2.x
    • Environment Settings
    • Distributed Mode Installation
    • Configuration of Hadoop properties
    • Running Basic Map Reduce Code
  • Hadoop File system commands

Understanding of Map Reduce

  • Understanding of Map Reduce services - JobTracker & TaskTracker
  • Map Reduce Flow Chart
  • Map Reduce Phases
    • Mapper
    • Reducer
    • Splitting
    • Sorting
    • Shuffling
    • Combiner
    • Partitioning
  • Developing Map Reduce applications - JAVA Code
  • Developing Map Reduce applications - Python Code
  • Discussion on Input File Formats
  • Difference b/w Old MR API & New MR API
  • Use Cases - Assignments

HIVE - Data Warehouse

  • Introduction to HIVE Architecture
  • Setup of Hive
  • Basic queries in HIVE
  • Advance Features of HIVE
    • Partitioning
    • Bucketing
    • Serialize & De-serialize
  • Query optimization in Hive
  • Use Cases - Assignments

PIG - Data Flow Language

  • Introduction to PIG Latin
  • Setup of Pig
    • Independent Mode
    • Map Reduce Mode
  • Basic commands in Pig
  • Functions in Pig
  • Developing UDF's in Java
  • Use Cases - Assignments

SQOOP

  • Introduction to Sqoop
  • Setup of Sqoop
  • Sqoop Import commands
  • Sqoop Export commands
  • Formats in Sqoop
  • Use Cases - Assignments

FLUME

  • Introduction to Flume
  • Setup of Flume Components
    • Source
    • Sink
    • Channel
    • Agents
  • Use Cases - Assignments

SPARK

  • Introduction to SPARK
  • Understanding of RDD, Contexts
  • Developing Application in SPARK
  • Use Cases - Assignments

Understanding on Big Data

  • What is Big Data
  • Data Facts
  • Aspects / Principle of Big Data
  • Difference b/w Big Data & Traditional BI
  • Examples : Where to use Big Data
  • Big Data Business Opportunities
  • Distributed File System computation with Facebook Example

Understanding on AWS

  • Initializing EC2 Instances
  • Creating Volumes, snapshots,
  • Security Groups
  • Creating AMI
  • Elastic IP's
  • Difference b/w Dedicated & Reserved Instance
  • Use Cases - Assignments

Understanding on LINUX

  • Installation of VMWare
  • Installation of LINUX
  • Understanding File System working
  • Basic commands of LINUX
  • Shell scripting
  • Use Cases - Assignments

Hadoop 1.x Architecture

  • Understanding Hadoop Architect
  • Basic Understanding of Hadoop core components - HDFS & Map Reduce
  • Understanding Hadoop services
  • Networking Concepts
  • Understanding on File System Read & Write
  • Real-time Cluster setup based on requirement

Hadoop 2.x Architecture

  • Understanding YARN Architect
  • Architect Difference b/w Hadoop 1.x & Hadoop 2.x
  • Understanding File System Read & Write
  • Understanding Containers & Application Masters

Cluster Installation

  • Hadoop 1.x
    • Environment Settings
    • Pseudo Mode Installation
    • Distributed Mode Installation
    • Basic configuration of Hadoop properties
    • Understanding in-built scripts
    • Running Basic Map Reduce code
  • Hadoop 2.x
    • Environment Settings
    • Distributed Mode Installation
    • Configuration of Hadoop properties
    • Running Basic Map Reduce Code
  • Hadoop File system commands

Understanding on HDFS commands

  • Hadoop File system commands
  • File System check
  • File System Reporting
  • Understanding Web UI of Hadoop

Performance Tuning

  • Setup of Block Size
  • Setup of Replication
  • Buffer Size
  • Scheduler Options
  • Logging
  • Safe Mode
  • Configuring Hadoop ports

Setup of Hadoop Components with discussion of Basic functionality

  • HIVE
  • SQOOP
  • HBASE
  • PIG
  • FLUME
  • Nifi
Back to Top