Big Data Hadoop Developer Training Chennai

(Hadoop ,Spark , NoSQL , Cloud)

Duration : 60hrs + 15hrs

Big data hadoop

Module 1

Introduction to Big Data

Characteristics

Why, How and What s of Big data

Existing OLTP, ETL,DWH,OLAP

Module 2

Introduction to Hadoop Ecosystem

Architecture-HDFS

Sharding , Distributed and Replication factor  (SDR)

Daemons

Map reduce (MRV1) and Yarn

Hadoop v1 and v2

Hadoop Data fedaration

Module 3

Prerequisite for Installation

Single node , Pseudo distributed and Multinode cluster

Virtual machine using Linux ubuntu/CentOS

Installation of hadoop in cloud (Azure/AWS)

Installation of Java ,ssh,eclipse

Installation and configuration of Hadoop,HDFS,Daemons,YARN Daemons

High Availability (Active and Standby)

Automatic and manual failover

Hadoop Fs shell commands

Writing Data to HDFS

Reading Data from DFS

Module 4

Rack awareness policy and Replica placement Strategy

Failure Handling

Namenode

Datanode

Block-Safe mode

Rebalancing and load optimization

Trouble shooting and error rectification

Hadoop fs shell commands-Unix and Java-Basics

Assessment 1

Module 5

Introduction to Mapreduce

Architecture of Map reduce

Execution Map reduce in YARN

App Master ,Resource Manager and Node manager

Input format , Input split and Key Value Pairs

class and  methods of Mapreduce paradigm

Mapper

Reducer

Partitioner

Custom and Default partition

Shuffle and Sort

Combiner-Scheduler

App Master /manager

Container-Node manager

Module 6

Map reduce Hands on

word count program/ log analytics

Hadoop streaming in R/Python

Data processing Transformations

Map only jobs and uber jobs

Inverted index and searches

Module 7

MR Programs 2

Structured and Unstructured Data handling

optimizing using Combiner

Partitioner

Single and multiple column

Inverted Index

XML -semi structure

Map side joins

Reduce side join

Module 8

Introduction to Hive Data warehouse

Installation hive and metastore database

Configure metastore to mysql

Hive QL Commands

Module 9

Manipulation and anlytical function in hive

Managed table and external tables

Partitioning and Bucketing

Complex data types and Unstructured data

Advance HQL commands

UDF and UDAF

Integration with Hbase

SerDe / Regular Expression

File formats : Parquet , sequence file , RCF , ORC file

Assessment 2

Module 10

Introduction to PIG

Installation-Bags and collections

Commands and Scripts

Pig UDF

Module 11

Introduction to NOSQL

ACID /CAP/BASE

Key value pair

Map reduce

Column family

Hbase Documennt

MongoDB

Graph DB

Neo4j

Module 12

Introduction to HBASE and installation

The HBase Data Model

The HBase Shell

HBase Architecture

Schema Design

The HBase API

HBase Configuration and Tuning

Module 13

Ingest data from RDB

Introduction to Sqoop and installation

Import and export data from and to RDB

Bulk loading , Incremental load , Split by , Conditional query

Sqoop validation and jobs

Module 14

Ingest streaming data

Flume Architecture

Agent ,Source,sink channel

Ingest log file

Collecting data from twitter for Sentimental analysis

Assessment 3

Module 15

Integrate With ETL

Talend Big data edition – Components of big data

Module 16

Big data Analytics

Dimensional modelling

Data Visualization

Tableau – Hive and spark sql connectors

Module 17

Spark core and Components

Spark Shell

Create RDD from HDFS /Local

Creating new RDD-Transformations on RDD

Lineage Graph – DAG

Actions on RDD

RDD Concepts on Persist and Cache-Lazy evaluation of RDD

Hands on and core concepts of map() transformation

Hands on and core concepts of filter() transformation

Hands on and core concepts of flatMap() transformation Compare map and flatMap transformation Hands on and core concepts of reduce() action

Hands on and core concepts of fold() action-Hands on and core concepts of aggregate() action

Basics of Accumulator

Hands on and core concepts of collect() action

Hands on and core concepts of take() action

Apache Spark Execution Model

How Spark execute program

Concepts of RDD partitioning

RDD data shuffling and performance issue

Module 18

Data frames and dataset

Spark SQL

Pyspark

Module 19

Spark jobs

Build scala program using SBT /Maven

Spark submit and spark Application

Module 20

KAFKA-Publisher /Subcrriber

Consumer and producer

Module 21

HUE

Monitoring and scheduling

Module 22

OOZIE-Workflow and Co-ordinator

Module 23

Distribution Installation or Sandbox

Cloudera -cloudera manager

Horton works -ambari server

MapR – MCS

Module 24

Introduction to Data science-Machine learning-Statistical Analysis-Sentiment Analysis

Module 25

Use Multinode cluster setup-High Availabilty-Hadoop data federation-Commissioning and-decommissioning-Automatic and manual failover-Zookeeper failover controller

Module 26

Use cases, Case studies and Proof of Concept-Working on different Distributions

Module 27 (Certification guidance)

Big Data Hadoop Developer certification

CCA Spark and Hadoop Developer Exam (CCA175)

CCP Data Engineer (DE575)

HDPCD CERTIFICATION

HDP CERTIFIED APACHE SPARK DEVELOPER

best big data training center in chennai
best hadoop training centre in chennai
best big data training in chennai
best training institute in chennai for big data
big data analytics training center in chennai
big data architect training in chennai
big data certification cost chennai
hadoop architect training in chennai
best bigdata corporate training for singapore , Australia , US
big data classroom training in chennai
big data testing training in chennai
big data hadoop certification training and placement in chennai
big data cloudera training in chennai
big data mapr training in chennai
big data hortonworks training in chennai
big data hadoop training in chennai ekkaduthangal
big data hadoop training institutes in chennai
big data testing training in chennai
big data training and placement in chennai
big data corporate training center chennai
big data hadoop corporate training chennai
big data workshop for students in chennai
big data training fees in chennai
free big data training in chennai
big data microsoft hdinsight training in chennai ekkaduthangal
big data training in chennai review
big data training in chennai tambaram
big data training in chennai velachery
big data training in chennai with placement
big data training institute chennai
big data training ekkaduthangal chennai
cost of big data training in chennai
hadoop big data training cost in chennai
ibm big insight big data training in chennai
ekkaduthangal big data training in chennai
training for big data in chennai
training on big data in chennai
Apache spark training
cloudera certification training
data science bigdata training
data science using python
statistics training in chennai
bigdata spark training in chennai
cloudera spark hadoop certification training
Hortonworks developer and admin training
Azure big data lake training
cloudera hadoop installation in azure
Hortonworks hadoop installation in azure
Mapr hadoop installation in azure
Mapr hadoop installation in AWS
Talend bigdata training in chennai
cassandra solr training in chennai
big data nosql training in ekkaduthangal
best big data machine learning training in chennai
best big data deep learning training in chennai
best big data online training in chennai
Mapr cluster installation and certification training in chennai
Informatica big data training in chennai
hadoop spark nosql cloud training in chennai
spark scala python programming training in chennai
Tensorflow training in chennai
How to participate in kaggle and hackathon

Back