Big Data Hadoop Developer Training Chennai

(Hadoop ,Spark , NoSQL , Cloud)

Module 1

Introduction to Big Data

Characteristics

Why, How and What s of Big data

Existing OLTP, ETL,DWH,OLAP

Module 2

Introduction to Hadoop Ecosystem

Architecture-HDFS

Sharding , Distributed and Replication factor  (SDR)

Daemons

Map reduce (MRV1) and Yarn

Hadoop v1 and v2

Hadoop Data fedaration

Module 3

Prerequisite for Installation

Single node , Pseudo distributed and Multinode cluster

Virtual machine using Linux ubuntu/CentOS

Installation of hadoop in cloud (Azure/AWS)

Installation of Java ,ssh,eclipse

Installation and configuration of Hadoop,HDFS,Daemons,YARN Daemons

High Availability (Active and Standby)

Automatic and manual failover

Hadoop Fs shell commands

Writing Data to HDFS

Reading Data from DFS

Module 4

Rack awareness policy and Replica placement Strategy

Failure Handling

Namenode

Datanode

Block-Safe mode

Rebalancing and load optimization

Trouble shooting and error rectification

Hadoop fs shell commands-Unix and Java-Basics

Assessment 1

Module 5

Introduction to Mapreduce

Architecture of Map reduce

Execution Map reduce in YARN

App Master ,Resource Manager and Node manager

Input format , Input split and Key Value Pairs

class and  methods of Mapreduce paradigm

Mapper

Reducer

Partitioner

Custom and Default partition

Shuffle and Sort

Combiner-Scheduler

App Master /manager

Container-Node manager

Module 6

Map reduce Hands on

word count program/ log analytics

Hadoop streaming in R/Python

Data processing Transformations

Map only jobs and uber jobs

Inverted index and searches

Module 7

MR Programs 2

Structured and Unstructured Data handling

optimizing using Combiner

Partitioner

Single and multiple column

Inverted Index

XML -semi structure

Map side joins

Reduce side join

Module 8

Introduction to Hive Data warehouse

Installation hive and metastore database

Configure metastore to mysql

Hive QL Commands

Module 9

Manipulation and anlytical function in hive

Managed table and external tables

Partitioning and Bucketing

Complex data types and Unstructured data

Advance HQL commands

UDF and UDAF

Integration with Hbase

SerDe / Regular Expression

File formats : Parquet , sequence file , RCF , ORC file

Assessment 2

Module 10

Introduction to PIG

Installation-Bags and collections

Commands and Scripts

Pig UDF

Module 11

Introduction to NOSQL

ACID /CAP/BASE

Key value pair

Map reduce

Column family

Hbase Documennt

MongoDB

Graph DB

Neo4j

Module 12

Introduction to HBASE and installation

The HBase Data Model

The HBase Shell

HBase Architecture

Schema Design

The HBase API

HBase Configuration and Tuning

Module 13

Ingest data from RDB

Introduction to Sqoop and installation

Import and export data from and to RDB

Bulk loading , Incremental load , Split by , Conditional query

Sqoop validation and jobs

Module 14

Ingest streaming data

Flume Architecture

Agent ,Source,sink channel

Ingest log file

Collecting data from twitter for Sentimental analysis

Assessment 3

Module 15

Integrate With ETL

Talend Big data edition – Components of big data

Module 16

Big data Analytics

Dimensional modelling

Data Visualization

Tableau – Hive and spark sql connectors

Module 17

Spark core and Components

Spark Shell

Create RDD from HDFS /Local

Creating new RDD-Transformations on RDD

Lineage Graph – DAG

Actions on RDD

RDD Concepts on Persist and Cache-Lazy evaluation of RDD

Hands on and core concepts of map() transformation

Hands on and core concepts of filter() transformation

Hands on and core concepts of flatMap() transformation Compare map and flatMap transformation Hands on and core concepts of reduce() action

Hands on and core concepts of fold() action-Hands on and core concepts of aggregate() action

Basics of Accumulator

Hands on and core concepts of collect() action

Hands on and core concepts of take() action

Apache Spark Execution Model

How Spark execute program

Concepts of RDD partitioning

RDD data shuffling and performance issue

Module 18

Data frames and dataset

Spark SQL

Pyspark

Module 19

Spark jobs

Build scala program using SBT /Maven

Spark submit and spark Application

Module 20

KAFKA-Publisher /Subcrriber

Consumer and producer

Module 21

HUE

Monitoring and scheduling

Module 22

Zeppelin

OOZIE-Workflow and Co-ordinator

Module 23

Distribution Installation on cloud  or Sandbox

Cloudera -cloudera manager

Horton works -ambari server

MapR – MCS

Module 24

Introduction to Data science-Machine learning-Statistical Analysis-Sentiment Analysis

Module 25

Use Multinode cluster setup-High Availabilty-Hadoop data federation-Commissioning and-decommissioning-Automatic and manual failover-Zookeeper failover controller

Module 26

Use cases, Case studies and Proof of Concept-Working on different Distributions

Module 27 (Certification guidance)

CCA Spark and Hadoop Developer Exam (CCA175)

CCP Data Engineer (DE575)

HDPCD CERTIFICATION

HDP CERTIFIED APACHE SPARK DEVELOPER

Best bigdata training center in chennai,best hadoop training centre in chennai,best big data training in chennai,best training institute in chennai for big data,big data analytics training center in chennai,big data architect training in chennai,big data certification cost chennai,hadoop architect training in chennai,best bigdata corporate training for singapore , Australia , US ,big data classroom training in chennai,big data testing training in chennai,big data hadoop certification training and placement in chennai,big data cloudera training in chennai,big data mapr training in chennai,big data hortonworks training in chennai,big data hadoop training in chennai ekkaduthangal,big data hadoop training institutes in chennai,big data testing training in chennai,big data training and placement in chennai,big data corporate training center chennai,big data hadoop corporate training chennai ,big data workshop for students in chennai,big data training fees in chennai,free big data training in chennai,big data microsoft hdinsight training in chennai ekkaduthangal,big data training in chennai review,big data training in chennai tambaram,big data training in chennai velachery,big data training in chennai with placement,big data training institute chennai
big data training ekkaduthangal chennai,cost of big data training in chennai,hadoop big data training cost in chennai,ibm big insight big data training in chennai,ekkaduthangal big data training in chennai,training for big data in chennai,training on big data in chennai,Apache spark training,cloudera certification training ,data science bigdata training,data science using python ,statistics training in chennai,bigdata spark training in chennai,cloudera spark hadoop certification training,Hortonworks developer and admin training,Azure big data lake training,cloudera hadoop installation in azure ,Hortonworks hadoop installation in azure ,Mapr hadoop installation in azure,Mapr hadoop installation in AWS,Talend bigdata training in chennai,cassandra solr training in chennai,big data nosql training in ekkaduthangal,best big data machine learning training in chennai,best big data deep learning training in chennai,best big data online training in chennai with 100 % placement assistance,Bigdata job for fresher,RPA training in chennai,Mapr cluster installation and certification training in chennai,Informatica big data training in chennai,hadoop spark nosql cloud training in chennai ,spark scala python programming training in chennai,Tensorflow training in chennai,pyspark training,hadoop job,bigdata job oriented training
How to participate in kaggle and hackathon , bigdata free workshop in chennai, Bigdata workshop with certificate in chennai,Bigdata journal preparation

Back