Big Data Course

Why Learn Big Data:

Big Data is the most booming technology in IT industry now , with hyper-digitization taking over our lives Data is the most valuable asset for a company. And Big Data (Hadoop) plays a big role in storing, securing and analyzing a huge amount of data which commonly used databases like Oracle, MySql, Sql cant handle. So its now the need of all big companies who have huge amount of data, for example Yahoo, Facebok, Google and many such organizations. Nasscom (National association of software and services companies) projects that India will need about 150,000 to 50,000 Big Data professionals for domestic and international operations by next few years.

Training Schedule:

Duration: 60 Hrs.
Course Fees: 16,000

Module Structure:

Module 1: Introduction to Bigdata , Hadoop & Hadoop Ecosystem Projects

(What is bigdata, Advent of hadoop, Motivation for hadoop, Big Data & future of Analytics, Basic hadoop components, hadoop ecosystem projects, Basic installation tips)

Module 2: Hadoop Distributed File System (HDFS) Architecture

(HHDFS concepts, File storage on hadoop, HDFS features, Name node and data node, Secondary Name Node, Job Tracker & Task Tracker, basic concepts on Hadoop I/O, accessing HDFS, Hadoop daemons, cluster configuration, Basic Linux and HDFS Commands, Hands-on exercises on HDFS)

Module 3: Map Reduce Workflow

(Introduction to map reduce, features of map reduce, map reduce flow and representation, Mapper and Reducer, Input file format (TextInputFormat, KeyValueTextInputFormat, SequenceFileInputFormat,..etc), Map reduce sample word count Java code walkthrough and hands-on, Debugging, submitting a map reduce job, IndentityMapeer ,IndetityReducer,Combiners, Partitioner , Hands-on exercise on IdentityMapper, IdentityReducer, Combiner & Partitioner)

Module 4: Hive

(Introduction to Hive, Hive data types, Hive megastore, Physical data layout, Managed and external tables, Loading operations, HIVEQL, Data definition, Data manipulation, Hive query execution, User defined functions, Comparison between pig and hive, Hands-on exercises on hive )

Module 5: Pig

(Introduction to Pig, Pig concepts and features, Pig's data model, Pig Latin language constructs, Input and output operations, Relational operations, User defined functions, Running a pig code, Sample pig code walkthrough, Hands-on exercises on pig Latin )

Module 6: HBase

(Introduction, Architecture , Installation ,General Commands, Admin API, Create Table, Enable & Disable Table, Describe & alter , Shutting Down HBase, Client API, CRUD Operation on HBase, HBase Security, Hands-on exercises on HBase )

Module 7: Other Ecosystem Projects

(Sqoop ( Introdution , Database Imports , Workign with Imported data , Importing Large Objects, Performing Exports , Hands-on exercise on Squoop) , Flume ( Introduction , Event Data handing using Flume, Hands-on Exercise on Flume ) , ZooKeeper (Introduction , Architecture , Hands-on ZooKeeper ) , Oozie (Introduction , workflow, hands-on exercise on Oozie))

Module 8: Hadoop Administration

( Hands on installation of Hadoop on Single node cluster, introduction to Cloudera CDH 4.x, 5.x, Horton works HDP administration, basic concepts on MAPR hadoop sandbox )

Real Time Case studies on Hadoop:

Introduction to realtime big data use-cases, hands-on data collection & ingestion of data, processing & analyse through BI tools. Introduction to basic Big Data Analytics (ClickStream, Weblog, Sensor data) on HealthCare, Retail , e-commerce & IoT(Internet of Things).


Certification as Trainee Big Data Architect:

On successful completion of the training and the assigned project work you will be certified as Trainee Big Data Architect by SysAlgo Technology, the software division of Ejobindia. Also, we will provide you full guidance to achieve some other valuable certifications like Cloudera.

Starting your journey to a
Successful Career

Please fill out the form below to get a free career consultation callback.