Wednesday, June 12, 2013

Big data training Daywise curriculum @Geoinsyssoft


                                         


         For more details Course curriculum ,duration ,fees Click here 
             Class room and Online training : 

For demo call 9884218531 or mail to : info@geoinsyssoft.com





Big data training Daywise Content : 


Day 1:
Introduction to Big Data.
Realtime usages
Volume ,Variety,Velocity,Value
Compare with existing OLTP,ETL,DWH,OLAP
Day 2
Introduction to Hadoop 1.0 and Hadoop 2.0
Architecture
HDFS Cluster – Data Storage Framework
Map Reduce  - Data Processing Framework
HBASE – NOSQL Database
HIVE Warehouse
PIG  latin Data flow scripts
SQOOP –Bulk data transfer for relational database
Flume  -Streaming Logs

DAY 3
Setup -VM Linux /ubuntu/CentOS
Java
Hadoop setup and configuration –version 1.1.2 and 2.05
Hadoop 1.0 cluster and Daemons
Name node – Metadata , fsimage ,Editlog , Block reports
Rack awareness policy
Safe mode ,rebalancing and load optimization
Data node – Writing, reading and replication of blocks
Job tracker – Intialization, Execution, IO, failure
Task tracker – Initialization , progress, failure
Secondary Namenode – Not a backup
DAY 4
Installation and config of Hadoop 2.0 –YARN
Resource Manager – resource and job Management
Application Manager
Scheduler  - Fair ,Capacity ,Priority
Node Manager
Application Master
Container – Yarn Child and task execution
UBER job
Failure of Application ,RM,AM,NM

Day 5:
Unix and Java Basics.
HDFS file operations  fs shell

 
Day 6:
Introduction to Mapreduce.
Architecture of MR v1 and v2
Key Value Pairs
Mapper – setup/Config,init,map,cleanup,close
Shuffle and Sort
Combiner
Pratitioner
Reducer

Day 7:
Map reduce  word count program.

Structured and Unstructured Data handling
Data processing 
Map only jobs 

Day 8 and Day 9
MR Programs 2:
Combiner and Partitioner
Single and multiple column
Inverted index
XML -semi structured data
Map side joins.
Reduce side join.

Day 10
Introduction to HIVE Datawarehouse
Architecture Installation
Basic HQL Commands
Load, external table
Join
Partioning
Bucket
Advance HQL commands
Beeswax –Web console
Word count in hive

Day 11:
Introduction to PIG
Installation
Data flow Scripts
Handling structured and unstructured

Day 12:
Introduction to NOSQL
ACID /CAP/BASE
Key value pair -Map reduce
Column family-Hbase
Document -MongoDB
Graph DB -Neo4j

Day 13:
Introduction to HBASE and installation. 
The HBase Data Model
The HBase Shell
HBase Architecture
Schema Design
The HBase API
HBase Configuration and Tuning

Day 14:
Introduction to Sqoop and installation.
Bulk loading
Hadoop Streaming.

Day 15:
Flume –NG
Source,Sink,Channel –Agent
Avro  
Zoo keeper
chukwa and oozie

Day 16:
Integrate With ETL
Talend Data studio

Day 17 :
Big data Analytics-Visualization
Tableau or Jaspersoft
Cloudera /Hortonworks/Greenplum

Day 18:
Introduction to Data science
Data mining -Machine learning
Statistical Analysis –Predictive modelling
Sentiment Analysis or opinion mining

Day 19 :
Use cases ,Case studies and Proof of Concepts 

Day 20 and Day 21(Optional)

CCD-410 - Cloudera Certification Questions Discussion.





                                           www.geoinsyssoft.com/courses
         For more details Course curriculum ,duration ,fees Click here 

No comments:

Post a Comment