Apache Spark with Scala Online Training by experts from Industry
Scala Essentials | Traits and OOPs in Scala |Functional Programming in Scala
Introduction to Big Data and Spark | Spark Baby Steps | Playing with RDDs
Shark – When Spark meets Hive ( Spark SQL) | Spark Streaming
Spark Mlib | Spark GraphX | Project and Installation
Spark & Scala Online Training with 15+ years Experienced Faculty
Duration of Spark Scala Training : 32 hrs
Batch type : Weekdays/Weekends
Mode of Training : Classroom/Online/Corporate Training
Spark & Scala Online Training & Certification in Pune
Realtime Projects, Scenarios & Assignments
COURSE CONTENT :
Module 1 :
Introduction to Scala
Learning Objectives – In this module, you will understand basic concepts of Scala,
motives towards learning a new language and get your set-up ready.
Module 2 :
Learning Objectives – In this module, you will learn essentials of Scala that are
needed to work on it.
Module 3 :
Traits and OOPs in Scala
Learning Objectives – In this module, you will understand implementation of OOPs
concepts in Scala and use Traits as Mixins
Module 4 :
Functional Programming in Scala
Learning Objectives – In this module, you will understand functional programming
know how for Scala.
Module 5 :
Introduction to Big Data and Spark
Learning Objectives – In this module, you will understand what is Big Data, it’s
associated challenges, various frameworks available and will get the first hand introduction
Module 6 :
Spark Baby Steps
Learning Objectives – In this module, you will learn how to invoke Spark shell and
use it for various common operations.
Module 7 :
Playing with RDDs
Learning Objectives – In this module, you will learn one of the building blocks of
Spark – RDDs and related manipulations for implementing business logics.
Module 8 :
Shark – When Spark meets Hive ( Spark SQL)
Learning Objectives – In this module, you will see various offspring’s of Spark like
Shark, SparkSQL and Mlib. This session is primarily interactive for discussing industrial use
cases of Spark and latest developments happening in this area.
Module 9 :
Learning Objectives – In this module, you will learn about the major APIs that Spark
offers. You will get an opportunity to work on Spark streaming which makes it easy to build
scalable fault-tolerant streaming applications.
Module 10 :
Learning Objectives – In this module, you will learn about the machine learning
concepts in Spark
Module 11 :
Learning Objectives – In this module, you will learn about Graph Analysis concepts in
Module 12 :
Project and Installation
DataQubez University creates meaningful big data & Data Science certifications that are recognized in the industry as a confident measure of qualified, capable big data experts. How do we accomplish that mission? DataQubez certifications are exclusively hands on, performance-based exams that require you to complete a set of tasks. Demonstrate your expertise with the most sought-after technical skills. Big data success requires professionals who can prove their mastery with the tools and techniques of the Hadoop stack. However, experts predict a major shortage of advanced analytics skills over the next few years. At DataQubez, we’re drawing on our industry leadership and early corpus of real-world experience to address the big data & Data Science talent gap.
How To Become Certified Apache – Spark Developer
Certification Code – DQCP – 504
Certification Description – DataQubez Certified Professional Apache – Spark Developer
Define and deploy a rack topology script, Change the configuration of a service using Apache Hadoop, Configure the Capacity Scheduler, Create a home directory for a user and configure permissions, Configure the include and exclude DataNode files
Demonstrate ability to find the root cause of a problem, optimize inefficient execution, and resolve resource contention scenarios, Resolve errors/warnings in Hadoop Cluster, Resolve performance problems/errors in cluster operation, Determine reason for application failure, Configure the Fair Scheduler to resolve application delays, Restart an Cluster service, View an application’s log file, Configure and manage alerts, Troubleshoot a failed job
Configure NameNode, Configure ResourceManager, Copy data between two clusters, Create a snapshot of an HDFS directory, Recover a snapshot, Configure HiveServer2
Import data from a table in a relational database into HDFS, Import the results of a query from a relational database into HDFS, Import a table from a relational database into a new or existing Hive table, Insert or update data from HDFS into a table in a relational database, Given a Flume configuration file, start a Flume agent, Given a configured sink and source, configure a Flume memory channel with a specified capacity
Frame big data analysis problems as Apache Spark scripts, Optimize Spark jobs through partitioning, caching, and other techniques, Develop distributed code using the Scala programming language, Build, deploy, and run Spark scripts on Hadoop clusters, Transform structured data using SparkSQL and DataFrames
Using MLLib to Produce Recomandation Engine, Run Page rank algorithem, using dataframes with mllib, Machine Learning with Spark
Process Stream Data using spark streaming.
Introduction to Linear Regression, Introduction to Regression Section, Linear Regression Documentation Alternate Linear Regression Data CSV File, Linear Regression Walkthrough , Linear Regression Project
For Exam Registration of Apache – Spark Developer, Click here:
Spark&Scala trainer is having 17 year experience in IT with 10 years in data warehousing &ETL experience. It has been six years now that he has been working extensively in BigData ecosystem tool sets for few of the banking-retail-manufacturing clients. He is a certified HDP-Spark Developer and Cloudera certified Hbase specialist. He also have done corporate sessions and seminars both in India and abroad. Recently he was engaged by Pune University for 40 hour sessions on BigData analytics to the senior professors of Pune.
All faculties at our organization are currently working on the technologies in reputed organization. The curriculum that is imparted is not just some theory or talk with some PPTs. We absolutely frame the forum in such a way so that at the end the lessons are imparted in easy language and the contents are well absorbed by the candidates. The sessions are backed by hands-on assignment. Also that the faculties are industry experience so during the course he does showcase his practical stories.
ML and GraphX ,’R’ Language
Data Analytics / Science
Cloudera Certified Professional (CCP)
CCP Data Engineer