Radical Technologies
Call :+91 8055223360 | 8103400400

DATASCIENCE – PG DIPLOMA CERTIFICATION

Satisfied Learners

Event batch schedules:

Location Day/Duration Start Date ₹ Price Book Seat
Pune 352 days 15/09/2020 ₹ 0.00 Enroll Now

PG DIPLOMA CERTIFICATION IN DATASCIENCE & AI

This program consists of Highly Practical learning of Statistics, Data Science, AI , Python, R, Big data Science ,Apache Spark & Scala, TensorFlow , Tableau SoftMax function, Autoencoder Neural Networks, Restricted Boltzmann  Machine (RBM),K-Means Clustering, Decision Trees, Random Forest, and Naive Bayes using R ,Pandas, NumPy, Matplotlib, Spark RDD, Spark SQL, Spark MLlib and Spark Streaming ,Scala Programming language, HDFS, Sqoop, Flume, Spark GraphX and Messaging System such as Kafka etc.

50 Projects – 8 Major and 42 Mini Projects | 500 Hrs of Training Content | 300+ Assignments | 20+ Use Case Studies

Gain Knowledge of 3+ Year Experienced Data Scientist

1 Global Certification is Free along with this course . All materials are giving free along with this course . Master program is integrated with International PG Certificate Program from UK . After 6 months of successful completion of Masters Program , You will start getting interview calls . Once you placed in Organization , You should complete the Assessments and remaining projects on weekends or on Online Mode .

Certificates Achieved : You will receive 3 certifications after successful completion .

Global Certification – Google Certified Data Engineer . RCITP after 6 months and International PG Diploma in Data Science and AI – UK After 1 year of successful completion of the course . You have to undergo multiple projects and assignments to achieve the International PG Diploma. After successful completion of the course, there will be 12 online assessments on topics covered during the course .

Major courses are Statistics, Data Science, AI , Python, R, Big data Science , Apache Spark & Scala, TensorFlow , Tableau SoftMax function, Autoencoder Neural Networks, Restricted Boltzmann  Machine (RBM),K-Means Clustering, Decision Trees, Random Forest, and Naive Bayes using R ,Pandas, NumPy, Matplotlib, Spark RDD, Spark SQL,
Spark MLlib and Spark Streaming ,Scala Programming language, HDFS, Sqoop, Flume, Spark GraphX and Messaging System such as Kafka etc.30 Plus tools are covered as part of this training

Avg. salary of a Data Scientists is goes to 20Lakhs per annum

Our Highlights
 18000+ Students Empowered till now
 Online/Classroom/Self-Paced
 12 months Intensive Training

 Triple Certification – Global Certification + RCITP After 6
months of Successful Completion of Masters + International PG
Diploma From UK after 1 Year
 Start Date – Every Month One Batch ( Per Batch 25 Students )
 400+ Hiring Partners
 Certified from OTHM–UK Recognised by ofqual.gov.uk
 Minimum 5 Interview Call and until you get Job
 What you get After the Training – Gain Knowledge of At least 3+ Year Experienced Professional

About the Program
Project Driven industry mentorship, dedicated career support, learn 30 +tools Related to Data Science & AI.
Expertise in Data Science & AI with multiple assignments and Project. Dedicated Trainers with ample of Industry Experience . Project based IT Training and Certification programs .
Provide Level 7 Certification , which is equal to Master’s program in the rest of the world . Word Wide recognised certification from UK.Candidate and educational institutions can verify the certification online .

Program Overview – Key Highlights

 Designed for Freshers to Working Professionals
 Eligible for Ofqual regulates qualifications – UK Certified and World Recognised programs
 30+ Data Science & AI tools
 50+ Industry Projects , 300 + Live assignments , 150+ coding Solutions.
 8 Major Projects , 42 Minor Projects & Use Case Studies , Job Oriented Scenarios

 Ofqual – UK validated PG Diploma from UK
 360 Degree Career Support
 One-on-One with Industry Mentors
 Dedicated Student Mentor
 Placements with Top Firms
 No Cost EMI Option – 0% EMI Option available

Top Skills You Will Learn

Predictive Analytics using Python, Machine Learning, Data Visualization, Big Data, Natural Language Processing , Statistics, Data Science, AI , Python, R, Apache Spark & Scala, TensorFlow , Tableau SoftMax function, Autoencoder Neural Networks, Restricted Boltzmann  Machine (RBM),K-Means Clustering, Decision Trees, Random Forest, and Naive Bayes using R ,Pandas, NumPy, Matplotlib, Spark RDD, Spark SQL, Spark MLlib and Spark Streaming Scala Programming language, HDFS, Sqoop, Flume, Spark GraphX and Messaging System such as Kafka , Big data Science

Job Opportunities
Data Analyst, Data Scientist, Data Engineer, Product Analyst, Machine Learning Engineer,
Decision Scientist, Python Developer etc

Who Is This Program For?

Audience :- Freshers , Any Graduate , 2 to 4 Years Experienced up skilling enthusiasts . 3 rd Year Graduates who are going to attend campus Interviews . These courses designed in a way to be suitable for all branches of Engineering and all type of graduates – Science and non- science Graduates. Option to customise the subject according to the interest of candidate is also available.

Minimum Eligibility
Any Bachelor’s degree. Completed or not completed . No coding experience required . If you have any Educational Gap or any other career gap , you can do this program to boost up your career . The qualifications provided by UK Regulatory board is equal to Level 7 Masters Degree in UK.

Programming Languages and Tools Covered .

Post Graduate Diploma without quitting your job

How You Benefit From This Program
 Post Graduate Diploma without Spending full time in College
 3 Aditional Certifications / Qualification to Tag with your Academic Degree – Global
Certification in Data Engineering , RCITP- Radical Certified IT Professional and International
PG Diploma Certificate in Data Science & AI
 Eligible for Ofqual regulates qualifications – UK Certified and World Recognised programs
 Level 7 Program recognised world-wide for your Higher studies
 Get recognised by High Value world recognised UK equal Master Degree
 Career transition with up to 70% average salary hike

Frequently Asked Questions

1. What is the eligibility Criteria ?
Any One who is interested in statistics and programming . Those who drop out , Looking for Higher studies in UK and any other foreign countries
2. Is this is a Job Guaranteed program
Yes we give guaranteed interview calls until you find the Job . Minimum 5 interview calls and maximum until you get the satisfied Job .
3 . What is RCITP ?
Radical certified IT Professional – After 6 month of completion of the training, you will be
awarded masters Certificate from Radical – Masters in Data Science & AI – RCITP
4.whether Master’s program is integrated with International PG Certificate program ?
Yes it is. After 6 Month of enrolling for the masters course , and successfully completing the RCITP program ,You needs to undergo more projects and Assessments to obtain the PG Program . By default , all Master programs are integrated with International PG Certificate Program . By default , it will be converted into International PG certificate Program.
5 . Selection procedure
Once you enrolled for the course . Within one month , you have to complete the fee . If you
need any loan facility , This should be informed earlier before enrolling to make necessary
arrangements . Enrolment process of PG Diploma program will be starting after 1 month .
Necessary documents should be provided to enrol UK PG Program .

Curriculum

1. Statistics For Data Science – Using Python

Mode Of Training :- Classroom/ Online

Python Scripting allows programmers to build applications easily and rapidly. This course is an introduction to Python scripting, which focuses on the concepts of Python, it will help you to perform operations on variable types using Pycharm. You will learn the importance of Python in real time environment and will be able to develop applications based on Object Oriented Programming concept. End of this course, you will be able to develop networking applications with suitable GUI

1.1Understanding the Data
Goal: In this module, you will be introduced to data and its types and accordingly
sample data and derive meaningful information from the data in terms different
statistical parameters.
Objectives: At the end of this Module, you should be able to:
 Understand various data types
 Learn Various variable types
 List the uses of variable types
 Explain Population and Sample
 Discuss sampling techniques
 Understand Data representation

Topics:
 Introduction to Data Types
 Numerical parameters to represent data
 Mean
 Mode
 Median
 Sensitivity
 Information Gain

Page 8

 Entropy
 Statistical parameters to represent data

Hands-On/Demo
 Estimating mean, median and mode using python
 Calculating Information Gain and Entropy
1.2 Probability and its uses
Goal: In this module, you should learn about probability, interpret & solve real-life
problems using probability. You will get to know the power of probability with
Bayesian Inference.
Objectives: At the end of this Module, you should be able to:
 Understand rules of probability
 Learn about dependent and independent events
 Implement conditional, marginal and joint probability using Bayes Theorem
 Discuss probability distribution
 Explain Central Limit Theorem

Topics:
 Uses of probability
 Need of probability
 Bayesian Inference
 Density Concepts
 Normal Distribution Curve

Hands-On/Demo:
 Calculating probability using python
 Conditional, Joint and Marginal Probability using Python
 Plotting a Normal distribution curve

Page 9

1.3 Statistical Inference
Goal: Draw inferences from present data and construct predictive models using
different inferential parameters (as a constraint).
Objectives: At the end of this Module, you should be able to:
 Understand the concept of point estimation using confidence margin
 Draw meaningful inferences using margin of error
 Explore hypothesis testing and its different levels

Topics:
 Point Estimation
 Confidence Margin
 Hypothesis Testing
 Levels of Hypothesis Testing

Hands-On/Demo:
 Calculating and generalizing point estimates using python
 Estimation of Confidence Intervals and Margin of Error
1.4 Testing the Data
Goal: In this module, you should learn the different methods of testing the
alternative hypothesis.
Objectives: At the end of this module, you should be able to:
 Understand Parametric and Non-parametric Testing
 Learn various types of parametric testing
 Discuss experimental designing
 Explain a/b testing

Topics:
 Parametric Test

Page 10

 Parametric Test Types
 Non- Parametric Test
 Experimental Designing
 A/B testing

Hands-On/Demo:
 Perform p test and t tests in python
 A/B testing in python
1.5 Data Clustering
Goal: Get an introduction to Clustering as part of this Module which forms the
basis for machine learning.
Objectives: At the end of this module, you should be able to:
 Understand the concept of association and dependence
 Explain causation and correlation
 Learn the concept of covariance
 Discuss Simpson’s paradox
 Illustrate Clustering Techniques

Topics:
 Association and Dependence
 Causation and Correlation
 Covariance
 Simpson’s Paradox
 Clustering Techniques

Hands-On/Demo:
 Correlation and Covariance in python
 Hierarchical clustering in python
 K means clustering in python

Page 11

1.6 Regression Modelling
Goal: Learn the roots of Regression Modelling using statistics.
Objectives: At the end of this module, you should be able to:
 Understand the concept of Linear Regression
 Explain Logistic Regression
 Implement WOE
 Differentiate between heteroscedasticity and homoscedasticity
 Learn the concept of residual analysis

Topics:
 Logistic and Regression Techniques
 Problem of Collinearity
 WOE and IV
 Residual Analysis
 Heteroscedasticity
 Homoscedasticity

Hands-On/Demo:
 Perform Linear and Logistic Regression in python
 Analyze the residuals using python

2. Statistics for Data Science – Using R

2.1 Understanding the Data

Page 12
Goal: In this module, you will be introduced to data and its types and will
accordingly sample data and derive meaningful information from the data in terms
of different statistical parameters.
Objectives: At the end of this Module, you should be able to:
 Understand various data types
 Learn Various variable types
 List the uses of Variable types
 Explain Population and Sample
 Discuss Sampling techniques
 Understand Data representation

Topics:
 Introduction to Data Types
 Numerical parameters to represent data
 Mean
 Mode
 Median
 Sensitivity
 Information Gain
 Entropy
 Statistical parameters to represent data

Hands-On/Demo:
 Estimating mean, median and mode using R
 Calculating Information Gain and Entropy
2.2 Probability and its Uses
Goal: In this module, you will learn about probability, interpret & solve real-life
problems using probability. You will get to know the power of probability with
Bayesian Inference.
Objectives: At the end of this Module, you should be able to:

Page 13

 Understand rules of probability
 Learn about dependent and independent events
 Implement conditional, marginal and joint probability using Bayes Theorem
 Discuss probability distribution
 Explain Central Limit Theorem

Topics:
 Uses of probability
 Need of probability
 Bayesian Inference
 Density Concepts
 Normal Distribution Curve

Hands-On/Demo:
 Calculating probability using R
 Conditional, Joint and Marginal Probability using R
 Plotting a Normal distribution curve
2.3 Statistical Inference
Goal: In this module, you will be able to draw inferences from present data and
construct predictive models using different inferential parameters (as the
constraint).
Objectives: At the end of this Module, you should be able to:
 Understand the concept of point estimation using confidence margin
 Demonstrate the use of Level of Confidence and Confidence Margin
 Draw meaningful inferences using margin of error
 Explore hypothesis testing and its different levels

Topics:
 Point Estimation
 Confidence Margin

Page 14

 Hypothesis Testing
 Levels of Hypothesis Testing

Hands-On/Demo:
 Calculating and generalizing point estimates using R
 Estimation of Confidence Intervals and Margin of Error
2.4 Testing the Data
Goal: In this module, you will learn the different methods of testing the alternative
hypothesis.
Objectives: At the end of this module, you should be able to:
 Understand Parametric and Non-Parametric testing
 Learn various types of Parametric testing
 Explain A/B testing

Topics:
 Parametric Test
 Parametric Test Types
 Non- Parametric Test
 A/B testing

Hands-On/Demo:
 Perform P test and T tests in R
2.5 Data Clustering
Goal: In this module, you will get an introduction to Clustering which forms the
basis for machine learning.
Objectives: At the end of this module, you should be able to:
 Understand the concept of Association and Dependence

Page 15

 Explain Causation and Correlation
 Learn the concept of Covariance
 Discuss Simpson’s paradox
 Illustrate Clustering Techniques

Topics:
 Association and Dependence
 Causation and Correlation
 Covariance
 Simpson’s Paradox
 Clustering Techniques

Hands-On/Demo:
 Correlation and Covariance in R
 Hierarchical clustering in R
 K means clustering in R
2.6 Regression Modelling
Goal: In this module, you will be able to learn about the roots of Regression
Modelling using statistics.
Objectives: At the end of this module, you should be able to:
 Understand the concept of Linear Regression
 Explain Logistic Regression
 Implement WOE
 Differentiate between heteroscedasticity and homoscedasticity
 Learn concept of residual analysis

Topics:
 Logistic and Regression Techniques
 Problem of Collinearity
 WOE and IV

Page 16

 Residual Analysis
 Heteroscedasticity
 Homoscedasticity

Hands-On/Demo:
 Perform Linear and Logistic Regression in R
 Analyze the residuals using R
 Calculation of WOE values using R

3. DATASCIENCE & MACHINE LEARNING WITH
PYTHON

3.1 Introduction to Data Science with Python
 What is analytics & Data Science?
 Common Terms in Analytics
 Analytics vs. Data warehousing, OLAP, MIS Reporting
 Relevance in industry and need of the hour
 Types of problems and business objectives in various industries
 How leading companies are harnessing the power of analytics?
 Critical success drivers
 Overview of analytics tools & their popularity
 Analytics Methodology & problem solving framework
 List of steps in Analytics projects
 Identify the most appropriate solution design for the given problem statement
 Project plan for Analytics project & key milestones based on effort estimates
 Build Resource plan for analytics project
3.2 Python Essentials

Page 17

 Why Python for data science?
 Overview of Python- Starting with Python
 Introduction to installation of Python
 Introduction to Python Editors & IDE’s(Canopy, pycharm, Jupyter, Rodeo, Ipython etc…)
 Understand Jupyter notebook & Customize Settings
 Concept of Packages/Libraries – Important packages(NumPy, SciPy, scikit-learn, Pandas, Matplotlib, etc)
 Installing & loading Packages & Name Spaces
 Data Types & Data objects/structures (strings, Tuples, Lists, Dictionaries)
 List and Dictionary Comprehensions
 Variable & Value Labels –  Date & Time Values
 Basic Operations – Mathematical – string – date
 Reading and writing data
 Simple plotting
 Control flow & conditional statements
 Debugging & Code profiling
 How to create class and modules and how to call them?
Scientific Distributions Used In Python For Data Science
NumPy, pandas, scikit-learn, stat models, nltk
3.3 Accessing/Importing And Exporting Data Using Python Modules
 Importing Data from various sources (Csv, txt, excel, access etc)
 Database Input (Connecting to database)
 Viewing Data objects – subsetting Data, methods
 Exporting Data to various formats
 Important python modules: Pandas, beautiful soup
3.4 Data Manipulation – Cleansing – Munging using python modules
 Cleansing Data with Python
 Data Manipulation steps(Sorting, filtering, duplicates, merging, appending, subsetting, derived variables,
sampling, Data type conversions, renaming, formatting etc)
 Data manipulation tools(Operators, Functions, Packages, control structures, Loops, arrays etc)
 Python Built-in Functions (Text, numeric, date, utility functions)
 Python User Defined Functions
 Stripping out extraneous information
 Normalizing data
 Formatting data
 Important Python modules for data manipulation (Pandas, Numpy, re, math, string, datetime etc)
3.5 Data Analysis – Visualization Using Python
 Introduction exploratory data analysis

Page 18

 Descriptive statistics, Frequency Tables and summarization
 Univariate Analysis (Distribution of data & Graphical Analysis)
 Bivariate Analysis(Cross Tabs, Distributions & Relationships, Graphical Analysis)
 Creating Graphs- Bar/pie/line chart/histogram/ boxplot/ scatter/ density etc)
 Important Packages for Exploratory Analysis(NumPy Arrays, Matplotlib, seaborn, Pandas and SciPy. Stats
etc)
3.6 Introduction to Statistics
 Basic Statistics – Measures of Central Tendencies and Variance
 Building blocks – Probability Distributions – Normal distribution – Central Limit Theorem
 Inferential Statistics -Sampling – Concept of Hypothesis Testing Statistical Methods – Z/t-tests( One
sample, independent, paired), Analysis of variance, Correlations and Chi-square
 Important modules for statistical methods: NumPy, SciPy, Pandas
3.7 Introduction to Predictive Modelling
 Concept of model in analytics and how it is used?
 Common terminology used in analytics & Modelling process
 Popular modelling algorithms
 Types of Business problems – Mapping of Techniques
 Different Phases of Predictive Modelling
3.8 Data Exploration For Modelling
 Need for structured exploratory data
 EDA framework for exploring the data and identifying any problems with the data (Data Audit Report)
 Identify missing data
 Identify outliers data
 Visualize the data trends and patterns
3.9 Data Preparation
 Need of Data preparation
 Consolidation/Aggregation – Outlier treatment – Flat Liners – Missing values- Dummy creation – Variable
Reduction
 Variable Reduction Techniques – Factor & PCA Analysis
3.10 Segmentation: Solving Segmentation Problems
 Introduction to Segmentation
 Types of Segmentation (Subjective Vs Objective, Heuristic Vs. Statistical)
 Heuristic Segmentation Techniques (Value Based, RFM Segmentation and Life Stage Segmentation)
 Behavioural Segmentation Techniques (K-Means Cluster Analysis)

Page 19

 Cluster evaluation and profiling – Identify cluster characteristics
 Interpretation of results – Implementation on new data
3.11 Linear Regression: Solving Regression Problems
 Introduction – Applications
 Assumptions of Linear Regression
 Building Linear Regression Model
 Understanding standard metrics (Variable significance, R-square/Adjusted R-square, Global hypothesis
,etc)
 Assess the overall effectiveness of the model
 Validation of Models (Re running Vs. Scoring)
 Standard Business Outputs (Decile Analysis, Error distribution (histogram), Model equation, drivers etc.)
 Interpretation of Results – Business Validation – Implementation on new data
3.12 Logistic Regression : Solving Classification Problems
 Introduction – Applications
 Linear Regression Vs. Logistic Regression Vs. Generalized Linear Models
 Building Logistic Regression Model (Binary Logistic Model)
 Understanding standard model metrics (Concordance, Variable significance, Hosmer Lemeshov Test, Gini,
KS, Misclassification, ROC Curve etc)
 Validation of Logistic Regression Models (Re running Vs. Scoring)
 Standard Business Outputs (Decile Analysis, ROC Curve, Probability Cut-offs, Lift charts, Model equation,
Drivers or variable importance, etc)
 Interpretation of Results – Business Validation – Implementation on new data
3.13 Time Series Forecasting : Solving Forecasting Problems
 Introduction – Applications
 Time Series Components( Trend, Seasonality, Cyclicity and Level) and Decomposition
 Classification of Techniques(Pattern based – Pattern less)
 Basic Techniques – Averages, Smoothening, etc
 Advanced Techniques – AR Models, ARIMA, etc
 Understanding Forecasting Accuracy – MAPE, MAD, MSE, etc
3.14 Machine Learning : Predictive Modelling
 Introduction to Machine Learning & Predictive Modelling
 Types of Business problems – Mapping of Techniques – Regression vs. classification vs. segmentation vs.
Forecasting
 Major Classes of Learning Algorithms -Supervised vs Unsupervised Learning
 Different Phases of Predictive Modelling (Data Pre-processing, Sampling, Model Building, Validation)
 Overfitting (Bias-Variance Trade off) & Performance Metrics

Page 20

 Feature engineering & dimension reduction
 Concept of optimization & cost function
 Overview of gradient descent algorithm
 Overview of Cross validation(Bootstrapping, K-Fold validation etc)
 Model performance metrics (R-square, Adjusted R-square, RMSE, MAPE, AUC, ROC curve, recall,
precision, sensitivity, specificity, confusion metrics )
3.15 Unsupervised Learning : Segmentation
 What is segmentation & Role of ML in Segmentation?
 Concept of Distance and related math background
 K-Means Clustering
 Expectation Maximization
 Hierarchical Clustering
 Spectral Clustering (DBSCAN)
 Principle component Analysis (PCA)
3.16 Supervised Learning :- Decision Trees
 Decision Trees – Introduction – Applications
 Types of Decision Tree Algorithms
 Construction of Decision Trees through Simplified Examples; Choosing the “Best” attribute at each Non-
Leaf node; Entropy; Information Gain, Gini Index, Chi Square, Regression Trees
 Generalizing Decision Trees; Information Content and Gain Ratio; Dealing with Numerical Variables; other
Measures of Randomness
 Pruning a Decision Tree; Cost as a consideration; Unwrapping Trees as Rules
 Decision Trees – Validation
 Overfitting – Best Practices to avoid
3.17 Supervised Learning :- Ensemble Learning
 Concept of Ensembling
 Manual Ensembling Vs. Automated Ensembling
 Methods of Ensembling (Stacking, Mixture of Experts)
 Bagging (Logic, Practical Applications)
 Random forest (Logic, Practical Applications)
 Boosting (Logic, Practical Applications)
 Ada Boost
 Gradient Boosting Machines (GBM)
 XGBoost
3.18 Supervised Learning :- Artificial Neural Network – ANN
 Motivation for Neural Networks and Its Applications

Page 21

 Perceptron and Single Layer Neural Network, and Hand Calculations
 Learning In a Multi Layered Neural Net: Back Propagation and Conjugant Gradient Techniques
 Neural Networks for Regression
 Neural Networks for Classification
 Interpretation of Outputs and Fine tune the models with hyper parameters
 Validating ANN models
3.19 Supervised Learning :- Support Vector Machines
 Motivation for Support Vector Machine & Applications
 Support Vector Regression
 Support vector classifier (Linear & Non-Linear)
 Mathematical Intuition (Kernel Methods Revisited, Quadratic Optimization and Soft Constraints)
 Interpretation of Outputs and Fine tune the models with hyper parameters
 Validating SVM models
3.20 Supervised Learning :-KNN
 What is KNN & Applications?
 KNN for missing treatment
 KNN For solving regression problems
 KNN for solving classification problems
 Validating KNN model
 Model fine tuning with hyper parameters
3.21 Supervised Learning :- Naive Bayes
 Concept of Conditional Probability
 Bayes Theorem and Its Applications
 Naïve Bayes for classification
 Applications of Naïve Bayes in Classifications
3.22 Text Mining And Analytics
 Taming big text, Unstructured vs. Semi-structured Data; Fundamentals of information retrieval, Properties
of words; Creating Term-Document (TxD);Matrices; Similarity measures, Low-level processes (Sentence
Splitting; Tokenization; Part-of-Speech Tagging; Stemming; Chunking)
 Finding patterns in text: text mining, text as a graph
 Natural Language processing (NLP)
 Text Analytics – Sentiment Analysis using Python
 Text Analytics – Word cloud analysis using Python
 Text Analytics – Segmentation using K-Means/Hierarchical Clustering
 Text Analytics – Classification (Spam/Not spam)
 Applications of Social Media Analytics

Page 22

 Metrics(Measures Actions) in social media analytics
 Examples & Actionable Insights using Social Media Analytics
 Important python modules for Machine Learning (SciKit Learn, stats models, scipy, nltk etc)
 Fine tuning the models using Hyper parameters, grid search, piping etc.

4 . DATASCIENCE WITH R
4.1 Introduction to Data Science With R
 What is analytics & Data Science?
 Common Terms in Analytics
 Analytics vs. Data warehousing, OLAP, MIS Reporting
 Relevance in industry and need of the hour
 Types of problems and business objectives in various industries
 How leading companies are harnessing the power of analytics?
 Critical success drivers
 Overview of analytics tools & their popularity
 Analytics Methodology & problem solving framework
 List of steps in Analytics projects
 Identify the most appropriate solution design for the given problem statement
 Project plan for Analytics project & key milestones based on effort estimates
 Build Resource plan for analytics project
 Why R for data science?
4.2 Data Importing / Exporting
 Introduction R/R-Studio – GUI
 Concept of Packages – Useful Packages (Base & Other packages)
 Data Structure & Data Types (Vectors, Matrices, factors, Data frames,  and Lists)
 Importing Data from various sources (txt, dlm, excel, sas7bdata, db, etc.)
 Database Input (Connecting to database)
 Exporting Data to various formats)
 Viewing Data (Viewing partial data and full data)
 Variable & Value Labels –  Date Values
4.3 Data Manipulation
 Data Manipulation steps
 Creating New Variables (calculations & Binning)
 Dummy variable creation
 Applying transformations
 Handling duplicates

Page 23

 Handling missings
 Sorting and Filtering
 Subsetting (Rows/Columns)
 Appending (Row appending/column appending)
 Merging/Joining (Left, right, inner, full, outer etc)
 Data type conversions
 Renaming
 Formatting
 Reshaping data
 Sampling
 Data manipulation tools
 Operators
 Functions
 Packages
 Control Structures (if, if else)
 Loops (Conditional, iterative loops, apply functions)
 Arrays
 R Built-in Functions (Text, Numeric, Date, utility)
 Numerical Functions
 Text Functions
 Date Functions
 Utilities Functions
 R User Defined Functions
 R Packages for data manipulation (base, dplyr, plyr, data.table, reshape, car, sqldf, etc)
4.4 Data Analysis – Visualization
 ntroduction exploratory data analysis
 Descriptive statistics, Frequency Tables and summarization
 Univariate Analysis (Distribution of data & Graphical Analysis)
 Bivariate Analysis(Cross Tabs, Distributions & Relationships, Graphical Analysis)
 Creating Graphs- Bar/pie/line chart/histogram/boxplot/scatter/density etc)
 R Packages for Exploratory Data Analysis(dplyr, plyr, gmodes, car, vcd, Hmisc, psych, doby etc)
 R Packages for Graphical Analysis (base, ggplot, lattice,etc)
4.5 Introduction To Statistics
 Basic Statistics – Measures of Central Tendencies and Variance
 Building blocks – Probability Distributions – Normal distribution – Central Limit Theorem
 Inferential Statistics -Sampling – Concept of Hypothesis Testing
 Statistical Methods – Z/t-tests( One sample, independent, paired), Anova, Correlations and Chi-square
4.6 Predictive Modelling

Page 24

 Concept of model in analytics and how it is used?
 Common terminology used in analytics & modelling process
 Popular modelling algorithms
 Types of Business problems – Mapping of Techniques
 Different Phases of Predictive Modelling
4.7 Data Exploration For Modeling
4.8Data Preparation
  Need of Data preparation
 Consolidation/Aggregation – Outlier treatment – Flat Liners – Missing values- Dummy creation – Variable
Reduction
 Variable Reduction Techniques – Factor & PCA Analysis
4.9 Segmentation: Solving Segmentation Problems
 Introduction to Segmentation
 Types of Segmentation (Subjective Vs Objective, Heuristic Vs. Statistical)
 Heuristic Segmentation Techniques (Value Based, RFM Segmentation and Life Stage Segmentation)
 Behavioral Segmentation Techniques (K-Means Cluster Analysis)
 Cluster evaluation and profiling – Identify cluster characteristics
 Interpretation of results – Implementation on new data
4.10 Linear Regression: Solving Regression Problems
 Introduction – Applications
 Assumptions of Linear Regression
 Building Linear Regression Model
 Understanding standard metrics (Variable significance, R-square/Adjusted R-square, Global hypothesis
,etc)
 Assess the overall effectiveness of the model
 Validation of Models (Re running Vs. Scoring)
 Standard Business Outputs (Decile Analysis, Error distribution (histogram), Model equation, drivers etc.)
 Interpretation of Results – Business Validation – Implementation on new data
4.11 Logistic Regression: Solving Classification Problems
 Introduction – Applications
 Linear Regression Vs. Logistic Regression Vs. Generalized Linear Models
 Building Logistic Regression Model (Binary Logistic Model)
 Understanding standard model metrics (Concordance, Variable significance, Hosmer Lemeshov Test, Gini,
KS, Misclassification, ROC Curve etc)
 Validation of Logistic Regression Models (Re running Vs. Scoring)

Page 25
 Standard Business Outputs (Decile Analysis, ROC Curve, Probability Cut-offs, Lift charts, Model equation,
Drivers or variable importance, etc)
 Interpretation of Results – Business Validation – Implementation on new data
4.12 Time Series Forecasting: Solving Forecasting Problems
 Introduction – Applications
 Time Series Components( Trend, Seasonality, Cyclicity and Level) and Decomposition
 Classification of Techniques(Pattern based – Pattern less)
 Basic Techniques – Averages, Smoothening, etc
 Advanced Techniques – AR Models, ARIMA, etc
 Understanding Forecasting Accuracy – MAPE, MAD, MSE, etc

4.13 Machine Learning -Predictive Modeling – Basics
 Introduction to Machine Learning & Predictive Modeling
 Types of Business problems – Mapping of Techniques – Regression vs. classification vs. segmentation vs.
Forecasting
 Major Classes of Learning Algorithms -Supervised vs Unsupervised Learning
 Different Phases of Predictive Modeling (Data Pre-processing, Sampling, Model Building, Validation)
 Overfitting (Bias-Variance Trade off) & Performance Metrics
 Feature engineering & dimension reduction
 Concept of optimization & cost function
 Overview of gradient descent algorithm
 Overview of Cross validation(Bootstrapping, K-Fold validation etc)
 Model performance metrics (R-square, Adjusted R-squre, RMSE, MAPE, AUC, ROC curve, recall, precision,
sensitivity, specificity, confusion metrics )
4.14 Unsupervised Learning: Segmentation
 What is segmentation & Role of ML in Segmentation?
 Concept of Distance and related math background
 K-Means Clustering
 Expectation Maximization
 Hierarchical Clustering
 Spectral Clustering (DBSCAN)
 Principle component Analysis (PCA)
4.15 Supervised Learning: Decision Trees
 Decision Trees – Introduction – Applications
 Types of Decision Tree Algorithms

Page 26
 Construction of Decision Trees through Simplified Examples; Choosing the “Best” attribute at each Non-
Leaf node; Entropy; Information Gain, Gini Index, Chi Square, Regression Trees
 Generalizing Decision Trees; Information Content and Gain Ratio; Dealing with Numerical Variables; other
Measures of Randomness
 Pruning a Decision Tree; Cost as a consideration; Unwrapping Trees as Rules
 Decision Trees – Validation
 Overfitting – Best Practices to avoid

   4.16 Supervised Learning: Ensemble Learning
 Concept of Ensembling
 Manual Ensembling Vs. Automated Ensembling
 Methods of Ensembling (Stacking, Mixture of Experts)
 Bagging (Logic, Practical Applications)
 Random forest (Logic, Practical Applications)
 Boosting (Logic, Practical Applications)
 Ada Boost
 Gradient Boosting Machines (GBM)
 XGBoost
4.17 Supervised Learning: Artificial Neural Networks (ANN)
 Motivation for Neural Networks and Its Applications
 Perceptron and Single Layer Neural Network, and Hand Calculations
 Learning In a Multi Layered Neural Net: Back Propagation and Conjugant Gradient Techniques
 Neural Networks for Regression
 Neural Networks for Classification
 Interpretation of Outputs and Fine tune the models with hyper parameters
 Validating ANN models
4.18 Supervised Learning: Support Vector Machines
 Motivation for Support Vector Machine & Applications
 Support Vector Regression
 Support vector classifier (Linear & Non-Linear)
 Mathematical Intuition (Kernel Methods Revisited, Quadratic Optimization and Soft Constraints)
 Interpretation of Outputs and Fine tune the models with hyper parameters
 Validating SVM models
4.19 Supervised Learning: KNN
 What is KNN & Applications?
 KNN for missing treatment

Page 27

 KNN For solving regression problems
 KNN for solving classification problems
 Validating KNN model
 Model fine tuning with hyper parameters
4.20 Supervised Learning: Naïve Bayes
 Concept of Conditional Probability
 Bayes Theorem and Its Applications
 Naïve Bayes for classification
 Applications of Naïve Bayes in Classifications

4.21 Text Mining & Analytics
 Taming big text, Unstructured vs. Semi-structured Data; Fundamentals of information retrieval, Properties
of words; Creating Term-Document (TxD);Matrices; Similarity measures, Low-level processes (Sentence
Splitting; Tokenization; Part-of-Speech Tagging; Stemming; Chunking)
 Finding patterns in text: text mining, text as a graph
 Natural Language processing (NLP)
 Text Analytics – Sentiment Analysis using R
 Text Analytics – Word cloud analysis using R
 Text Analytics – Segmentation using K-Means/Hierarchical Clustering
 Text Analytics – Classification (Spam/Not spam)
 Applications of Social Media Analytics
 Metrics(Measures Actions) in social media analytics
 Examples & Actionable Insights using Social Media Analytics
 Important R packages for Machine Learning (caret, H2O, Randomforest, nnet, tm etc)
 Fine tuning the models using Hyper parameters, grid search, piping etc.

5 . AI With ML & DL

5.1 Introduction with Artificial Intelligence.
 What is AI (Artificial Intelligence) ?
 What types of intelligences we are talking about?
 Different definitions and Ultimate goal of AI.
 What are application areas for AI?
 History of AI and some real life examples of AI.

Page 28

5.2 ML and other related terms to AI.
 What is ML and How it is related with AI?
 What is NLP and How it is related with AI?
 What is DL and How it is related with ML and AI?
 What are ANNs and DNNs and How are they related to AI?
5.3 A working example of AI and ML.
Project 1 – These simple tasks are to make you understand how AI and ML can find their applications
in real life.
5.4 Python libraries for ML.
 What are Libraries, packages and Modules?
 What are top Python libraries for ML in Python?
5.5 Setting up Anaconda development environment.
 Why choosing Anaconda development environment?
 Setting up Anaconda development environment on Windows 10 PC.
 Verifying proper installation of Anaconda environment.
5.6 Getting into core development of ML.
 What is a classifier in ML?
 Important elements and flow of any ML projects.
 Let’s develop our first ML program – explanations
 Let’s develop our first ML program – development
Project – 2
These simple tasks are going to give you some great experience with Machine Learning introductory
programs or better say, “Hello world” programs of Machine Learning.
5.7 Different ML techniques.
 What all ML techniques are there?
 Evaluation methods of all ML techniques.
(IRIS flower project) Developing complete project of ML.
 Developing complete ML project – understanding data set
 Developing complete ML project – understanding flow of project
 Developing complete ML project – visualizing data set through Python

Page 29

 Developing complete ML project – development
 Developing complete ML project – concepts explanations
 –(Digit recognition project) Developing another project of ML.
Project 3
 After completing these project, you have done and understood multiple complete projects of Machine
Learning.
5.8 Introduction of Ai with Deep Learning
 Installation
 CPU Software Requirements
 CPU Installation of PyTorch
 PyTorch with GPU on AWS
 PyTorch with GPU on Linux
 PyTorch with GPU on MacOSX
5.9 PyTorch Fundamentals: Matrices
 Matrix Basics
 Seed for Reproducibility
 Torch to NumPy Bridge
 NumPy to Torch Bridge
 GPU and CPU Toggling
 Basic Mathematical Tensor Operations
 Summary of Matrices
5.10 PyTorch Fundamentals: Variables and Gradients
 Variables
 Gradients
 Summary of Variables and Gradients
5.11 Linear Regression with PyTorch
 Linear Regression Introduction
 Linear Regression in PyTorch
 Linear Regression From CPU to GPU in PyTorch
 Summary of Linear Regression
5.12 Logistic Regression with PyTorch
 Logistic Regression Introduction

Page 30

 Linear Regression Problems
 Logistic Regression In-depth
 Logistic Regression with PyTorch
 Logistic Regression From CPU to GPU in PyTorch
 Summary of Logistic Regression
5.13 Feedforward Neural Network with PyTorch
 Logistic Regression Transition to Feedforward Neural Network
 Non-linearity
 Feedforward Neural Network in PyTorch
 More Feedforward Neural Network Models in PyTorch
 Feedforward Neural Network From CPU to GPU in PyTorch
 Summary of Feedforward Neural Network
5.14 Convolutional Neural Network (CNN) with PyTorch
 Feedforward Neural Network Transition to CNN
 One Convolutional Layer, Input Depth of 1
 One Convolutional Layer, Input Depth of 3
 One Convolutional Layer Summary
 Multiple Convolutional Layers Overview
 Pooling Layers
 Padding for Convolutional Layers
 Output Size Calculation
 CNN in PyTorch
 More CNN Models in PyTorch
 CNN Models Summary
 Expanding Model’s Capacity
 CNN From CPU to GPU in PyTorch
 Summary of CNN
5.15 Recurrent Neural Networks (RNN)
 Introduction to RNN
 RNN in PyTorch
 More RNN Models in PyTorch
 RNN From CPU to GPU in PyTorch
 Summary of RNN
5.16 Long Short-Term Memory Networks (LSTM)
 Introduction to LSTMs
 LSTM Equations

Page 31

 LSTM in PyTorch
 More LSTM Models in PyTorch
 LSTM From CPU to GPU in PyTorch
 Summary of LSTM

6 Apache Spark and Scala
6.1 Introduction to Big Data Hadoop and Spark
Learning Objectives: Understand Big Data and its components such as HDFS. You
will learn about the Hadoop Cluster Architecture and you will also get an
introduction to Spark and you will get to know about the difference between batch
processing and real-time processing.
Topics:
 What is Big Data?
 Big Data Customer Scenarios
 Limitations and Solutions of Existing Data Analytics Architecture with Uber Use
Case
 How Hadoop Solves the Big Data Problem?
 What is Hadoop?
 Hadoop’s Key Characteristics
 Hadoop Ecosystem and HDFS
 Hadoop Core Components
 Rack Awareness and Block Replication
 YARN and its Advantage
 Hadoop Cluster and its Architecture
 Hadoop: Different Cluster Modes
 Big Data Analytics with Batch & Real-time Processing
 Why Spark is needed?
 What is Spark?
 How Spark differs from other frameworks?
 Spark at Yahoo!
6.2 Introduction to Scala for Apache Spark

Page 32
Learning Objectives: Learn the basics of Scala that are required for programming
Spark applications. You will also learn about the basic constructs of Scala such as
variable types, control structures, collections such as Array, ArrayBuffer, Map, Lists,
and many more.
Topics:
 What is Scala?
 Why Scala for Spark?
 Scala in other Frameworks
 Introduction to Scala REPL
 Basic Scala Operations
 Variable Types in Scala
 Control Structures in Scala
 Foreach loop, Functions and Procedures
 Collections in Scala- Array
 ArrayBuffer, Map, Tuples, Lists, and more
Hands-on:
 Scala REPL Detailed Demo
6.3 Functional Programming and OOPs Concepts in Scala
Learning Objectives: In this module, you will learn about object-oriented
programming and functional programming techniques in Scala.
Topics:
 Functional Programming
 Higher Order Functions
 Anonymous Functions
 Class in Scala
 Getters and Setters
 Custom Getters and Setters
 Properties with only Getters
 Auxiliary Constructor and Primary Constructor
 Singletons

Page 33

 Extending a Class
 Overriding Methods
 Traits as Interfaces and Layered Traits

Hands-on:
 OOPs Concepts
 Functional Programming
6.4 Deep Dive into Apache Spark Framework
Learning Objectives: Understand Apache Spark and learn how to develop Spark
applications. At the end, you will learn how to perform data ingestion using Sqoop.
Topics:
 Spark’s Place in Hadoop Ecosystem
 Spark Components & its Architecture
 Spark Deployment Modes
 Introduction to Spark Shell
 Writing your first Spark Job Using SBT
 Submitting Spark Job
 Spark Web UI
 Data Ingestion using Sqoop

Hands-on:
 Building and Running Spark Application
 Spark Application Web UI
 Configuring Spark Properties
 Data ingestion using Sqoop
6.5 Playing with Spark RDDs
Learning Objectives: Get an insight of Spark – RDDs and other RDD related
manipulations for implementing business logics (Transformations, Actions and
Functions performed on RDD).

Page 34

Topics:
 Challenges in Existing Computing Methods
 Probable Solution & How RDD Solves the Problem
 What is RDD, It’s Operations, Transformations & Actions
 Data Loading and Saving Through RDDs
 Key-Value Pair RDDs
 Other Pair RDDs, Two Pair RDDs
 RDD Lineage
 RDD Persistence
 WordCount Program Using RDD Concepts
 RDD Partitioning & How It Helps Achieve Parallelization
 Passing Functions to Spark

Hands-on:
 Loading data in RDDs
 Saving data through RDDs
 RDD Transformations
 RDD Actions and Functions
 RDD Partitions
 WordCount through RDDs
6.6 Data Frames and Spark SQL
Learning Objectives: In this module, you will learn about SparkSQL which is used
to process structured data with SQL queries, data-frames and datasets in Spark
SQL along with different kind of SQL operations performed on the data-frames.
You will also learn about the Spark and Hive integration.
Topics:
 Need for Spark SQL
 What is Spark SQL?
 Spark SQL Architecture
 SQL Context in Spark SQL

Page 35

 User Defined Functions
 Data Frames & Datasets
 Interoperating with RDDs
 JSON and Parquet File Formats
 Loading Data through Different Sources
 Spark – Hive Integration

Hands-on:
 Spark SQL – Creating Data Frames
 Loading and Transforming Data through Different Sources
 Stock Market Analysis
 Spark-Hive Integration
6.7 Machine Learning using Spark MLlib
Learning Objectives: Learn why machine learning is needed, different Machine
Learning techniques/algorithms, and SparK MLlib.
Topics:
 Why Machine Learning?
 What is Machine Learning?
 Where Machine Learning is Used?
 Face Detection: USE CASE
 Different Types of Machine Learning Techniques
 Introduction to MLlib
 Features of MLlib and MLlib Tools
 Various ML algorithms supported by MLlib
6.8 Deep Dive into Spark MLlib
Learning Objectives: Implement various algorithms supported by MLlib such as
Linear Regression, Decision Tree, Random Forest and many more.
Topics:

Page 36
 Supervised Learning – Linear Regression, Logistic Regression, Decision Tree,
Random Forest
 Unsupervised Learning – K-Means Clustering & How It Works with MLlib
 Analysis on US Election Data using MLlib (K-Means)

Hands-on:
 Machine Learning MLlib
 K- Means Clustering
 Linear Regression
 Logistic Regression
 Decision Tree
 Random Forest
6.9 Understanding Apache Kafka and Apache Flume
Learning Objectives: Understand Kafka and its Architecture. Also, learn about
Kafka Cluster, how to configure different types of Kafka Cluster. Get introduced to
Apache Flume, its architecture and how it is integrated with Apache Kafka for event
processing. At the end, learn how to ingest streaming data using flume.
Topics:
 Need for Kafka
 What is Kafka?
 Core Concepts of Kafka
 Kafka Architecture
 Where is Kafka Used?
 Understanding the Components of Kafka Cluster
 Configuring Kafka Cluster
 Kafka Producer and Consumer Java API
 Need of Apache Flume
 What is Apache Flume?
 Basic Flume Architecture
 Flume Sources
 Flume Sinks
 Flume Channels

Page 37

 Flume Configuration
 Integrating Apache Flume and Apache Kafka

Hands-on:
 Configuring Single Node Single Broker Cluster
 Configuring Single Node Multi Broker Cluster
 Producing and consuming messages
 Flume Commands
 Setting up Flume Agent
 Streaming Twitter Data into HDFS
6.10 Apache Spark Streaming – Processing Multiple Batches
Learning Objectives: Work on Spark streaming which is used to build scalable
fault-tolerant streaming applications. Also, learn about DStreams and various
Transformations performed on the streaming data. You will get to know about
commonly used streaming operators such as Sliding Window Operators and
Stateful Operators.
Topics:
 Drawbacks in Existing Computing Methods
 Why Streaming is Necessary?
 What is Spark Streaming?
 Spark Streaming Features
 Spark Streaming Workflow
 How Uber Uses Streaming Data
 Streaming Context & DStreams
 Transformations on DStreams
 Describe Windowed Operators and Why it is Useful
 Important Windowed Operators
 Slice, Window and ReduceByWindow Operators
 Stateful Operators
6.11 Apache Spark Streaming – Data Sources

Page 38
Learning Objectives: In this module, you will learn about the different streaming
data sources such as Kafka and flume. At the end of the module, you will be able to
create a spark streaming application.
Topics:
 Apache Spark Streaming: Data Sources
 Streaming Data Source Overview
 Apache Flume and Apache Kafka Data Sources
 Example: Using a Kafka Direct Data Source
 Perform Twitter Sentimental Analysis Using Spark Streaming

Hands-on:
 Different Streaming Data Sources

7 Data Visualization And Analytics – Tableau

Audience:

This course is designed to provide you with the skills required to become a
Tableau power user. The course is designed for the professional who has solid working
experience with Tableau and wants to take it to the next level. You should have a deep
understanding of all the fundamental concepts of building worksheets and dashboards, but
may scratch your head when working with more complex issues.
7.1 Learning Objectives: At the end of this class, the student will be able to:
 Build advanced chart types and visualizations
 Build complex calculations to manipulate your data
 Work with statistics and statistical techniques
 Work with parameters and input controls

Page 39
 Implement advanced geographic mapping techniques and use custom images to build
spatial visualizations of non-geographic data
 Implement all options in working with data: Joining multiple tables, data blending,
performance considerations and working with the Data Engine, and understand when to
implement which connection method.
 Build better dashboards using techniques for guided analytics, interactive dashboard
design and visual best practices
 Implement many efficiency tips and tricks
 Understand the basics of Tableau Server and other options for sharing your results

7.2 Introduction and Getting Started
 Why Tableau? Why Visualization?
 The Tableau Product Line
 Level Setting – Terminology
 Getting Started – creating some powerful visualizations quickly
 Review of some Key Fundamental Concepts
7.3 Filtering, Sorting & Grouping
Filtering, Sorting and Grouping are fundamental concepts
when working with and analyzing data. We will briefly review these topics as they apply to
Tableau
 Advanced options for filtering and hiding
 Understanding your many options for ordering and grouping your data: Sort, Groups, Bins,
Sets
 Understanding how all of these options inter-relate
7.4 Working with Data– In the Advanced class, we will understand the difference between
joining and blending data, and when we should do each. We will also consider the implications
of working with large data sets, and consider options for when and how to work with extracts
and the data engine. We will also investigate best practices in “sharing” data sources for
Tableau Server users.
 Data Types and Roles
 Dimension versus Measures
 Data Types
 Discrete versus Continuous
 The meaning of pill colors
 Database Joins
 Data Blending

Page 40

 Working with the Data Engine / Extracts and scheduling extract updates
 Working with Custom SQL
 Adding to Context
 Switching to Direct Connection
7.5 Working with Calculated Data and Statistics– In the Fundamentals Class, we were
introduced to some basic calculations: basic string and arithmetic calculations and ratios and
quick table calculations. In the Advanced class, we will extend those concepts to understand
the intricacies of manipulating data within Tableau
7.6 A Quick Review of Basic Calculations
Arithmetic Calculations
String Manipulation
Date Calculations
Quick Table Calculations
Custom Aggregations
Custom Calculated Fields
Logic and Conditional Calculations
Conditional Filters
7.7 Advanced Table Calculations
Understanding Scope and Direction
Calculate on Results of Table Calculations
Complex Calculations
Difference From Average
Discrete Aggregations
Index to Ratios
7.8 Working with Parameters– In the Fundamentals class, we were introduced to
parameters – How to create a parameter and use it in a calculation. In the Advanced class, we
will go into more details on how we can use parameters to modify our title, create What-If
analysis, etc
Parameter Basics

Page 41

Data types of parameters
Using parameters in calculated fields
Inputting parameter values and parameter control options
Advanced Usage of Parameters
Using parameters for titles, field selections, logic statements, Top X

7.9 Building Advanced Chart Types and Visualizations / Tips & Tricks– This topic
covers how to create some of the chart types and visualizations that may be less obvious in
Tableau. It also covers some of the more common tips & tricks / techniques that we use to
assist customers in solving some of their more complex problems.
 Bar in Bar
 Box Plot
 Bullet Chart
 Custom Shapes
 Gantt Chart
 Heat Map
 Pareto Chart
 Spark Line
 KPI Chart
7.10 Best Practices in Formatting and Visualizing
 Formatting Tips
 Drag to Legend
 Edit Legend
 Highlighting
 Labeling
 Legends
 Working with Nulls
 Table Options
 Annotations and Display Options
 Introduction to Visualization Best Practices

7.11 Introduction to XL Data Handling
 Introduction to Excel Environment

Page 42

 Formatting and Conditional Formatting
 Data Sorting, Filtering and Data Validation
 Understanding Name Ranges
7.12 Data Manipulation Using Functions
 Descriptive functions: sum, count, min, max, average, counta, countblank
 Logical functions: IF, and, or, not
 Relational operators > >= < <= = !=
 Nesting of functions
 Date and Time functions: today, now, month, year, day, weekday, networkdays, weeknum, time,
minute, hour
 Text functions: left, right, mid, find, length, replace, substitute, trim, rank, rank.avg, upper, lower,
proper
 Array functions: sumif, sumifs, countif, countifs, sumproduct
 Use and application of lookup functions in excel: Vlookup, Hlookup
 Limitations of lookup functions
 Using Index, Match, Offset, concept of reverse vlookup
7.13 Data Analysis And Reporting
 Data Analysis using Pivot Tables – use of row and column shelf, values and filters
 Difference between data layering and cross tabulation, summary reports, advantages and limitations
 Change aggregation types and summarisation
 Creating groups and bins in pivot data
 Concept of calculated fields, usage and limitations
 Changing report layouts – Outline, compact and tabular forms
 Show and hide grand totals and subtotals
 Creating summary reports using pivot tables
7.14 Data Visualization In Excel
 Overview of chart types – column and bar charts, line and area charts, pie charts, doughnut charts,
scatter plots

Page 43

 How to select right chart for your data
 Chart formatting
 Creating and customizing advance charts – thermometer charts, waterfall charts, population
pyramids
7.15 Overview Of Dashboards
 What is dashboard & Excel dashboard
 Adding icons and images to dashboards
 Making dashboards dynamic
7.16 Create Dashboards In Excel – Using Pivot Controls
 Concept of pivot cache and its use in creating interactive dashboards in Excel
 Pivot table design elements – concept of slicers and timelines
 Designing sample dashboard using Pivot Controls
 Design principles for including charts in dashboards – do's and don’t's
7.17 Business Dashboard Creation
 Complete Management Dashboard for Sales & Services
 Best practices – Tips and Tricks to enhance dashboard designing
7.18 SQL: Understanding RDBMS
 Schema – Meta Data – ER Diagram
 Looking at an example of Database design
 Data Integrity Constraints & types of Relationships (Primary and foreign key)
 Basic concepts – Queries, Data types & NULL Values, Operators and Comments in SQL
7.19 SQL: Utilising The Object Explorer
 What is SQL – A Quick Introduction
 Installing MS SQL Server for windows OS
 Introduction to SQL Server Management Studio

Page 44

 Understanding basic database concepts
7.20 SQL: Data Based Objects Creation (DDL Commands)
 Creating, Modifying & Deleting Databases and Tables
 Drop & Truncate statements – Uses & Differences
 Alter Table & Alter Column statements
 Import and Export wizard to get the data in SQL server from excel files or delimited files

7.21 SQL: Data Manipulation (DML Commands)
 Insert, Update & Delete statements
 Select statement – Subsetting, Filters, Sorting. Removing Duplicates, grouping and aggregations etc
 Where, Group By, Order by & Having clauses
 SQL Functions – Number, Text, Date, etc
 SQL Keywords – Top, Distinct, Null, etc
 SQL Operators –  Relational (single valued and multi valued), Logical (and, or, not), Use of wildcard
operators and wildcard characters, etc
7.22 SQL: Accessing Data From Multiple Tables Using SELECT
 Append and JoinsUnion and Union All – Use & constraints
 Intersect and Except statements
 Table Joins – inner join, left join, right join, full join
 Cross joins/cartesian products, self joins, natural joins etc
 Inline views and sub-queries
 Optimizing your work

7.23 Tableau: Getting Started
 What is Tableau? What does the Tableau product suite comprise of? How Does Tableau Work?

Page 45

 Tableau Architecture
 Connecting to Data & Introduction to data source concepts
 Understanding the Tableau workspace
 Dimensions and Measures
 Data Types & Default Properties
 Tour of Shelves & Marks Card
 Using Show Me
 Saving and Sharing your work-overview
7.24 Tableau: Data Handling & Summaries
 Date Aggregations and Date parts
 Cross tab & Tabular charts
 Totals & Subtotals
 Bar Charts & Stacked Bars
 Line Graphs with Date & Without Date
 Tree maps
 Scatter Plots
 Individual Axes, Blended Axes, Dual Axes & Combination chart
 Parts of Views
 Sorting
 Trend lines/ Forecasting
 Reference Lines
 Filters/Context filters
 Sets
o In/Out Sets
o Combined Sets
 Grouping
 Bins/Histograms
 Drilling up/down – drill through
 Hierarchies
 View data

Page 46

 Actions (across sheets)
7.25 Tableau: Building Advanced Reports/ Maps
 Explain latitude and longitude
 Default location/Edit locations
 Building geographical maps
 Using Map layers
7.26 Tableau: Calculated Fields
 Working with aggregate versus disaggregate data
 Explain – #Number of Rows
 Basic Functions (String, Date, Numbers etc)
 Usage of Logical conditions
7.27 Tableau: Table Calculations
 Explain scope and direction
 Percent of Total, Running / Cumulative calculations
 Introduction to LOD (Level of Detail) Expressions
 User applications of Table calculations
7.28 Tableau: Parameters
 Using Parameters in
o Calculated fieldsBins
o Reference Lines
o Filters/Sets
 Display Options (Dynamic Dimension/Measure Selection)
 Create What-If/ Scenario analysis
7.29 Tableau: Building Interactive Dashboards
 Combining multiple visualizations into a dashboard (overview)

Page 47

 Making your worksheet interactive by using actions
o Filter
o URL
o Highlight
 Complete Interactive Dashboard for Sales & Services
7.30 Tableau: Formatting
 Options in Formatting your Visualization
 Working with Labels and Annotations
 Effective Use of Titles and Captions
7.31 Tableau: Working With Data
 Multiple Table Joins
 Data Blending
 Difference between joining and blending data, and when we should do each
 Toggle between to Direct Connection and Extracts

7.32 MS VBA
 Introducing VBA
 What is Logic?
 What Is VBA?
 Introduction to Macro Recordings, IDE
 How VBA Works with Excel
 Working In the Visual Basic Editor
 Introducing the Excel Object Model
 Using the Excel Macro Recorder
 VBA Sub and Function Procedures
 Key Components of  Programming  language

Page 48

 Essential VBA Language Elements
 Keywords & Syntax
 Programming statements
 Variables & Data types
 Comments
 Operators
 Working with Range Objects
 A look at some commonly used code snippets
 Programming constructs in VBA
 Control Structures
 Looping Structures
 The With- End with Block
 Functions & Procedures in VBA – Modularizing your programs
 Worksheet & workbook functions
 Automatic Procedures and Events
 Arrays
 Objects & Memory Management in VBA
 The NEW and SET Key words
 Destroying Objects – The Nothing Keyword
 Error Handling
 Controlling accessibility of your code – Access specifiers
 Code Reusability – Adding references and components to your code
 Communicating with Your Users
 Simple Dialog Boxes
 User Form Basics
 Using User Form Controls

Page 49

 Add-ins
 Accessing Your Macros through the User Interface
 Retrieve information through Excel from Access Database using VBA

Our Courses

Drop A Query

Call Now ButtonCall Us