Data science

(Data Analytics / Artificial Intelligence
Machine Learning / GenAI)

DURATION
8 Months

MODE OF TRAINING
Online/offline

LEVEL
Advanced

Data Science

A Data Science course provides comprehensive training in extracting valuable insights from complex data, preparing individuals for one of the most in-demand careers of the 21st century.
This course introduces foundational programming languages like Python, R, and SQL, essential for data manipulation and analysis.
Participants learn statistical methods, probability, and hypothesis testing to understand data patterns and trends. The curriculum also covers machine learning techniques, including supervised and unsupervised learning, enabling students to build predictive and classification models.

Learners gain expertise in data preprocessing, cleaning, and feature engineering, which are critical for handling real-world datasets.
Tools like Pandas, NumPy, and Scikit-learn are used extensively to perform data analysis.
Visualization techniques using tools like Matplotlib, Seaborn, Tableau, and Power BI help in presenting insights through dashboards and graphs.
The course also includes big data frameworks such as Hadoop and Spark, equipping students to work with large-scale data in distributed environments.

Advanced topics like deep learning, neural networks, and natural language processing (NLP) are integral parts of the curriculum, enabling learners to solve complex problems like image recognition, text generation, and sentiment analysis.
Techniques like TensorFlow, Keras, and PyTorch are used for implementing AI models.
Real-world case studies and capstone projects provide hands-on experience, allowing learners to apply their knowledge in areas like fraud detection, recommendation systems, and market analysis.

The course also emphasizes practical applications across various industries, including healthcare, finance, e-commerce, and entertainment.
Participants explore concepts like big data analytics, cloud computing, and data security to stay aligned with industry trends.
Additionally, modules on statistical modeling, A/B testing, and advanced regression analysis prepare learners for strategic decision-making roles.

By the end of the course, students are equipped to handle roles such as data scientist, machine learning engineer, and data analyst.
With growing demand across industries, this course is ideal for individuals with a strong foundation in mathematics, statistics, and logical thinking, who are eager to build a lucrative career in data science.

Data Science Course Curriculum

Module 1: Python Core and Advanced +
    • What is Python?
    • Why does Data Science require Python?
    • Installation of Anaconda
    • Understanding Jupyter Notebook
    • Basic commands in Jupyter Notebook
    • Understanding Python Syntax
    • Data Types and Data Structures
  • Variables and Strings
  • Lists, Sets, Tuples, and Dictionaries
  • Control Flow and Conditional Statements
    • Conditional Operators, Arithmetic Operators, and Logical Operators
    • If, Elif and Else Statements
    • While Loops
    • For Loops
    • Nested Loops and List and Dictionary Comprehensions
  • Functions
    • What is function and types of functions
    • Code optimization and argument functions
    • Scope
    • Lambda Functions
    • Map, Filter, and Reduce
  • File Handling
    • Create, Read, Write files and Operations in File Handling
    • Errors and Exception Handling
  • Class and Objects
    • Create a class
    • Create an object
    • The __init__()
    • Modifying Objects
    • Object Methods
    • Self
    • Modify the Object Properties
    • Delete Object
    • Pass Statements
    • Modular Case Study: 1
    • Formative Assessment: 1
Module 2: Exploratory Data Analysis using Python +
    • NumPy – Numerical Python
      • Introduction to Array
      • Creation and Printing of an Array
      • Basic Operations in NumPy
      • Indexing
      • Mathematical Functions of NumPy
    • Data Manipulation with Pandas
      • Series and DataFrames
      • Data Importing and Exporting through Excel, CSV Files
      • Data Understanding Operations
      • Indexing and Slicing and More Filtering with Conditional Slicing
      • Group By, Pivot Table, and Cross Tab
      • Concatenating and Merging Joining
      • Descriptive Statistics
      • Removing Duplicates
      • String Manipulation
      • Missing Data Handling
    • Data Visualization
      • Data Visualization using Matplotlib and Pandas
      • Introduction to Matplotlib
      • Basic Plotting
      • Properties of Plotting
      • About Subplots
      • Line Plots
      • Pie Chart and Bar Graph
      • Histograms
      • Box and Violin Plots
      • Scatterplot
    • Case Study on Exploratory Data Analysis (EDA) and Visualizations
      • What is EDA?
      • Univariate Analysis
      • Bivariate Analysis
      • Seaborn-based Plotting (Pair Plots, Catplot, Heatmaps, Count Plot) Along with Matplotlib Plots
    • Unstructured Data Processing
      • Regular Expressions
      • Structured Data and Unstructured Data
      • Literals and Meta Characters
      • Using Regular Expressions with Pandas
      • Inbuilt Methods
      • Pattern Matching
    • Project on Web Scraping: Data Mining and Exploratory Data Analysis
      • Data Mining (Web Scraping)
      • Project Includes:
        • Collection of Raw Data from Different Sources
        • Conversion of Unstructured Data to a Structured Format
        • Application of Machine Learning and NLP Models
      • Main Steps of the Data Science Life Cycle:
        • Data Collection
        • Data Mining
        • Data Preprocessing
        • Data Visualization
      • Examples: Text, CSV, TSV, Excel Files, Matrices, Images
      • Modular Case Study: 2
      • Formative Assessment: 2
Module 3: Advanced Statistics +
    • Data Types and Data Structures
    • Statistics in Data Science
      • What is Statistics?
      • How is Statistics used in Data Science?
      • Population and Sample
      • Parameter and Statistic
      • Variable and its Types
      • Data Gathering Techniques
    • Data Types
    • Data Collection Techniques
    • Sampling Techniques
      • Convenience Sampling
      • Simple Random Sampling
      • Stratified Sampling
      • Systematic Sampling
      • Cluster Sampling
    • Descriptive Statistics
      • What is Univariate and Bi-Variate Analysis?
      • Measures of Central Tendencies
      • Measures of Dispersion
      • Skewness and Kurtosis
      • Box Plots and Outlier Detection
      • Covariance and Correlation
    • Probability Distribution
      • Probability and Limitations
      • Discrete Probability Distributions
        • Bernoulli Distribution
        • Binomial Distribution
        • Poisson Distribution
      • Continuous Probability Distributions
        • Normal Distribution
        • Standard Normal Distribution
    • Inferential Statistics
      • Sampling Variability and Central Limit Theorem
      • Confidence Intervals
      • Hypothesis Testing
        • Z-test
        • T-test
        • Chi-Square Test
        • F-Test and ANOVA
        • Modular Case Study: 3
        • Formative Assessment: 3
Module 4. SQL for Data Analysis +
  • Module 4: SQL for Data Analysis
    • Introduction to Databases
    • Basics of SQL
      • DML, DDL, DCL, and Data Types
      • Common SQL Commands: SELECT, FROM, and WHERE
    • Logical Operators in SQL
    • SQL Joins
      • INNER and OUTER Joins to Combine Data from Multiple Tables
      • RIGHT and LEFT Joins to Combine Data from Multiple Tables
    • Filtering and Sorting
      • Advanced Filtering Using IN, OR, and NOT
      • Sorting with GROUP BY and ORDER BY
    • SQL Aggregations
      • Common Aggregations: COUNT, SUM, MIN, and MAX
      • CASE and DATE Functions
      • Working with NULL Values
    • Subqueries and Temp Tables
      • Subqueries to Run Multiple Queries Together
      • Temp Tables to Access a Table with More Than One Query
    • SQL Data Cleaning
      • Perform Data Cleaning Using SQL
      • Modular Case Study: 4
      • Formative Assessment: 4
Module 5: Machine Learning Supervised Learning +
    • Introduction
      • What is Machine Learning?
      • Supervised Versus Unsupervised Learning
      • Regression Versus Classification Problems
      • Assessing Model Accuracy
    • Regression Techniques
      • Linear Regression
        • Simple Linear Regression:
          • Estimating the Coefficients
          • Assessing the Coefficient Estimates
          • R Squared and Adjusted R Squared
          • MSE and RMSE
        • Multiple Linear Regression:
          • Estimating the Regression Coefficients
          • OLS Assumptions
          • Multicollinearity
          • Feature Selection
          • Gradient Descent
        • Evaluating the Metrics of Regression Techniques:
          • Homoscedasticity and Heteroscedasticity of Error Terms
          • Residual Analysis
          • Q-Q Plot
          • Cook’s Distance and Shapiro-Wilk Test
          • Identifying the Line of Best Fit
        • Other Considerations in the Regression Model:
          • Qualitative Predictors
          • Interaction Terms
          • Non-linear Transformations of the Predictors
        • Polynomial Regression:
          • Why Polynomial Regression?
          • Creating Polynomial Linear Regression
          • Evaluating the Metrics
        • Regularization Techniques:
          • Lasso Regularization
          • Ridge Regularization
          • ElasticNet Regularization
        • Case Study on Linear, Multiple Linear Regression, and Polynomial Regression Using Python
      • Capstone Project: A project on a use case challenging Data Understanding, EDA, Data Processing, and Regression Techniques
    • Classification Techniques
      • Logistic Regression
        • An Overview of Classification
        • Difference Between Regression and Classification Models
        • Why Not Linear Regression?
        • Logistic Regression:
          • The Logistic Model
          • Estimating the Regression Coefficients and Making Predictions
          • Logit and Sigmoid Functions
          • Setting the Threshold and Understanding Decision Boundary
          • Logistic Regression for >2 Response Classes
        • Evaluation Metrics for Classification Models:
          • Confusion Matrix
          • Accuracy and Error Rate
          • TPR and FPR
          • Precision and Recall, F1 Score
          • AUC-ROC
          • Kappa Score
      • Naive Bayes
        • Principle of Naive Bayes Classifier
        • Bayes Theorem
        • Terminology in Naive Bayes:
          • Posterior Probability
          • Prior Probability of Class
          • Likelihood
        • Types of Naive Bayes Classifier:
          • Multinomial Naive Bayes
          • Bernoulli Naive Bayes
          • Gaussian Naive Bayes
      • Tree-Based Models
        • Decision Trees:
          • Basic Terminology in Decision Tree
          • Root Node and Terminal Node
          • Regression Trees and Classification Trees
          • Trees Versus Linear Models
          • Advantages and Disadvantages of Trees
          • Gini Index
          • Overfitting and Pruning
          • Stopping Criteria
          • Accuracy Estimation Using Decision Trees
        • Case Study: Decision Tree Using Python
        • Resampling Methods:
          • Cross-Validation
          • The Validation Set Approach
          • Leave-One-Out Cross-Validation
          • K-Fold Cross-Validation
          • Bias-Variance Trade-Off for K-Fold Cross-Validation
        • Ensemble Methods in Tree-Based Models:
          • What is Ensemble Learning?
          • Bootstrap Aggregation Classifiers
          • Random Forest:
            • What is it and How Does it Work?
            • Variable Selection Using Random Forest
          • Boosting:
            • AdaBoost
            • Gradient Boosting
            • Hyperparameter Tuning
            • Pros and Cons
        • Case Study: Random Forest Techniques Using Python
      • Distance-Based Models
        • K-Nearest Neighbors:
          • K-Nearest Neighbor Algorithm
          • Eager vs Lazy Learners
          • How the KNN Algorithm Works
          • Deciding the Number of Neighbors in KNN
          • Curse of Dimensionality
          • Pros and Cons of KNN
          • Improving KNN Performance
        • Case Study: KNN Using Python
        • Support Vector Machines:
          • The Maximal Margin Classifier
          • Hyperplane
          • Support Vector Classifiers and Support Vector Machines
          • Hard and Soft Margin Classification
          • Classification with Non-linear Decision Boundaries
          • Kernel Trick (Polynomial and Radial)
          • Tuning Hyperparameters for SVM (Gamma, Cost, and Epsilon)
          • SVMs with More Than Two Classes
        • Case Study: SVM Using Python
      • Capstone Project: A project on a use case challenging Data Understanding, EDA, Data Processing, and Classification Techniques
      • Modular Case Study: 5
      • Formative Assessment: 5
Module 6: Machine Learning Unsupervised Learning +
    • Introduction
      • Why Unsupervised Learning?
      • How It Differs from Supervised Learning
      • The Challenges of Unsupervised Learning
    • Principal Components Analysis (PCA)
      • Introduction to Dimensionality Reduction and Its Necessity
      • What Are Principal Components?
      • Demonstration of 2D PCA and 3D PCA
      • Eigen Values, Eigen Vectors, and Orthogonality
      • Transforming Eigen Values into a New Data Set
      • Proportion of Variance Explained in PCA
      • Case Study: PCA Using Python
    • K-Means Clustering
      • Centroids and Medoids
      • Deciding the Optimal Value of ‘K’ Using Elbow Method
      • Linkage Methods
    • Hierarchical Clustering
      • Divisive and Agglomerative Clustering
      • Dendrograms and Their Interpretation
      • Applications of Clustering
      • Practical Issues in Clustering
      • Case Study: Clustering Using Python
    • Capstone Project
      • A project on a use case will challenge Data Understanding, EDA, Data Processing, and Unsupervised Algorithms
    • Recommendation Systems
      • What Are Recommendation Engines?
      • How Does a Recommendation Engine Work?
      • Data Collection
      • Data Storage
      • Filtering the Data
      • Content-Based Filtering
      • Collaborative Filtering
      • Cold Start Problem
      • Matrix Factorization
      • Building a Recommendation Engine Using Matrix Factorization
      • Case Study
      • Modular Case Study: 6
      • Formative Assessment: 7
Module 7: Artificial Intelligence and Deep Learning +
    • Introduction to Neural Networks
      • Introduction to Perceptron & History of Neural Networks
      • Activation Functions
        • Sigmoid
        • ReLU
        • Softmax
        • Leaky ReLU
        • Tanh
      • Gradient Descent
      • Learning Rate and Tuning
      • Optimization Functions
    • Introduction to TensorFlow
    • Introduction to Keras
    • Backpropagation and Chain Rule
    • Fully Connected Layer
    • Cross Entropy
    • Weight Initialization
    • Regularization
    • TensorFlow 2.0
      • Introducing Google Colab
      • TensorFlow Basic Syntax
      • TensorFlow Graphs
      • TensorBoard
    • Artificial Neural Network with TensorFlow
      • Neural Network for Regression
      • Neural Network for Classification
      • Evaluating the ANN
      • Improving and Tuning the ANN
      • Saving and Restoring Graphs
      • Modular Case Study: 7
      • Formative Assessment: 7
Module 8: Computer Vision (CV) +
    • Working with Images & CNN Building Blocks
      • Working with Images: Introduction
      • Working with Images: Reshaping, Understanding Size of Image, Pixels, Digitization
      • Sampling and Quantization
      • Working with Images: Filtering
      • Hands-on Python Demo: Working with Images
    • Introduction to Convolutions
      • 2D Convolutions for Images
      • Convolution: Backward
      • Transposed Convolution and Fully Connected Layer as a Convolution
    • Pooling
      • Max Pooling
      • Other Pooling Options
    • CNN Architectures and Transfer Learning
      • CNN Architectures and LeNet Case Study
      • Case Study: AlexNet
      • Case Study: ZFNet and VGGNet
      • Case Study: GoogleNet
      • Case Study: ResNet
      • GPU vs CPU
      • Transfer Learning Principles and Practice
      • Hands-on Keras Demo: SVHN Transfer Learning from MNIST Dataset
      • Transfer Learning Visualization (Run Package, Occlusion Experiment)
      • Hands-on Demo: T-SNE
    • Object Detection
      • CNN’s at Work: Object Detection with Region Proposals
      • CNN’s at Work: Object Detection with YOLO and SSD
      • Hands-on Demo: Bounding Box Regressor
    • CNN’s at Work: Semantic Segmentation
      • Semantic Segmentation Process
      • U-Net Architecture for Semantic Segmentation
      • Hands-on Demo: Semantic Segmentation Using U-Net
      • Other Variants of Convolutions
    • Inception and MobileNet Models
    • CNN’s at Work: Siamese Network for Metric Learning
      • Metric Learning
      • Siamese Network as Metric Learning
      • How to Train a Neural Network in Siamese Way
      • Hands-on Demo: Siamese Network
      • Modular Case Study: 8
      • Formative Assessment: 8
Module 9: Natural Language Processing (NLP) +
    • Introduction to Statistical NLP Techniques
      • Introduction to NLP
      • Preprocessing in NLP: Tokenization, Stop Words, Normalization, Stemming, and Lemmatization
      • Preprocessing in NLP: Bag of Words, TF-IDF as Features
      • Language Model: Probabilistic Models, n-gram Model, and Channel Model
      • Hands-on NLTK
    • Word Embedding
      • Word2Vec
      • GloVe
      • POS Tagger
      • Named Entity Recognition (NER)
      • POS with NLTK
      • TF-IDF with NLTK
    • Sequential Models
      • Introduction to Sequential Models
      • Introduction to RNN
      • Introduction to LSTM
      • LSTM Forward Pass
      • LSTM Backpropagation Through Time
      • Hands-on Keras LSTM
    • Applications
      • Sentiment Analysis
      • Sentence Generation
      • Machine Translation
      • Advanced LSTM Structures
      • Keras – Machine Translation
      • ChatBot
      • Modular Case Study: 9
      • Formative Assessment: 9
Module 10: Power BI +
  • Module 10: Power BI
    • Introduction To Power BI
      • What is Business Intelligence?
      • Power BI Introduction
      • Quadrant Report
      • Comparison with other BI tools
      • Power BI Desktop Overview
      • Power BI Workflow
      • Installation Query Addressal
    • Data Import and Visualizations
      • Data Import Options in Power BI
      • Import from Web (Hands-on)
      • Why Visualization?
      • Visualization Types
    • Data Visualization (Contd.)
      • Categorical Data Visualization
      • Visuals for Filtering
      • Slicer Details and Use
      • Formatting Visuals
      • KPI Visuals
      • Tables and Matrix
    • Power Queries
      • Power Query Introduction
      • Data Transformation – Its Benefits
      • Queries Panel
      • M Language Briefing
      • Power BI Datatypes
      • Changing Datatypes of Columns
    • Power Queries (Contd.)
      • Filtering
      • Inbuilt Column Transformations
      • Inbuilt Row Transformations
      • Combine Queries
      • Merge Queries
    • Power Pivot and Introduction to DAX
      • Power Pivot
      • Intro to Data Modelling
      • Relationship and Cardinality
      • Relationship View
      • Calculated Columns vs Measures
      • DAX Introduction and Syntax
      • Modular Case Study: 10
      • Formative Assessment: 10

Enquiry Form

Only alphabets are allowed.
Email must start with alphabets followed by numbers.

💬
Chat with Us
WhatsApp Logo
Chat with Us