

Data Science and Machine Learning Using Python
Learn Data Science & Machine Learning, Deep Learning with Python Language
Trainer :- Experienced Data Science Consultant
Duration : 3 Months
Become a Data Scientist
One time class room registration to Payment Details Fee 1000/-
Data Science, Deep Learning, & Machine Learning with Python & R Language With Live Machine Learning & Deep Learning Projects
- Project 1 Build your own image recognition model with TensorFlow
- Project 2 Predict fraud with data visualization & predictive modeling!
- Project 3 Spam Detection
- Project 4 Build your own Recommendation System
- Project 5 Build your own Python predictive modeling, regression analysis & machine learning Model
- Getting Started
- Course Introduction
- Course Material & Lab Setup
- Installation
- Python Basic – Part – 1
- Python Basic – Part – 2
- Advance Python – Part – 1
- Advance Python – Part – 2
● Statistics and Probability Refresher, and Python Practice
- Types of Data
- Mean, Median, Mode
- Using mean, median, and mode in Python
- Variation and Standard Deviation
- Probability Density Function; Probability Mass Function
- Common Data Distributions
- Percentiles and Moments
- A Crash Course in matplotlib
- Covariance and Correlation
- Conditional Probability
- Exercise Solution: Conditional Probability of Purchase by Age
- Bayes’ Theorem
● Predictive Models
- Linear Regression
- Polynomial Regression
- Multivariate Regression, and Predicting Car Prices
- Multi-Level Models
● Machine Learning with Python
- Supervised vs. Unsupervised Learning, and Train/Test
- Using Train/Test to Prevent Overfitting a Polynomial Regression
- Bayesian Methods: Concepts
- Implementing a Spam Classifier with Naive Bayes
- K-Means Clustering
- Clustering people based on income and age
- Measuring Entropy
- Install GraphViz32. Decision Trees: Concepts
- Decision Trees: Predicting Hiring Decisions
- Ensemble Learning
- Support Vector Machines (SVM) Overview
- Using SVM to cluster people using scikit-learn
● Recommender Systems
- User-Based Collaborative Filtering
- Item-Based Collaborative Filtering
- Finding Movie Similarities
- Improving the Results of Movie Similarities
- Making Movie Recommendations to People
- Improve the recommender’s results
● More Data Mining and Machine Learning Techniques
- K-Nearest-Neighbors: Concepts
- Using KNN to predict a rating for a movie
- Dimensionality Reduction; Principal Component Analysis
- PCA Example with the Iris data set
- Data Warehousing Overview: ETL and ELT
- Reinforcement Learning
● Dealing with Real-World Data
- Bias/Variance Tradeoff
- K-Fold Cross-Validation to avoid overfitting
- Data Cleaning and Normalization
- Cleaning web log data
- Normalizing numerical data
- Detecting outliers
● Experimental Design
- A/B Testing Concepts
- T-Tests and P-Values
- Hands-on With T-Tests
- Determining How Long to Run an Experiment
- A/B Test Gotchas
● Deep Learning and Neural Network
● Statistics and Data Science in R
● Introduction
- Introduction to R
- R and R studio Installation & Lab Setup
- Descriptive Statistics
● Descriptive Statistics
- 0Mean, Median, Mode
- Our first foray into R : Frequency Distributions
- Draw your first plot : A Histogram
- Computing Mean, Median, Mode in R
- What is IQR (Inter-quartile Range)?
- Box and Whisker Plots
- The Standard Deviation
- Computing IQR and Standard Deviation in R
● Inferential Statistics
- Drawing inferences from data
- Random Variables are ubiquitous
- The Normal Probability Distribution
- Sampling is like fishing
- Sample Statistics and Sampling Distributions
● Case studies in Inferential Statistics
● Diving into R
- Harnessing the power of R
- Assigning Variables
- Printing an output
- Numbers are of type numeric
- Characters and Dates
- Logicals
● Vectors
- Data Structures are the building blocks of R
- Creating a Vector
- The Mode of a Vector
- Vectors are Atomic
- Doing something with each element of a Vector
- Aggregating Vectors
- Operations between vectors of the same length
- Operations between vectors of different length
- Generating Sequences
- Using conditions with Vectors
- Find the lengths of multiple strings using Vectors
- Generate a complex sequence (using recycling)
- Vector Indexing (using numbers)
- Vector Indexing (using conditions)
- Vector Indexing (using names)
● Arrays
- Creating an Array
- Indexing an Array
- Operations between 2 Arrays
- Operations between an Array and a Vector
- Outer Products
● Matrices
- A Matrix is a 2-Dimensional Array
- Creating a Matrix
- Matrix Multiplication
- Merging Matrices
- Solving a set of linear equations
● Factors
- What is a factor?
- Find the distinct values in a dataset (using factors)
- Replace the levels of a factor
- Aggregate factors with table()
- Aggregate factors with tapply()
● Lists and Data Frames
- Introducing Lists
- Introducing Data Frames
- Reading Data from files
- Indexing a Data Frame
- Aggregating and Sorting a Data Frame
- Merging Data Frames
● Regression quantifies relationships between variables
- Linear Regression in Excel : Preparing the data.
- Linear Regression in Excel : Using LINEST()
● Linear Regression in R
- Linear Regression in R : Preparing the data
- Linear Regression in R : lm() and summary()
- Multiple Linear Regression
- Adding Categorical Variables to a linear mode
- Robust Regression in R : rlm()
- Parsing Regression Diagnostic Plots
○ Predictive Models
- Linear Regression
- Polynomial Regression
- Multivariate Regression, and Predicting Car Prices
- Multi-Level Models
○ Machine Learning with R
- Supervised vs. Unsupervised Learning, and Train/Test
- Using Train/Test to Prevent Overfitting a Polynomial Regression
- Bayesian Methods: Concepts
- Implementing a Spam Classifier with Naive Bayes
- K-Means Clustering
- Clustering people based on income and age
- Measuring Entropy
- Install GraphViz32. Decision Trees: Concepts
- Decision Trees: Predicting Hiring Decisions
- Ensemble Learning
- Support Vector Machines (SVM) Overview
- Using SVM to cluster people using scikit-learn
○ Recommender Systems
- User-Based Collaborative Filtering
- Item-Based Collaborative Filtering
- Finding Movie Similarities
- Improving the Results of Movie Similarities
- Making Movie Recommendations to People
- Improve the recommender’s results
○ More Data Mining and Machine Learning Techniques
- K-Nearest-Neighbors: Concepts
- Using KNN to predict a rating for a movie
- Dimensionality Reduction; Principal Component Analysis
- PCA Example with the Iris data set
- Data Warehousing Overview: ETL and ELT
- Reinforcement Learning
○ Dealing with Real-World Data
- Bias/Variance Tradeoff
- K-Fold Cross-Validation to avoid overfitting
- Data Cleaning and Normalization
- Cleaning web log data
- Normalizing numerical data
- Detecting outliers
○ Experimental Design
- A/B Testing Concepts
- T-Tests and P-Values
- Hands-on With T-Tests
- Determining How Long to Run an Experiment
- A/B Test Gotchas
● Data Visualization in R
- Data Visualization
- The plot() function in R
- Control color palettes with RColorbrewer
- Drawing bar plots
- Drawing a heatmap
- Drawing a Scatterplot Matrix
- Plot a line chart with ggplot
We Will Be Updated Soon.
Machine Learning Using Python
Basic Python
1 Introduction
1.1 What is Python..?
1.2 A Brief history of Python
1.3 Installing Python
1.4 How to execute Python program
- Using the Python Interpreter
- 1. Invoking the Interpreter
- 1.1. Argument Passing
- 1.2. Interactive Mode
- 2. The Interpreter and Its Environment
- 2.1. Source Code Encoding
- 1. Invoking the Interpreter
- An Informal Introduction to Python
- 1. Using Python as a Calculator
- 1.1. Numbers
- 1.2. Strings
- 1.3. Lists
- 2. First Steps Towards Programming
- 1. Using Python as a Calculator
- More Control Flow Tools
- 1. if Statements
- 2. for Statements
- 3. The range() Function
- 4. break and continue Statements, and else Clauses on Loops
- 5. pass Statements
- 6. Defining Functions
- 7. More on Defining Functions
- 7.1. Default Argument Values
- 7.2. Keyword Arguments
- 7.3. Arbitrary Argument Lists
- 7.4. Unpacking Argument Lists
- 7.5. Lambda Expressions
- 7.6. Documentation Strings
- 7.7. Function Annotations
- 8. Intermezzo: Coding Style
- Data Structures
- 1. More on Lists
- 1.1. Using Lists as Stacks
- 1.2. Using Lists as Queues
- 1.3. List Comprehensions
- 1.4. Nested List Comprehensions
- 2. The del statement
- 3. Tuples and Sequences
- 4. Sets
- 5. Dictionaries
- 6. Looping Techniques
- 7. More on Conditions
- 8. Comparing Sequences and Other Types
- 1. More on Lists
- Modules
- 1. More on Modules
- 1.1. Executing modules as scripts
- 1.2. The Module Search Path
- 1.3. “Compiled” Python files
- 2. Standard Modules
- 3. The dir() Function
- 4. Packages
- 4.1. Importing * From a Package
- 4.2. Intra-package References
- 4.3. Packages in Multiple Directories
- 1. More on Modules
- Input and Output
- 1. Fancier Output Formatting
- 1.1. Formatted String Literals
- 1.2. The String format() Method
- 1.3. Manual String Formatting
- 1.4. Old string formatting
- 2. Reading and Writing Files
- 2.1. Methods of File Objects
- 2.2. Saving structured data with json
- 1. Fancier Output Formatting
Data Science with Python
- Install Anaconda Distribution as per OS from https://www.anaconda.com/distribution/ (Python 3.7 version)
- Sign Up for account creation on https://www.hackerrank.com/
- Sign up for account creation on https://www.kaggle.com/
- Sign up for account creation on https://github.com/
- Git Bash Utility – https://git-scm.com/downloads
Module 1: Statistics and Probability
- Descriptive Statistics:
- Central tendency: Mean, Median, Mode
- Sample variance
- Standard deviation
- Random Variables: Discrete, Continuous
- Probability density functions
- Binomial distribution
- Expected Value, E(X)
- Poisson Process
- Law of large numbers
- Standard normal distribution and empirical rule
- Z-score
- Inferential Statistics:
- Central limit theorem
- Sampling distribution of the sample mean
- Standard error of the mean
- Mean and variance of Bernoulli distribution
- Margin of error 1
- Margin of error 2
- Confidence interval
- Hypothesis testing and p-value
- One-tailed and two tailed tests
- Z-statistics and T-statistics
- Type 1 error
- Squared error of regression line
- Co-efficient of determination
- Chi-square distribution
- Pearson’s chi square test (goodness of fit)
- Co-relation and casualty.
Module 2: Data Analysis using Python
- Numpy
- Numpy Vector and Matrix
- Functions – arange(), zeros(), ones(), linspace(), eye(),
- reshape(), random(), max(), min(),
- argmax(), argmin(), shape and dtype attribute
- Indexing and Selection
- Numpy Operations – Array with Array, Array with Scalars,
- Universal Array Functions
- Pandas
- Pandas Series
- Pandas Data-Frame
- Missing Data (Imputation)
- Group by Operations
- Merging, Joining and Concatenating Data-Frame.
- Pandas Operations
- Data Input and Output from wide variety of formats like csv, excel, db and html etc.
Module 3: Data Visualization using python Matplotlib, Seaborn, Pandas-in built, Plotly and Cufflinks
- Matplotlib
- plot() using Functional approach
- multi-plot using subplot()
- figure() using OO API Methods
- add_axes(), set_xlabel(), set_ylabel(), set_title() Methods
- Customization – figure size, impoving dpi, Plot appearance,
- Markers, Control over axis appearance and special Plot Types
- Seaborn
- Distribution Plots using distplot(), jointplot(), pairplot(), rugplot(), kdeplot()
- Categorical Plots using barplot(), countplot(), boxplot(), violinplot(), stripplot(), swarmplot(), factorplot()
- Matrix Plots using heatmap(), clustermap()
- Grid Plots using PairGrid(), FacetGrid()
- Regression Plots using lmplot()
- Styles and Colors customization.
- Plotly and Cufflinks
- Interactive Plotting using Plotly and Cufflinks
- Pandas Built-in
- Histogram, Area Plot, Bar Plot, Scatter Plot, Box-plot, Hex-plot, Kde-plot, Density Plot e. Choropleth Maps
- Interactive World Map and US Map using Plotly and Cufflinks Module
Module 4: GIT
- Distribution Version Control System
- How internally, GIT Manages Version Control on Changesets.
- Creating Repository
- Basic Commands like, git status, git add, git remove, git branch, git checkout, git log, git cat-file, git pull, git push, git commit
- Managing Configuration – System Level, User Level, Repository level
Module 5: Jupyter Notebook
- Introduction, Basic Commands, Keyboard Shortcut and Magic Functions
Module 6: Linear Algebra and Calculus
- Vector and Matrix, basic operations
- Trigonometry
- Derivatives
Module 7: SQL
- MySQL Server and Client Installation
- SQL Queries
- CRUD Operations
- Types of tables(Fact and dimension)
Module 8: Big Data
- What is big data?
- What is distributed computing?
- What is parallel processing?
- Why data scientist require big data?
Module 9: Machine Learning Introduction
- What is Machine Learning?
- Machine Learning Process Flow-Diagram
- Different Categories of Machine Leaning – Supervised, Unsupervised and Reinforcement
- Scikit-Learn Overview
- Scikit-Learn cheat-sheet
Module 10: Regression
- Linear Regression
- Robust Regression (RANSAC Algorithm)
- Exploratory Data Analysis (EDA)
- Correlation Analysis and Feature Selection
- Performance Evaluation – Residual Analysis, Mean Square Error (MSE), Co-efficient of
- Determination R^2, Mean Absolute Error (MAE), Root Mean Square Error (RMSE)
- Polynomial Regression
- Regularized Regression – Ridge, Lasso and Elastic Net Regression
- Bias-Variance Trade-Off
- Cross Validation – Hold Out and K-Fold Cross Validation
- Data Pre-Processing – Standardization, Min-Max, Normalization and Binarization
- Gradient Descent
Module 11: Classification – Logistic Regression
- Sigmoid function
- Logistic Regression learning using Stochastic Gradient Descent (SGD)
- SGDClassifier
- Measuring accuracy using Cross-Validation, Stratified k-fold
- Confusion Matrix – True Positive (TP), False Positive (FP), False
- Negative (FN), True Negative (TN)
- Precision, Recall, F1 Score, Precision/Recall Trade-Off
- Receiver Operating Characteristics (ROC) Curve.
Module 12: Classification – k-Nearest Neighbor(KNN)
- Classification and Regression
- Application, Advantages and Disadvantages
- Distance Metric – Euclidean, Manhattan, Chebyshev, Minkowski
- Measuring accuracy using Cross-Validation, Stratified k-fold, Confusion Matrix, Precision, Recall, F1-score.
Module 13: Classification – SVM (Support Vector Machine)
- Classification and Regression
- Separating line, Margin and Support Vectors
- Linear SVC Classification
- Polynomial Kernel – Kernel Trick
- Gaussian Radial Basis Function (rbf)
- Grid Search to tune hyper-parameters.
- Support Vector Regression.
Module 14: Classification –Decision Trees
- CART (Classification and Regression Tree)
- Advantages and Disadvantages and its applications.
- Decision Tree Learning algorithms – ID3, C4.5, C5.0 and CART.
- Gini Impurity, Entropy and Information Gain
- Decision Tree Regression
- Visualizing a Decision Tree using graphviz module.
- Regularization using tuning hyper-parameters using GridSearch CV.
Module 15: Classification – Ensemble Methods
- Bootstrap Aggregating or Bagging
- Random Forest algorithm
- Extremely Randomized (Extra-Trees) Ensemble
- Boosting – AdaBoost (Adaptive Boosting), Gradient Boosting
- Machine (GBM), XGBoost (Extreme Gradient Boosting)
Module 16: Unsupervised Learning – Clustering
- Connectivity- based Clustering using Hierarchical Clustering.
- Ward’s Agglomerative Hierarchical Clustering
- K-Means Clustering
- Elbow Method and Solhouette Analysis
Module 17: Unsupervised Learning – Dimensionality Reduction
- Linear Principal Component Analysis (PCA) reduction.
- Kernel PCA
- Linear Discriminant Analysis (LDA) on Supervised Data.
Module 18: Model Deployment On AWS Cloud
- What is cloud computing?
- What is AWS?
- How to store data in AWS S3?
- Create deep learning instance on EC2.
- Amazon sagemaker to train, tune, build and deploy on production.
Module 19: Tableau
- What is tableau? Its Application
- Installing tableau public
- Tableau Application and use
- Tableau tool introduction
- Tableau UI-Dimensions and measures
- Connecting to data
- Filter and its types
- Groups
- Set
- Hierarchy
- Graphs
- Table calculation
- LOD Expression
- Data Blending
- How we are Different from Others : Our Teachers covers each topics with Real Time Examples . They take 8 Real time project and more than 72+ assignments for almost every topic. We have Trainer from Real Time Industry with 15 years experience in DS. They are working as Data Science Machine Learning and AI consultant having 10+ years in ML & AI real time implementation and migrations.
This is completely Practical oriented training , Means everything you learn you will be able to code for the same . We have students who get confident in coding within 1 week of joining the training. that is our success and method of teaching. Here in Yess InfoTech , we always take prerequisite sessions also. Also we start from basic installation of the IDEs and other required softwares. Our way of teaching is that student will gain the confidence that , they got up-skilled to a different level. Also our student got many great positions and salary ranges in many great organizations.
-
- 5 DS Domain Based Project With Real Time Data ( with one trainer – two project.
- 9 Moc interviews(Monthly 3)
- Unlimited Assignments
- 28 Real Time Scenarios and Major topics
- Basic Python
- Machine Learning with Python
- Installation
- Data Visualization in R
- 19 Modules on Basics
60 Hours Online Sessions
12 Hours of assignments
10 hours for One Project and 50 Hrs for 2 Project ( Candidates should prepare with mentor support . 50 hours mentioned is total hours spent on project by each trainer )
Unlimited Interview Questions
Administration and Manual Installation of python with other Domain based projects will be done on regular basis apart from our normal batch schedule .
We do take projects
-
- Training By 15+ Years experienced Real Time Trainer
- A pool of 60+ real time Practical Sessions on Data Science
- Scenarios and Assignments to make sure you compete with current Industry standards
- World class training methods
- Training until the candidate get satisfed
- Certification and Placement Support until you get certified and placed for 4 years
- All training in reasonable cost
- 10000+ Satisfied candidates
- 5000+ Placement Records
- Corporate and Online Training in reasonable Cost
- Complete End-to-End Project with Each Course
- World Class Lab Facility which facilitates I3 /I5 /I7 computers
- Wifi available in Lab
-
- Resume And Interview preparation with 100% Hands-on Practical sessions
- Doubt clearing sessions any time after the course till 1 year
- Happy to help you any time after the course also
Trainer is having 15 year experience in Data Science with 10 years in Data Science Machine Learning and AI. It has been 15 years now that he has been working extensively in the top level Software company. He is having different kind of certifications in DS. He also have done corporate sessions and seminars both in India and abroad. Recently he was engaged by Yess InfoTech for sessions and professional motivator for working processionals to achieve their day to day targets.
All trainers at our organization are currently working on the technologies in reputed organization. The curriculum is not just some theory or some PPTs. We have all practical sessions and that to we ask our student to implement the same in the session only. We provide notes for the same. We use simple easy language and the contents are well absorbed by the candidates. The always give assignment. Also that the faculties are industry experienced so we give real time projects and practice. We also provide recorded sessions but that will be costing differently. Also we provide result oriented training.
- + Curriculum
-
Data Science, Deep Learning, & Machine Learning with Python & R Language With Live Machine Learning & Deep Learning Projects
- Project 1 Build your own image recognition model with TensorFlow
- Project 2 Predict fraud with data visualization & predictive modeling!
- Project 3 Spam Detection
- Project 4 Build your own Recommendation System
- Project 5 Build your own Python predictive modeling, regression analysis & machine learning Model
- Getting Started
- Course Introduction
- Course Material & Lab Setup
- Installation
- Python Basic – Part – 1
- Python Basic – Part – 2
- Advance Python – Part – 1
- Advance Python – Part – 2
● Statistics and Probability Refresher, and Python Practice
- Types of Data
- Mean, Median, Mode
- Using mean, median, and mode in Python
- Variation and Standard Deviation
- Probability Density Function; Probability Mass Function
- Common Data Distributions
- Percentiles and Moments
- A Crash Course in matplotlib
- Covariance and Correlation
- Conditional Probability
- Exercise Solution: Conditional Probability of Purchase by Age
- Bayes’ Theorem
● Predictive Models
- Linear Regression
- Polynomial Regression
- Multivariate Regression, and Predicting Car Prices
- Multi-Level Models
● Machine Learning with Python
- Supervised vs. Unsupervised Learning, and Train/Test
- Using Train/Test to Prevent Overfitting a Polynomial Regression
- Bayesian Methods: Concepts
- Implementing a Spam Classifier with Naive Bayes
- K-Means Clustering
- Clustering people based on income and age
- Measuring Entropy
- Install GraphViz32. Decision Trees: Concepts
- Decision Trees: Predicting Hiring Decisions
- Ensemble Learning
- Support Vector Machines (SVM) Overview
- Using SVM to cluster people using scikit-learn
● Recommender Systems
- User-Based Collaborative Filtering
- Item-Based Collaborative Filtering
- Finding Movie Similarities
- Improving the Results of Movie Similarities
- Making Movie Recommendations to People
- Improve the recommender’s results
● More Data Mining and Machine Learning Techniques
- K-Nearest-Neighbors: Concepts
- Using KNN to predict a rating for a movie
- Dimensionality Reduction; Principal Component Analysis
- PCA Example with the Iris data set
- Data Warehousing Overview: ETL and ELT
- Reinforcement Learning
● Dealing with Real-World Data
- Bias/Variance Tradeoff
- K-Fold Cross-Validation to avoid overfitting
- Data Cleaning and Normalization
- Cleaning web log data
- Normalizing numerical data
- Detecting outliers
● Experimental Design
- A/B Testing Concepts
- T-Tests and P-Values
- Hands-on With T-Tests
- Determining How Long to Run an Experiment
- A/B Test Gotchas
● Deep Learning and Neural Network
● Statistics and Data Science in R
● Introduction
- Introduction to R
- R and R studio Installation & Lab Setup
- Descriptive Statistics
● Descriptive Statistics
- 0Mean, Median, Mode
- Our first foray into R : Frequency Distributions
- Draw your first plot : A Histogram
- Computing Mean, Median, Mode in R
- What is IQR (Inter-quartile Range)?
- Box and Whisker Plots
- The Standard Deviation
- Computing IQR and Standard Deviation in R
● Inferential Statistics
- Drawing inferences from data
- Random Variables are ubiquitous
- The Normal Probability Distribution
- Sampling is like fishing
- Sample Statistics and Sampling Distributions
● Case studies in Inferential Statistics
● Diving into R
- Harnessing the power of R
- Assigning Variables
- Printing an output
- Numbers are of type numeric
- Characters and Dates
- Logicals
● Vectors
- Data Structures are the building blocks of R
- Creating a Vector
- The Mode of a Vector
- Vectors are Atomic
- Doing something with each element of a Vector
- Aggregating Vectors
- Operations between vectors of the same length
- Operations between vectors of different length
- Generating Sequences
- Using conditions with Vectors
- Find the lengths of multiple strings using Vectors
- Generate a complex sequence (using recycling)
- Vector Indexing (using numbers)
- Vector Indexing (using conditions)
- Vector Indexing (using names)
● Arrays
- Creating an Array
- Indexing an Array
- Operations between 2 Arrays
- Operations between an Array and a Vector
- Outer Products
● Matrices
- A Matrix is a 2-Dimensional Array
- Creating a Matrix
- Matrix Multiplication
- Merging Matrices
- Solving a set of linear equations
● Factors
- What is a factor?
- Find the distinct values in a dataset (using factors)
- Replace the levels of a factor
- Aggregate factors with table()
- Aggregate factors with tapply()
● Lists and Data Frames
- Introducing Lists
- Introducing Data Frames
- Reading Data from files
- Indexing a Data Frame
- Aggregating and Sorting a Data Frame
- Merging Data Frames
● Regression quantifies relationships between variables
- Linear Regression in Excel : Preparing the data.
- Linear Regression in Excel : Using LINEST()
● Linear Regression in R
- Linear Regression in R : Preparing the data
- Linear Regression in R : lm() and summary()
- Multiple Linear Regression
- Adding Categorical Variables to a linear mode
- Robust Regression in R : rlm()
- Parsing Regression Diagnostic Plots
○ Predictive Models
- Linear Regression
- Polynomial Regression
- Multivariate Regression, and Predicting Car Prices
- Multi-Level Models
○ Machine Learning with R
- Supervised vs. Unsupervised Learning, and Train/Test
- Using Train/Test to Prevent Overfitting a Polynomial Regression
- Bayesian Methods: Concepts
- Implementing a Spam Classifier with Naive Bayes
- K-Means Clustering
- Clustering people based on income and age
- Measuring Entropy
- Install GraphViz32. Decision Trees: Concepts
- Decision Trees: Predicting Hiring Decisions
- Ensemble Learning
- Support Vector Machines (SVM) Overview
- Using SVM to cluster people using scikit-learn
○ Recommender Systems
- User-Based Collaborative Filtering
- Item-Based Collaborative Filtering
- Finding Movie Similarities
- Improving the Results of Movie Similarities
- Making Movie Recommendations to People
- Improve the recommender’s results
○ More Data Mining and Machine Learning Techniques
- K-Nearest-Neighbors: Concepts
- Using KNN to predict a rating for a movie
- Dimensionality Reduction; Principal Component Analysis
- PCA Example with the Iris data set
- Data Warehousing Overview: ETL and ELT
- Reinforcement Learning
○ Dealing with Real-World Data
- Bias/Variance Tradeoff
- K-Fold Cross-Validation to avoid overfitting
- Data Cleaning and Normalization
- Cleaning web log data
- Normalizing numerical data
- Detecting outliers
○ Experimental Design
- A/B Testing Concepts
- T-Tests and P-Values
- Hands-on With T-Tests
- Determining How Long to Run an Experiment
- A/B Test Gotchas
● Data Visualization in R
- Data Visualization
- The plot() function in R
- Control color palettes with RColorbrewer
- Drawing bar plots
- Drawing a heatmap
- Drawing a Scatterplot Matrix
- Plot a line chart with ggplot
- + What is Next
-
We Will Be Updated Soon.
- + Introduction
-
Machine Learning Using Python
Basic Python
1 Introduction
1.1 What is Python..?
1.2 A Brief history of Python
1.3 Installing Python
1.4 How to execute Python program
- Using the Python Interpreter
- 1. Invoking the Interpreter
- 1.1. Argument Passing
- 1.2. Interactive Mode
- 2. The Interpreter and Its Environment
- 2.1. Source Code Encoding
- 1. Invoking the Interpreter
- An Informal Introduction to Python
- 1. Using Python as a Calculator
- 1.1. Numbers
- 1.2. Strings
- 1.3. Lists
- 2. First Steps Towards Programming
- 1. Using Python as a Calculator
- More Control Flow Tools
- 1. if Statements
- 2. for Statements
- 3. The range() Function
- 4. break and continue Statements, and else Clauses on Loops
- 5. pass Statements
- 6. Defining Functions
- 7. More on Defining Functions
- 7.1. Default Argument Values
- 7.2. Keyword Arguments
- 7.3. Arbitrary Argument Lists
- 7.4. Unpacking Argument Lists
- 7.5. Lambda Expressions
- 7.6. Documentation Strings
- 7.7. Function Annotations
- 8. Intermezzo: Coding Style
- Data Structures
- 1. More on Lists
- 1.1. Using Lists as Stacks
- 1.2. Using Lists as Queues
- 1.3. List Comprehensions
- 1.4. Nested List Comprehensions
- 2. The del statement
- 3. Tuples and Sequences
- 4. Sets
- 5. Dictionaries
- 6. Looping Techniques
- 7. More on Conditions
- 8. Comparing Sequences and Other Types
- 1. More on Lists
- Modules
- 1. More on Modules
- 1.1. Executing modules as scripts
- 1.2. The Module Search Path
- 1.3. “Compiled” Python files
- 2. Standard Modules
- 3. The dir() Function
- 4. Packages
- 4.1. Importing * From a Package
- 4.2. Intra-package References
- 4.3. Packages in Multiple Directories
- 1. More on Modules
- Input and Output
- 1. Fancier Output Formatting
- 1.1. Formatted String Literals
- 1.2. The String format() Method
- 1.3. Manual String Formatting
- 1.4. Old string formatting
- 2. Reading and Writing Files
- 2.1. Methods of File Objects
- 2.2. Saving structured data with json
- 1. Fancier Output Formatting
Data Science with Python
- Install Anaconda Distribution as per OS from https://www.anaconda.com/distribution/ (Python 3.7 version)
- Sign Up for account creation on https://www.hackerrank.com/
- Sign up for account creation on https://www.kaggle.com/
- Sign up for account creation on https://github.com/
- Git Bash Utility – https://git-scm.com/downloads
Module 1: Statistics and Probability
- Descriptive Statistics:
- Central tendency: Mean, Median, Mode
- Sample variance
- Standard deviation
- Random Variables: Discrete, Continuous
- Probability density functions
- Binomial distribution
- Expected Value, E(X)
- Poisson Process
- Law of large numbers
- Standard normal distribution and empirical rule
- Z-score
- Inferential Statistics:
- Central limit theorem
- Sampling distribution of the sample mean
- Standard error of the mean
- Mean and variance of Bernoulli distribution
- Margin of error 1
- Margin of error 2
- Confidence interval
- Hypothesis testing and p-value
- One-tailed and two tailed tests
- Z-statistics and T-statistics
- Type 1 error
- Squared error of regression line
- Co-efficient of determination
- Chi-square distribution
- Pearson’s chi square test (goodness of fit)
- Co-relation and casualty.
Module 2: Data Analysis using Python
- Numpy
- Numpy Vector and Matrix
- Functions – arange(), zeros(), ones(), linspace(), eye(),
- reshape(), random(), max(), min(),
- argmax(), argmin(), shape and dtype attribute
- Indexing and Selection
- Numpy Operations – Array with Array, Array with Scalars,
- Universal Array Functions
- Pandas
- Pandas Series
- Pandas Data-Frame
- Missing Data (Imputation)
- Group by Operations
- Merging, Joining and Concatenating Data-Frame.
- Pandas Operations
- Data Input and Output from wide variety of formats like csv, excel, db and html etc.
Module 3: Data Visualization using python Matplotlib, Seaborn, Pandas-in built, Plotly and Cufflinks
- Matplotlib
- plot() using Functional approach
- multi-plot using subplot()
- figure() using OO API Methods
- add_axes(), set_xlabel(), set_ylabel(), set_title() Methods
- Customization – figure size, impoving dpi, Plot appearance,
- Markers, Control over axis appearance and special Plot Types
- Seaborn
- Distribution Plots using distplot(), jointplot(), pairplot(), rugplot(), kdeplot()
- Categorical Plots using barplot(), countplot(), boxplot(), violinplot(), stripplot(), swarmplot(), factorplot()
- Matrix Plots using heatmap(), clustermap()
- Grid Plots using PairGrid(), FacetGrid()
- Regression Plots using lmplot()
- Styles and Colors customization.
- Plotly and Cufflinks
- Interactive Plotting using Plotly and Cufflinks
- Pandas Built-in
- Histogram, Area Plot, Bar Plot, Scatter Plot, Box-plot, Hex-plot, Kde-plot, Density Plot e. Choropleth Maps
- Interactive World Map and US Map using Plotly and Cufflinks Module
Module 4: GIT
- Distribution Version Control System
- How internally, GIT Manages Version Control on Changesets.
- Creating Repository
- Basic Commands like, git status, git add, git remove, git branch, git checkout, git log, git cat-file, git pull, git push, git commit
- Managing Configuration – System Level, User Level, Repository level
Module 5: Jupyter Notebook
- Introduction, Basic Commands, Keyboard Shortcut and Magic Functions
Module 6: Linear Algebra and Calculus
- Vector and Matrix, basic operations
- Trigonometry
- Derivatives
Module 7: SQL
- MySQL Server and Client Installation
- SQL Queries
- CRUD Operations
- Types of tables(Fact and dimension)
Module 8: Big Data
- What is big data?
- What is distributed computing?
- What is parallel processing?
- Why data scientist require big data?
Module 9: Machine Learning Introduction
- What is Machine Learning?
- Machine Learning Process Flow-Diagram
- Different Categories of Machine Leaning – Supervised, Unsupervised and Reinforcement
- Scikit-Learn Overview
- Scikit-Learn cheat-sheet
Module 10: Regression
- Linear Regression
- Robust Regression (RANSAC Algorithm)
- Exploratory Data Analysis (EDA)
- Correlation Analysis and Feature Selection
- Performance Evaluation – Residual Analysis, Mean Square Error (MSE), Co-efficient of
- Determination R^2, Mean Absolute Error (MAE), Root Mean Square Error (RMSE)
- Polynomial Regression
- Regularized Regression – Ridge, Lasso and Elastic Net Regression
- Bias-Variance Trade-Off
- Cross Validation – Hold Out and K-Fold Cross Validation
- Data Pre-Processing – Standardization, Min-Max, Normalization and Binarization
- Gradient Descent
Module 11: Classification – Logistic Regression
- Sigmoid function
- Logistic Regression learning using Stochastic Gradient Descent (SGD)
- SGDClassifier
- Measuring accuracy using Cross-Validation, Stratified k-fold
- Confusion Matrix – True Positive (TP), False Positive (FP), False
- Negative (FN), True Negative (TN)
- Precision, Recall, F1 Score, Precision/Recall Trade-Off
- Receiver Operating Characteristics (ROC) Curve.
Module 12: Classification – k-Nearest Neighbor(KNN)
- Classification and Regression
- Application, Advantages and Disadvantages
- Distance Metric – Euclidean, Manhattan, Chebyshev, Minkowski
- Measuring accuracy using Cross-Validation, Stratified k-fold, Confusion Matrix, Precision, Recall, F1-score.
Module 13: Classification – SVM (Support Vector Machine)
- Classification and Regression
- Separating line, Margin and Support Vectors
- Linear SVC Classification
- Polynomial Kernel – Kernel Trick
- Gaussian Radial Basis Function (rbf)
- Grid Search to tune hyper-parameters.
- Support Vector Regression.
Module 14: Classification –Decision Trees
- CART (Classification and Regression Tree)
- Advantages and Disadvantages and its applications.
- Decision Tree Learning algorithms – ID3, C4.5, C5.0 and CART.
- Gini Impurity, Entropy and Information Gain
- Decision Tree Regression
- Visualizing a Decision Tree using graphviz module.
- Regularization using tuning hyper-parameters using GridSearch CV.
Module 15: Classification – Ensemble Methods
- Bootstrap Aggregating or Bagging
- Random Forest algorithm
- Extremely Randomized (Extra-Trees) Ensemble
- Boosting – AdaBoost (Adaptive Boosting), Gradient Boosting
- Machine (GBM), XGBoost (Extreme Gradient Boosting)
Module 16: Unsupervised Learning – Clustering
- Connectivity- based Clustering using Hierarchical Clustering.
- Ward’s Agglomerative Hierarchical Clustering
- K-Means Clustering
- Elbow Method and Solhouette Analysis
Module 17: Unsupervised Learning – Dimensionality Reduction
- Linear Principal Component Analysis (PCA) reduction.
- Kernel PCA
- Linear Discriminant Analysis (LDA) on Supervised Data.
Module 18: Model Deployment On AWS Cloud
- What is cloud computing?
- What is AWS?
- How to store data in AWS S3?
- Create deep learning instance on EC2.
- Amazon sagemaker to train, tune, build and deploy on production.
Module 19: Tableau
- What is tableau? Its Application
- Installing tableau public
- Tableau Application and use
- Tableau tool introduction
- Tableau UI-Dimensions and measures
- Connecting to data
- Filter and its types
- Groups
- Set
- Hierarchy
- Graphs
- Table calculation
- LOD Expression
- Data Blending
- Using the Python Interpreter
- + Why Yess Infotech
-
- How we are Different from Others : Our Teachers covers each topics with Real Time Examples . They take 8 Real time project and more than 72+ assignments for almost every topic. We have Trainer from Real Time Industry with 15 years experience in DS. They are working as Data Science Machine Learning and AI consultant having 10+ years in ML & AI real time implementation and migrations.
This is completely Practical oriented training , Means everything you learn you will be able to code for the same . We have students who get confident in coding within 1 week of joining the training. that is our success and method of teaching. Here in Yess InfoTech , we always take prerequisite sessions also. Also we start from basic installation of the IDEs and other required softwares. Our way of teaching is that student will gain the confidence that , they got up-skilled to a different level. Also our student got many great positions and salary ranges in many great organizations.
-
- 5 DS Domain Based Project With Real Time Data ( with one trainer – two project.
- 9 Moc interviews(Monthly 3)
- Unlimited Assignments
- 28 Real Time Scenarios and Major topics
- Basic Python
- Machine Learning with Python
- Installation
- Data Visualization in R
- 19 Modules on Basics
60 Hours Online Sessions
12 Hours of assignments
10 hours for One Project and 50 Hrs for 2 Project ( Candidates should prepare with mentor support . 50 hours mentioned is total hours spent on project by each trainer )
Unlimited Interview Questions
Administration and Manual Installation of python with other Domain based projects will be done on regular basis apart from our normal batch schedule .
We do take projects
-
- Training By 15+ Years experienced Real Time Trainer
- A pool of 60+ real time Practical Sessions on Data Science
- Scenarios and Assignments to make sure you compete with current Industry standards
- World class training methods
- Training until the candidate get satisfed
- Certification and Placement Support until you get certified and placed for 4 years
- All training in reasonable cost
- 10000+ Satisfied candidates
- 5000+ Placement Records
- Corporate and Online Training in reasonable Cost
- Complete End-to-End Project with Each Course
- World Class Lab Facility which facilitates I3 /I5 /I7 computers
- Wifi available in Lab
-
- Resume And Interview preparation with 100% Hands-on Practical sessions
- Doubt clearing sessions any time after the course till 1 year
- Happy to help you any time after the course also
- + Trainer Profile
-
Trainer is having 15 year experience in Data Science with 10 years in Data Science Machine Learning and AI. It has been 15 years now that he has been working extensively in the top level Software company. He is having different kind of certifications in DS. He also have done corporate sessions and seminars both in India and abroad. Recently he was engaged by Yess InfoTech for sessions and professional motivator for working processionals to achieve their day to day targets.
All trainers at our organization are currently working on the technologies in reputed organization. The curriculum is not just some theory or some PPTs. We have all practical sessions and that to we ask our student to implement the same in the session only. We provide notes for the same. We use simple easy language and the contents are well absorbed by the candidates. The always give assignment. Also that the faculties are industry experienced so we give real time projects and practice. We also provide recorded sessions but that will be costing differently. Also we provide result oriented training.
Our Courses
Quick Inquiry
Testimonial
It was great learning at Yess InfoTech. I have attended python Course under the guidance of very good trainer. He started from the very basic and covered and shared everything he knew about python . It’s not only about learning but he makes learning interesting and fun as well. It was a great experience. Worth the time and money too.