# Data Science with Python

### Course Insides

## About the course

Data science is the field of study that combines domain expertise, programming skills, and knowledge of math and statistics to extract meaningful insights from data. Data science practitioners apply machine learning algorithms to numbers, text, images, video, audio, and more to produce artificial intelligence (AI) systems that perform tasks which ordinarily require human intelligence. In turn, these systems generate insights that analysts and business users translate into tangible business value.

## Who can take up the course?

- This course is meant for people with at least some programming experience
- Data Analyst
- Software Engineer

## Benefits

The course is absolutely practical and real-time based on theory material provided in advance.

The sessions are interactive and interesting

All the queries are answered along with guidance on certification

Data Science is greatly in demand. Prospective job seekers have numerous opportunities. It is the fastest growing job on LinkedIn and is predicted to create 11.5 million jobs by 2026. This makes Data Science a highly employable job sector.

Data Science is one of the most highly paid jobs. According to Glassdoor, Data Scientists make an average of $116,100 per year. This makes Data Science a highly lucrative career option.

Data Science has helped various industries to automate redundant tasks. Companies are using historical data to train machines in order to perform repetitive tasks. This has simplified the arduous jobs undertaken by humans before.

## Part 1-Python

- Why Python?
- What is Python?
- A Brief History of Python
- Python Versions
- Python IDE’s- Pycharm, Jupyter Notebook, Spyder etc

- Variables & Data Types Introduction
- Data Type-Numbers,Strings,Lists
- Tuples,Dictionary and Sets
- Practice, Questions and exercise

- Overview of Logical Condition and Loops
- The if…elif…else Statement
- Nested if…else Statement
- The while Loop
- break and continue Statement
- The for Loop
- Pass statement
- Practice, Questions and exercise

- Introduction of functions
- Function definition and return
- Function call and reuse
- Function parameters
- Function recipe and docstring
- Scope of variables
- Recursive functions
- Lambda Functions / Anonymous Functions
- Map , Filter & Reduce functions
- Practice, Questions and exercise

- Object Oriented Programming – Introduction
- Object Oriented Programming – Attributes and Class Keyword
- Object Oriented Programming – Class Object Attributes and Methods
- Object Oriented Programming – Inheritance and Polymorphism
- Object Oriented Programming – Special (Magic/Dunder) Methods
- Practice, Questions and exercise

- Module
- Importing Module
- Standard Module – sys
- Standard Module – OS
- The dir Function
- Packages
- Practice, Questions and exercise

## Part 2-Data Analysis, Statistics, Data Visualization and EDA

- What is data analysis?
- Why python for data analysis?
- Essential Python Libraries Installation and setup
- Ipython
- Jupyter Notebook
- Practice, Questions and exercise

- Introduction to Numpy
- Numpy Arrays,Numpy Data types
- Numpy Array Indexing
- Numpy Mathematical Operations
- Indexing and slicing
- Manipulating array shapes
- Stacking arrays,Sorting arrays
- Creating array views and copies
- I/O with NumPy
- Practice, Questions and exercise

- Introduction to Pandas
- Data structure of pandas
- Pandas Series,Pandas dataframes
- Data aggregation with Pandas
- DataFrames Concatenating and appending
- DataFrames Joining
- DataFrames Handling missing data
- Data Indexing and Selection
- Operating on data in pandas
- loc and iloc,map,apply,apply_map
- group_by,string methods
- Querying data in pandas
- Dealing with dates
- Reading and Writing to CSV files with pandas
- Reading and Writing to Excel with pandas
- Reading and Writing to SQL with pandas
- Reading and Writing to HTML files with pandas
- Practice, Questions and exercise

- Descriptive Statistics
- Common charts used
- Skewness
- Inferential Statistics
- Variance,standard deviations
- Covariance,Coefficient,Correlation
- Probability
- Normal Distributions
- Central Limit Theorem
- Inferential statistics
- Confidence intervals
- Hypothesis Testing
- Practice, Questions and exercise

- Introduction of Matplotlib
- Basic matplotlib plots
- Line Plots,Bar Plots,Pie Plots,
- Scatter plots,Histogram Plots
- Saving plots to file
- Plotting functions in matplotlib
- Practice, Questions and exercise

- Introduction of Seaborn
- Distribution Plots
- Categorical Plots,Matrix Plots
- Bar Plots,Box Plots,Strip Plots
- Violin Plots,Clustermap Plots,Heatmaps Plots
- KDE Plots,Regression Plots,Style and Color
- Practice, Questions and exercise

- Introduction to Plotly and Cufflinks
- Plotly and Cufflinks
- Introduction to Geographical Plotting
- Choropleth Maps – Part 1
- Choropleth Maps – Part 2
- Projects using Analysis and Visualisation
- Practice, Questions and exercise

- Univariate Analysis
- Segmented Analysis
- Bivariate Analysis
- Derived Columns
- Practice, Questions and exercise

## Part 3-Machine Learning

- What is Machine learing?
- Overview about scikit-learn package
- Types of ML
- Basic steps of ML
- ML algorithms
- Practice, Questions and exercise

- Dealing with missing data
- Identifying missing values
- Imputing missing values
- Drop samples with missing values
- Handling with categorical data
- Nominal and Ordinal features
- Encoding class labels,One hot encoding
- Split data into training and testing sets
- Feature scaling
- Practice, Questions and exercise

- Simple Regression
- Multiple Regression
- Predicting house prices with Regression
- Practice, Questions and exercise

- What is R_Squared?
- What is Adjusted R_Squared?
- Evaluating Regression Models Performance
- Interpreting Linear Regression Coefficients
- Practice, Questions and exercise

- K-Nearest Neighbors (KNN)
- Decision tree
- Random forest
- Support vector machines (SVM)
- Naive Bayes
- Logistic Regression
- Email Spam Filtering Project
- Practice, Questions and exercise

- Classification Models Performance
- Importance of Confusion matrix for predictions
- False Positives & False Negatives
- Measures of model evaluation – Sensitivity, specificity, precision,recall & f-score
- Practice, Questions and exercise

- Definition
- Types of clustering
- The k-means clustering algorithm
- Practice, Questions and exercise

- Unsupervised Learning: Introduction to Curse of Dimensionality
- What is dimensionality reduction?
- Technique used in PCA to reduce dimensions
- Applications of Principle component Analysis (PCA)
- Practice, Questions and exercise

- Install nltk
- Tokenize words
- Tokenizing sentences
- Stop words with NLTK
- Stemming words with NLTK
- Twitter Sentiment analysis Project
- Practice, Questions and exercise

- Introduction to Neural networks
- What are Perceptrons & Types of Perceptrons?
- Workflow of a Neural network & analogy with biological neurons

- Introduction to Model Selection
- K-Fold Cross Validation Intuition
- Grid Search in Python Intuition
- Introduction to XGBoost
- XGBoost Intuition
- Practice, Questions and exercise

- Credit Default Prediction
- Medical treatment
- Car Price Prediction

0.00 average based on 0 ratings