Data Science Digital Book

July 15, 2023

A crucial component of the data is data that has been gathered across intervals of time that are equally spaced apart.

July 15, 2023

The nonlinear pattern will not be captured by the mere existence of hidden layers.

July 15, 2023

The goal of artificial intelligence is to simulate the human brain.

July 15, 2023

In order to simulate biological neural networks, artificial neural networks are utilised.

July 22, 2023

Almost every learning job, including classification and numerical prediction, may be used with SVMs.

January 14, 2023

Predicts the probability of the outcome class. The algorithm finds the linear relationship between the independent variables and a link function of these probabilities.

July 15, 2023

Ordinary Least Squares Technique to find the best fit line. The best fit line is the line which has minimum square deviations from all the data points to the line.

July 15, 2023

Decision Trees are Nonparametric hierarchical model, that works on a divide & conquer strategy, a rule-based algorithm that works on the principle of recursive partitioning.

July 15, 2023

A machine learning method called Naive Bayes is based on the probability principle.

July 15, 2023

KNN is based on the calculating distance among the various points. The distance can be any of the distance measures such as Euclidean distance discussed in previous sections.

July 15, 2023

The set of error functions below can be used to assess the model if the output variable 'Y' is continuous.

July 15, 2023

Steps based on Training & Testing datasets - Get the historical/past data needed for analysis which is the output of data cleansing.

January 13, 2023

Analyzing unstructured Text data by generating structured data in key-value pair form.

July 15, 2023

A distinct sort of data, known as network data or graph data, necessitates a different kind of analysis.

July 15, 2023

'Users' are typically the rows in the data utilised for the analysis, and 'Items' will be the columns.

July 15, 2023

The same concept underlies Relationship Mining, Market Basket Analysis, and Affinity Analysis: how are two entities connected to one another and is there any reliance between them.

July 15, 2023

Feature extraction of input variables from hundreds of variables is known as Dimensionality Reduction.

July 15, 2023

Agglomerative technique (top-down hierarchy of clusters) or Divisive technique (bottom-up hierarchy of clusters) are other names for hierarchical clustering.

July 15, 2023

Similar records to be grouped together. High intra-class similarity, Dissimilar records to be assigned to different groups. Less inter-class similarity

July 15, 2023

Standardize or Normalize the variables before calculating the distance if the variables scale or are of different units.

July 15, 2023

If the outcome variable 'Y' in the historical data is known, then supervised learning tasks are applied to the historical data. Predictive modelling and machine learning are other names for supervised learning.

July 15, 2023

Feature Extraction and Feature Engineering are other names for attribute generation. Try to use domain expertise to create more insightful derived variables from the provided variables.

July 15, 2023

The goal of this stage is to locate any potential data mistakes, flaws, or problems.

July 15, 2023

Univariate Analysis - Analysis of a single variable is called Univariate Analysis.

July 15, 2023

Other names for data cleaning include data preparation, data organisation, munging, and data wrangling.

July 15, 2023

Cross Industry Standard Process for Data Mining. Articulate the business problem by understanding the client/customer requirements

July 15, 2023

Definition of Artificial Intelligence, Data Science, Data Mining, Machine Learning, Deep Learning, Reinforcement Learning (RL)

