Call Us

Home / Blog / Data Science / High Level Project Management – Data Science

High Level Project Management – Data Science

  • December 10, 2020
  • 3015
  • 25
Author Images

Meet the Author : Mr. Bharani Kumar

Bharani Kumar Depuru is a well known IT personality from Hyderabad. He is the Founder and Director of AiSPRY and 360DigiTMG. Bharani Kumar is an IIT and ISB alumni with more than 17 years of experience, he held prominent positions in the IT elites like HSBC, ITC Infotech, Infosys, and Deloitte. He is a prevalent IT consultant specializing in Industrial Revolution 4.0 implementation, Data Analytics practice setup, Artificial Intelligence, Big Data Analytics, Industrial IoT, Business Intelligence and Business Management. Bharani Kumar is also the chief trainer at 360DigiTMG with more than Ten years of experience and has been making the IT transition journey easy for his students. 360DigiTMG is at the forefront of delivering quality education, thereby bridging the gap between academia and industry.

Read More >

Data Collection

  • Primary Data Sources – Data collected at that moment – Surveys / Experiments
    • Costly
    • Time-consuming / Low quality
    • Get the exact variable
  • Secondary Data Sources – Data which is collected beforehand
    • Quick access to data
    • Free of cost
    • Need not have data of interest

Want to learn more about data science? Enroll in the Best Data Science courses in Chennai to do so.

Data Science

Data Cleansing / Data Preparation / Exploratory Data Analysis / Feature Engineering

  • Data Cleansing / Data Preparation

    • Outlier Analysis / Treatment – 3R (Rectify, Retain, Remove)
    • Missingness of data – Imputation – Mean, Median, Mode, Regression, KNN
    • Standardization (X-Min(X)/Range(X) / Normalization (X-Mu/Sigma)) – Unitless and Scale Free
    • Discretization / Binning / Grouping
    • Transformation (log, exp, etc.)
      • Non-linear
      • Non-normal
      • Heteroscedasticity – unequal variance
      • Collinearity
    • Dummy variable creation – One hot encoding
  • Exploratory Data Analysis

    Earn yourself a promising career in data science by enrolling in the Data Science Classes in Pune offered by 360DigiTMG.

    • First-moment business decision / Measures of central tendency
      • Mean, Median, Mode
    • Second-moment business decision / Measures of dispersion
      • Variance, Standard Deviation, Range
    • Third-moment business decision – Skewness
    • Fourth-moment business decision – Kurtosis
    • Graphical Representation
      • Univariate
        • Box Plot
          • Primary purpose – Identify outliers
          • Secondary purpose – Identify shape of distribution
        • Histogram
          • Primary purpose – Identify Shape of distribution
          • Secondary purpose – Identify outliers
        • Q-Q plot – Data are normal or not
      • Bivariate
        • Scatter plot
          • Primary purposes
            • Direction-Positive, Negative, no correlation
            • Strength – Strong, moderate, weak – Subjective; Objective – correlation coefficient; r: -1 to +1; |r| > 0.85; |r| < 0.4
            • Linear or Non-linear / Curvilinear
          • Secondary purposes
            • Scatter plot
              • Primary purposes
                • Clusters
                • Outliers
            • Feature Engineering / Feature Extraction – Using your given variables, try to apply domain knowledge to come up with more meaningful derived variables
            • Feature Selection -> Decision Tree (Information Gain), Random Forest (Variable Importance plot), Hypothesis testing, Lasso regression, Ridge regression

Data Mining (Cross-Sectional)

  • Supervised Learning / Machine Learning / Predictive Modelling (Y known)

    • Regression Analysis (Interpret the parameters)
      • Y= Continuous -> Linear Regression
      • Y = Discrete (2 categories) -> Logistic Regression
      • Y = Discrete (> 2 categories) -> Multinomial / Ordinal Regression
      • Y = Count -> Poisson / Negative Binomial Regression
      • Excessive Zero – ZIP / ZINB / Hurdle
    • KNN
    • Black Box Techniques (No interpretation exists)
      • Neural Networks
      • SVM
    • Ensemble Techniques
      • Stacking
      • Bagging(Random Forest)
      • Boosting (Decision Tree)
  • Looking forward to becoming a Data Scientist? Check out the Data Science Course and get certified today.

    Unsupervised Learning (Y unknown)

    • Clustering / Segmentation – Reduce the rows
      • K-Means / non-hierarchical – Upfront determine the # of clusters – Scree plot / Elbow curve
      • Hierarchical / Agglomerative – Dendrogram
      • DBSCAN
      • OPTICS
      • CLARA
      • K-medians / K-Medoids / K-modes
    • Dimension Reduction – Reduce the columns
      • PCA, Factor Analysis
      • SVD
    • Association Rules / Market Basket Analysis / Affinity Analysis
      • Support
      • Confidence
      • Lift Ratio > 1 => Antecedent and Consequent have strong association
    • Recommender Systems
    • Network Analytics
      • Degree
      • Closeness
      • Betweenness
      • Eigenvector
      • Page Rank
    • Text Mining & NLP
      • BoW
      • TDM / DTM
      • TF / TFIDF

    Also, check this Data Science Institute in Bangalore to start a career in Data Science.

    Watch Free Videos on Youtube

  • Forecasting / Time Series

    • Model-Based Approaches
      • Trend
        • Linear
        • Exponential
        • Quadratic
      • Seasonality
        • Additive
        • Multiplicative
    • Data-Based Approaches
      • AR
      • MA
      • ES
        • SES
        • Holts
        • HoltWinters

Data Science Placement Success Story

Data Science Training Institutes in Other Locations

Data Analyst Courses in Other Locations

Navigate to Address

360DigiTMG - Data Science, IR 4.0, AI, Machine Learning Training in Malaysia

Level 16, 1 Sentral, Jalan Stesen Sentral 5, Kuala Lumpur Sentral, 50470 Kuala Lumpur, Wilayah Persekutuan Kuala Lumpur, Malaysia

+60 19-383 1378

Get Direction: Data Science Course

Read
Success Stories
Make an Enquiry