Workflow Element Store

  1. Data Pre-existing
  2. Structured Data (Tabular)
  3. Crowdsourcing
  4. Data Generation
  5. WebScraping
  6. Data Logging
  7. APIs and Data Feeds
  8. Unstructured data (Audio)
  9. Mobile Applications or IoT Applications
  10. Data Collaboration and Partnerships
  11. Surveys and Questionnaires
  12. Unstructured data (Images / Videos)
  13. Public Datasets
  1. Oracle DB
  2. MySQL
  3. NoSQL DB
  4. Azure Data Warehouse
  5. Informatica
  6. S3
  7. RDBMS
  8. MS SQL server
  9. Azure blob storage
  10. GCS
  11. PostgreSQL
  12. AWS Redshift
  13. GCP BigQuery
  1. Textual Feature Extraction
  2. Interaction Features
  3. Feature Selection
  4. Time-Based Features
  5. Handling Noisy Data
  6. Dimensionality Reduction
  7. Data Scaling and Normalization
  8. Logarithmic Transform
  9. Handling Imbalanced Classes
  10. AutoEDA libraries
  11. Domain-Specific Feature Engineering
  12. Encoding Categorical Variables
  13. Dealing with Outliers
  14. Handling Time-Series Data
  15. Polynomial Features
  16. Binning
  17. Data Scaling and Normalization
  18. Handling Categorical Data
  19. Feature Extraction from Images
  20. Handling Missing Data
  21. Dimensionality Reduction
  22. Auto-Preprocessing libraries
  1. Forecasting
  2. Blackbox Techniques
  3. Data Partitioning
  4. Supervised Learning-multiclass classification
  5. Supervised Learning-Regression
  6. Time Series Anaysis
  7. Ensemble Techniques
  8. Supervised Learning-binary classification
  9. Train-Test Split
  10. Unsupervised Learning
  1. Cross-Validation
  2. Learning Rate Scheduling
  3. Data Partition-sequential
  4. Hyperparameter Tuning
  5. Ensemble Methods
  6. Early Stopping
  7. Regular Monitoring and Logging
  8. Regularization
  9. Weight Initialization
  10. Train-Test Split
  11. Gradient Clipping
  12. Batch Size Selection
  13. Transfer Learning
  14. Data Augmentation
  15. Batch Normalization
  1. Regularization Techniques
  2. Model Interpretability
  3. External Validation
  4. Hyperparameter Tuning
  5. Train-Test Split
  6. Cross-Validation
  7. Data Partitioning
  8. Performance Visualization
  9. Evaluation Metrics
  10. Model Comparison
  1. Serverless Computing
  2. Model Registry
  3. Model Drift
  4. Continuous Integration and Deployment (CI/CD)
  5. Streamlit
  6. Bias and Fairness Assessment
  7. Model Retraining and Updating
  8. Documentation and Reporting
  9. Security Considerations
  10. Cloud Deployment
  11. Error Analysis
  12. Feedback Collection
  13. Web APIs - Flask, FastAPI, etc.
  14. Edge Deployment
  15. Data Drift Monitoring
  16. Model Versioning
  17. Performance Metrics
  18. Containerization
  19. Model Monitoring and Maintenance
  20. Model Serialization
  21. A/B Testing
  22. Prediction Logging
  23. Concept Drift Detection
  24. Monitoring and Logging
  25. Alerting and Notification
  26. Documentation and API Documentation
  27. Model Health Monitoring
  1. End User Machine
  2. Mobile
ML Workflow Beginner - Architecture
  • Element belongs to model
  • Element not belongs to model
Feature Store

Feature Store
(Online / Offline)

Data Sources

Data Sources

Data Warehouse

Data Warehouse/ Data Lake

Data Pre Processing & Feature Engineering

EDA, Data Pre Processing & Feature Engineering

Model Selection

Model Selection

Model Training & Hyper Parameter Tuning

Model Training & Hyper Parameter Tuning

Model Evaluation

Model Evaluation

Model Deployment

Model Deployment

End User Device

End User Device

Model Registry

Model Registry