Workflow Element Store

  1. Data Pre-existing
  2. Unstructured data (Images / Videos)
  3. APIs and Data Feeds
  4. Data Generation
  5. WebScraping
  6. Structured Data (Tabular)
  7. Unstructured data (Audio)
  8. Crowdsourcing
  9. Mobile Applications or IoT Applications
  10. Public Datasets
  11. Data Logging
  12. Data Collaboration and Partnerships
  13. Surveys and Questionnaires
  1. RDBMS
  2. Azure Data Warehouse
  3. Informatica
  4. MySQL
  5. Azure blob storage
  6. PostgreSQL
  7. Oracle DB
  8. GCS
  9. S3
  10. GCP BigQuery
  11. NoSQL DB
  12. AWS Redshift
  13. MS SQL server
  1. Polynomial Features
  2. Feature Extraction from Images
  3. Dealing with Outliers
  4. Data Scaling and Normalization
  5. Dimensionality Reduction
  6. Time-Based Features
  7. Handling Time-Series Data
  8. Handling Categorical Data
  9. Handling Imbalanced Classes
  10. Dimensionality Reduction
  11. Encoding Categorical Variables
  12. Textual Feature Extraction
  13. Handling Missing Data
  14. Binning
  15. Data Scaling and Normalization
  16. Logarithmic Transform
  17. Handling Noisy Data
  18. Auto-Preprocessing libraries
  19. AutoEDA libraries
  20. Domain-Specific Feature Engineering
  21. Feature Selection
  22. Interaction Features
  1. Ensemble Techniques
  2. Blackbox Techniques
  3. Time Series Anaysis
  4. Forecasting
  5. Train-Test Split
  6. Supervised Learning-binary classification
  7. Data Partitioning
  8. Supervised Learning-Regression
  9. Unsupervised Learning
  10. Supervised Learning-multiclass classification
  1. Weight Initialization
  2. Cross-Validation
  3. Data Partition-sequential
  4. Hyperparameter Tuning
  5. Regularization
  6. Train-Test Split
  7. Early Stopping
  8. Learning Rate Scheduling
  9. Regular Monitoring and Logging
  10. Batch Normalization
  11. Ensemble Methods
  12. Data Augmentation
  13. Gradient Clipping
  14. Batch Size Selection
  15. Transfer Learning
  1. Hyperparameter Tuning
  2. Train-Test Split
  3. Regularization Techniques
  4. Cross-Validation
  5. Model Interpretability
  6. Data Partitioning
  7. Model Comparison
  8. External Validation
  9. Evaluation Metrics
  10. Performance Visualization
  1. Alerting and Notification
  2. Documentation and API Documentation
  3. Cloud Deployment
  4. Performance Metrics
  5. Edge Deployment
  6. Feedback Collection
  7. Containerization
  8. Concept Drift Detection
  9. Model Drift
  10. Model Versioning
  11. Model Registry
  12. Web APIs - Flask, FastAPI, etc.
  13. Model Monitoring and Maintenance
  14. Error Analysis
  15. Data Drift Monitoring
  16. Streamlit
  17. Model Health Monitoring
  18. Monitoring and Logging
  19. Bias and Fairness Assessment
  20. Prediction Logging
  21. Documentation and Reporting
  22. Continuous Integration and Deployment (CI/CD)
  23. Model Serialization
  24. Model Retraining and Updating
  25. A/B Testing
  26. Serverless Computing
  27. Security Considerations
  1. Mobile
  2. End User Machine
ML Workflow - Architecture
  • Element belongs to model
  • Element not belongs to model

Feature Store
(Online / Offline)

Data Sources

Data Warehouse/ Data Lake

EDA, Data Pre Processing & Feature Engineering

Model Selection

Model Training & Hyper Parameter Tuning

Model Evaluation

Model Deployment

End User Device

Model Registry