Workflow Element Store

  1. Dimensionality Reduction
  2. Binning / Discretization
  3. Handling Missing Data
  4. Handling Time-Series Data
  5. Auto-Preprocessing libraries
  6. Annotation
  7. Handling Noisy Data
  8. Feature Selection
  9. Handling Imbalanced Classes
  10. Polynomial Features
  11. Data Scaling and Normalization
  12. Feature Extraction from Images
  13. Handling Categorical Data
  14. Augmentation
  15. Data Partitioning - Train, Validation, & Test
  16. Dealing with Outliers
  17. Data Transformations
  18. Time-Based Features
  19. Textual Feature Extraction
  20. AutoEDA libraries
  21. Interaction Features
  22. Domain-Specific Feature Engineering
  1. AutoML
  2. Data Augmentation
  3. Reinforcement Learning
  4. Natural Language Processing
  5. Regularization
  6. Learning Rate Scheduling
  7. Forecasting Techniques
  8. Recommendation Engine
  9. Cross-Validation
  10. Binary Classification Techniques
  11. Batch Normalization
  12. Multiclass Classification Techniques
  13. Association Rules
  14. Network Analytics/ GeoSpatial Analytics
  15. Hyperparameter Tuning
  16. Regularization Techniques
  17. Blackbox - Neural Network Models
  18. Transfer Learning
  19. Transfer Learning
  20. Ensemble Techniques
  21. Batch Size Selection
  22. Performance Visualization
  23. Regression Analysis
  24. Regular Monitoring and Logging
  25. Cross-Validation
  26. GridSearchCV, RandomisedSearchCV, BayesianSearchCV
  27. Early Stopping
  28. Weight Initialization
  29. External Validation
  30. Model Comparison
  31. Clustering
  32. Evaluation Metrics
  33. Model Interpretability
  34. Word Embeddings
  1. Data Preprocessing pipeline models
  2. Apache Airflow
  3. Datawarehouse
  4. Github Actions
  5. Evidently.ai
  6. Databases
  7. Kafka Brokers
  8. model registry
  9. Github
  10. code repository
ML Workflow Advanced - Architecture
  • Element belongs to model
  • Element not belongs to model
Data Sources

Data Sources

Streaming Data

Streaming Data

Batch Data

Batch Data

Cloud Storage

Cloud Storage

Labeled Data

Labeled Data

Feature Engineering Pipeline

Feature Engineering Pipeline

Experimentation

Experimentation

ML Model

ML Model

Repository

Repository

CI/CD component

Continuous integration/Continuous delivery

Continuous deployment

Artifact Store

Feature Store System

Offline DB Online DB

Orchestration Component

Artifact Store

CI/CD Component

Model Registry

Scheduler

Workflow orchestration component

Automation ML Workflow Pipeline

Automation ML Workflow Pipeline

Monitoring Component

Monitoring Component

Model serving component

(Prediction on new batch or streaming data)