Workflow Element Store

  1. APIs and Data Feeds
  2. Flat files
  3. Surveys and Questionnaires
  4. Feedback Data
  5. Data Collaboration and Partnerships
  6. Mobile Applications or IoT Applications
  7. Data bases - NoSQL
  8. Public Datasets
  9. WebScraping
  10. Experiments (DoE)
  11. Data Bases - SQL
  1. AWS RDS
  2. Azure Streaming Analytics
  3. AWS Glue
  4. PostgreSQL
  5. GCP Dataflow
  6. Apache Kafka
  7. MongoDB
  8. ETL/ELT pipeline
  9. GCP Data Fusion
  10. AWS Kinesis
  11. Oracle DB
  12. MS SQL server
  13. GCP BigQuery
  14. Azure ADF
  15. GCS
  16. AWS Redshift
  17. s3
  18. MySQL
  19. Azure blob storage
  20. Azure Synapse
  21. RDBMS
  1. Handling Noisy Data
  2. Dimensionality Reduction
  3. AutoEDA libraries
  4. Annotation
  5. Feature Extraction from Images
  6. Interaction Features
  7. Dealing with Outliers
  8. Feature Selection
  9. Augmentation
  10. Data Transformations
  11. Auto-Preprocessing libraries
  12. Handling Missing Data
  13. Data Scaling and Normalization
  14. Time-Based Features
  15. Handling Time-Series Data
  16. Textual Feature Extraction
  17. Polynomial Features
  18. Binning / Discretization
  19. Handling Imbalanced Classes
  20. Data Partitioning - Train, Validation, & Test
  21. Handling Categorical Data
  22. Domain-Specific Feature Engineering
  1. AutoML
  2. Binary Classification Techniques
  3. GridSearchCV, RandomisedSearchCV, BayesianSearchCV
  4. Regular Monitoring and Logging
  5. Evaluation Metrics
  6. Data Augmentation
  7. Weight Initialization
  8. Multiclass Classification Techniques
  9. Blackbox - Neural Network Models
  10. Performance Visualization
  11. Reinforcement Learning
  12. Model Interpretability
  13. Cross-Validation
  14. Word Embeddings
  15. Regression Analysis
  16. Model Comparison
  17. Batch Size Selection
  18. Hyperparameter Tuning
  19. Clustering
  20. Regularization Techniques
  21. Network Analytics/ GeoSpatial Analytics
  22. Batch Normalization
  23. Ensemble Techniques
  24. Learning Rate Scheduling
  25. Transfer Learning
  26. Regularization
  27. Cross-Validation
  28. Forecasting Techniques
  29. Natural Language Processing
  30. Transfer Learning
  31. Recommendation Engine
  32. Association Rules
  33. Early Stopping
  34. External Validation
  1. Databases
  2. Data Preprocessing pipeline models
  3. code repository
  4. Datawarehouse
  5. model registry
  1. Bias and Fairness Assessment
  2. Cloud Deployment
  3. Feedback Collection
  4. Prediction Logging
  5. Data Drift Monitoring
  6. Flask
  7. Model Health Monitoring
  8. Alerting and Notification
  9. Model Serialization
  10. FastAPI
  11. Streamlit
  12. Containerization
  13. Edge Deployment
  14. Model Drift
  15. Serverless Computing
  16. Performance Metrics
  17. Model Versioning
  18. Concept Drift Detection
ML Workflow Intermediate - Architecture
  • Element belongs to model
  • Element not belongs to model
Training Pipeline
Data Collection

Data Collection

Inference API

API Stream

Web crawler

API Stream

Web crawler

Python logo

Selenium

Data Ingestion

Data Ingestion

Data Landing Zone

Store Data from all the Sources
Store Data from all the Sources

Store Data from all the Sources

Data Cleaning / Preprocessing

Data Cleaning / Preprocessing

Derived & Base features

Data Training & Modelling

Data Training & Modelling

Inference Pipeline
Input Data for Forecasting

Input Data for Forecasting

Input Data

Cleaned & Processed Data

Inference

Inference

Inference pickle
Inference Joblib
streamlit
Inference API