Workflow Element Store

  1. Data Bases - SQL
  2. Data Collaboration and Partnerships
  3. APIs and Data Feeds
  4. Experiments (DoE)
  5. Feedback Data
  6. Data bases - NoSQL
  7. Flat files
  8. Public Datasets
  9. WebScraping
  10. Mobile Applications or IoT Applications
  11. Surveys and Questionnaires
  1. GCP Dataflow
  2. Apache Kafka
  3. PostgreSQL
  4. AWS Redshift
  5. GCP Data Fusion
  6. RDBMS
  7. s3
  8. Azure Synapse
  9. ETL/ELT pipeline
  10. MySQL
  11. AWS RDS
  12. Azure ADF
  13. AWS Kinesis
  14. Oracle DB
  15. GCS
  16. MS SQL server
  17. MongoDB
  18. GCP BigQuery
  19. Azure blob storage
  20. AWS Glue
  21. Azure Streaming Analytics
  1. Dealing with Outliers
  2. Handling Missing Data
  3. Polynomial Features
  4. Textual Feature Extraction
  5. Feature Extraction from Images
  6. Auto-Preprocessing libraries
  7. Data Scaling and Normalization
  8. Interaction Features
  9. Domain-Specific Feature Engineering
  10. Time-Based Features
  11. Binning / Discretization
  12. Handling Categorical Data
  13. Augmentation
  14. Data Transformations
  15. AutoEDA libraries
  16. Handling Time-Series Data
  17. Feature Selection
  18. Annotation
  19. Handling Noisy Data
  20. Dimensionality Reduction
  21. Handling Imbalanced Classes
  22. Data Partitioning - Train, Validation, & Test
  1. Evaluation Metrics
  2. Transfer Learning
  3. Data Augmentation
  4. Cross-Validation
  5. Clustering
  6. Multiclass Classification Techniques
  7. Regularization
  8. Regression Analysis
  9. Blackbox - Neural Network Models
  10. Regularization Techniques
  11. Word Embeddings
  12. Batch Size Selection
  13. Transfer Learning
  14. Weight Initialization
  15. Reinforcement Learning
  16. External Validation
  17. AutoML
  18. Ensemble Techniques
  19. Model Interpretability
  20. Model Comparison
  21. GridSearchCV, RandomisedSearchCV, BayesianSearchCV
  22. Learning Rate Scheduling
  23. Forecasting Techniques
  24. Recommendation Engine
  25. Batch Normalization
  26. Regular Monitoring and Logging
  27. Natural Language Processing
  28. Cross-Validation
  29. Binary Classification Techniques
  30. Association Rules
  31. Network Analytics/ GeoSpatial Analytics
  32. Performance Visualization
  33. Hyperparameter Tuning
  34. Early Stopping
  1. Data Preprocessing pipeline models
  2. model registry
  3. code repository
  4. Datawarehouse
  5. Databases
  1. Serverless Computing
  2. Bias and Fairness Assessment
  3. Model Versioning
  4. FastAPI
  5. Alerting and Notification
  6. Flask
  7. Feedback Collection
  8. Model Serialization
  9. Streamlit
  10. Data Drift Monitoring
  11. Containerization
  12. Performance Metrics
  13. Model Drift
  14. Concept Drift Detection
  15. Model Health Monitoring
  16. Prediction Logging
  17. Edge Deployment
  18. Cloud Deployment
ML Workflow Intermediate - Architecture
  • Element belongs to model
  • Element not belongs to model
Training Pipeline
Data Collection

Data Collection

Inference API

API Stream

Web crawler

API Stream

Web crawler

Python logo

Selenium

Data Ingestion

Data Ingestion

Data Landing Zone

Store Data from all the Sources
Store Data from all the Sources

Store Data from all the Sources

Data Cleaning / Preprocessing

Data Cleaning / Preprocessing

Derived & Base features

Data Training & Modelling

Data Training & Modelling

Inference Pipeline
Input Data for Forecasting

Input Data for Forecasting

Input Data

Cleaned & Processed Data

Inference

Inference

Inference pickle
Inference Joblib
streamlit
Inference API