Workflow Element Store

  1. Mobile Applications or IoT Applications
  2. Surveys and Questionnaires
  3. WebScraping
  4. Experiments (DoE)
  5. Feedback Data
  6. APIs and Data Feeds
  7. Flat files
  8. Data bases - NoSQL
  9. Public Datasets
  10. Data Collaboration and Partnerships
  11. Data Bases - SQL
  1. GCS
  2. AWS RDS
  3. Azure Synapse
  4. AWS Glue
  5. GCP Data Fusion
  6. MS SQL server
  7. Azure ADF
  8. Oracle DB
  9. Azure Streaming Analytics
  10. AWS Kinesis
  11. ETL/ELT pipeline
  12. MongoDB
  13. Apache Kafka
  14. AWS Redshift
  15. s3
  16. GCP Dataflow
  17. PostgreSQL
  18. Azure blob storage
  19. GCP BigQuery
  20. MySQL
  21. RDBMS
  1. Feature Selection
  2. Augmentation
  3. Data Scaling and Normalization
  4. Dealing with Outliers
  5. Domain-Specific Feature Engineering
  6. Handling Categorical Data
  7. AutoEDA libraries
  8. Handling Imbalanced Classes
  9. Handling Missing Data
  10. Handling Time-Series Data
  11. Binning / Discretization
  12. Handling Noisy Data
  13. Data Transformations
  14. Data Partitioning - Train, Validation, & Test
  15. Dimensionality Reduction
  16. Time-Based Features
  17. Feature Extraction from Images
  18. Textual Feature Extraction
  19. Polynomial Features
  20. Interaction Features
  21. Auto-Preprocessing libraries
  22. Annotation
  1. Forecasting Techniques
  2. Batch Size Selection
  3. Word Embeddings
  4. Blackbox - Neural Network Models
  5. External Validation
  6. Natural Language Processing
  7. Recommendation Engine
  8. Weight Initialization
  9. Ensemble Techniques
  10. Model Comparison
  11. Cross-Validation
  12. Learning Rate Scheduling
  13. Binary Classification Techniques
  14. Performance Visualization
  15. Reinforcement Learning
  16. Regression Analysis
  17. Batch Normalization
  18. Regularization
  19. Clustering
  20. Network Analytics/ GeoSpatial Analytics
  21. Transfer Learning
  22. Regular Monitoring and Logging
  23. Data Augmentation
  24. AutoML
  25. Multiclass Classification Techniques
  26. Evaluation Metrics
  27. Association Rules
  28. Hyperparameter Tuning
  29. Early Stopping
  30. GridSearchCV, RandomisedSearchCV, BayesianSearchCV
  31. Transfer Learning
  32. Cross-Validation
  33. Model Interpretability
  34. Regularization Techniques
  1. Data Preprocessing pipeline models
  2. Datawarehouse
  3. Databases
  4. code repository
  5. model registry
  1. Data Drift Monitoring
  2. Alerting and Notification
  3. Model Health Monitoring
  4. Prediction Logging
  5. Feedback Collection
  6. Streamlit
  7. Model Serialization
  8. Concept Drift Detection
  9. Containerization
  10. Performance Metrics
  11. Model Versioning
  12. Cloud Deployment
  13. Model Drift
  14. Edge Deployment
  15. Serverless Computing
  16. FastAPI
  17. Bias and Fairness Assessment
  18. Flask
ML Workflow Intermediate - Architecture
  • Element belongs to model
  • Element not belongs to model
Training Pipeline
Data Collection

Data Collection

Inference API

API Stream

Web crawler

API Stream

Web crawler

Python logo

Selenium

Data Ingestion

Data Ingestion

Data Landing Zone

Store Data from all the Sources
Store Data from all the Sources

Store Data from all the Sources

Data Cleaning / Preprocessing

Data Cleaning / Preprocessing

Derived & Base features

Data Training & Modelling

Data Training & Modelling

Inference Pipeline
Input Data for Forecasting

Input Data for Forecasting

Input Data

Cleaned & Processed Data

Inference

Inference

Inference pickle
Inference Joblib
streamlit
Inference API