Workflow Element Store

  1. WebScraping
  2. Data bases - NoSQL
  3. Mobile Applications or IoT Applications
  4. Experiments (DoE)
  5. Data Collaboration and Partnerships
  6. Data Bases - SQL
  7. Flat files
  8. APIs and Data Feeds
  9. Feedback Data
  10. Surveys and Questionnaires
  11. Public Datasets
  1. AWS Redshift
  2. Oracle DB
  3. AWS Glue
  4. Azure blob storage
  5. MySQL
  6. MongoDB
  7. Azure ADF
  8. AWS RDS
  9. GCP Dataflow
  10. MS SQL server
  11. Azure Streaming Analytics
  12. PostgreSQL
  13. GCS
  14. s3
  15. GCP BigQuery
  16. Apache Kafka
  17. ETL/ELT pipeline
  18. AWS Kinesis
  19. RDBMS
  20. GCP Data Fusion
  21. Azure Synapse
  1. Textual Feature Extraction
  2. Polynomial Features
  3. Domain-Specific Feature Engineering
  4. Augmentation
  5. Handling Imbalanced Classes
  6. Dealing with Outliers
  7. Feature Selection
  8. Interaction Features
  9. Data Partitioning - Train, Validation, & Test
  10. Binning / Discretization
  11. Handling Missing Data
  12. Handling Time-Series Data
  13. Handling Categorical Data
  14. Data Scaling and Normalization
  15. Auto-Preprocessing libraries
  16. Feature Extraction from Images
  17. Data Transformations
  18. Time-Based Features
  19. Handling Noisy Data
  20. Annotation
  21. Dimensionality Reduction
  22. AutoEDA libraries
  1. External Validation
  2. Blackbox - Neural Network Models
  3. Regression Analysis
  4. Reinforcement Learning
  5. Weight Initialization
  6. Clustering
  7. Network Analytics/ GeoSpatial Analytics
  8. Learning Rate Scheduling
  9. Batch Normalization
  10. Recommendation Engine
  11. Transfer Learning
  12. Data Augmentation
  13. Regularization Techniques
  14. Association Rules
  15. Regularization
  16. Transfer Learning
  17. Evaluation Metrics
  18. GridSearchCV, RandomisedSearchCV, BayesianSearchCV
  19. Regular Monitoring and Logging
  20. Multiclass Classification Techniques
  21. Model Comparison
  22. Early Stopping
  23. Binary Classification Techniques
  24. Natural Language Processing
  25. Model Interpretability
  26. Ensemble Techniques
  27. Hyperparameter Tuning
  28. Forecasting Techniques
  29. AutoML
  30. Word Embeddings
  31. Performance Visualization
  32. Cross-Validation
  33. Batch Size Selection
  34. Cross-Validation
  1. Datawarehouse
  2. Data Preprocessing pipeline models
  3. model registry
  4. Databases
  5. code repository
  1. Cloud Deployment
  2. Bias and Fairness Assessment
  3. Model Versioning
  4. Data Drift Monitoring
  5. Containerization
  6. Flask
  7. Edge Deployment
  8. Serverless Computing
  9. Streamlit
  10. Prediction Logging
  11. Model Drift
  12. Feedback Collection
  13. Concept Drift Detection
  14. Performance Metrics
  15. FastAPI
  16. Alerting and Notification
  17. Model Serialization
  18. Model Health Monitoring
ML Workflow Intermediate - Architecture
  • Element belongs to model
  • Element not belongs to model
Training Pipeline
Data Collection

Data Collection

Inference API

API Stream

Web crawler

API Stream

Web crawler

Python logo

Selenium

Data Ingestion

Data Ingestion

Data Landing Zone

Store Data from all the Sources
Store Data from all the Sources

Store Data from all the Sources

Data Cleaning / Preprocessing

Data Cleaning / Preprocessing

Derived & Base features

Data Training & Modelling

Data Training & Modelling

Inference Pipeline
Input Data for Forecasting

Input Data for Forecasting

Input Data

Cleaned & Processed Data

Inference

Inference

Inference pickle
Inference Joblib
streamlit
Inference API