Workflow Element Store

  1. Feedback Data
  2. Mobile Applications or IoT Applications
  3. Data Collaboration and Partnerships
  4. Public Datasets
  5. Surveys and Questionnaires
  6. Data Bases - SQL
  7. Data bases - NoSQL
  8. WebScraping
  9. Flat files
  10. Experiments (DoE)
  11. APIs and Data Feeds
  1. GCP Data Fusion
  2. AWS Redshift
  3. Azure blob storage
  4. AWS RDS
  5. Oracle DB
  6. GCP Dataflow
  7. MongoDB
  8. PostgreSQL
  9. ETL/ELT pipeline
  10. Azure ADF
  11. GCS
  12. GCP BigQuery
  13. Apache Kafka
  14. s3
  15. Azure Streaming Analytics
  16. MySQL
  17. Azure Synapse
  18. AWS Glue
  19. AWS Kinesis
  20. RDBMS
  21. MS SQL server
  1. Domain-Specific Feature Engineering
  2. Interaction Features
  3. Feature Selection
  4. Auto-Preprocessing libraries
  5. Feature Extraction from Images
  6. Handling Categorical Data
  7. Handling Missing Data
  8. Time-Based Features
  9. Handling Imbalanced Classes
  10. Annotation
  11. Dealing with Outliers
  12. Data Transformations
  13. Polynomial Features
  14. Textual Feature Extraction
  15. Data Scaling and Normalization
  16. Data Partitioning - Train, Validation, & Test
  17. Dimensionality Reduction
  18. AutoEDA libraries
  19. Augmentation
  20. Handling Time-Series Data
  21. Binning / Discretization
  22. Handling Noisy Data
  1. Data Augmentation
  2. Natural Language Processing
  3. Binary Classification Techniques
  4. Regression Analysis
  5. Network Analytics/ GeoSpatial Analytics
  6. Batch Size Selection
  7. Recommendation Engine
  8. Regular Monitoring and Logging
  9. GridSearchCV, RandomisedSearchCV, BayesianSearchCV
  10. AutoML
  11. Early Stopping
  12. Word Embeddings
  13. External Validation
  14. Cross-Validation
  15. Learning Rate Scheduling
  16. Model Comparison
  17. Cross-Validation
  18. Forecasting Techniques
  19. Weight Initialization
  20. Transfer Learning
  21. Hyperparameter Tuning
  22. Regularization
  23. Evaluation Metrics
  24. Transfer Learning
  25. Clustering
  26. Reinforcement Learning
  27. Blackbox - Neural Network Models
  28. Batch Normalization
  29. Model Interpretability
  30. Performance Visualization
  31. Association Rules
  32. Regularization Techniques
  33. Ensemble Techniques
  34. Multiclass Classification Techniques
  1. Databases
  2. code repository
  3. model registry
  4. Datawarehouse
  5. Data Preprocessing pipeline models
  1. Model Drift
  2. Performance Metrics
  3. Data Drift Monitoring
  4. Model Versioning
  5. Concept Drift Detection
  6. Flask
  7. Edge Deployment
  8. FastAPI
  9. Serverless Computing
  10. Cloud Deployment
  11. Prediction Logging
  12. Feedback Collection
  13. Model Serialization
  14. Model Health Monitoring
  15. Bias and Fairness Assessment
  16. Containerization
  17. Alerting and Notification
  18. Streamlit
ML Workflow Intermediate - Architecture
  • Element belongs to model
  • Element not belongs to model
Training Pipeline
Data Collection

Data Collection

Inference API

API Stream

Web crawler

API Stream

Web crawler

Python logo

Selenium

Data Ingestion

Data Ingestion

Data Landing Zone

Store Data from all the Sources
Store Data from all the Sources

Store Data from all the Sources

Data Cleaning / Preprocessing

Data Cleaning / Preprocessing

Derived & Base features

Data Training & Modelling

Data Training & Modelling

Inference Pipeline
Input Data for Forecasting

Input Data for Forecasting

Input Data

Cleaned & Processed Data

Inference

Inference

Inference pickle
Inference Joblib
streamlit
Inference API