Workflow Element Store

  1. Mobile Applications or IoT Applications
  2. Flat files
  3. Surveys and Questionnaires
  4. Feedback Data
  5. Experiments (DoE)
  6. Public Datasets
  7. Data Bases - SQL
  8. WebScraping
  9. APIs and Data Feeds
  10. Data bases - NoSQL
  11. Data Collaboration and Partnerships
  1. MySQL
  2. GCS
  3. RDBMS
  4. Apache Kafka
  5. Azure Synapse
  6. MongoDB
  7. GCP Data Fusion
  8. PostgreSQL
  9. GCP Dataflow
  10. AWS Glue
  11. Oracle DB
  12. Azure Streaming Analytics
  13. AWS RDS
  14. Azure blob storage
  15. AWS Redshift
  16. ETL/ELT pipeline
  17. s3
  18. AWS Kinesis
  19. GCP BigQuery
  20. MS SQL server
  21. Azure ADF
  1. Feature Extraction from Images
  2. Dealing with Outliers
  3. Handling Noisy Data
  4. Auto-Preprocessing libraries
  5. Data Scaling and Normalization
  6. AutoEDA libraries
  7. Textual Feature Extraction
  8. Interaction Features
  9. Feature Selection
  10. Polynomial Features
  11. Data Partitioning - Train, Validation, & Test
  12. Binning / Discretization
  13. Annotation
  14. Handling Categorical Data
  15. Handling Missing Data
  16. Dimensionality Reduction
  17. Data Transformations
  18. Augmentation
  19. Domain-Specific Feature Engineering
  20. Handling Imbalanced Classes
  21. Time-Based Features
  22. Handling Time-Series Data
  1. Model Interpretability
  2. Hyperparameter Tuning
  3. Regular Monitoring and Logging
  4. Regularization Techniques
  5. GridSearchCV, RandomisedSearchCV, BayesianSearchCV
  6. Clustering
  7. Blackbox - Neural Network Models
  8. Regularization
  9. Early Stopping
  10. Network Analytics/ GeoSpatial Analytics
  11. Association Rules
  12. Weight Initialization
  13. Cross-Validation
  14. Regression Analysis
  15. Learning Rate Scheduling
  16. Word Embeddings
  17. Evaluation Metrics
  18. Batch Size Selection
  19. External Validation
  20. Ensemble Techniques
  21. Transfer Learning
  22. Reinforcement Learning
  23. Performance Visualization
  24. Recommendation Engine
  25. Natural Language Processing
  26. Multiclass Classification Techniques
  27. Batch Normalization
  28. Transfer Learning
  29. Forecasting Techniques
  30. AutoML
  31. Data Augmentation
  32. Cross-Validation
  33. Binary Classification Techniques
  34. Model Comparison
  1. Data Preprocessing pipeline models
  2. Datawarehouse
  3. Databases
  4. model registry
  5. code repository
  1. Performance Metrics
  2. Prediction Logging
  3. Feedback Collection
  4. FastAPI
  5. Bias and Fairness Assessment
  6. Data Drift Monitoring
  7. Serverless Computing
  8. Model Versioning
  9. Alerting and Notification
  10. Model Serialization
  11. Concept Drift Detection
  12. Streamlit
  13. Containerization
  14. Edge Deployment
  15. Model Drift
  16. Model Health Monitoring
  17. Cloud Deployment
  18. Flask
ML Workflow Intermediate - Architecture
  • Element belongs to model
  • Element not belongs to model
Training Pipeline
Data Collection

Data Collection

Inference API

API Stream

Web crawler

API Stream

Web crawler

Python logo

Selenium

Data Ingestion

Data Ingestion

Data Landing Zone

Store Data from all the Sources
Store Data from all the Sources

Store Data from all the Sources

Data Cleaning / Preprocessing

Data Cleaning / Preprocessing

Derived & Base features

Data Training & Modelling

Data Training & Modelling

Inference Pipeline
Input Data for Forecasting

Input Data for Forecasting

Input Data

Cleaned & Processed Data

Inference

Inference

Inference pickle
Inference Joblib
streamlit
Inference API