Workflow Element Store

  1. APIs and Data Feeds
  2. Data Bases - SQL
  3. Feedback Data
  4. Experiments (DoE)
  5. Mobile Applications or IoT Applications
  6. Flat files
  7. Data Collaboration and Partnerships
  8. Surveys and Questionnaires
  9. Public Datasets
  10. WebScraping
  11. Data bases - NoSQL
  1. MS SQL server
  2. AWS RDS
  3. AWS Glue
  4. Azure ADF
  5. Azure Synapse
  6. MySQL
  7. ETL/ELT pipeline
  8. GCS
  9. Oracle DB
  10. MongoDB
  11. Apache Kafka
  12. RDBMS
  13. GCP Dataflow
  14. GCP BigQuery
  15. AWS Kinesis
  16. AWS Redshift
  17. PostgreSQL
  18. Azure blob storage
  19. GCP Data Fusion
  20. s3
  21. Azure Streaming Analytics
  1. Handling Categorical Data
  2. Dimensionality Reduction
  3. Domain-Specific Feature Engineering
  4. Auto-Preprocessing libraries
  5. Data Partitioning - Train, Validation, & Test
  6. Feature Extraction from Images
  7. Feature Selection
  8. Binning / Discretization
  9. Augmentation
  10. Textual Feature Extraction
  11. AutoEDA libraries
  12. Handling Imbalanced Classes
  13. Time-Based Features
  14. Handling Missing Data
  15. Annotation
  16. Interaction Features
  17. Data Transformations
  18. Data Scaling and Normalization
  19. Handling Time-Series Data
  20. Dealing with Outliers
  21. Handling Noisy Data
  22. Polynomial Features
  1. Clustering
  2. Regularization
  3. AutoML
  4. Cross-Validation
  5. Word Embeddings
  6. Regularization Techniques
  7. Association Rules
  8. Regression Analysis
  9. Network Analytics/ GeoSpatial Analytics
  10. Early Stopping
  11. Weight Initialization
  12. Batch Normalization
  13. External Validation
  14. Natural Language Processing
  15. Recommendation Engine
  16. Transfer Learning
  17. Hyperparameter Tuning
  18. GridSearchCV, RandomisedSearchCV, BayesianSearchCV
  19. Data Augmentation
  20. Binary Classification Techniques
  21. Regular Monitoring and Logging
  22. Forecasting Techniques
  23. Performance Visualization
  24. Blackbox - Neural Network Models
  25. Cross-Validation
  26. Evaluation Metrics
  27. Transfer Learning
  28. Reinforcement Learning
  29. Model Comparison
  30. Ensemble Techniques
  31. Multiclass Classification Techniques
  32. Model Interpretability
  33. Batch Size Selection
  34. Learning Rate Scheduling
  1. model registry
  2. Data Preprocessing pipeline models
  3. code repository
  4. Databases
  5. Datawarehouse
  1. Streamlit
  2. Containerization
  3. Alerting and Notification
  4. Concept Drift Detection
  5. FastAPI
  6. Model Versioning
  7. Prediction Logging
  8. Serverless Computing
  9. Model Serialization
  10. Performance Metrics
  11. Data Drift Monitoring
  12. Model Drift
  13. Bias and Fairness Assessment
  14. Edge Deployment
  15. Cloud Deployment
  16. Feedback Collection
  17. Flask
  18. Model Health Monitoring
ML Workflow Intermediate - Architecture
  • Element belongs to model
  • Element not belongs to model
Training Pipeline
Data Collection

Data Collection

Inference API

API Stream

Web crawler

API Stream

Web crawler

Python logo

Selenium

Data Ingestion

Data Ingestion

Data Landing Zone

Store Data from all the Sources
Store Data from all the Sources

Store Data from all the Sources

Data Cleaning / Preprocessing

Data Cleaning / Preprocessing

Derived & Base features

Data Training & Modelling

Data Training & Modelling

Inference Pipeline
Input Data for Forecasting

Input Data for Forecasting

Input Data

Cleaned & Processed Data

Inference

Inference

Inference pickle
Inference Joblib
streamlit
Inference API