Workflow Element Store

  1. Data Bases - SQL
  2. Flat files
  3. Public Datasets
  4. APIs and Data Feeds
  5. Data Collaboration and Partnerships
  6. Feedback Data
  7. Experiments (DoE)
  8. WebScraping
  9. Mobile Applications or IoT Applications
  10. Surveys and Questionnaires
  11. Data bases - NoSQL
  1. Azure blob storage
  2. PostgreSQL
  3. AWS Glue
  4. GCP Data Fusion
  5. Azure Synapse
  6. AWS RDS
  7. Azure ADF
  8. GCP Dataflow
  9. MS SQL server
  10. Apache Kafka
  11. ETL/ELT pipeline
  12. Oracle DB
  13. GCS
  14. RDBMS
  15. Azure Streaming Analytics
  16. AWS Redshift
  17. s3
  18. AWS Kinesis
  19. MongoDB
  20. GCP BigQuery
  21. MySQL
  1. Handling Categorical Data
  2. Data Transformations
  3. Annotation
  4. Dimensionality Reduction
  5. Time-Based Features
  6. Feature Extraction from Images
  7. Handling Missing Data
  8. Feature Selection
  9. Interaction Features
  10. Augmentation
  11. Handling Imbalanced Classes
  12. Handling Time-Series Data
  13. Domain-Specific Feature Engineering
  14. AutoEDA libraries
  15. Polynomial Features
  16. Binning / Discretization
  17. Auto-Preprocessing libraries
  18. Data Partitioning - Train, Validation, & Test
  19. Dealing with Outliers
  20. Handling Noisy Data
  21. Textual Feature Extraction
  22. Data Scaling and Normalization
  1. Model Interpretability
  2. Data Augmentation
  3. Early Stopping
  4. Transfer Learning
  5. Forecasting Techniques
  6. Blackbox - Neural Network Models
  7. Learning Rate Scheduling
  8. AutoML
  9. Evaluation Metrics
  10. Performance Visualization
  11. Hyperparameter Tuning
  12. Regularization
  13. Regular Monitoring and Logging
  14. Multiclass Classification Techniques
  15. Model Comparison
  16. Association Rules
  17. External Validation
  18. Weight Initialization
  19. Ensemble Techniques
  20. Network Analytics/ GeoSpatial Analytics
  21. Clustering
  22. Recommendation Engine
  23. Binary Classification Techniques
  24. Regularization Techniques
  25. Transfer Learning
  26. Cross-Validation
  27. Natural Language Processing
  28. Batch Normalization
  29. Batch Size Selection
  30. Cross-Validation
  31. Regression Analysis
  32. Word Embeddings
  33. GridSearchCV, RandomisedSearchCV, BayesianSearchCV
  34. Reinforcement Learning
  1. Databases
  2. code repository
  3. Data Preprocessing pipeline models
  4. Datawarehouse
  5. model registry
  1. Data Drift Monitoring
  2. Performance Metrics
  3. Cloud Deployment
  4. Model Health Monitoring
  5. Edge Deployment
  6. Model Serialization
  7. Feedback Collection
  8. Prediction Logging
  9. Model Drift
  10. Model Versioning
  11. Flask
  12. Streamlit
  13. Alerting and Notification
  14. Serverless Computing
  15. Containerization
  16. Concept Drift Detection
  17. FastAPI
  18. Bias and Fairness Assessment
ML Workflow Intermediate - Architecture
  • Element belongs to model
  • Element not belongs to model
Training Pipeline
Data Collection

Data Collection

Inference API

API Stream

Web crawler

API Stream

Web crawler

Python logo

Selenium

Data Ingestion

Data Ingestion

Data Landing Zone

Store Data from all the Sources
Store Data from all the Sources

Store Data from all the Sources

Data Cleaning / Preprocessing

Data Cleaning / Preprocessing

Derived & Base features

Data Training & Modelling

Data Training & Modelling

Inference Pipeline
Input Data for Forecasting

Input Data for Forecasting

Input Data

Cleaned & Processed Data

Inference

Inference

Inference pickle
Inference Joblib
streamlit
Inference API