Workflow Element Store

  1. Mobile Applications or IoT Applications
  2. Data Collaboration and Partnerships
  3. Flat files
  4. Public Datasets
  5. WebScraping
  6. Feedback Data
  7. Surveys and Questionnaires
  8. APIs and Data Feeds
  9. Data bases - NoSQL
  10. Experiments (DoE)
  11. Data Bases - SQL
  1. Azure blob storage
  2. AWS RDS
  3. Oracle DB
  4. GCS
  5. Azure Synapse
  6. MySQL
  7. Azure ADF
  8. Azure Streaming Analytics
  9. RDBMS
  10. s3
  11. AWS Glue
  12. Apache Kafka
  13. MS SQL server
  14. GCP Data Fusion
  15. AWS Redshift
  16. GCP BigQuery
  17. MongoDB
  18. ETL/ELT pipeline
  19. GCP Dataflow
  20. AWS Kinesis
  21. PostgreSQL
  1. Handling Categorical Data
  2. Data Transformations
  3. Feature Extraction from Images
  4. Handling Time-Series Data
  5. AutoEDA libraries
  6. Polynomial Features
  7. Annotation
  8. Dimensionality Reduction
  9. Feature Selection
  10. Augmentation
  11. Dealing with Outliers
  12. Textual Feature Extraction
  13. Domain-Specific Feature Engineering
  14. Handling Missing Data
  15. Binning / Discretization
  16. Handling Noisy Data
  17. Handling Imbalanced Classes
  18. Interaction Features
  19. Auto-Preprocessing libraries
  20. Time-Based Features
  21. Data Partitioning - Train, Validation, & Test
  22. Data Scaling and Normalization
  1. Forecasting Techniques
  2. Regular Monitoring and Logging
  3. Regression Analysis
  4. Network Analytics/ GeoSpatial Analytics
  5. Batch Normalization
  6. Cross-Validation
  7. Recommendation Engine
  8. Binary Classification Techniques
  9. External Validation
  10. Cross-Validation
  11. Transfer Learning
  12. Word Embeddings
  13. Hyperparameter Tuning
  14. Evaluation Metrics
  15. Data Augmentation
  16. Ensemble Techniques
  17. Model Comparison
  18. Natural Language Processing
  19. Blackbox - Neural Network Models
  20. Weight Initialization
  21. Association Rules
  22. Learning Rate Scheduling
  23. Multiclass Classification Techniques
  24. AutoML
  25. Reinforcement Learning
  26. Transfer Learning
  27. GridSearchCV, RandomisedSearchCV, BayesianSearchCV
  28. Regularization Techniques
  29. Regularization
  30. Early Stopping
  31. Clustering
  32. Batch Size Selection
  33. Model Interpretability
  34. Performance Visualization
  1. Data Preprocessing pipeline models
  2. Databases
  3. Datawarehouse
  4. model registry
  5. code repository
  1. Feedback Collection
  2. Model Versioning
  3. FastAPI
  4. Alerting and Notification
  5. Serverless Computing
  6. Edge Deployment
  7. Model Health Monitoring
  8. Containerization
  9. Streamlit
  10. Model Serialization
  11. Concept Drift Detection
  12. Prediction Logging
  13. Flask
  14. Performance Metrics
  15. Data Drift Monitoring
  16. Model Drift
  17. Bias and Fairness Assessment
  18. Cloud Deployment
ML Workflow Intermediate - Architecture
  • Element belongs to model
  • Element not belongs to model
Training Pipeline
Data Collection

Data Collection

Inference API

API Stream

Web crawler

API Stream

Web crawler

Python logo

Selenium

Data Ingestion

Data Ingestion

Data Landing Zone

Store Data from all the Sources
Store Data from all the Sources

Store Data from all the Sources

Data Cleaning / Preprocessing

Data Cleaning / Preprocessing

Derived & Base features

Data Training & Modelling

Data Training & Modelling

Inference Pipeline
Input Data for Forecasting

Input Data for Forecasting

Input Data

Cleaned & Processed Data

Inference

Inference

Inference pickle
Inference Joblib
streamlit
Inference API