Workflow Element Store

  1. Mobile Applications or IoT Applications
  2. Experiments (DoE)
  3. Surveys and Questionnaires
  4. Data Bases - SQL
  5. Flat files
  6. Data Collaboration and Partnerships
  7. Data bases - NoSQL
  8. Public Datasets
  9. APIs and Data Feeds
  10. WebScraping
  11. Feedback Data
  1. AWS Glue
  2. AWS Kinesis
  3. AWS Redshift
  4. GCP Dataflow
  5. MS SQL server
  6. PostgreSQL
  7. Oracle DB
  8. MySQL
  9. ETL/ELT pipeline
  10. Azure ADF
  11. AWS RDS
  12. Azure Streaming Analytics
  13. RDBMS
  14. Azure Synapse
  15. GCP BigQuery
  16. GCP Data Fusion
  17. s3
  18. Azure blob storage
  19. GCS
  20. MongoDB
  21. Apache Kafka
  1. AutoEDA libraries
  2. Handling Imbalanced Classes
  3. Domain-Specific Feature Engineering
  4. Feature Selection
  5. Dimensionality Reduction
  6. Interaction Features
  7. Dealing with Outliers
  8. Binning / Discretization
  9. Feature Extraction from Images
  10. Auto-Preprocessing libraries
  11. Time-Based Features
  12. Textual Feature Extraction
  13. Polynomial Features
  14. Annotation
  15. Handling Categorical Data
  16. Augmentation
  17. Data Transformations
  18. Handling Time-Series Data
  19. Data Partitioning - Train, Validation, & Test
  20. Handling Noisy Data
  21. Handling Missing Data
  22. Data Scaling and Normalization
  1. Model Interpretability
  2. Word Embeddings
  3. Early Stopping
  4. Recommendation Engine
  5. Evaluation Metrics
  6. Reinforcement Learning
  7. Learning Rate Scheduling
  8. Ensemble Techniques
  9. Blackbox - Neural Network Models
  10. Network Analytics/ GeoSpatial Analytics
  11. Multiclass Classification Techniques
  12. Regular Monitoring and Logging
  13. Regularization Techniques
  14. Batch Normalization
  15. AutoML
  16. Clustering
  17. External Validation
  18. Regularization
  19. Cross-Validation
  20. Forecasting Techniques
  21. Natural Language Processing
  22. Weight Initialization
  23. Model Comparison
  24. Binary Classification Techniques
  25. GridSearchCV, RandomisedSearchCV, BayesianSearchCV
  26. Transfer Learning
  27. Association Rules
  28. Batch Size Selection
  29. Performance Visualization
  30. Regression Analysis
  31. Cross-Validation
  32. Transfer Learning
  33. Hyperparameter Tuning
  34. Data Augmentation
  1. Data Preprocessing pipeline models
  2. model registry
  3. Databases
  4. Datawarehouse
  5. code repository
  1. Edge Deployment
  2. Alerting and Notification
  3. Model Serialization
  4. Serverless Computing
  5. Performance Metrics
  6. Data Drift Monitoring
  7. Model Versioning
  8. Flask
  9. Streamlit
  10. Feedback Collection
  11. Cloud Deployment
  12. FastAPI
  13. Concept Drift Detection
  14. Model Health Monitoring
  15. Prediction Logging
  16. Model Drift
  17. Bias and Fairness Assessment
  18. Containerization
ML Workflow Intermediate - Architecture
  • Element belongs to model
  • Element not belongs to model
Training Pipeline
Data Collection

Data Collection

Inference API

API Stream

Web crawler

API Stream

Web crawler

Python logo

Selenium

Data Ingestion

Data Ingestion

Data Landing Zone

Store Data from all the Sources
Store Data from all the Sources

Store Data from all the Sources

Data Cleaning / Preprocessing

Data Cleaning / Preprocessing

Derived & Base features

Data Training & Modelling

Data Training & Modelling

Inference Pipeline
Input Data for Forecasting

Input Data for Forecasting

Input Data

Cleaned & Processed Data

Inference

Inference

Inference pickle
Inference Joblib
streamlit
Inference API