Workflow Element Store

  1. Experiments (DoE)
  2. Data bases - NoSQL
  3. Feedback Data
  4. WebScraping
  5. Surveys and Questionnaires
  6. Flat files
  7. APIs and Data Feeds
  8. Data Collaboration and Partnerships
  9. Data Bases - SQL
  10. Mobile Applications or IoT Applications
  11. Public Datasets
  1. Azure blob storage
  2. GCP BigQuery
  3. AWS Glue
  4. RDBMS
  5. GCS
  6. MongoDB
  7. AWS Redshift
  8. ETL/ELT pipeline
  9. GCP Data Fusion
  10. Azure Streaming Analytics
  11. AWS Kinesis
  12. GCP Dataflow
  13. Azure Synapse
  14. Oracle DB
  15. s3
  16. PostgreSQL
  17. Apache Kafka
  18. AWS RDS
  19. Azure ADF
  20. MySQL
  21. MS SQL server
  1. Handling Missing Data
  2. Polynomial Features
  3. Auto-Preprocessing libraries
  4. Data Partitioning - Train, Validation, & Test
  5. Textual Feature Extraction
  6. Domain-Specific Feature Engineering
  7. Annotation
  8. Data Transformations
  9. Handling Imbalanced Classes
  10. Binning / Discretization
  11. Handling Categorical Data
  12. Data Scaling and Normalization
  13. Time-Based Features
  14. Interaction Features
  15. Dealing with Outliers
  16. Handling Time-Series Data
  17. Handling Noisy Data
  18. AutoEDA libraries
  19. Feature Selection
  20. Augmentation
  21. Feature Extraction from Images
  22. Dimensionality Reduction
  1. Regularization
  2. Word Embeddings
  3. Natural Language Processing
  4. Network Analytics/ GeoSpatial Analytics
  5. Association Rules
  6. Evaluation Metrics
  7. External Validation
  8. Regular Monitoring and Logging
  9. Forecasting Techniques
  10. Hyperparameter Tuning
  11. Clustering
  12. Performance Visualization
  13. Learning Rate Scheduling
  14. AutoML
  15. Regression Analysis
  16. Cross-Validation
  17. Transfer Learning
  18. GridSearchCV, RandomisedSearchCV, BayesianSearchCV
  19. Batch Size Selection
  20. Model Comparison
  21. Ensemble Techniques
  22. Recommendation Engine
  23. Cross-Validation
  24. Regularization Techniques
  25. Reinforcement Learning
  26. Binary Classification Techniques
  27. Multiclass Classification Techniques
  28. Blackbox - Neural Network Models
  29. Data Augmentation
  30. Batch Normalization
  31. Transfer Learning
  32. Weight Initialization
  33. Model Interpretability
  34. Early Stopping
  1. Datawarehouse
  2. Databases
  3. model registry
  4. code repository
  5. Data Preprocessing pipeline models
  1. Concept Drift Detection
  2. Serverless Computing
  3. Model Serialization
  4. Cloud Deployment
  5. FastAPI
  6. Flask
  7. Feedback Collection
  8. Model Drift
  9. Streamlit
  10. Bias and Fairness Assessment
  11. Prediction Logging
  12. Model Health Monitoring
  13. Alerting and Notification
  14. Containerization
  15. Edge Deployment
  16. Performance Metrics
  17. Data Drift Monitoring
  18. Model Versioning
ML Workflow Intermediate - Architecture
  • Element belongs to model
  • Element not belongs to model
Training Pipeline

Data Collection

API Stream

Web crawler

API Stream

Web crawler

Selenium

Data Ingestion

Data Landing Zone

Store Data from all the Sources

Data Cleaning / Preprocessing

Derived & Base features

Data Training & Modelling

Inference Pipeline

Input Data for Forecasting

Input Data

Cleaned & Processed Data

Inference