Workflow Element Store

  1. Feedback Data
  2. APIs and Data Feeds
  3. WebScraping
  4. Data Collaboration and Partnerships
  5. Surveys and Questionnaires
  6. Public Datasets
  7. Flat files
  8. Data bases - NoSQL
  9. Experiments (DoE)
  10. Mobile Applications or IoT Applications
  11. Data Bases - SQL
  1. Azure Synapse
  2. s3
  3. PostgreSQL
  4. MongoDB
  5. AWS Redshift
  6. Azure ADF
  7. MS SQL server
  8. GCP Dataflow
  9. Apache Kafka
  10. AWS Kinesis
  11. GCP BigQuery
  12. Azure Streaming Analytics
  13. GCP Data Fusion
  14. AWS RDS
  15. MySQL
  16. Azure blob storage
  17. AWS Glue
  18. GCS
  19. Oracle DB
  20. ETL/ELT pipeline
  21. RDBMS
  1. Dealing with Outliers
  2. Handling Categorical Data
  3. Domain-Specific Feature Engineering
  4. Auto-Preprocessing libraries
  5. Handling Imbalanced Classes
  6. Augmentation
  7. Annotation
  8. Textual Feature Extraction
  9. Interaction Features
  10. Dimensionality Reduction
  11. Handling Missing Data
  12. Data Scaling and Normalization
  13. Data Transformations
  14. Handling Noisy Data
  15. Feature Extraction from Images
  16. Handling Time-Series Data
  17. Feature Selection
  18. AutoEDA libraries
  19. Binning / Discretization
  20. Polynomial Features
  21. Time-Based Features
  22. Data Partitioning - Train, Validation, & Test
  1. Cross-Validation
  2. Multiclass Classification Techniques
  3. Association Rules
  4. Natural Language Processing
  5. Binary Classification Techniques
  6. Regression Analysis
  7. AutoML
  8. Batch Size Selection
  9. Data Augmentation
  10. Batch Normalization
  11. Evaluation Metrics
  12. Network Analytics/ GeoSpatial Analytics
  13. Transfer Learning
  14. Recommendation Engine
  15. GridSearchCV, RandomisedSearchCV, BayesianSearchCV
  16. External Validation
  17. Word Embeddings
  18. Regular Monitoring and Logging
  19. Performance Visualization
  20. Cross-Validation
  21. Early Stopping
  22. Blackbox - Neural Network Models
  23. Transfer Learning
  24. Regularization Techniques
  25. Model Comparison
  26. Clustering
  27. Reinforcement Learning
  28. Forecasting Techniques
  29. Weight Initialization
  30. Hyperparameter Tuning
  31. Learning Rate Scheduling
  32. Ensemble Techniques
  33. Model Interpretability
  34. Regularization
  1. model registry
  2. code repository
  3. Databases
  4. Datawarehouse
  5. Data Preprocessing pipeline models
  1. Model Versioning
  2. Model Drift
  3. Data Drift Monitoring
  4. Edge Deployment
  5. Streamlit
  6. Cloud Deployment
  7. Prediction Logging
  8. Bias and Fairness Assessment
  9. Model Serialization
  10. Feedback Collection
  11. Containerization
  12. Flask
  13. FastAPI
  14. Concept Drift Detection
  15. Alerting and Notification
  16. Performance Metrics
  17. Serverless Computing
  18. Model Health Monitoring
ML Workflow - Architecture
  • Element belongs to model
  • Element not belongs to model
Training Pipeline

Data Collection

API Stream

Web crawler

API Stream

Web crawler

Selenium

Data Ingestion

Data Landing Zone

Store Data from all the Sources

Data Cleaning / Preprocessing

Derived & Base features

Data Training & Modelling

Inference Pipeline

Input Data for Forecasting

Input Data

Cleaned & Processed Data

Inference