Workflow Element Store

  1. Public Datasets
  2. Feedback Data
  3. Experiments (DoE)
  4. WebScraping
  5. APIs and Data Feeds
  6. Flat files
  7. Surveys and Questionnaires
  8. Data Bases - SQL
  9. Data Collaboration and Partnerships
  10. Data bases - NoSQL
  11. Mobile Applications or IoT Applications
  1. PostgreSQL
  2. AWS RDS
  3. GCP BigQuery
  4. AWS Redshift
  5. AWS Glue
  6. Oracle DB
  7. Azure ADF
  8. RDBMS
  9. MS SQL server
  10. AWS Kinesis
  11. Azure Synapse
  12. Azure blob storage
  13. Azure Streaming Analytics
  14. Apache Kafka
  15. ETL/ELT pipeline
  16. s3
  17. GCS
  18. MongoDB
  19. MySQL
  20. GCP Data Fusion
  21. GCP Dataflow
  1. Binning / Discretization
  2. Time-Based Features
  3. Annotation
  4. Handling Imbalanced Classes
  5. Dealing with Outliers
  6. Handling Time-Series Data
  7. Dimensionality Reduction
  8. Domain-Specific Feature Engineering
  9. AutoEDA libraries
  10. Polynomial Features
  11. Handling Noisy Data
  12. Textual Feature Extraction
  13. Data Scaling and Normalization
  14. Handling Missing Data
  15. Data Partitioning - Train, Validation, & Test
  16. Augmentation
  17. Feature Selection
  18. Auto-Preprocessing libraries
  19. Handling Categorical Data
  20. Data Transformations
  21. Interaction Features
  22. Feature Extraction from Images
  1. Regression Analysis
  2. Weight Initialization
  3. Evaluation Metrics
  4. Reinforcement Learning
  5. Transfer Learning
  6. Network Analytics/ GeoSpatial Analytics
  7. Batch Size Selection
  8. Word Embeddings
  9. Forecasting Techniques
  10. Association Rules
  11. Natural Language Processing
  12. Cross-Validation
  13. GridSearchCV, RandomisedSearchCV, BayesianSearchCV
  14. Batch Normalization
  15. Regularization Techniques
  16. Regular Monitoring and Logging
  17. Hyperparameter Tuning
  18. Data Augmentation
  19. Recommendation Engine
  20. Multiclass Classification Techniques
  21. Performance Visualization
  22. Blackbox - Neural Network Models
  23. Ensemble Techniques
  24. Model Comparison
  25. External Validation
  26. AutoML
  27. Clustering
  28. Binary Classification Techniques
  29. Model Interpretability
  30. Transfer Learning
  31. Cross-Validation
  32. Regularization
  33. Early Stopping
  34. Learning Rate Scheduling
  1. Databases
  2. code repository
  3. Datawarehouse
  4. Data Preprocessing pipeline models
  5. model registry
  1. Model Drift
  2. Alerting and Notification
  3. Data Drift Monitoring
  4. Feedback Collection
  5. Model Versioning
  6. Performance Metrics
  7. Bias and Fairness Assessment
  8. Cloud Deployment
  9. Containerization
  10. Flask
  11. Model Health Monitoring
  12. Prediction Logging
  13. Streamlit
  14. FastAPI
  15. Model Serialization
  16. Serverless Computing
  17. Edge Deployment
  18. Concept Drift Detection
ML Workflow - Architecture
  • Element belongs to model
  • Element not belongs to model
Training Pipeline

Data Collection

API Stream

Web crawler

API Stream

Web crawler

Selenium

Data Ingestion

Data Landing Zone

Store Data from all the Sources

Data Cleaning / Preprocessing

Derived & Base features

Data Training & Modelling

Inference Pipeline

Input Data for Forecasting

Input Data

Cleaned & Processed Data

Inference