Workflow Element Store

  1. Experiments (DoE)
  2. Surveys and Questionnaires
  3. WebScraping
  4. Mobile Applications or IoT Applications
  5. Feedback Data
  6. Flat files
  7. Data Bases - SQL
  8. APIs and Data Feeds
  9. Data bases - NoSQL
  10. Data Collaboration and Partnerships
  11. Public Datasets
  1. PostgreSQL
  2. Azure Streaming Analytics
  3. MS SQL server
  4. Azure Synapse
  5. MongoDB
  6. Azure blob storage
  7. ETL/ELT pipeline
  8. GCP BigQuery
  9. AWS Redshift
  10. Azure ADF
  11. AWS Kinesis
  12. Apache Kafka
  13. Oracle DB
  14. GCP Data Fusion
  15. MySQL
  16. AWS RDS
  17. AWS Glue
  18. GCP Dataflow
  19. RDBMS
  20. s3
  21. GCS
  1. Data Scaling and Normalization
  2. Domain-Specific Feature Engineering
  3. Dealing with Outliers
  4. AutoEDA libraries
  5. Data Transformations
  6. Feature Selection
  7. Textual Feature Extraction
  8. Handling Imbalanced Classes
  9. Interaction Features
  10. Dimensionality Reduction
  11. Augmentation
  12. Binning / Discretization
  13. Handling Categorical Data
  14. Polynomial Features
  15. Handling Time-Series Data
  16. Feature Extraction from Images
  17. Annotation
  18. Handling Missing Data
  19. Data Partitioning - Train, Validation, & Test
  20. Time-Based Features
  21. Handling Noisy Data
  22. Auto-Preprocessing libraries
  1. Learning Rate Scheduling
  2. Association Rules
  3. Word Embeddings
  4. AutoML
  5. Model Interpretability
  6. Multiclass Classification Techniques
  7. Regular Monitoring and Logging
  8. Regularization Techniques
  9. Recommendation Engine
  10. Natural Language Processing
  11. Blackbox - Neural Network Models
  12. Data Augmentation
  13. External Validation
  14. Clustering
  15. Performance Visualization
  16. Transfer Learning
  17. Regularization
  18. Regression Analysis
  19. Transfer Learning
  20. Evaluation Metrics
  21. Binary Classification Techniques
  22. Ensemble Techniques
  23. GridSearchCV, RandomisedSearchCV, BayesianSearchCV
  24. Cross-Validation
  25. Early Stopping
  26. Weight Initialization
  27. Network Analytics/ GeoSpatial Analytics
  28. Cross-Validation
  29. Batch Normalization
  30. Hyperparameter Tuning
  31. Batch Size Selection
  32. Reinforcement Learning
  33. Forecasting Techniques
  34. Model Comparison
  1. code repository
  2. Datawarehouse
  3. Databases
  4. Data Preprocessing pipeline models
  5. model registry
  1. Serverless Computing
  2. Model Serialization
  3. Prediction Logging
  4. Model Versioning
  5. Data Drift Monitoring
  6. Model Drift
  7. Performance Metrics
  8. Streamlit
  9. Concept Drift Detection
  10. Model Health Monitoring
  11. Bias and Fairness Assessment
  12. FastAPI
  13. Feedback Collection
  14. Flask
  15. Edge Deployment
  16. Alerting and Notification
  17. Cloud Deployment
  18. Containerization
ML Workflow Intermediate - Architecture
  • Element belongs to model
  • Element not belongs to model
Training Pipeline

Data Collection

API Stream

Web crawler

API Stream

Web crawler

Selenium

Data Ingestion

Data Landing Zone

Store Data from all the Sources

Data Cleaning / Preprocessing

Derived & Base features

Data Training & Modelling

Inference Pipeline

Input Data for Forecasting

Input Data

Cleaned & Processed Data

Inference