Workflow Element Store

  1. Data Collaboration and Partnerships
  2. Surveys and Questionnaires
  3. Feedback Data
  4. WebScraping
  5. APIs and Data Feeds
  6. Experiments (DoE)
  7. Data bases - NoSQL
  8. Mobile Applications or IoT Applications
  9. Public Datasets
  10. Data Bases - SQL
  11. Flat files
  1. Azure Synapse
  2. GCP Dataflow
  3. MongoDB
  4. AWS RDS
  5. ETL/ELT pipeline
  6. MS SQL server
  7. Oracle DB
  8. GCP BigQuery
  9. PostgreSQL
  10. s3
  11. AWS Redshift
  12. Azure ADF
  13. GCS
  14. AWS Glue
  15. AWS Kinesis
  16. Azure Streaming Analytics
  17. RDBMS
  18. Azure blob storage
  19. GCP Data Fusion
  20. MySQL
  21. Apache Kafka
  1. Polynomial Features
  2. Binning / Discretization
  3. Auto-Preprocessing libraries
  4. Dealing with Outliers
  5. Dimensionality Reduction
  6. Handling Categorical Data
  7. Interaction Features
  8. Feature Extraction from Images
  9. Handling Imbalanced Classes
  10. Annotation
  11. Handling Time-Series Data
  12. Textual Feature Extraction
  13. Data Scaling and Normalization
  14. Feature Selection
  15. Data Partitioning - Train, Validation, & Test
  16. AutoEDA libraries
  17. Time-Based Features
  18. Handling Missing Data
  19. Handling Noisy Data
  20. Data Transformations
  21. Augmentation
  22. Domain-Specific Feature Engineering
  1. GridSearchCV, RandomisedSearchCV, BayesianSearchCV
  2. Evaluation Metrics
  3. Blackbox - Neural Network Models
  4. Natural Language Processing
  5. Regularization Techniques
  6. Clustering
  7. Transfer Learning
  8. Regression Analysis
  9. AutoML
  10. Network Analytics/ GeoSpatial Analytics
  11. Transfer Learning
  12. Batch Normalization
  13. Association Rules
  14. Weight Initialization
  15. Multiclass Classification Techniques
  16. Binary Classification Techniques
  17. Learning Rate Scheduling
  18. Recommendation Engine
  19. Cross-Validation
  20. External Validation
  21. Performance Visualization
  22. Cross-Validation
  23. Model Comparison
  24. Hyperparameter Tuning
  25. Forecasting Techniques
  26. Reinforcement Learning
  27. Early Stopping
  28. Data Augmentation
  29. Batch Size Selection
  30. Regular Monitoring and Logging
  31. Word Embeddings
  32. Model Interpretability
  33. Ensemble Techniques
  34. Regularization
  1. model registry
  2. code repository
  3. Datawarehouse
  4. Data Preprocessing pipeline models
  5. Databases
  1. Data Drift Monitoring
  2. Containerization
  3. Alerting and Notification
  4. Streamlit
  5. Feedback Collection
  6. Flask
  7. Serverless Computing
  8. Model Versioning
  9. FastAPI
  10. Prediction Logging
  11. Cloud Deployment
  12. Model Health Monitoring
  13. Model Serialization
  14. Performance Metrics
  15. Bias and Fairness Assessment
  16. Edge Deployment
  17. Model Drift
  18. Concept Drift Detection
ML Workflow Intermediate - Architecture
  • Element belongs to model
  • Element not belongs to model
Training Pipeline

Data Collection

API Stream

Web crawler

API Stream

Web crawler

Selenium

Data Ingestion

Data Landing Zone

Store Data from all the Sources

Data Cleaning / Preprocessing

Derived & Base features

Data Training & Modelling

Inference Pipeline

Input Data for Forecasting

Input Data

Cleaned & Processed Data

Inference