Workflow Element Store

  1. Data Logging
  2. Public Datasets
  3. Data Collaboration and Partnerships
  4. Data Generation
  5. Crowdsourcing
  6. Unstructured data (Images / Videos)
  7. Data Pre-existing
  8. Mobile Applications or IoT Applications
  9. APIs and Data Feeds
  10. Unstructured data (Audio)
  11. WebScraping
  12. Structured Data (Tabular)
  13. Surveys and Questionnaires
  1. MySQL
  2. NoSQL DB
  3. RDBMS
  4. AWS Redshift
  5. Azure Data Warehouse
  6. Azure blob storage
  7. GCS
  8. Oracle DB
  9. MS SQL server
  10. S3
  11. PostgreSQL
  12. GCP BigQuery
  13. Informatica
  1. Handling Noisy Data
  2. Handling Missing Data
  3. Dimensionality Reduction
  4. Logarithmic Transform
  5. Auto-Preprocessing libraries
  6. Handling Time-Series Data
  7. Encoding Categorical Variables
  8. Feature Extraction from Images
  9. Interaction Features
  10. Time-Based Features
  11. Textual Feature Extraction
  12. Domain-Specific Feature Engineering
  13. Polynomial Features
  14. Feature Selection
  15. Handling Imbalanced Classes
  16. Data Scaling and Normalization
  17. Handling Categorical Data
  18. Dealing with Outliers
  19. Data Scaling and Normalization
  20. AutoEDA libraries
  21. Dimensionality Reduction
  22. Binning
  1. Train-Test Split
  2. Blackbox Techniques
  3. Forecasting
  4. Data Partitioning
  5. Supervised Learning-binary classification
  6. Unsupervised Learning
  7. Supervised Learning-Regression
  8. Time Series Anaysis
  9. Supervised Learning-multiclass classification
  10. Ensemble Techniques
  1. Batch Normalization
  2. Data Augmentation
  3. Early Stopping
  4. Hyperparameter Tuning
  5. Regular Monitoring and Logging
  6. Gradient Clipping
  7. Weight Initialization
  8. Learning Rate Scheduling
  9. Cross-Validation
  10. Transfer Learning
  11. Data Partition-sequential
  12. Ensemble Methods
  13. Train-Test Split
  14. Regularization
  15. Batch Size Selection
  1. Evaluation Metrics
  2. Model Comparison
  3. Model Interpretability
  4. Performance Visualization
  5. Cross-Validation
  6. Regularization Techniques
  7. Train-Test Split
  8. Data Partitioning
  9. Hyperparameter Tuning
  10. External Validation
  1. Monitoring and Logging
  2. Prediction Logging
  3. Cloud Deployment
  4. Model Versioning
  5. Model Drift
  6. Error Analysis
  7. Alerting and Notification
  8. Continuous Integration and Deployment (CI/CD)
  9. Model Health Monitoring
  10. Edge Deployment
  11. Model Retraining and Updating
  12. Data Drift Monitoring
  13. Model Serialization
  14. Documentation and Reporting
  15. Streamlit
  16. Concept Drift Detection
  17. Model Registry
  18. Serverless Computing
  19. Containerization
  20. Web APIs - Flask, FastAPI, etc.
  21. Model Monitoring and Maintenance
  22. A/B Testing
  23. Bias and Fairness Assessment
  24. Performance Metrics
  25. Feedback Collection
  26. Security Considerations
  27. Documentation and API Documentation
  1. Mobile
  2. End User Machine
ML Workflow - Architecture
  • Element belongs to model
  • Element not belongs to model

Feature Store
(Online / Offline)

Data Sources

Data Warehouse/ Data Lake

EDA, Data Pre Processing & Feature Engineering

Model Selection

Model Training & Hyper Parameter Tuning

Model Evaluation

Model Deployment

End User Device

Model Registry