Workflow Element Store

  1. Mobile Applications or IoT Applications
  2. Data bases - NoSQL
  3. APIs and Data Feeds
  4. Feedback Data
  5. Experiments (DoE)
  6. Surveys and Questionnaires
  7. WebScraping
  8. Data Bases - SQL
  9. Data Collaboration and Partnerships
  10. Flat files
  11. Public Datasets
  1. Azure ADF
  2. MySQL
  3. MongoDB
  4. ETL/ELT pipeline
  5. GCP Dataflow
  6. GCS
  7. Apache Kafka
  8. Azure Synapse
  9. AWS Glue
  10. PostgreSQL
  11. MS SQL server
  12. Azure blob storage
  13. GCP Data Fusion
  14. AWS RDS
  15. AWS Kinesis
  16. Oracle DB
  17. AWS Redshift
  18. Azure Streaming Analytics
  19. s3
  20. GCP BigQuery
  21. RDBMS
  1. Handling Noisy Data
  2. Polynomial Features
  3. Augmentation
  4. Handling Time-Series Data
  5. Data Scaling and Normalization
  6. Textual Feature Extraction
  7. Interaction Features
  8. Dealing with Outliers
  9. Feature Extraction from Images
  10. Handling Imbalanced Classes
  11. Time-Based Features
  12. Handling Categorical Data
  13. Auto-Preprocessing libraries
  14. Handling Missing Data
  15. Dimensionality Reduction
  16. Domain-Specific Feature Engineering
  17. Feature Selection
  18. Annotation
  19. Data Transformations
  20. Binning / Discretization
  21. Data Partitioning - Train, Validation, & Test
  22. AutoEDA libraries
  1. Model Interpretability
  2. Word Embeddings
  3. Transfer Learning
  4. Blackbox - Neural Network Models
  5. Evaluation Metrics
  6. Performance Visualization
  7. Batch Normalization
  8. Association Rules
  9. Cross-Validation
  10. Weight Initialization
  11. Natural Language Processing
  12. GridSearchCV, RandomisedSearchCV, BayesianSearchCV
  13. Clustering
  14. Cross-Validation
  15. Forecasting Techniques
  16. Multiclass Classification Techniques
  17. Model Comparison
  18. Reinforcement Learning
  19. Ensemble Techniques
  20. External Validation
  21. Regularization Techniques
  22. AutoML
  23. Batch Size Selection
  24. Hyperparameter Tuning
  25. Regularization
  26. Regular Monitoring and Logging
  27. Binary Classification Techniques
  28. Learning Rate Scheduling
  29. Early Stopping
  30. Recommendation Engine
  31. Regression Analysis
  32. Network Analytics/ GeoSpatial Analytics
  33. Transfer Learning
  34. Data Augmentation
  1. Data Preprocessing pipeline models
  2. code repository
  3. model registry
  4. Databases
  5. Datawarehouse
  1. Bias and Fairness Assessment
  2. Model Drift
  3. Edge Deployment
  4. Performance Metrics
  5. Containerization
  6. Serverless Computing
  7. Model Serialization
  8. Streamlit
  9. FastAPI
  10. Cloud Deployment
  11. Feedback Collection
  12. Data Drift Monitoring
  13. Model Versioning
  14. Concept Drift Detection
  15. Prediction Logging
  16. Model Health Monitoring
  17. Flask
  18. Alerting and Notification
ML Workflow - Architecture
  • Element belongs to model
  • Element not belongs to model
Training Pipeline

Data Collection

API Stream

Web crawler

API Stream

Web crawler

Selenium

Data Ingestion

Data Landing Zone

Store Data from all the Sources

Data Cleaning / Preprocessing

Derived & Base features

Data Training & Modelling

Inference Pipeline

Input Data for Forecasting

Input Data

Cleaned & Processed Data

Inference