Workflow Element Store

  1. WebScraping
  2. Mobile Applications or IoT Applications
  3. Data Bases - SQL
  4. Experiments (DoE)
  5. Flat files
  6. Surveys and Questionnaires
  7. Data Collaboration and Partnerships
  8. Feedback Data
  9. Public Datasets
  10. APIs and Data Feeds
  11. Data bases - NoSQL
  1. MySQL
  2. AWS Kinesis
  3. MongoDB
  4. ETL/ELT pipeline
  5. RDBMS
  6. AWS RDS
  7. Azure Synapse
  8. AWS Glue
  9. PostgreSQL
  10. GCS
  11. GCP Data Fusion
  12. GCP BigQuery
  13. AWS Redshift
  14. s3
  15. Azure blob storage
  16. MS SQL server
  17. Azure Streaming Analytics
  18. Apache Kafka
  19. GCP Dataflow
  20. Azure ADF
  21. Oracle DB
  1. Interaction Features
  2. Handling Time-Series Data
  3. Domain-Specific Feature Engineering
  4. Feature Selection
  5. Time-Based Features
  6. AutoEDA libraries
  7. Dimensionality Reduction
  8. Handling Categorical Data
  9. Data Partitioning - Train, Validation, & Test
  10. Textual Feature Extraction
  11. Augmentation
  12. Handling Imbalanced Classes
  13. Handling Missing Data
  14. Polynomial Features
  15. Auto-Preprocessing libraries
  16. Feature Extraction from Images
  17. Dealing with Outliers
  18. Annotation
  19. Handling Noisy Data
  20. Data Transformations
  21. Binning / Discretization
  22. Data Scaling and Normalization
  1. Blackbox - Neural Network Models
  2. Learning Rate Scheduling
  3. External Validation
  4. Data Augmentation
  5. Ensemble Techniques
  6. Transfer Learning
  7. Multiclass Classification Techniques
  8. Regularization Techniques
  9. Transfer Learning
  10. Network Analytics/ GeoSpatial Analytics
  11. Association Rules
  12. Model Comparison
  13. Natural Language Processing
  14. Regularization
  15. Hyperparameter Tuning
  16. Recommendation Engine
  17. Batch Normalization
  18. Cross-Validation
  19. Word Embeddings
  20. Reinforcement Learning
  21. Binary Classification Techniques
  22. Cross-Validation
  23. Weight Initialization
  24. Regular Monitoring and Logging
  25. Evaluation Metrics
  26. Early Stopping
  27. Performance Visualization
  28. Clustering
  29. Model Interpretability
  30. Forecasting Techniques
  31. GridSearchCV, RandomisedSearchCV, BayesianSearchCV
  32. Regression Analysis
  33. Batch Size Selection
  34. AutoML
  1. Datawarehouse
  2. model registry
  3. code repository
  4. Databases
  5. Data Preprocessing pipeline models
  1. Data Drift Monitoring
  2. Prediction Logging
  3. Flask
  4. Containerization
  5. Model Versioning
  6. FastAPI
  7. Performance Metrics
  8. Model Health Monitoring
  9. Alerting and Notification
  10. Model Drift
  11. Concept Drift Detection
  12. Bias and Fairness Assessment
  13. Edge Deployment
  14. Feedback Collection
  15. Model Serialization
  16. Serverless Computing
  17. Cloud Deployment
  18. Streamlit
ML Workflow Intermediate - Architecture
  • Element belongs to model
  • Element not belongs to model
Training Pipeline
Data Collection

Data Collection

API Stream

Web crawler

API Stream

Web crawler

Selenium

Data Ingestion

Data Ingestion

Data Landing Zone

Store Data from all the Sources

Data Cleaning / Preprocessing

Data Cleaning / Preprocessing

Derived & Base features

Data Training & Modelling

Data Training & Modelling

Inference Pipeline
Input Data for Forecasting

Input Data for Forecasting

Input Data

Cleaned & Processed Data

Inference

Inference

streamlit