ML Project Steps

Get Started. It's Free
or sign up with your email address
ML Project Steps by Mind Map: ML Project Steps

1. 1. Requirements & Scope of Work

1.1. 1. Define Project Statement & Objectives 2. Define Project Milestones, Technical Stack and deliverables

1.2. Tooling: JIRA Software, Confluence, Word, Excel, etc

2. 2. Data Collection

2.1. 1. Data Discovery & Collection( Internal, External ) 2. Access Control 3. Compliance

2.2. Tooling: Web Scraping Libraries ( Beautiful soup, scrapy) API Interaction libraries Database interaction libraries Data Extraction Tools Data Storage and Management Tools ( Cloud Storage - S3, Data Warehouse - Redshift, Snowflake, BigQuery, Data Lakes - HDFS

3. 3. Data Preparation

3.1. 1. Cleaning - Remove invalid data, handle Outliers 2. Transformation 3. Labelling

3.2. Tooling: Pandas, NumPy, Apache Spark

4. 4.EDA - Exploratory Data Analysis

4.1. 1. Data Visualization 2. Identify patterns and trends 3. Univariate,Bivariate,Multivariate analysis

4.2. Tooling: Pandas,NumPy, matploblib, seaborn, Apache Spark, etc

5. 5. Feature Enginering

5.1. Feature creation, selection, scaling. Raw & Derived features.

5.2. Tooling: Numpy, scikit learn, pandas, etc

6. 6.Model Selection and Training

6.1. 1. Algorithm Selection - Regression, Classification, Clustering, etc 2. Train-Test-Split

6.2. Tooling: Jupyter, Statistical ML - (scikit learn, XGBoost) , Deep Learning - (PyTorch, Tensor Flow), etc

7. 7.Model Evaluation

7.1. 1. Model Evaluation Metrics (Accuracy, Prediction, Recall & F1 Score) 2. K-fold cross validation

7.2. Tooling: Jupyter, scikit learn, XGBoost, PyTorch,TensorFlow,etc

8. 8.Model Fine Tuning

8.1. 1. Hyperparameter tuning, 2. Transfer learning 3. Grid Seach CV(Cross Validation)

8.2. Tooling: scikit learn, XGBoost, Tensorflow, PyTorch

9. 9.Model Deployment

9.1. 1. Deploy model to environment 2. API or Web Application 3. Integrate with other systems

9.2. Tooling: Amazon Sagemaker, Azure ML, docker, FastAPI, etc

10. 10. Monitoring & Feedback

10.1. 1. Track model performance 2. Collect user feedback 3. Iterate and improve

10.2. Tooling: Amazon Sagemaker, mlflow,Azure ML,etc