DataScience roadmap
by Gabriel Caetano

1. Tools
1.1. Python
1.1.1. Pandas
1.1.2. Polar
1.1.3. Numpy
1.1.4. Matplotlib
1.1.5. SQLAlchemy
1.1.6. Requests
1.1.7. Seaborn
1.1.8. SciKit-Learn
1.1.9. Docker
1.1.10. xgboost
1.1.11. Optuna
1.1.12. MLflow
1.1.13. Weights & Biases (W&B)
1.1.14. Decorators
1.1.15. TensorFlow
1.1.16. Keras
1.1.17. PyTorch
1.1.18. LightFM
1.1.19. Surprise
1.1.20. NetworkX
1.1.21. NLTK
1.1.22. TextBlob
1.1.23. SpaCy
1.1.24. Gensim
1.1.25. MIxtend
1.1.26. OpenCV
1.1.27. Prophet
1.2. R
1.2.1. dplyr
1.2.2. tidyr
1.2.3. ggplot2
1.2.4. forecast
1.3. PowerBI
1.4. Gephi
1.5. Bert
1.6. SQL
1.7. AWS EC2/EC3
1.8. SSH
1.9. Git
1.10. Spark/Hadoop/Hive
2. Applications
2.1. Forecasting
2.2. Data Mining
2.3. Anomaly detection
2.4. Pattern detecction
2.4.1. Equipment failure detection
2.4.2. Consumption patter identification
2.5. Personalized recommendation system
2.6. Social network analysis
2.7. Sentiment classification
2.8. Natural language processing
2.9. Association rule discovery
2.10. Object identification in images
3. Skills
3.1. Statistics and Probability
3.2. Machine Learning
3.2.1. Classification
3.2.2. Regression
3.2.3. Agrouping
3.2.4. Pattern identification
3.2.5. Association identification
3.3. Storytelling
3.4. Big Data Tools
3.5. Cloud Computing
3.6. Machine Learning Parameters Optimization
3.7. Experiment Tracking
3.8. Version Control
3.9. Data Access
3.10. Problem-solving
3.11. Communication
3.12. Colaboration
3.13. Ethical and Responsible AI
3.14. Curiosity and Continuous Learning
3.15. Exploratory Data Analysis
3.16. Software Engineering Principles
3.16.1. Clean coding
3.16.2. Testing