Awesome Data Engineering
by Lilia Lipatova
1. Julia
2. Logit
3. ML libs
3.1. High-level
3.1.1. Scikit-learn
3.1.2. Keras
3.1.3. Tensorforce
3.2. Low-level
3.2.1. Tensorflow
3.2.2. Theano
3.2.3. Caffe2
3.2.4. Torch
3.2.5. CNTK
4. Jupyter Notebook
4.1. Boosting / Ensembles
5. OS / Shell / Environment
5.1. Linux
5.2. Bash
6. Programming Languages
6.1. Python
6.2. R
6.3. Scala
7. Machine Learning Methods
7.1. Neural Networks
7.1.1. Convolutional
7.1.2. Recurrent, LSTM
7.2. Support Vector Machine (SVM)
7.3. Decision Trees
7.4. Reinforcement Learning
8. Machine Learning Areas
8.1. NLP
8.2. Picture
8.2.1. Style Transfer
8.2.2. Object Detection and Classification
8.2.2.1. Optical Character Recognition (OCR)
8.2.2.2. ImageNet / VGG16 / VGG19
8.3. Sound
8.3.1. Text-to-Speech
8.4. Speech recognition
9. Math
10. Druid
11. OLAP-specific
12. Working With Data
12.1. Mapreduce Systems
12.1.1. Hadoop
12.1.2. YT
12.2. RDBMS-like
12.2.1. Google BigQuery
12.2.2. Amazon Redshift
12.2.3. Yandex Clickhouse
12.2.4. CockroachDB
12.3. PostgreSQL-based
12.3.1. Greenplum
12.3.2. Citus
12.4. NoSQL
12.4.1. Elasticsearch
12.5. MongoDB
12.6. BI / Quering / Reports
12.6.1. Kibana
12.6.1.1. Tableau
12.6.2. Metabase
12.6.2.1. Superset
12.6.3. Plotly
12.6.4. Redash
12.7. ETL
12.7.1. Splunk
12.7.2. Talend
12.7.3. Singer.io
13. Theory
13.1. Stats
13.2. Algorithms
14. DevOps
14.1. Continious Integragion
14.1.1. Gitlab CI
14.1.2. Travis CI
14.1.3. Drone
14.1.4. Teamcity
14.1.5. Jenkins
14.1.6. Buildbot
14.2. Amazon Web Services
14.3. Google Cloud Platform
14.4. Docker
14.5. Kubernetes
15. Minimal "Must have" example (could vary)
15.1. Bash
15.2. Scikit-learn
15.2.1. Jupyter Notebook