1. Frameworks
1.1. Pyspark
1.2. Dash
1.3. Flask / (blueprints)
1.4. uWSGI
1.5. nginx
1.6. Plotly
1.7. Scipy
1.8. Numpy
1.9. Ipywidgets :(
1.10. Spring
1.11. scikit-learn
1.12. pyMC3
1.13. docx
1.14. boto3
1.15. pandas
1.16. jupyter
1.17. MlFlow
1.18. Conda
1.19. Grafana
1.20. pytest
1.21. unit test
1.22. junit
1.23. cron
1.24. Tensorflow (?)
1.25. PyTorch (?)
2. Programming Languages
2.1. Python
2.1.1. PEP8
2.2. Java
2.3. Scala
2.4. Shell scripting
2.5. SQL
2.6. General Development Skills
2.6.1. packageing
2.6.1.1. Gradle
2.6.1.2. setup.py
2.6.1.3. sbt
2.6.1.4. pipenv / poetry / pip
2.6.1.5. nmp
2.6.2. OOP
2.6.3. FP
2.6.4. tests
2.6.5. design patterns
2.6.6. CI/CD
2.6.7. version control system(git)
2.6.8. User Interface Design
2.6.9. SemVer
2.7. JavaScript
2.8. HTML / CSS
2.9. RegEx
2.10. jmp/sasjmp
3. Data Sources
3.1. PI
3.1.1. PI Web
3.1.2. SQL directly to PI
3.1.3. PI Clients / API's
3.2. SAP
3.2.1. AAO
3.2.2. SAP HANA
3.2.3. SAP ENLIST
3.3. LIMS
3.4. Vibration data (PCH cloud / IOT)
3.4.1. That computer that is collecting IoT stuff before it goes to influx. Is it even our responsibility?
4. Data Analysis
4.1. Classical statistics
4.1.1. Statistical tests (ANOVA-like)
4.1.2. Correlation
4.1.3. PCA
4.2. Time Series Analysis
4.2.1. Spectral analysis
4.2.1.1. FFT
4.2.1.2. Wavelet analysis
4.2.2. Signal processing
4.2.2.1. Filtering
4.2.2.2. Interpolation
4.2.2.3. Feature Extraction
4.2.3. Deep learning approaches
4.2.3.1. Scaleogram / spectogram -> CNN
4.2.3.2. GADF / GASF -> CNN
4.2.3.3. RNN
4.2.3.4. LSTM
4.3. Machine learning
4.3.1. Models
4.3.1.1. Linear
4.3.1.2. Tree-based
4.3.1.3. Deep learning
4.3.1.4. Autoencoders
4.3.1.5. Outlier detection
4.3.2. Preprocessing
4.3.2.1. Outliers removal
4.3.2.2. Feature Selection
4.3.2.3. Feature Extraction
4.4. Data Visualization
4.4.1. Boxplots
4.4.2. Scatter plot
4.4.3. Timeseries
4.4.4. Heatmaps
5. Storage
5.1. SQL Server
5.2. S3
5.3. Influx DB
5.4. DynamoDB
5.5. MySQL
5.6. SQLLite
5.7. Postgresql
6. Data Engineering
6.1. Data Governance
6.2. Data OPS
6.3. Pipelines
6.4. Streaming Processing
6.5. Data warehouse layout
6.6. Snow flake, Star schema, Normalization
6.7. Big Data
7. AWS Services
7.1. EC2
7.2. S3
7.2.1. Encryption
7.3. Security
7.3.1. networking
7.3.2. policies
7.4. ECS / Fargate
7.5. IoT
7.6. Confluent
7.7. CloudWatch
7.8. Route53
7.9. Kinesis
7.10. SQS
8. Software
8.1. NiFi
8.2. Jenkins
8.3. Git
8.4. ELK
8.4.1. Beats
8.4.2. Kibana Dashboards
8.5. HAProxy
8.6. Kafka
8.7. Nexus
9. Domain
9.1. Fermentation
9.2. Recovery
9.3. Spray drying
9.4. Formulation
9.4.1. Granulation
9.4.2. Liquid/SCO
9.5. Predictive Maintenance
9.6. Data proprocessing
9.6.1. Normalization by 40s size
9.6.2. Filtering undesired batch groups
9.6.3. Batch-tracing
9.6.4. How to combine metadata tables
9.6.5. All the hardcoded metadata .csv files in Switchboard
9.7. Business case evaluation/finance
10. DigiPro
10.1. WebApps
10.1.1. Recovery-data-leads
10.1.2. Performance Review
10.1.3. Offspec
10.1.4. Ferm Productivity
10.2. Notebooks
10.2.1. Yggdrasil-X training notebooks
10.2.2. Yggdrasil X
10.2.3. Yggdrasil
10.3. Switchboard
10.4. Analytics Factory
10.5. Decanter Doctor
10.6. Fermentor Doctor
11. Orhestration
11.1. Terraform
11.2. Packer
11.3. Configuration
11.3.1. Docker
11.3.2. Ansible
12. Networking
12.1. ip protocol
12.2. routing
12.3. DNS
12.4. firewalls
12.5. REST
12.6. encryption (SSL)
12.7. JWT
13. Novozymes Enterprise knowledge
13.1. Organizational Structure / How the rest of the company work
13.2. PDA way of working / Agile
13.2.1. Jira
13.2.2. Confluence
13.2.3. Service Desk
13.3. MS Office package
13.4. Access mandates & who is supposed to see what?
13.5. MY AWESOME WINDOWS WORK STATION!!!
14. OS
14.1. Linux
14.1.1. Ubuntu
14.1.2. Centos
14.1.3. Resources monitoring
14.1.4. OS tunning