Statistics for Data Science

Começar. É Gratuito
ou inscrever-se com seu endereço de e-mail
Statistics for Data Science por Mind Map: Statistics for Data Science

1. Statistical Modelling and Inference (GSE)

1.1. Topic: Fundamentals of regression

1.2. Topic: Variable selection & penalized likelihood

1.3. Topic: Bayesian regression

1.4. Topic: Bayesian computation

1.5. Topic: Probabilistic supervised and unsupervised learning

1.6. Topic: Graphical models

1.7. Topic: Gaussian processes for regression and classification

1.8. Topic: Intro to multilevel and hierarchical models

2. Basic Probability and Mathematical Statistics (ebook: All of Statistics A concise course in Statistical Inference)

2.1. Bootstrapping

2.2. Curve Estimation

2.3. Graphical Models

2.4. Statistical Inference

2.5. The first part of the text is concerned with probabilty theory, the formal language of uncertanty which is the basis of Statistical Inference. The basic problem we study in probability is: Given a data generating process, what are the properties of the outcomes?

2.6. The second part of the book is about statistical inference and its close cousins, data mining and machine learning. The basic problem of the statistical inference is the inverse of probability: Given the outcomes, what can we say about the process that generated the data?

2.6.1. Prediction

2.6.2. Classification

2.6.3. Clustering

2.6.4. Estimation

2.6.5. Data Analysis, Machine Learning and Data mining are various names given to the practive of Statistical Inference, depending on the context.

3. As bases da teoria da probabilidade: variáveis aleatórias e distribuições

4. Aplicação: estimando o tempo de um projeto com simulação de Monte Carlo

5. Utilizando regressão linear para entender e prever como o mercado precifica imóveis

6. Utilizando regressão logística para entender e prever o comportamento do consumidor

7. Statistical learning refers to a set of tools for modeling and understanding complex datasets

7.1. The field encompasses many methods such as the lasso and sparse regression, classification and regression trees, and boosting and support vector machines.