Multivariate Geostats

Jetzt loslegen. Gratis!
oder registrieren mit Ihrer E-Mail-Adresse
Multivariate Geostats von Mind Map: Multivariate Geostats

1. Getting ready

1.1. Data

1.1.1. Peru's "districts"

1.2. PreProcessing

1.2.1. for clustering

1.2.1.1. rescaling

1.2.1.1.1. control outliers

1.2.2. for both clustering and regression

1.2.2.1. same monotony of variable measurement

1.2.2.1.1. reversing may be needed

1.2.2.2. meaningful variable names

1.2.2.2.1. recoding may be needed

1.3. The Seed

1.3.1. keep the seed set, so you will see the same results I show

2. Clustering

2.1. Conventional

2.1.1. Grouping

2.1.1.1. WHAT

2.1.1.1.1. Multiple cases (rows)

2.1.1.2. LOOKING FOR

2.1.1.2.1. Homogeneity

2.1.1.2.2. Heterogeneity

2.1.2. several techniques

2.1.2.1. clusters all cases in the data

2.1.2.1.1. for example: **KMEANS**

2.1.2.2. clusters what "makes sense"

2.1.2.2.1. may leave cases isolated

2.1.3. statistical coherent

2.1.4. supportive of policy

2.1.4.1. profiling the clusters

2.1.4.1.1. generally straighforward

2.2. Spatial Clustering / **REGIONALIZATION**

2.2.1. Grouping

2.2.1.1. WHAT

2.2.1.2. LOOKING FOR

2.2.1.3. BUT....

2.2.1.3.1. forces contiguity/proximity

2.2.2. statistical coherent, but...

2.2.2.1. geographical coherence is more important

2.2.3. For all techniques (in this course)

2.2.3.1. input

2.2.3.1.1. K: The target number of contiguous regions.

2.2.3.1.2. W (spatial weights matrix)

2.2.3.1.3. data (**D**)

2.2.3.1.4. dissimilarity/similarity matrix

2.2.3.2. output

2.2.3.2.1. cluster

2.2.3.3. profiling

2.2.3.3.1. how to interpret the cluster label

2.2.4. regionalization algorithm (in this course)

2.2.4.1. Spatial KMeans (**RegionKMeansHeuristic**)

2.2.4.1.1. process

2.2.4.2. Apriori Zoning Problem (**AZP**)

2.2.4.2.1. process

2.2.4.3. Spatial Clustering by Tree Edge Removal (**SKATER**)

2.2.4.3.1. process

2.3. Non-Spatial vs Spatial trade-off

2.3.1. supportive of policy

2.3.1.1. planing

2.3.1.2. allocation

2.3.1.3. intervention

2.3.2. Compactness

2.3.2.1. The isoperimetric quotient (IPQ)

2.3.2.1.1. from 0 to 1 (1 is best)

2.3.2.1.2. It penalizes long, wiggly, or highly elongated shapes

2.3.2.2. Convex hull ratio (CHR)

2.3.2.2.1. from 0 to 1 (1 is best)

2.3.2.2.2. It penalizes "pitted" or "re-entrant" shapes where the boundary folds inward

2.3.3. Homogeneity

2.3.3.1. Silhouette (SIL) Score:

2.3.3.1.1. It measures how similar a data point is to its own cluster compared to other clusters.

2.3.4. Heterogeneity

2.3.4.1. Calinski-Harabasz (CH) Score

2.3.4.1.1. measures the quality of clusters by comparing the between-cluster dispersion (how separated the clusters are) to the within-cluster dispersion (how tight the clusters are)

2.3.5. further steps

2.3.5.1. profiling

3. Regression

3.1. Conventional OLS

3.1.1. predict/explain

3.1.1.1. average behavior of a variable (the dependent variable

3.1.1.2. from the behavior of other variables (independent variables)

3.1.1.2.1. predictor (covariates)

3.1.1.2.2. control variables

3.1.2. returns

3.1.2.1. an equation

3.1.2.2. residuals

3.1.2.3. results for interpretation

3.1.2.3.1. Model Fit (comparability)

3.1.2.3.2. Regression Table

3.1.2.4. diagnostics

3.1.2.4.1. No multicollinearity

3.1.2.4.2. Normality of residuals

3.1.2.4.3. Homoscedasticity

3.1.3. key assumptions

3.1.3.1. independence of errors

3.1.3.2. independence of observations

3.2. Spatial regression

3.2.1. if locations matters, conventional regression fails two assumptions

3.2.1.1. first step: **Durbin Joint Test**

3.2.1.1.1. OLS (conventional regression) is statistically flawed?

3.2.2. IMAGINE

3.2.2.1. FOCAL

3.2.2.1.1. ARIZONA

3.2.2.2. Y

3.2.2.2.1. crime

3.2.2.3. X

3.2.2.3.1. unemployment

3.2.3. THEN: simple models (one term at at time)

3.2.3.1. SAR REGRESSION

3.2.3.1.1. pay attention

3.2.3.1.2. what may affect the dependant?

3.2.3.2. SLX REGRESSION

3.2.3.2.1. pay attention

3.2.3.2.2. what may affect the dependant?

3.2.3.3. SEM REGRESSION

3.2.3.3.1. pay attention

3.2.3.3.2. what may affect the dependant?

3.2.4. or COMBINATIONS

3.2.4.1. SDM REGRESSION

3.2.4.1.1. pay attention

3.2.4.1.2. what may affect the dependant?

3.2.4.2. SAC REGRESSION

3.2.4.2.1. pay attention

3.2.4.2.2. what may affect the dependant?

3.2.4.3. SLX-ERROR

3.2.4.3.1. pay attention

3.2.4.3.2. what may affect the dependant?

3.2.5. WHAT ABOUT?

3.2.5.1. Total COMBO

3.2.5.1.1. technically

3.2.5.1.2. operationally

3.2.5.1.3. from policy perspective

3.2.5.2. Interpreting effects of variables in the model

3.2.5.2.1. ρ is not significant

3.2.5.2.2. ρ is significant