Sesión 2

Get Started. It's Free
Sesión 2

1. Data

1.1. Types

1.1.1. Numeric

1.1.1.1. Real

1.1.1.2. Integer

1.1.2. Categorical

1.1.2.1. dichotomous

1.1.2.2. polytomous

1.1.2.2.1. nominal

1.1.2.2.2. ordinal

1.1.3. Logical

1.1.4. String

1.1.5. Date

1.2. Structures

1.2.1. Table

1.2.1.1. the DataFrame

1.2.1.1.1. Raw data

1.2.1.1.2. Frequency Tables

1.2.1.1.3. Contingency Tables

1.2.1.2. Insight from

1.2.1.2.1. Univariate Behavior

1.2.1.2.2. Association

1.2.1.2.3. Complex Association

1.2.2. Map

1.2.2.1. spatial layers

1.2.2.2. Insight from

1.2.2.2.1. Distance

1.2.2.2.2. Location

1.2.2.2.3. neighborhood

1.2.3. Graph

1.2.3.1. networks

1.2.3.2. Insight from

1.2.3.2.1. Roles

1.2.3.2.2. Communities

1.2.4. Text

1.2.4.1. Text Analytics

1.2.4.2. Insight from

1.2.4.2.1. Similarity

1.2.4.2.3. Topic Salience

1.2.4.2.4. Meaning

2. Plot Components

2.1. Title

2.1.1. Main: "The title is the answer"

2.1.2. Sub: Just for the specifics (place, time, etc)

2.2. Axes

2.2.1. Includes Zero?

2.2.1.1. Always in bars

2.2.1.2. May be optional in lines

2.2.2. Units of measurement

2.2.3. annotations might make them irrelevant

2.2.3.1. benefit data-to-ink ratio

2.3. Caption

2.3.1. Source: cite!

2.3.2. Some technical notes

2.3.2.1. for example:

2.4. Annotations

2.4.1. not for all values

2.4.2. a tool for focus

2.4.3. can be

2.4.3.1. text

2.4.3.2. line

2.4.3.3. rectangle

2.4.3.4. picture

2.5. Background

2.5.1. Grid

2.5.1.1. complexity

2.5.2. Margins

2.5.2.1. flexibility

2.6. Legend

2.6.1. helps

2.6.2. not much help

3. encoding

3.1. color

3.1.1. quantities

3.1.2. categories

3.1.3. levels

3.2. shape

3.2.1. encoding: category

3.2.2. encoding: category

3.3. position

3.3.1. variables relationship

3.3.2. a spatial location

3.8.1. density

3.10. others

3.10.1. thickness

3.10.2. transparency

3.10.3. density

5. Purpose and Bias

5.1. Reduce complexity

5.1.1. what is left out?

5.1.1.1. financial reasons

5.1.1.2. political pressure

5.1.1.3. limited knowledge

5.1.1.4. institutional inercy

5.2. Represent data

5.2.1. How much data?

5.2.1.1. all cases (rows) but incomplete data (columns)

5.2.1.2. complete data of ...

5.2.1.2.1. incomplete cases

5.2.1.2.2. wrong cases

5.3. Benefit from easiness of visualization software

5.3.1. can we be good at DataViz software and produce bad visuals?

5.3.2. simple is beautiful, especially with humans

5.3.2.1. The less ink/alterations/variety the better

5.3.2.1.1. people will add information on their own

5.3.2.2. people know what correlation is...

5.3.2.2.1. but they always see causality

5.3.2.3. massive data

5.3.2.3.1. creates normality