1. Format Spatial Data
1.1. determine projection
1.1.1. assign right projection
1.1.2. reproject if needed
1.2. set geometry (if not set)
2. Format Graph Data
2.1. Input
2.1.1. matrix (n by n)
2.1.2. edgelist (n x (2+j))
3. You are **assuming** data is clean
3.1. See DATA TYPES
3.1.1. R: **str()**
3.1.2. Python: **.info()**
4. Format numeric data
4.1. R: **as.numeric()**
4.1.1. If NAs created, STOP, Explore and CLEAN
4.2. Python: **pd.to_numeric()**
4.2.1. always use RAISE
4.2.1.1. If NAs created, STOP, Explore and CLEAN
4.3. when numeric values are clean...
4.3.1. you get numeric data
5. Format dates
5.1. avoid date inference
5.2. Be aware of the date/time symbols
5.2.1. Year
5.2.1.1. %y (24)
5.2.1.2. %Y (2024)
5.2.2. Month
5.2.2.1. %m (00-12)
5.2.2.2. %b (Jan, Dec)
5.2.2.3. %B (January, December)
5.2.3. Day
5.2.3.1. %d (01-31)
5.2.3.2. %a (Mon, Tue)
5.2.3.3. %A (Monday, Tuesday)
6. Format categorical data
6.1. Nominal
6.1.1. dichotomous
6.1.1.1. Boolean column
6.1.1.2. Textual
6.1.2. polytomous
6.1.2.1. Textual
6.2. Ordinal
6.2.1. duplicate columns
6.2.2. Homogenize range of ordinal levels
6.2.2.1. same min
6.2.2.2. same max
6.3. When converting either type
6.3.1. levels as integer values
6.3.2. levels as labels
7. Format all the text
7.1. decide
7.1.1. capitalization
7.1.2. normalization