Section 1 Data Science

Get Started. It's Free
or sign up with your email address
Section 1 Data Science by Mind Map: Section 1 Data Science

1. Sprint 1 Data Preprocess & EDA

1.1. Numpy

1.1.1. nd.array

1.1.2. Handling shape

1.1.2.1. vector [x, ]

1.1.2.2. matrix [x,y...]

1.1.2.3. reshape

1.1.3. Indexing & Slicing

1.1.4. Creation function & Operation function

1.1.4.1. arrange

1.1.4.2. ones & zeros & empty

1.1.4.3. eye & identity & digonal

1.1.4.4. sum & mean & std & sqrt

1.1.4.5. vstack & hstack & concatenate

1.1.4.6. tolist

1.1.5. Comparisons

1.1.5.1. all (bool) any (bool)

1.1.5.2. np.logical_and(x > 0, x < 3) np.logical_not(x) np.logical_or(x, y)

1.1.5.3. np.where(a>10) & isnan & isfinite(bool)

1.1.5.4. argmax & argmin

1.1.6. Array operation

1.1.6.1. + , - , * , /

1.1.6.2. dot & transpose (T)

1.1.6.3. broadcasting

1.1.7. Boolean index & Fancy index

1.1.7.1. condition

1.1.7.2. A = np.array([.....]) x = A < 10 x.astype(np.int)

1.1.7.3. x[x>4] a[b,c] (b가 row index, c가 columns index) a[b]

1.2. Pandas

1.2.1. Series & DataFrame

1.2.2. Wide & Tidy

1.2.3. Read data

1.2.4. General Function

1.2.4.1. melt & pivot & pivot table & crosstab

1.2.4.2. merge & concat

1.2.5. Missing data

1.2.5.1. isna & isnull & notna & notnull

1.2.5.2. dropna & fillna & ffill & replace (& to numeric)

1.2.6. Attributes

1.2.6.1. index & columns & dtypes

1.2.6.2. info & shape & T

1.2.7. Conversion

1.2.7.1. to_list & astype & to_dict

1.2.8. Indexing / Reindexing / Selcection / Label manipulation

1.2.8.1. df [ ] & df [' ']

1.2.8.2. df.loc & df.iloc

1.2.8.3. head & tail & sample

1.2.8.4. align & drop & isin & filter & replace & rename

1.2.8.5. set_index & reset_index

1.2.8.6. sort_values & sort_index

1.2.9. Fuction application

1.2.9.1. groupby & append & aggregate & apply (& map)

1.2.10. Computation

1.2.10.1. sum & mean & describe

1.3. Vis Libarary

1.3.1. Matplotlib

1.3.1.1. line

1.3.1.2. scatter

1.3.1.3. bar

1.3.1.4. histogram

1.3.1.5. boxplot

1.3.1.6. scatter matrix

1.3.2. Seaborn

1.3.3. Plotly

1.3.4. Wordcloud

1.3.5. bar_chart_race

1.4. Process

1.4.1. Read Data

1.4.2. Gener

2. Sprint 0 Python

2.1. Basic

2.1.1. Library & Package & Module & Class & Module & Method

2.1.2. Data type

2.1.3. List & Dictionary & Tuple

2.1.4. If & While & For

2.1.5. Function

2.1.5.1. argument

2.1.5.2. parameter

2.1.6. Try & Except & Finally

2.2. Pythonic Code

2.2.1. List comprehension

2.2.2. Split & Join

2.2.3. Enumerate & Zip

2.2.4. Lambda & Map & Reduce

2.2.5. Asterisk

2.2.5.1. *args

2.2.5.2. **kwargs

2.2.6. Collections (module)

2.2.6.1. deque( )

2.2.6.2. Ordered Dict( )

2.2.6.3. Defaultdict( )

2.2.6.4. Counter( )

3. Sprint 2 Statistics

3.1. Descriptive

3.1.1. Central tendency

3.1.1.1. Mean

3.1.1.2. Median

3.1.1.3. Mode

3.1.2. Degree of scattering

3.1.2.1. Maximum

3.1.2.2. Minimum

3.1.2.3. Range

3.1.2.4. Quartile deviation

3.1.2.5. Variance

3.1.2.6. Standard deviation

3.1.2.7. Standard error

3.1.2.8. Degrees of freedom

3.1.3. Distribution

3.1.3.1. Kurtosis

3.1.3.2. Skewness

3.2. Inferential Statistics

3.2.1. Estimation

3.2.1.1. Population 모집단

3.2.1.2. Parameter 모수

3.2.1.3. Sample 표본

3.2.1.4. Statistics 통계량

3.2.1.5. Estimator 추정량

3.2.2. Hypothesis Test

3.2.2.1. Z-test

3.2.2.2. T-test

3.2.2.2.1. one-side & two-side

3.2.2.2.2. one-sample & two-sample

3.2.2.3. F-test

3.2.2.3.1. ANOVA

3.2.2.4. Chi-square test

3.2.2.4.1. one-way & two-way

3.2.2.5. Concept

3.2.2.5.1. p-value

3.2.2.5.2. confience level

3.2.2.5.3. H0 vs H1

3.2.2.5.4. sampling

3.2.2.5.5. CDF vs PDF

3.2.3. Paradigms

3.2.3.1. Frequentist

3.2.3.2. Bayesian

3.2.4. Theorem

3.2.4.1. Central limit theorem 중심극한정리

3.2.4.2. Law of large numbers 큰 수의 법칙

3.3. Datatype

3.3.1. 자료의 유형에 따라

3.3.1.1. Quantitative (Numeric)

3.3.1.1.1. Continuos

3.3.1.1.2. Discrete

3.3.1.2. Qualitative (Categorical)

3.3.1.2.1. Ordinal

3.3.1.2.2. Nominal

3.3.2. 측정척도에 따라

3.3.2.1. Quantitative (Numeric)

3.3.2.1.1. Ratio scale

3.3.2.1.2. Interval scale

3.3.2.2. Qualitative (Categorical)

3.3.2.2.1. Ordinal scale

3.3.2.2.2. Nominal scale

4. Sprint 3 Linear Algebra

4.1. Vector & Matrix

4.1.1. Real Coordinate space

4.1.2. Unit Vector

4.1.3. Multipyling matrices & vectors

4.1.4. Determinant

4.1.5. Inverse

4.1.6. Dot Product

4.1.7. Corss Product

4.1.8. Basis

4.1.9. Rank

4.1.10. Gaussian Jordan Elimination

4.1.11. Eigenvalues & Eigenvectors

4.2. Covariance

4.2.1. Covariance

4.2.2. Coefficient

4.2.2.1. Pearson Coefficient

4.2.2.2. Spearman Coefficient

4.3. Dimensionally Reduction

4.3.1. PCA (Principal Component Analysis)

4.3.1.1. Scree Plot

4.3.1.2. Visualization

4.3.2. T-SNE

4.4. * Clustering with dimensionally reduction

4.4.1. Process

4.4.1.1. Def number of cluster

4.4.1.2. Fit (Scaled | Normalize | Standardize)

4.4.1.2.1. Fit

4.4.1.2.2. Transform

4.4.1.3. Check labels & Predict

4.4.2. K-Means

4.4.2.1. Elbow Method

4.4.2.2. Silhouette Method

4.4.3. Hierarchical

4.4.3.1. Agglomerative Clustering