1. Training/ test Data
1.1. Labeled Data set
1.2. Normal Labeled Dataset
1.3. labeled abnormal dataset (rare)
1.4. Unlabeled dataset assumed very low abnormalities
2. Outputs
2.1. Score
2.1.1. List sorted by abnormality
2.2. Label
2.2.1. Tell whether normal or abnomal
3. Applications
3.1. Intrusion Detection
3.1.1. Host based
3.1.1.1. multi point system traces can be available
3.1.1.2. sequential / collective anomalies
3.1.1.3. limited alphabet
3.1.1.4. point anomaly detection is not applicable
3.1.2. Network based
3.1.2.1. Network data
3.1.2.2. collective anomalies
3.1.2.3. challenges
3.1.2.3.1. Nature of anomalies keep changing over time
3.1.2.3.2. adapting intruders
3.2. Fraud Detection
3.2.1. Actual users doing unauthorized things
3.2.2. Credit card fraud
3.2.2.1. labeled data available
3.2.2.2. point anomalies
3.2.2.3. detection
3.2.2.3.1. by-owner
3.2.2.3.2. by-operation
3.2.3. Mobile Phone Frauds
3.2.4. Insurance claim frauds
3.2.5. insider trading
3.3. Sensor networks
3.3.1. sensor faults
3.3.2. event (i.e. intrusion) detection
3.4. Image processing
3.4.1. Contextually different Points or regions
3.5. industrial damage detection
3.5.1. mechanical components defects
3.5.2. defects in physical structures
3.6. Medical/public health
3.6.1. abnormal patient conditions
3.6.2. instrument errors
3.6.3. recording errors
4. Categorizing Anomalies
4.1. Point Anomalies
4.1.1. Techniques
4.1.1.1. Classification Based
4.1.1.1.1. learning
4.1.1.1.2. testing
4.1.1.1.3. methods
4.1.1.1.4. concerns
4.1.1.2. Nearest Neighbor based
4.1.1.2.1. Distance to Kth nearest neighbor
4.1.1.2.2. relative density of each data instance
4.1.1.2.3. advantages
4.1.1.2.4. disadvantages
4.1.1.3. Clustering Based
4.1.1.3.1. Points that does not belong to any cluster
4.1.1.3.2. distance to the closest cluster center
4.1.1.3.3. anomalies are clustered together, yet smaller and with less density
4.1.1.4. Statistical Techniques
4.1.1.4.1. parametric techniques
4.1.1.4.2. non parametric techniques
4.1.1.5. Information theoritic
4.1.1.5.1. Kolomogorv Complexity
4.1.1.5.2. Maximzing the complexity of normal instances
4.1.1.5.3. multidimensional data
4.1.1.6. Spectral
4.1.1.6.1. Principal Component Analysis
4.2. Contextual Anomalies
4.2.1. data
4.2.1.1. contextual attributes
4.2.1.1.1. Spacial
4.2.1.1.2. Graph
4.2.1.1.3. Sequential
4.2.1.1.4. Profile
4.2.1.2. behavioral attributes
4.2.2. techniques
4.2.2.1. reducing to point anomaly detection
4.2.2.1.1. identify context
4.2.2.1.2. compute anomaly score
4.2.2.2. model the structure in data
4.3. Collective anomalies
4.3.1. Not discussed in detail..