Practical Image and Video Processing Using MATLAB Oge Marques (read clockwise)

Mind map of Practical Digital Image Processing with MATLAB (Oge Marques)

Get Started. It's Free
or sign up with your email address
Practical Image and Video Processing Using MATLAB Oge Marques (read clockwise) by Mind Map: Practical Image and Video Processing Using MATLAB Oge Marques (read clockwise)

1. Chapter 1 Introduction and Overview

1.1. Image

1.1.1. Visual representation

1.1.1.1. 2D

1.1.1.1.1. A projection of a 3D real-world item

1.1.2. Digital

1.1.2.1. Finite number of points (pixels)

1.1.2.1.1. Monochrome

1.1.2.1.2. Color

1.1.2.1.3. Alternative representations

1.2. Image processing

1.2.1. Modifying digital images with computers

1.2.1.1. Multidisciplinary

1.2.1.1.1. Mathematics

1.2.1.1.2. Physics

1.2.1.1.3. Computer science

1.2.1.1.4. Computer engineering

1.2.1.1.5. Optical engineering

1.2.1.1.6. Electrical engineering

1.2.1.1.7. Pattern recognition

1.2.1.1.8. Machine learning

1.2.1.1.9. Artificial intelligence

1.2.1.1.10. Human vision research

1.2.2. Scope

1.2.2.1. Low level

1.2.2.1.1. Primitive operations

1.2.2.2. Mid level

1.2.2.2.1. Extract attributes

1.2.2.3. High level

1.2.2.3.1. Analysis

1.2.2.3.2. Interpretation

1.2.3. Examples

1.2.3.1. Sharpening: enhance edges and fine details

1.2.3.1.1. Example

1.2.3.2. Noise removal

1.2.3.2.1. Example

1.2.3.3. Deblurring

1.2.3.3.1. Example

1.2.3.4. Edge extraction

1.2.3.4.1. Example

1.2.3.4.2. Preprocessing step to separate objects from one another

1.2.3.5. Binarization

1.2.3.5.1. Example

1.2.3.5.2. To simplify and speed up interpretation

1.2.3.6. Blurring

1.2.3.6.1. Example

1.2.3.6.2. To minimize the importance of textures and fine details

1.2.3.7. Contrast enhancement

1.2.3.7.1. Example

1.2.3.7.2. To help humans and other computer tasks, e.g. edge extraction

1.2.3.8. Object segmentation and labeling

1.2.3.8.1. Example

1.2.3.8.2. Prerequisite for most objection recognition and classification work

1.3. Components of a digital image processing system

1.3.1. Hardware

1.3.1.1. Acquisition devices

1.3.1.1.1. Capture

1.3.1.2. Processing equipment

1.3.1.2.1. Modify and analyze

1.3.1.3. Display device

1.3.1.3.1. Show

1.3.1.4. Hardcopy device

1.3.1.4.1. Show

1.3.1.5. Storage device

1.3.1.5.1. Preserve

1.3.2. Software

1.4. Machine Vision Systems

1.4.1. Overview (without AI/ML)

1.4.2. Acquisition

1.4.2.1. Input: real-world object

1.4.2.2. Output: digital image

1.4.3. Preprocessing

1.4.3.1. Input: digital image

1.4.3.2. Output: corrected/improved digital image

1.4.4. Without AI/ML

1.4.4.1. Segmentation

1.4.4.1.1. Input: digital image

1.4.4.1.2. Output: partitioned image

1.4.4.2. Feature extraction

1.4.4.2.1. Input: preprocessed, segmented image

1.4.4.2.2. Output: encoded image contents (feature vector)

1.4.4.3. Classification

1.4.4.3.1. Input: feature vectors

1.4.4.3.2. Output: human-useful label

1.4.5. With AI/ML (trained model)

1.4.5.1. Classification

1.4.5.1.1. Input: preprocessed image

1.4.5.1.2. Output: human-useful label

1.5. Human Vision System (HVS) vs. Machine vision System (MVS)

1.5.1. Storage

1.5.1.1. HVS: large database of images accumulated over a lifetime, mapped to high-level semantics

1.5.1.2. MVS: large storage possible, but not much context (semantics)

1.5.2. Speed

1.5.2.1. HVS: very high speed

1.5.2.2. MVS: fast, but not as fast as HVS (not real time)

1.5.3. Working conditions

1.5.3.1. HVS: can cope with poor lighting conditions, confusing perspectives, etc.

1.5.3.2. MVS: needs good lightning conditions, confused by extraneous objects, angles, etc.

2. Chapter 2 Image Processing Basics

2.1. Digital Image Representation

2.1.1. Basic formats

2.1.1.1. raster (bitmap)

2.1.1.1.1. 2D matrix of numbers

2.1.1.1.2. f(x,y) = intensity or gray level at pixel (x,y)

2.1.1.1.3. Pros/cons

2.1.1.1.4. The book covers this format

2.1.1.2. vector

2.1.1.2.1. Drawing commands

2.1.1.2.2. Pros/cons

2.1.2. Binary (1-bit)

2.1.2.1. 1 bit per pixel

2.1.2.1.1. 0=black, 1=white (usually)

2.1.2.2. Small size

2.1.2.3. Suitable for simple graphics, line art

2.1.3. Gray-level (monochrome)

2.1.3.1. 8 bits per pixel

2.1.3.1.1. 256 levels of gray

2.1.3.2. Relatively compact, subjectively good quality

2.1.4. Color

2.1.4.1. 24-bit RGB (red, green, blue)

2.1.4.1.1. 3 x 2D arrays (one per color)

2.1.4.2. 32-bit RGB + alpha

2.1.4.2.1. Additional 8 bits for alpha (transparency) for each pixel

2.1.4.2.2. Used for image editing effects

2.1.4.3. Indexed color

2.1.4.3.1. For compatibility with older hardware

2.1.4.3.2. Pointers to a color palette (usually 256 colors)

2.2. Compression

2.2.1. Why

2.2.1.1. Raw image representations are large

2.2.2. Lossy

2.2.2.1. Tolerable degree of deterioration

2.2.2.2. General purpose photographic images

2.2.3. Lossless

2.2.3.1. Full quality

2.2.3.2. Line art, drawings, facsimiles, space images, medical images

2.2.4. Compression rate

2.2.4.1. bpp -> bits per pixel

2.3. File formats

2.3.1. General

2.3.1.1. File header

2.3.1.1.1. Image height, width, bands, bpp, format signature, ...

2.3.1.2. Pixel data (often compressed)

2.3.2. Examples

2.3.2.1. BIN, PPM, JPEG, GIF, PNG, TIFF

2.4. Terminology

2.4.1. Topology

2.4.1.1. Fundamental image properties

2.4.1.1.1. Usually done on binary images

2.4.1.1.2. Number of occurrences of an object

2.4.1.1.3. Number of separate (not connected) regions

2.4.1.1.4. Number of holes in an object

2.4.1.1.5. ...other examples

2.4.2. Neighborhood

2.4.2.1. Pixels surrounding a pixel

2.4.2.2. 4-neighborhood

2.4.2.3. 8-neighborhood

2.4.2.4. diagonal-neighborhood

2.4.3. Adjacency

2.4.3.1. In relation to two pixels (p and q)

2.4.3.2. 4-adjacent: p and q are 4-neighbors or each other

2.4.3.3. 8-adjacent: 8-neighbors of one another

2.4.3.4. mixed- or m-adjacency: eliminate ambiguities in 8-adjacency

2.4.4. Path

2.4.4.1. In relation to two pixels (p and q)

2.4.4.2. 4-path

2.4.4.2.1. sequence of pixels from p to q where each pixel is 4-adjacent to its predecessor

2.4.4.3. 8-path

2.4.4.3.1. same, but now each pixel is 8-adjacent to its predecessor

2.4.4.4. Example (expand node to see image)

2.4.4.4.1. Example image

2.4.5. Connectivity

2.4.5.1. 4-connected

2.4.5.1.1. there is a 4-path between p and q

2.4.5.2. 8-connected

2.4.5.2.1. there is an 8-path between p and q

2.4.6. Components

2.4.6.1. Set of pixels connected to each other

2.4.6.2. 4-component

2.4.6.2.1. 4-connected

2.4.6.3. 8-component

2.4.6.3.1. 8-connected

2.4.7. Distance

2.4.7.1. In relation to coordinates of two pixels p and q

2.4.7.2. Euclidean

2.4.7.2.1. sqrt( (x1-x0)^2 + (y1-y0)ˆ2 )

2.4.7.3. Manhattan (city block)

2.4.7.3.1. |x1-x0| + |y1-y0|

2.4.7.4. Chessboard

2.4.7.4.1. max(|x1-x0|, |y1-y0|)

2.4.7.5. Example (expand node to see image)

2.4.7.5.1. Example picture

2.5. Operations

2.5.1. In the spatial domain

2.5.1.1. Arithmetic/logical operations on the original pixel value

2.5.1.2. Global (point) operations

2.5.1.2.1. Entire image is treated uniformly: same function applied to all pixels

2.5.1.2.2. New pixel value is a function of old pixel value

2.5.1.2.3. Example: contrast adjustment

2.5.1.3. Neighborhood-oriented (local, area) operations

2.5.1.3.1. Pixel by pixel, typically using a convolution operation

2.5.1.3.2. New pixel value is a function of its value + neighbors

2.5.1.3.3. Example: spatial-domain filters (blur, enhance, find edges, remove noise)

2.5.1.4. Combining multiple images

2.5.1.4.1. Two or more images are combined arithmetically or logically

2.5.1.4.2. Example: subtract one image from another to detect differences

2.5.2. In a transform domain

2.5.2.1. Transform = convert a set of values to another set of values, creating a new representation for the same information

2.5.2.2. Image undergoes a mathematical transformation (Fourier transform, discrete cosine transformation)

2.5.2.2.1. From spatial domain (original image) to transform domain (new representation)

2.5.2.3. Examples: frequency-domain filtering

3. Chapter 5 Image Sensing and Acquisition

3.1. Requirements

3.1.1. Illumination (energy) source

3.1.1.1. Electromagnetic raidation

3.1.2. Imaging sensor

3.1.2.1. Converts optical information into electrical equivalent

3.2. Types of images

3.2.1. Reflection images

3.2.1.1. Radiation reflected from the surface of objects

3.2.2. Emission images

3.2.2.1. Objects are self-luminous

3.2.2.2. Visible or invisible raidation

3.2.3. Absorption images

3.2.3.1. Radiation passes through an object

3.2.3.2. Provides information about internal structure (e.g. X-ray)

3.3. Light and color perception

3.3.1. Radiance

3.3.1.1. Physical power

3.3.1.2. Expressed as spectral power distribution (SPD)

3.3.2. Human perception of light

3.3.2.1. Brightness

3.3.2.1.1. "An area appears to emit more or less light"

3.3.2.1.2. Luminous intensity

3.3.2.1.3. Perceptual (cannot be measured)

3.3.2.2. Hue

3.3.2.2.1. "An areas appears similar to one of the perceived colors (red, green, blue, or combination"

3.3.2.2.2. Dominant wavelength of the SPD

3.3.2.3. Saturation

3.3.2.3.1. "The colorfulness of an area judged in proportion to its brightness"

3.3.2.3.2. A description of the whiteness of the light source

3.3.2.3.3. The more the SPD is concentrated at one wavelength, the more saturated is the associated color

3.3.2.3.4. Adding white light (all wavelengths) causes color desaturation

3.3.2.3.5. Perceptual (cannot be measured)

3.3.2.4. Visualization of the concepts

3.3.2.4.1. Visualization 1

3.3.2.4.2. Visualization 2

3.4. Image acquisition

3.4.1. Sensors

3.4.1.1. Covert electromagnetic energy to electrical signals

3.4.1.2. Types

3.4.1.2.1. CCD (Charge-coupled devices)

3.4.1.2.2. CMOS (complementary metal oxide semiconductor)

3.4.1.3. Characteristics

3.4.1.3.1. Nominal resolution

3.4.1.3.2. Field of view

3.4.2. Camera optics

3.4.2.1. Characteristics

3.4.2.1.1. Magnification power

3.4.2.1.2. Light gathering capacity

3.4.2.2. Aberrations

3.4.2.2.1. PIncushion distortion

3.4.2.2.2. Barrel distortion

3.5. Digitization

3.5.1. From analog to digital representation

3.5.1.1. Results in a pixel array

3.5.1.1.1. Monochrome: intensity

3.5.1.1.2. Color: color values

3.5.2. Processes

3.5.2.1. Sampling

3.5.2.1.1. Time or space

3.5.2.1.2. Usually done before quantization

3.5.2.1.3. Measure the value of a 2D function (height/width) at discrete intervals

3.5.2.1.4. Rate: number of samples across height and width

3.5.2.1.5. Pattern: physical arrangement of the samples

3.5.2.1.6. Nyquist criterion

3.5.2.1.7. Illustration

3.5.2.2. Quantization

3.5.2.2.1. Amplitude

3.5.2.2.2. Replaces a continuous function with a discrete set of quantization levels

3.5.2.2.3. Illustration

3.5.3. Resolution

3.5.3.1. Spatial

3.5.3.1.1. Density of pixels in an image

3.5.3.2. Gray-level

3.5.3.2.1. Smallest change in intensity level that the HVS can discern

4. Chapter 6 Arithmetic and Logic Operations

4.1. Arithmetic operations

4.1.1. Addition

4.1.1.1. Blend images

4.1.2. Subtraction

4.1.2.1. Detect differences between images

4.1.3. Multiplication and division

4.1.3.1. Brightness adjustment

4.2. Logic operations

4.2.1. Pciture

5. Appendix A Human Vision Perception HVS (human vision system)

6. Chapter 7 Geometric Operations

6.1. What they are

6.1.1. Modify the geometry of an image by repositioning pixels

6.1.2. Modify the spatial relationship between groups of pixels

6.2. Components

6.2.1. Mapping functions

6.2.1.1. Specify new coordinates in the output image for each pixel in the input image

6.2.1.2. Spatial transformation equations: f(x,y) --> g(x',y')

6.2.1.2.1. Often separate functions for x and y x' = T(x,y), y' = Ty(x,y)

6.2.1.3. Affine transformations

6.2.1.3.1. Linear combinations of x and y

6.2.1.3.2. Preserves line parallelism, but not angles and distances

6.2.1.4. Examples

6.2.1.4.1. Translation

6.2.1.4.2. Scaling

6.2.1.4.3. Rotation

6.2.1.4.4. Shearing

6.2.2. Interpolation methods

6.2.2.1. Compute new values for each pixel

6.2.2.2. Approaches

6.2.2.2.1. Forward mapping (source to target)

6.2.2.2.2. Backward mapping (target to source)

6.3. Examples

6.3.1. Zooming

6.3.2. Shrinking

6.3.3. Resizing

6.3.4. Translation

6.3.5. Rotation

6.3.6. Cropping

6.3.7. Flipping

6.3.8. Warping

6.3.9. Non-linear

6.3.9.1. Twirling

6.3.9.2. Rippiling

6.3.9.3. Morphing

6.3.9.4. Seam carving

6.3.9.5. Image registration