Data and Data Representation

Get Started. It's Free
or sign up with your email address
Data and Data Representation by Mind Map: Data and Data Representation

1. Bits, Bytes, and Words

1.1. Bits

1.1.1. The basic unit of information in computing and telecommunication

1.1.2. In computing, a bit is defined as a variable or computed quantity that can have only two possible

1.1.3. These two values are often interpreted as binary digits and are usually denoted by 0 and 1

1.2. Bytes

1.2.1. a unit of digital information in computing and telecommunications, that most commonly consists of eight bits

1.2.2. a byte was the number of bits used to encode a single character of text in a computer and it is for this reason the basic addressable element in many computer architectures.

1.3. Words

1.3.1. In computing, word is a term for the natural unit of data used by a particular computer design

1.3.2. A word is simply a fixed sized group of bits that are handled together by the system

1.3.3. The number of bits in a word (the word size or word length) is an important characteristic of computer architecture.

1.4. Number Bases

1.4.1. Radix

1.4.1.1. When referring to binary, octal, decimal, hexadecimal, a single lowercase letter appended to the end of each number to identify its type.

1.4.2. Base

1.4.2.1. The number of different symbols required to represent any given number The larger the base, the more numerals are required, For a given number, the larger the base the more symbols required but the fewer digits needed

1.4.2.1.1. Base 2 (Binary) : 0,1

1.4.2.1.2. Base 8 (Octal) : 0,1,2, 3,4,5,6,7

1.4.2.1.3. Base 10 (Decimal) : 0,1, 2,3,4,5,6,7,8,9

1.4.2.1.4. Base 16 (Hexadecimal): 0,1,2,3,4,5,6,7,8,9,A,B,C,D,E,F

1.5. Binary System

1.5.1. Early computer design was decimal

1.5.2. Mark I and ENIAC

1.5.3. John von Neumann proposed binary data processing (1945)

1.5.4. Simplified computer design

1.5.5. Used for both instructions and data

1.5.6. Natural relationship between on/off switches and calculation using Boolean logic

1.5.7. A computer stores both instructions and data as individual electronic charges.

1.5.8. represent these entities with numbers requires a system geared to the concept of on and off or true and false

1.5.9. Binary is a base 2 numbering system

1.5.10. each digit is either a 0 (off) or a 1 (on)

1.5.11. Computers store all instructions and data as sequences of binary digit

1.5.12. e.g. 010000010100001001001000011 = “ABC”

1.6. Octal System

1.6.1. As known as base 8 numbering system

1.6.2. There are only eight different digits available (0, 1, 2, 3, 4, 5, 6, 7)

1.7. Decimal System

1.7.1. Decimal is a base 10 numbering system

1.7.2. We use a system based on decimal digits to represent numbers

1.7.3. Each digit in the number is multiplied by 10 raised to a power corresponding to that digit position.

1.8. Hexadecimal System

1.8.1. Hexadecimal is a base 16 numbering system

1.8.2. Used not only to represent integers

1.8.3. Also used to represent sequence of binary digits

2. Data, Information and Processing

2.1. Data Formats

2.1.1. Computers

2.1.1.1. Process and store all forms of data in binary format

2.1.2. Human communication

2.1.2.1. Includes language, images and sounds

2.1.3. Data formats :

2.1.3.1. Specifications for converting data into computer-usable form

2.1.3.2. Define the different ways human data may be represented, stored and processed by a computer

2.2. Information processing cycle

2.2.1. Information processing cycle is the series of input, process, output, and storage activities.

2.2.2. Computers process data into information.

2.2.2.1. Data is a collection of unprocessed items, which can include text, numbers, images, audio, and video.

2.2.2.2. Information conveys meaning and is useful to people.

2.3. Data, Information, Knowledge

2.3.1. Data: unprocessed facts and figures

2.3.1.1. “The price of crude oil is $80 per barrel.”

2.3.2. Information: data that has been interpreted

2.3.2.1. “The price of crude oil has risen from $70 to $80 per barrel”

2.3.3. Knowledge: information, experience and insight

2.3.3.1. “When crude oil prices go up by $10 per barrel, it’s likely that gas prices will rise by 14¢ per gallon”

2.4. Processing – Data Coding

2.4.1. Data is encoded by assigning a bit pattern to each character, digit, or multimedia object.

2.4.2. Many standards exist for encoding:

2.4.2.1. Character encoding like ASCII

2.4.2.2. Image encodings like JPEG

2.4.2.3. Video encodings like MPEG-4

2.5. Processing - Data Storage and Compression

2.5.1. Reduce the size of data to save space or transmission time

2.5.2. Categories of data compression:

2.5.2.1. Lossless - Compresses data exactly to its original form

2.5.2.2. Lossy - Compression involves a certain amount of data degradation -Most common in multimedia applications

2.6. Processing – Data Integrity

2.6.1. Security or protection of data

2.6.2. Involves access to files  Access Control Lists (ACLs)

2.6.3. Protect files from being read, written to, or executed

2.6.3.1. Password protection

2.6.3.2. Keyboard locking

2.6.4. Data Integrity = Quality of Data

3. ASCII Codes, Unicode

3.1. The Alphanumeric Representation

3.1.1. The data entered as characters, number digits, and punctuation are known as alphanumeric data.

3.1.2. 3 alphanumeric codes are in common use.

3.1.2.1. ASCII (American Standard Code for Information Interchange)

3.1.2.2. Unicode

3.1.2.3. EBCDIC (Extended Binary Coded Decimal Interchange Code).

3.2. ASCII

3.2.1. It is an acronym for the American Standard Code for Information Interchange.

3.2.2. It is a standard seven-bit code that was first proposed by the American National Standards Institute or ANSI in 1963, and finalized in 1968 as ANSI Standard X3.4.

3.2.3. The purpose of ASCII was to provide a standard to code various symbols ( visible and invisible symbols)

3.2.4. In the ASCII character set, each binary value between 0 and 127 represents a specific character.

3.2.5. Most computers extend the ASCII character set to use the full range of 256 characters available in a byte. The upper 128 characters handle special things like accented characters from common foreign languages.

3.2.6. In general, ASCII works by assigning standard numeric values to letters, numbers, punctuation marks and other characters such as control codes.

3.3. Keyboard Input

3.3.1. Key (“scan”) codes are converted to ASCII

3.3.2. ASCII code sent to host computer

3.3.3. Received by the host as a “stream” of data

3.3.4. Stored in buffer and being processed

3.4. Unicode

3.4.1. A worldwide character-encoding standard

3.4.2. Its main objective is to enable a single, unique character set that is capable of supporting all characters from all scripts, as well as symbols, that are commonly utilized for computer processing throughout the globe

3.4.3. 16-bit standard

3.4.4. It is a superset of ASCII

3.5. Usage Of Unicode

3.5.1. Encode text for creation of passwords

3.5.2. Encodes characters to display in all webpages

3.5.3. Encode characters used in email settings

3.5.4. Modify characters used in documents

3.6. ASCII vs Unicode

3.6.1. Both are character codes

3.6.2. The 128 first code positions of Unicode mean the same as ASCII

3.6.3. ASCII defines 128 characters, which map to the numbers 0–127. Unicode defines (less than) 221characters, which, similarly, map to numbers 0–221 (though not all numbers are currently assigned, and some are reserved).

3.6.4. Unicode is a superset of ASCII, and the numbers 0–128 have the same meaning in ASCII as they have in Unicode. For example, the number 65 means "Latin capital 'A'".

3.7. EBCDIC

3.7.1. Extended Binary Coded Decimal Interchange Code developed by IBM

3.7.1.1. Restricted mainly to IBM or IBM compatible mainframes

3.7.1.2. Conversion software to/from ASCII available

3.7.1.3. Common in archival data

3.7.1.4. Character codes differ from ASCII