
1. Filter
1.1. reading
1.1.1. All (SELECT)
1.1.1.1. SELECT DISTINCT salary/12 AS monthly_wage FROM ...;
1.1.1.1.1. SELECT result FROM table WHERE selection
1.1.2. Parts (WHERE)
1.1.2.1. SELECT ... FROM ... WHERE column = result AND/OR/NOT column2 >< result2;
1.1.2.1.1. Searching in text
1.1.2.2. SELECT ... FROM t WHERE id IN (Subquery)
1.1.2.2.1. SELECT * FROM t WHERE AND id NOT In ( SELECT FROM WHERE )
2. Join Operations
2.1. return combined relations (FROM) to a new table or view
2.1.1. Inner
2.1.1.1. SELECT ... FROM course AS C , prereq AS P WHERE C.course_id = P.course_id;
2.1.1.1.1. Ignores cases where ids dont match!
2.1.2. Outer
2.1.2.1. Left (Null on right side)
2.1.2.1.1. SELECT ... FROM course AS C LEFT OUTER JOIN, prereq AS P ON C.course_id = P.course_id;
2.1.2.2. Right (Null on left side)
2.1.2.2.1. SELECT ... FROM course AS C RIGHT OUTER JOIN, prereq AS P ON C.course_id = P.course_id;
2.1.2.3. Full (Null)
2.1.2.3.1. SELECT ... FROM course AS C FULL OUTER JOIN, prereq AS P ON C.course_id = P.course_id;
3. Null Values
3.1. Find
3.1.1. SELECT... FROM ... WHERE x IS NULL;
3.1.1.1. Comparison with null returns unknown
3.1.1.1.1. Three valued logic or : positive perspective and : evaluates strictly not unknown: unknown
4. Aggregate Functions
4.1. Basics
4.1.1. SELECT AVG(salary) SELECT MAX(salary) SELECT SUM(salary) SELECT COUNT(DISTINCT id) FROM ...;
4.2. Group by
4.2.1. SELECT x, AVG(salary) AS avg_salary FROM ... GROUP BY x; ! Vars outside the agg function must be in the GROUP BY
4.3. AGG + WHERE SELECTION
4.3.1. WHERE doesnt work bc it is executed before the agg
4.3.1.1. SELECT x, AVG(salary) AS S FROM ... GROUP BY x HAVING S > 42000;
5. Nested Subqueries
5.1. From WHERE clause
5.1.1. For Set Membership For Set Comparison For Set Cardinality
5.1.1.1. Set Membership
5.1.1.1.1. Same as Basic > reading > WHERE > same table twice
5.2. From FROM clause
5.2.1. Temporary relation with WITH
5.2.1.1. WITH x AS ( SELECT MAX(x) FROM department ) SELECT deparment.name FROM department WITH department.budget = max;
5.3. From SELECT Clause
5.3.1. Returns single Values
5.3.1.1. SELECT dep_name, ( SELECT COUNT(*) FROM instructor WHERE xyz ) AS num_instructor FROM dept;
6. Set Operations
6.1. Combines independent Queries based on UNION (or), INTERSECT (and), EXCEPT (but not)
7. SQL Part 2
7.1. Data Definition Language (defines and modifies the structure of database objects)
7.1.1. Datatypes
7.1.1.1. Dates
7.1.1.1.1. Arithmetics
7.1.1.2. User defined types (dollars) CREATE TABLE ... ( dep_name VARCHAR(20), building VARCHAR, budget Dollars);
7.1.1.3. Domains (like type but with NOT NULL constraint) so that we dont have to check every entry as it is created
7.1.1.3.1. CREATE DOMAIN name CHAR(20) NOT NULL
7.1.1.4. Normal Datatypes
7.1.1.5. Large Object types
7.1.2. CREATE (Like constructor)
7.1.2.1. Vars
7.1.2.1.1. CREATE TABLE x ( id CHAR(5), salary NUMERIC(8,2));
7.1.2.2. Keys
7.1.2.2.1. Primary
7.1.2.2.2. Foreign
7.1.3. Modify : DROP and ALTER
7.1.3.1. DROP TABLE r (doesnt work if it still has references)
7.1.3.2. ALTER TABLE x ADD y (varchar(100)); domain
7.1.3.3. ALTER TABLE t DROP CONSTRAINT "xyz"
7.1.3.4. ALTER TABLE t MODIFY COLUMN "factor" DECIMAL (4,2) DEFAULT NULL; Modifies size factor from 9.99 to 10
7.1.3.4.1. ALTER TABLE CustomerOrder ADD COLUMN order_type ENUM('to go', 'eating in') NOT NULL DEFAULT 'eating in';
7.1.4. TRUNCATE drops and recreates table vs DELETE
7.1.4.1. CREATE VIEW geo AS ( SELECT * FROM t WHERE dep = "GEO"); GRANT SELECTION ON geo to geo_role
7.1.4.1.1. DELETE removes rows one by one and can be filtered with a WHERE clause. If where is omitted, i delete everything but keep the table structure
7.2. Data Manipulation Language (retrieve, insert, update and delete data in tables)
7.2.1. INSERT
7.2.1.1. INEXPLIZITE COLUMN NAMES INSERT INTO t VALUES (101, "y", 20); Values must match the numbers of columns in x assumes all columns are specified in the correct order
7.2.1.1.1. EXPLICIT COLUMN SELECTION INSERT INTO t (ID, name, age) VALUES (101, "y", 20) (FROM other table); More flexibility and reduces errors
7.2.2. UPDATE
7.2.2.1. UPDATE t SET salary = 8000 WHERE id = xyz;
7.2.2.1.1. Use CASE to avoid problems with linar updating
7.2.3. DELETE
7.2.3.1. DELETE FROM t WHERE dep = "finance"
7.2.3.1.1. DELETE FROM t WHERE salary < ( SELECT AVG(salary) FROM instructor );
7.3. Database Logic
7.3.1. Views Acts like a table but is not physically stored. Simplifies complex queries by hiding underlying logic. Provides security by restricting access to certain columns or rows. Can be used to present aggregated or calculated data.
7.3.1.1. Create
7.3.1.1.1. CREATE VIEW viewname AS (hier kommt nichts!) SELECT id, name, dep FROM instructor WHERE ;
7.3.1.2. Update
7.3.1.2.1. Possible if: From clause only one db relation Select clause contains only attribute names of the relation and has no agg
7.3.2. Prevent Data Errors
7.3.2.1. Declare NOT NULL
7.3.2.1.1. name VARCHAR(20) NOT NULL
7.3.2.2. Declare UNIQUE
7.3.2.3. CONSTRAINT ... CHECK
7.3.2.3.1. CREATE TABLE t ( ... CONSTRAINT "semester_fall" CHECK (semester IN ("Fall", "Spring");
7.3.3. Integrity
7.3.3.1. Cascading updates/deletions
7.3.3.1.1. If you delete X and Y has relation to it, Y also deleted CREATE TABLE t ( ...) FOREIGN KEY (dep) REFERENCES x ON DELETE CASCADE ON UPDATE CASCADE);
7.3.3.2. primary key constraints
7.3.3.2.1. Unique and non-null
7.3.4. Authorization
7.3.4.1. GRANT SELECT/INSERT/UPDATE/DELETE/ALL ON table or view TO user/PUBLIC
7.3.4.1.1. If you dont want to wite each user in the list use ROLES
7.3.4.2. REVOKE SELECT/INSERT/UPDATE/DELETE/ALL ON relation or view FROM user list
8. Entity Relationship Models -> From modelling to creating database
8.1. Design Process
8.1.1. Requirements Engeneering
8.1.1.1. What data and how much? Which types of entities and what relations? What attributes do they have?
8.1.1.1.1. Final Phase
8.2. Cardinality Constraints
8.2.1. in binary relationset:
8.2.1.1. 1:1 -> "at most 1" or 1...1 1:n (1+) n:1 n:m not all instances of set need to be mapped! So a 1:m can include 1 instructor taking no students
8.2.1.1.1. FInd cardinality by asking the other way around
8.3. ER Diagramm
8.3.1. cardinalities
8.3.1.1. - ->
8.3.1.2. complex
8.3.1.2.1. 1....* 0....*
8.3.2. Roles
8.3.2.1. see relationship sets.
8.3.3. Participation
8.3.3.1. total =
8.3.3.1.1. each entitiy in entitiy set in at least one relationship in relationship set
8.3.3.2. Partial -
8.3.3.2.1. not every instructor needs to have a student : some entitiys may not participate in relationship
8.3.4. Displaying attributes
8.3.4.1. composed attribute. Intendation
8.3.4.2. multivalued {}
8.3.4.3. derived ()
8.3.5. Higher Arity relationship sets
8.3.5.1. only one sigle arrow pointing to 3rd entitiy (interpretation issues)
8.3.6. Specialisaton: Inherit attributes from Superset
8.3.6.1. overlapping: both employee and student Disjoint: either or bot not both
8.3.6.1.1. Partial
8.3.6.1.2. Total
8.4. Reduction to Relation Schemas
8.4.1. Goal: transform ER sets uniformly into relation schemas How? for each ER set there is a unique relation. This relation is assigned a name of the corrisponding ER set
8.4.1.1. creates another table named after the relationset
8.4.1.1.1. M:M Relationship sets
8.4.1.1.2. 1:M
8.4.1.1.3. O:O
8.4.1.2. representing attributes
8.4.1.2.1. composite attributes
8.4.1.2.2. Multivalued
8.4.1.2.3. Derived
8.4.1.2.4. Weak entities
8.4.1.2.5. Higher arity relationships
8.4.1.2.6. specialisation
8.5. Design Decisions
8.5.1. ziehe multivalues als eigene relation raus (instru_phone)
8.5.1.1. Placement of primary key in 1:1 relationships
8.6. Comparison UML
9. Normalization asks: What is a good ER design? If we insert data, do we run into inconsistencies or anomalities? Implementing NF leads to: Consistency + Redundancy - Tables + complexity +
9.1. 1 NF (PK)
9.1.1. Concept: Atomicity (units are not divisible)
9.1.1.1. Def: Relational Schema is in NF if domains of all attributes are atomic no repeating groups or MultiValuedAttributes
9.1.1.1.1. SOlution: Create new relation with reference to all components of primary key of old relation
9.2. 2 NF (PK)
9.2.1. Concept: Functional Dependencies if we know... we can determine ....
9.2.1.1. Def: all non-key attributes are functionally dependent on the entire primary key. or depends on a proper subset of the candidate key( violation: AB -> C but B->C). -> PArtial dependency Means: I need really both columns of candidate Key to identify A If i have only one candidate key then it is already satisfied
9.2.1.1.1. Violation: one column is dependent on a minimal component of the composed PK
9.3. 3NF (PK)
9.3.1. Concept: Further Depenencies. Meaning: There is a second functional dependency (Transitivität) A-> B B-> C from right to left: Non key, Non Key, Primary Key M {A} M+{A,B,C}
9.3.1.1. Def: no attribute is transitivly depenendent on the PK A->B-> C
9.3.1.1.1. Solution: New relation with B as the PK
9.4. Boyce-Codd NF (PK, CK)
9.4.1. Concept: Depenencys of Key (which i didnt get) maybe: the determinant (left side of -> is not a CK)
9.4.1.1. Def: There is no functional dependency between attributes that belong to different candidate keys
9.4.1.1.1. Example: If AB->a, then a -> B is a violation
9.5. 4 NF (PK, CK)
9.5.1. Concept: Multi-valued attributes. For one PK, there are multiple phone numbers (independent MVA)
9.5.1.1. Def: R in NF if R does not contain more than one Multivalued attribute
9.5.1.1.1. Solution: Seperate R for each multi-valued attribute. Identify PK
9.6. Project join NF (PK, CK)
9.6.1. Concept: Addtional constraints (DM only in 2nd semester). Dependent MVA
9.6.1.1. Problem: We cannot insert tuple with unknown instructor Null due to PK null constraint
9.6.1.1.1. Def: in NF if R cannot be decomposed and rejoined based on keys without removing or adding info
9.7. Domain Key NF
9.7.1. Def : is in NF if no insertion or deletion of anomalies possible in DB. All constraints are encoded in DB
9.7.2. Concept: Domain Constraint like students can only work 30h /week
9.7.2.1. Solution: New R with specifiic constraint. PK is the subgroup
9.8. Other suff
9.8.1. Incidential Denormalization
9.8.1.1. not normalizing is better for performance
10. Indexing and Hashing
10.1. Complexity Theory
10.1.1. Time and Memory Complexity
10.1.1.1. Its about relative Numbers and Scaling (not absolute numbers, this is hardware (N*t) becomes N)
10.1.1.1.1. Rules: Ignore Constants Highes Complexity dominates (limit argument) 0(1) is constant complexity that doesnt change
10.2. Indexing
10.2.1. From linary search (O(N) to binary search (O(N*log2N)
10.2.1.1. Index Files (without Index we dont have to go trough all N, its a search key and this speeds up lookup)
10.2.1.1.1. Search in index is O(log2N) and the following link is O(1)
10.2.1.2. Index files and joins
10.2.1.3. Ordered indices (good for ranges)
10.2.1.3.1. B+ Tree
10.2.1.4. Hash indices (good for unique values)
10.2.1.4.1. Indexing vs Hashing
10.2.2. Bitmap Indeces
10.2.2.1. Good for few attributes (unlike Has and index)
10.2.2.1.1. Bit Operations
10.2.3. Other types
10.2.3.1. Trees
11. Database Architecture How is data stored?
11.1. Physical Storage and Storage Hierarchy
11.2. File Organization
11.3. Storage Access and Buffer Manager
11.4. DBS Architecture
11.4.1. Centralized
11.4.2. Client Server system
11.4.3. SQL Server
11.4.4. Parallel Database Systems
11.4.4.1. Speedup and Scaleup