Database Design

Database design Goals: The prime goal of designing a database is:

To have zero redundancy in the system
Loss-less join
Dependency preservation
Overcome all the shortcomings of conventional file system

According to E.F. Codd, “All the records of the table must be unique”.

Keys of a relation: There are various types of keys in a relation which are:

Candidate Key: The minimal set of attributes that can determine a tuple uniquely. There can be more than 1 candidate key of a relation and its proper subset can’t determine tuple uniquely and it can’t be NULL.
Super Key: The set of attributes that can determine a tuple uniquely. A candidate key is always a super key but vice versa is not true.
Primary Key and Alternate Key: Among various candidate keys, one key is taken as the primary key and others are alternate keys.
Foreign Key: Foreign Key is a set of attributes in a table that is used to refer to the primary key or alternative key of the same or another table.

Functional Dependency: It is a constraint that specifies the association/ relationship between a set of attributes. In functional dependency, one set can accurately determine the value of another set. It is represented as A->B, where set A can determine the values of set B correctly. The A is known as the Determinant, and B is known as the Dependent.

Functional dependencies are further categorized into two types:

Trival Functional Dependency: In functional dependency, if B is a subset of A, then such dependency is known as trivial functional dependency.
Non-Trivial Functional Dependency: In functional dependency, if B is not a subset of A, then such dependency is known as non-trivial functional dependency.

Armstrong’s Axioms: It is a statement that is always considered true and used as a starting point for further arguments. Armstrong axiom is used to generate a closure set in a relational database.

Armstrong Axiom

Attribute Closure(X⁺): All attributes of the set are functionally determined by X.

Prime Attribute: An attribute that is part of one candidate key.
Non-prime Attribute: An attribute that is not a part of any candidate key.

Example: If the relation R(ABCD) {A->B, B->C, C->D}, then the attribute closure of

A will be (A⁺)={ABCD} [A can determine B, B can determine C, C can determine D]
B will be (B⁺)={BCD} [B can determine C, C can determine D]
C will be (C⁺)={CD} [C can determine D]
D will be (D⁺)={D} [D can determine itself]

Note: With the help of Attribute closure, we can easily determine the Superkey [The set of attributes whose closure contains all attributes of a relation] of a relation, So in the above example A is the superkey of the given relation. There can be more than one superkey in a relationship.

Example: If the relation R(ABCDE) {A->BC, CD->E, B->D, E->A}, then the attribute closure will be

A⁺= {ABCDE}
B⁺= {BD}
C⁺= {C}
D⁺= {D}
E⁺= {ABCDE}

Equivalence sets of Functional Dependency: If two sets of a functional dependency are equivalent, i.e. if A⁺= B⁺. Every FD in A can be inferred from B, and every FD in B can be inferred from A, then A and B are functionally equivalent.

Minimal set of Functional Dependency: A set of FD will be minimal if it satisfies the following conditions:

F logically implies all dependencies in F⁺
F⁺ logically implies all dependencies in F.
No functional dependency in F⁺contains an extraneous attribute.
Each left side of a functional dependency in F+ is unique. That is, there are no two dependencies A->B and C->D in such that A->C.

Normalization: Normalization is used to eliminate the following anomalies:

Insertion anomaly
Deletion anomaly
Updation Anomaly
Join anomaly

Normalization was introduced to achieve integrity in the database and make the database more maintainable.

1. First Normal Form: A relation is in first normal form if it does not contain any multi-valued or composite attribute. If the data is in 1NF then it will have high redundancy.

2. Second Normal Form: A relation is in the second normal form if it is in the first normal form and if it does not contain any partial dependency.

Partial Dependency: A dependency is called partial dependency if any proper subset of candidate key determines non-prime (which are not part of candidate key) attribute.
Let R be the relational schema and X, Y, A is the set of attributes. Suppose X is any candidate key, Y is a proper subset of candidate key, and A is a Non-prime attribute.

Partial Dependency

Y->A will be partial dependency iff, Y is a proper subset of candidate key, and A is a non-prime attribute.

Full Functional Dependency: If A and B are an attribute set of a relation, B is fully functional dependent on A, if B is functionally dependent on A but not on any proper subset of A.

3. Third Normal Form: A relation is in the third normal form if it is in the second normal form and it does not contain any transitive dependency. For a relation to be in Third Normal Form, either LHS of FD should be super key or RHS should be the prime attribute.

4. Boyce-Codd Normal Form: A relation is in Boyce-Codd Normal Form if the LHS of every FD is super key. The relationship between Normal Forms can be represented as 1NF, 2NF, 3NF or BCNF.

Design Goal	1NF	2NF	3NF	BCNF
Zero Redundancy	High redundancy	Less than 1NF	Less than 2NF	No redundancy
Loss-less decomposition	Always	Always	Always	Always
Dependency preservation	Always	Always	Always	Sometimes Not possible

Properties of Decomposition:

Loss-less Join Decomposition: There should not be the generation of any new tuple because of the decomposition.

Dependency Preserving Decomposition: There should not be the loss of any tuple because of the decomposition. Let R be a relation with Functional dependency F. After decomposition R is decomposed into R₁, R₂, R₃……R_n with FD set F₁, F₂, F₃……F_n respectively.

If F₁, F₂, F₃…..F_n ≣ F, then the decomposition is dependency preserving otherwise not.

Last Minute Notes – DBMS

Database Management System is an organized collection of interrelated data that helps in accessing data quickly, along with efficient insertion, and deletion of data into the DBMS. DBMS organizes data in the form of tables, schemas, records, etc.

DBMS over File System

The file system has numerous issues, which were resolved with the help of DBMS, the issues with the file system are:

Data Redundancy: Same data can be stored at multiple places.
Data Inconsistency: If multiple copies of the same data have different content in each copy. Like, the phone number of students is different in academic and accounts files.
Data access: In a file system, accessing data was difficult and insecure as well. Accessing data concurrently was not possible.
No Backup and Recovery: There is no backup and recovery in the file system that can lead to data loss.

Database Design

Last Minute Notes – DBMS

Categories

Contact US

Database Design

Last Minute Notes – DBMS

Similar Reads

Categories

Contact US