Coursera_Managing Big Data with MySQL_Note 1

Reference Course: Managing Big Data with MySQL in Coursera
Remember: when you complete this course you will have marketable skills with big data analysis in SQL! The benefits you will be well worth your time and effort!

Entity-Relationship Diagrams(ER diagrams)

Terminology:

  • Entity Instance:
    A single occurrence of an entity type. In a table it would look like a row

  • Unique Attribute (normally used as unique key)
    An attribute with a unique value in each entity instance. In a table it would look like the column that allows you to link tables together.

    There could be many unique attributes in an entity of a ER diagram

Simple Notation:

Rectangle Box : Entity
  1. Represent category of data your database will keep track of.
  2. Each box is a category.
  3. Entity will probably become a table when the database is made
  4. Every entity must have at least one attribute that serves as a unique key
Oval Box: Attribute
  1. Represent aspects of each category or entity that will be recorded.
  2. Attribute will likely become the columns in the table that will be built around that entity.
  3. According to the rules of set theory, each attribute must be unique for that entity

Every entity or catrgory has to be related to at least one other entity in the database.

Diamond and lines: Relationship
  1. The word in the diamond spells out the nature of the relationship. For example: STUDENT attend COLLEGE
Cardinality Constrains

|: One
<-: Many

Cardinality Constrains Example

Complex Notation:

A attribute is in parenthese: ( e.g (ClassroomID) ) Composite Attribute
  1. Composite attributea are measurements or descriptions that have important meaning that can be created by combing other attributes being tracked and saved.
  2. To save space, compostite attributes are usually not made into their own columns in a database.
  3. Composite attribute works only when all the sub-attributes linked to the composite are joined together. Even misses one sub-attribute that this compsite won’t work.
Double Rectangle: Weak Entity
Attribute is underlined with a dash line: Partial Key
  1. Normally, this attribute in the database on its own will not identify an entity(not a unique key)
  2. Nontheless, a partial key can become unique in it’s own weak entiry if, and only if, it is connect to the unique key of the entity it is connected to with a double diamond.

Weak Entity and partial Key

Double Oval: Mulitivalued Attributes

Multivalued attributes are depicted by a double line around the oval of an attributes.

“Multivalued” attributes are used when an entity can have multiple values for the same attribute.

Example: Student achieves Completion contains Class

Student achieves Completion contains Class

Explanation:

In general, we can see that students are taking a bunch of classes, but studnet names can’t be connected directed with class levels or ID in this data, despite the fact that both students and class have a unique attribute.

Apparently, the only way they can be connected is if we combine them via the semester the class was taken. What that means is that student can take the same class multiple times.

If we want to analyze data about studnets and class, we would have to combine not two tables, but three tables to get the data we want. Further, since Completion is a week entity, when we actually write the queries, we would have to make sure specified both the partial key and the unqiue keys when we try to joun the tables.

Relational Schemas

Critical Components of a Relational Schema:

  • Tables
  • Primary keys
  • Foreign Keys

Techinical Terms:

Techinical Terms

Techinical Terms

(PK) : Primary key:

  • Only 1 column per table can be a primary key. (unless multiple columns are needed to identify each row, for example a mapping table of a many to many relationship).
  • Primary keys cannot have null values.

The critical difference between the ER diagrams and a relational schemaa, though is that although we were supposed to underline all unique attributes in the ER diagrams, only one unique column in a table of a relational schema or database can be labelled as a primary key.

The only exception is if we need multiple columns to make each row in a table unique.

The reason we only have one primary key, is that the database software uses the primary key in special ways to ensure data integrity. To maintain this integrity, primary keys are configured so that they cannot contain any missing values.

(U) : Unique column

To help us with these links it is useful to mark unique columns in our schema with a U next to their colum titles so that we know which ones can be use to link tables in our queries.

(FK) : Foreign Key

Foreign keys are used to link tables to other tables. They complish this by having values that refer to unique IDs and other tables.

  1. We indicate foreign keys in relational schemas by drawing arrows from foreign key in one table to the primary key it refers to in another table.
  2. The arrow goes from the table that has a foreign key to the table where the key comes from (so the arrow points to the primary key)
  3. Foreign keys and unique keys do not have to have the same name. But we should at least make the names seem similar as a good database designer.

Weak Entity represented in a Relational Schema

  • Example:

Each apartment is uniquely identified by a combination of the apartment number in the apartment table with the building ID in the building table.

We can discern this from the fact that both apartment ID and building ID are underlined in the apartment table, and that in addition builidng ID in this table is a foreign key to the building table.

Weak Entity represented in a Relational Schema

One to Many Relationships

Whenever we have a 1:Many relationship, the relation mapped from the entiry on the Many side of the relationship shoule get a foreign key that corresponds to the primary key of the relation on the 1 side of the relationship.

This is because we would have to have duplicate values of the unique indentifier of the entity on the 1 side in the table that represents the entity on the Many side

One to Many Relationships

A customer could have many phone numbers, but normally each phone number only corresponds one customer(except situations like a couple, family or somehting like that…)

Many to Many Relationships

  • Example

It has a table called enrolled in that only has foreign keys as attributes.

This type of mapping table has to be implemented when the entities in the corresponding ER diagram have what is called a many to many relationship.

Many to Many Relationships

Under these circumstances there is no way to follow the rule that primary keys mush not be duplicated and still included a foreign key in the same table as the primary key.

Therefore, a mapping table has nothing but one composite primary keys in it. Whenever we see a maping table like this in a database, it’s a clue of the columns in the tables that links up can have multiple instances of one another.

Question:

-[ ] 1. For relational databases to work most efficiently, which of the following requirements of set theory should be followed? Check all that apply. (Note that you will need to select more than one correct response to answer this question correctly.)

Funny Time

Never ask a database administrator to help you move the furniture. They always seem to be dropping tables.