database normalization question down to 3nf i think

Hello, i am building a small database of restaurants

RestID (1)
RestName (Sals Pizza)
RestPhone# (555.555.5555)
RestStreet (55 Main Street)
RestTown (Boston)
RestZip (01246)
RestState (MA)
RestType (italion, chinese etc)
RestHours ( 9am to 5pm)
RestRating (5 Stars)
RestMenu (Sals_Menu.jpg)

I think i am in 3rd Normal Form but i am still a bit confused if i went a step too far in Normalization.

Here are my tables

Table 1 Restaurant
RestID | RestName | RestPhone | RestStreet | RestRating | RestMenu | RestHours|

Table 2 Restaurant Type
TypeID | RestType |

Table 3 RelationTable (Restaurant to Restaurant Type)
RelationID | RestID | TypedID |

Table 4 Restaurant Town
TownID | RestTown | RestZip | RestState |

Table 5 RelationTable (Restaurant to Restaurant Town)
RelationID | RestID | TownID |

This is as far as i am, i am a bit worried that the two relation tables are creating an extra level of seperation i dont really need but without them i return to one big table with a lot of redundancies.

Can anyone offer some assistance. i am not looking for answers here, just a nod yes your moving in the right direction or no your an idiot you dont need Table "N". A little reasoning behind why or why not would also be grateful. Thank you for your time
[1426 byte] By [storeyed] at [2007-11-18 2:03:47]
# 1 Re: database normalization question down to 3nf i think
You are right you did make a few mistakes here. The point of normalizing is to create a table of common long items such as say City, State

Charlotte NC
New York New York

and producing a method to create a ruseable value that makes data storage more compact and logical.

So a citystate table would make

1 Charlotte NC
2 New York New York

And in my main table I just have a direct reference to the CityState where for Charlotte the value would be 1.

With this

Table 3 RelationTable (Restaurant to Restaurant Type)
RelationID | RestID | TypedID |

you actually are going a little over board. It would was normalized enough for this with Table 2 Resturant Type then just add a TypeID field to Table 1 to make direct reference. As is means extra storage needs and more work to join the elements.

Now I also suggest because resturant hours are a condition of opens at and closes at that you break RestHours in table 1 into two RestOpen | RestClose, more logical that way.

Ratings I will assume is the old count rating so no need to try to normalize there but depending on the tpye of DB you are on you may want to put a constraint on the table to prevent outside ranges (note: some DBs don't have this functionality).

Now with

Table 4 Restaurant Town
TownID | RestTown | RestZip | RestState |

as there can be Mutliple Zips for the same Town, State combination it is considered by most best normalized in the main table so move RestZip back to table 1, more or less the reason is mapping a number to a number is considered poor management of data and unless you can reuse often it has no bennifits.

Now with that last comment in mind it may or may not make sense to actually have Table 4, but lets say for sake it is. Then for the same reason you don't need Table 3 you don't need Table 5

Instead put TownStateID in the Table 1 to minimalize join requirements (also the Reason I say change to TownStateID is it is more logical).

Now normalizing further you would consider posible Table 4 moving State to it's own table and referencing in Table 4 the Identity value for state. But because state abbreviations are standard you actually would be wasting space and creating an unneccessary join. The reason is i you use state abbrevitions you always have 2 characters which will take up 2 bits. Even if you created another table with a Numbered column to map state abbreviation to you actually are not saving but at te most 1 bit per record and you would have to have at least 150 records to break even plus the join requirements make record numbers over that more costly on the query engine and it is not absolutely needed (that choice thou is your to make).

Other than that if you would have a high number of repeating values for Resturant name there is not need to normalize further. Again there would be no space savings and would not make logical sense otherwise.

Hopefully I was helpfull. Also, if you have further questions on structure, datatypes, indexing it would be helpfull to include the type of Database you are using (Access, SQL, Oracle, MySQL, Sybase, Informix, etc).
antares686 at 2007-11-9 13:33:03 >
# 2 Re: database normalization question down to 3nf i think
wow thanks i feared i was getting a bit to outta of hand with the referance table and the two ids

so your sugesting make city and state one value as in

CityState
CityStateIndex | CityStateName
1 Boston, MA

is there an advantage over this approach then with
a citystate table with 3 values
example: 1 | Boston | MA

i supose in essence it doesnt really matter either way
a differance of a text field of say 16 and a text field of 2
VS one text field of 18 chars.

i have a lot to chew with this thanks for the assistance i really appreciate it

i am planning on using mySQL with php here
ive been a software engineer working in visual c++ for 2 years now and have gotten a bit out of touch with the web and databases so this is my first test example

again thanks a lot, i warn you i might be coming back with more questions also you mentioned

"as there can be Mutliple Zips for the same Town, State combination it is considered by most best normalized in the main table so move RestZip back to table 1,"

is there a web site or referance or even book that would offer some insight into these standards. the only book i have is my college database book by Date. which while is very informative, becoming more like a turkey dinner then an interesting read at times.
storeyed at 2007-11-9 13:34:01 >
# 3 Re: database normalization question down to 3nf i think
Sorry, I did confuse you there. I would have the 3 columns, just for proper logical data seperation.

Not up on MySQL yet and just haven't found the time to play with since my backgrounds is SQL Server and recently I stumbled onto some awesome new understanding of the servers methods of storage and overall workings. I plan to pick MySQL apart in the future thou.

As for books, cannot really suggest anything in that realm as I have picked up most everything I know from examining everything I write and do and the interaction I have with the forums at www.sqlservercentral.com (a SQL Server site). Don't know of any good user forums for MySQL.

:cool:
antares686 at 2007-11-9 13:34:59 >
# 4 Re: database normalization question down to 3nf i think
Ok sorry for not replying sooner

I have what i think is the correct normalization of my Restaurant DB. I have reworked it into 3 tables now.

The issue that has given me issues and i am still a bit worried about is Restaurant type. The issue is if a Restaurant served say Italion and Chinese.

Note: this will be with mySQL and PHP

Table 1 : Restaurant

RestID|RName|RStreet|CityStateID|RZip|RPhone#|ROpen|RClose|RMenu|RDelivery?|RRating|

Table 2: City and State
RCityState|City|State|

Table 3: Restaurant Type
TypeID|RestID|RestType|

The questions i have are

A) is Table 1 too big. Its not like i am building a monster Dbase of all Restaurants in America but it seems rather large. In your experiance would you reduce this table just for size purposes.

B) The Restaurant Type. Again if a Restaurant has 2 types ie. Chinese and Italion is this the way to handle it. The RestID allows for a connection between the two tables.

C) Joins the area with the most overhead (i think) joining any or all of these tables keeps it small and doesnt look like too much of an issue. I guess thats the positive of the size of Table 1.

and finally
D) Has this been reduced all the way down to its core. Normalization is a bit of a "Too little or Too much is a bad thing" and so any further reduction would just be useless.

Thanks again
Ed
storeyed at 2007-11-9 13:36:04 >
# 5 Re: database normalization question down to 3nf i think
Ok sorry for not replying sooner

I have what i think is the correct normalization of my Restaurant DB. I have reworked it into 3 tables now.

The issue that has given me issues and i am still a bit worried about is Restaurant type. The issue is if a Restaurant served say Italion and Chinese.

Note: this will be with mySQL and PHP

Table 1 : Restaurant

RestID|RName|RStreet|CityStateID|RZip|RPhone#|ROpe
n|RClose|RMenu|RDelivery?|RRating|

Table 2: City and State
RCityState|City|State|

Table 3: Restaurant Type
TypeID|RestID|RestType|

The questions i have are

A) is Table 1 too big. Its not like i am building a monster Dbase of all Restaurants in America but it seems rather large. In your experiance would you reduce this table just for size purposes.

Table should be fine in size as long as the mySQL row length limitations are not passed.
However is there any specific data that would cause large sums of columsn to be repeated.
For example does RMenu have a single menu item and thus multiple rows for the Resturant.
If so then do the following Add a menu table similar to so (sorry I will use my normal conventions here)

tblMenu
MenuID | MenuItem

Then a relationship table for Resturant to Menu

tblRestToMenu
RestID | MenuID | ItemCost (Note: if item cost is same at all resturants then consider having in tblMenu table, if not this is here more logical. That is if you are even storing)

B) The Restaurant Type. Again if a Restaurant has 2 types ie. Chinese and Italion is this the way to handle it. The RestID allows for a connection between the two tables.

There are a few ways to handle. Now that we have backed up and I can see things better then here is what I suggest.

Table 3: Restaurant Type
TypeID|RestID|RestType|

becomes

Table 3: Restaurant Type
TypeID|RestType

and add a table

tblRestToType
RestID | TypeID

An additional note, although this would be properly normalized by most there is another option. It involves bit masking and is good for small list.

Concept is this: Say I have 3 types of resturants

Italian, Chinese, Mexican

Then I may opt for bit masking

tblType
TypeID | Type Bit value
1 Italian 001
2 Chinese 010
4 Mexican 100

Ok now say the resturant is Chinese and Italian I use the bit mask to set the significant bits like so

001 + 010 = 011

So I store in the TypeID field in resturant the value 3 (bit value is 11)

This can be a little more complicated but quite powerful, especially by removing an extra table.

However, unless you are extremely familiar with querying this type I suggest stick with the extra table.

Later then come back and play with the concept and see what it can do for you when you are comfortable with it.

C) Joins the area with the most overhead (i think) joining any or all of these tables keeps it small and doesnt look like too much of an issue. I guess thats the positive of the size of Table 1.

Joins are where your overhead will come in. However good indexes and planning on a good DB server can make it almost unnoticeable (mySQL is amoung those). But if you notice bad delays in results it may make sense to denomarlize things a bit for performance considerations.

D) Has this been reduced all the way down to its core. Normalization is a bit of a "Too little or Too much is a bad thing" and so any further reduction would just be useless.

Maybe, maybe not. Make your choice of changes, step back and see if there would be a situation where a large number of the same data would repeat.
For Example if the RestName in the Resturant table would have a large number of duplicate values some would suggest moving to another table

tblRestName
RestNameID | RestName

and replace RName in resturant table with RestNameID, but this is more of a storage concern (which normalization is all about). After you normalize and build you may decide against this for performance reason (that is denormalization).

Hope this helps.
antares686 at 2007-11-9 13:37:02 >