Large Databases

Hey everyone,

I'm currently designing an online database and backend using PHP/MySQL where a clientside program uploads a large amount of data for each user. I'm rather new to MySQL programming, and am wondering the best way to go about this. If some of you play Team Fortress 2 then you will understand, otherwise just assume each class is a different player-type.

Basically I have 44 fields for every class, plus an ID and name for each, which comes to 46. With 10 classes that brings us to 460 fields plus a possible few more. So, I ask you this, would the best way to store this data be in one database that would be 460+ fields wide, or sore the data in 10 separate databases of 46 fields each? Note, when the page loads, all (or atleast most) of the data will have to be retrieved.

It it unusual for MySQL to have databases that may potentially have 100+ entries with 460+ fields each?

If anyone has a better way to go about then then I am willing to listen!

Thanks,

Smark

12 Replies

Hmmm… 460 fields per record is a bit extreme. Read up on database normalisation. It's the concept of breaking a record into bits of linked information.

http://en.wikipedia.org/wiki/Database_normalization

Thus, have a table containing your list of users, but link that to a class table. The class table has the 10 classes of info (i.e. 10 records for each userID, each record with a different ID for each class).

If you want to extract information, you use a JOIN to link the data sets together.

–deckert

This is the type of thing I was looking for, I'm going to look into it… Anyone else have any other opinions or suggestions?

Thanks,

Smark

Are the 44 fields related to each other?

Are the ten sets of 46 fields the same for each class?

Can a user have more than one character?

@Smark:

Basically I have 44 fields for every class

Unlikely. You probably have 44 attributes for each class; an attribute isn't a field. I don't know Team Fortress, so I'll just use a pseudo-RPG for examples.

So a simple table could be:

classname,attributename,value

With this you'd have multiple rows for each class

Fighter,Strength,10

Fighter,Speed,20

Fighter,Dexterity,15

etc etc etc To get all the fighter attributes you'd do "select attributename,value from details where classname='Fighter'"

Now this is where you'd start to normalise data; you'd have one table listing classname and associating it with a number and maybe other info. So

1, Fighter, "This character has two handed swords as a feat"

2, Rogue, "Strong but clumsy"

3, Elf, "Tolkein is spinning in his grave"

You'd have another table listing attribute name and a number

1, Strength

2, Speed

3, Dexterity

Now your main table would be

1,1,10

1,2,20

1,3,15

2,1,25

2,2,5

2,3,10

3,1,5

3,2,30

3,3,40

Doesn't mean anything to humans, but this is a database and with cross joins we've defined 3 classes and 3 attributes for each class.

Your main table would be 440 rows long (your 10 classes with 44 attributes each).

This is a very small database :-)

Long is usually a worse performer than wide. Long is better for logs. If they are indeed 44 attributes, 44 fields would probably be better.

If you want performance, then don't use a database at all. Since we're talking about tiny datasets, store them in plain text files and load them into memory. I'd do it in perl with hashes ($character{Thief}{Agility}=30;) :-) :-)

But seriously, the problem with creating a gazillion fields is one of scalability, especially if there's a large number of attributes that could be selected from. What if character A's attributes are different to character B's? Maybe there's 20 in common, but the other 20-odd are different; now the table has to be 60+ fields wide, a lot of which are empty. It's just bad database design. Conversely the design I described is scalable, extensible, normalized, maintainable and all other good stuff :-)

(Actually, I am semi-serious about the perl hash; this is too small and too simple to use a database for; the overhead a database introduces is amazing)

@sweh:

If you want performance, then don't use a database at all. Since we're talking about tiny datasets, store them in plain text files and load them into memory.

Which if fine until you have two programs or instances accessing and modifying the data. At which point you either convert to a real DB or come up with your own half-assed locking and sharing scheme, which will make you crazy until you give up and convert to a real DB. And the performance thing is mostly irrelevant, the dataset, as described, is so small[1] that it will live in memory regardless of the implementation method. Let the OS and DB worry about caching and disk write-out; after all, a few people have worked on this problem. The footprint and setup of something like SQLite isn't all that big, if you're loading Perl anyway.

Steve

[1] Assuming that the 40+ fields aren't all multi-megabyte BLOBS, that is.

Thanks for the help everyone… After posting on a few other forums and talking with some people, it looks like my post mislead everyone…

I ended up making a table something like this:

Name    ID  Class       Kills   Deaths
-------------------------------------
Smark     1  Scout       34   32
Smark     1  Medic       13   64
Smark     1  Heavy       76   12
Smark     1  Spy       24     9
Smark     1  Engineer    67   12
Smark     1  Demoman  12  53
Smark     1  Pyro         64  23
Smark     1  Soldier  12  12
Smark     1  Total       302 217
Reject    2   Scout       324 214
Reject    2   Medic       352 242
Reject    2   Heavy       894 252
Reject    2   Spy       124   25
Reject    2   Engineer    96   47
Reject    2   Demoman  121    256
Reject    2   Pyro         134    132
Reject    2   Soldier  324    231
Reject    2   Total       2369  1399

The table goes on and on with 10 entries per player, and it goes farther to the right with more fields, I just showed deaths and kills to give you the idea of how it worked…

@sweh:

If you want performance, then don't use a database at all.

Or learn how to use one? I do databases all day long, so i'll have to take some issue with your statement. :)

@sweh:

Since we're talking about tiny datasets, store them in plain text files and load them into memory.

If he's not worried about multiple processes accessing it, that should be ok.

@sweh:

But seriously, the problem with creating a gazillion fields is one of scalability, especially if there's a large number of attributes that could be selected from. What if character A's attributes are different to character B's?

Which was my original question. If A's attributes are different then B's, it would probably be a good idea to have a separate TABLE for each one, with a one-to-one relationship on the main TABLE. Then, a VIEW could join them all together, especially helpful for the common fields.

@sweh:

Conversely the design I described is scalable, extensible, normalized, maintainable and all other good stuff :-)

scalable - Not really. In practicality, such a TABLE would probably kill performance especially if any sort of aggregation was required.

extensible - If, and only if, only one value was required per entry.

normalized - Absolutely not. In this TABLE all values must be of the same data type.

maintainable - Let me guess, you're a coder who has never touched a database before, and figures as long as the statement can be written, the database is maintainable?

@sweh:

(Actually, I am semi-serious about the perl hash; this is too small and too simple to use a database for; the overhead a database introduces is amazing)

A database introduces overhead? I think you mean the connection adds it to the code. A DB connection adds no more to code than any library's API does.

Sweh, i don't mean to knock you here. It just sounds as if you don't usually touch databases too often, so even if the logic is correct, as far as databases are concerned, a long TABLE is definitely not the way to go.

Now, the OP wants to use MySQL, so i think all he wants is quick data access anyway and not a real database. But that is another story.

"Sweh, i don't mean to knock you here. It just sounds as if you don't usually touch databases too often, so even if the logic is correct, as far as databases are concerned, a long TABLE is definitely not the way to go. "

Yeah well. I've only got 12 years of commercial experience, mostly with Oracle (I used to DBA Oracle 7.1.6 and Oracle 8.1.3), recently with postgres. Complicated SQL still causes my brain to leak out of my ears, though, so I tend to do most logic in perl (which helps with portability as well). My current application consists of 50-odd tables, the longest of which is a couple of million rows long by 30 fields wide.

Your objections… scalable… I was talking about scalability in terms of number of attributes; a 460 field wide table is bad. extensible; can have multiple rows per attribute (but that's not required in this problem). It's extensible by easily adding more attributes without needing to modify the schema; normalized… absolutely YES it is ("Name", "Class" should both be lookups into secondary tables; not sure you know what a normalized dataset actually is), maintainable (schema changes are bad when it comes to maintainability; my design didn't require any) etc etc etc.

But, at heart, I'm a unix geek and for data this small I'd still do it with text files :-) (locking is not an issue; it's a trivial task; done this too many times; saving is already atomic and the solution automatically provides read consistency)

But shrug each to their own. There's more than one way to skin a cat.

I disagree with you, but to each their own. I guess we'll have to leave it at that.

@Smark:

Thanks for the help everyone… After posting on a few other forums and talking with some people, it looks like my post mislead everyone…

I ended up making a table something like this:

Name    ID  Class       Kills   Deaths
-------------------------------------
Smark     1  Scout       34   32
Smark     1  Medic       13   64
Smark     1  Heavy       76   12
Smark     1  Spy       24     9
Smark     1  Engineer    67   12
Smark     1  Demoman  12  53
Smark     1  Pyro         64  23
Smark     1  Soldier  12  12
Smark     1  Total       302 217
Reject    2   Scout       324 214
Reject    2   Medic       352 242
Reject    2   Heavy       894 252
Reject    2   Spy       124   25
Reject    2   Engineer    96   47
Reject    2   Demoman  121    256
Reject    2   Pyro         134    132
Reject    2   Soldier  324    231
Reject    2   Total       2369  1399

The table goes on and on with 10 entries per player, and it goes farther to the right with more fields, I just showed deaths and kills to give you the idea of how it worked…

I think you could still do with some work on that design - there is repetition and repetition has repercussions down the line, for example if a player changes their name you have to update every row that contains their name.

I would start with something like this

One table is player info

id

name

Another table to store kills info

id

owner_id

class

kills

deaths

owner_id is a foreign key to id to the player table.

Reply

Please enter an answer
Tips:

You can mention users to notify them: @username

You can use Markdown to format your question. For more examples see the Markdown Cheatsheet.

> I’m a blockquote.

I’m a blockquote.

[I'm a link] (https://www.google.com)

I'm a link

**I am bold** I am bold

*I am italicized* I am italicized

Community Code of Conduct