Difference between revisions of "Database Table Overview"

From Dreamwidth Notes
Jump to: navigation, search
m (Added categories)
(no need to link to the cat Database Tables pulls in all the tables.)
 
(24 intermediate revisions by 5 users not shown)
Line 3: Line 3:
 
==Introduction==
 
==Introduction==
  
All the information that is entered into Dreamwidth (like with most web apps), gets stored in a database.  So understanding the table structure can help you understand how Dreamwidth is storing things.  That's the theory, anyway.  It also helps in debugging, extending Dreamwidth and so on.  So it's good to know, even if you're not planning on altering the tables or the interfacing code directly.
+
All the information that gets entered into Dreamwidth is, (like with most web apps) stored in a database.  It's the database that stores your journal, remembers all the comments and keeps track of all users, while the rest of Dreamwidth (the code and web based part) occupies itself with fetching things from the database to generate web pages for you to view or taking information which you've entered and getting it ready to store into the database.  Using techno-speak, the database proves the persistent data storage for the web/perl front-end.  If you want to understand what the code is doing (by understanding what data is being sent or received from the database when the code is generating web pages or storing data), or if you want to extend any part of Dreamwidth which needs some way of storing data, then you need to know about the database structure.
  
This section provides an overview on the tables, what they do and how they map to what you seeEnough information so that you get up and running, anywayThat's the plan.   
+
Pretty much all databases store their information in tables, and Dreamwidth's database is no exception.  To understand Dreamwidth's database structure, you need to understand all the tables that Dreamwidth has and also (quite importantly) how they link together. 
 +
 
 +
Dreamwidth has a lot of tables.  The full list of tables can be viewed through [[Database Tables]] and all the relationships between the tables can be viewed through [[Database Relationships]].  Because these lists are complete and exhaustive but otherwise unhelpful, this page attempts to describe all the database tables that Dreamwidth has as well as their relationships by linking them to front end concepts you will already know from using the site.  The idea is to make sure that you can quickly understand what tables correspond to the bit of code (or the DW feature) you are interested in, and whether any table is important to what you are doing.
 +
 
 +
===Understanding This Page===
 +
 
 +
This page should be usable by a people who don't know much about databases (or can't quite remember what they did when they last used one)However, depending on your skill levels, you may need to read up on some things first.  If you are a complete newbie, you should read all of these links first. 
 +
 
 +
Dreamwidth uses a slightly amped up version of the popular MySQL database.  The code talks to the database using SQL queries.  If you have used MySQL before, you should be ok with Dreamwidth.  MySQL also tends to be very good about implementing SQL, so the odds are that if you've used another database before, most (but not all) of the SQL you already know will be transferable.  There will probably be problems with small things, and these are best resolved by consulting the MySQL manual for any command you are trying to use that is giving you grief. 
 +
 
 +
If you are completely new to databases and SQL (or you want to know how to connect using a dreamhack account), please go see [[Exploring the database]], which has a lovely tutorial for guiding you through connecting to your Dreamwidth implementation and viewing some of the information in the databaseThere is nothing like entering something into a web-page and then seeing the changes reflected in the database tables to teach you what's happening and also to help debug your code, if needed. 
 +
 
 +
Database tables should also be defined, for complete newbies.  If you're thinking of the kind of tables you drew in school to represent data, then you are not far wrongTables in databases have some additional properties to worry about.  Each column has a type (character string, integer, text, date etc.) which it can be, and columns also need to know their defaults, or whether they are allowed to have entries with no data.  Finally, columns can have extra special features like being a key, or auto incrementing. 
 +
 
 +
This guide sometimes references the concept of primary and foreign keys.  A primary key in a database table is a column (although sometimes it is more than one column) which uniquely describes that entry in the database.  Primary keys are particularly useful in relational databases (of which MySQL is one) because it allows entries in other tables (or even other entries in the same table) to refer to one unique entry.  Also, the database tends to be happier if it can identify entries uniquely as well.  A column in a table B which references a primary key in table A is called a foreign key.  The presence of primary/foreign key pairs implies a relationship between entries in the two separate tables, which is why they are important.  If you understand programming, you should view the foreign key as a pointer or reference to a particular entry in a database, and the primary key as being equivalent to the location in memory.
 +
 
 +
This overview also talks a lot about the relationships between tables.  This overview uses terms that are common in relational database design, and should be familiar if you're already familiar with relational databases.  As these terms are not completely standard, you might need to check out [[Database Relationships Explained]] to make sure you understand the terminology being used.  If you are a newbie, you should definitely check out [[Database Relationships Explained]], as it explains the different types of relational database used in Dreamwidth (and commonly on the internet as well).
  
 
==Tables by Concept==
 
==Tables by Concept==
 +
 +
This section attempts to map the concepts you know (for example, from having used the site) with the database tables that represent them, to make finding things easy.
  
 
===Users===
 
===Users===
  
[[Database Table: user]]
+
Details about a user are held in the table [[Database Table: user]] and this forms the centre of how user information is stored about users.
  
The details describing a user are, shockingly, held in [[Database Table: user]].   
+
There are many other tables which help in storing information about users, many (but not all) of them prefixed by the phrase user.  We divide these other tables into three groups for ease of description:
  
There are also many other tables starting with user. Some of them map a user onto something else.  For example, user_schools maps a user (via a userid) to a school in the school database table (via a schoolid), with some extra information about the join (year started and year ended). Some of them map extra information onto user (e.g. userbio, which just maps a bio field onto a userid).
+
* One to one and one to many tables, which basically hold extra information about each user.
 +
* Link tables which link users to other concepts (like schools in the schools table, for example).
 +
* User Property Lists (see [[#Property Lists]] above).
  
Some join with each other and user and come in the form user*prop and user*proplistAll these do is define properties for something, (in the user*proplist table) and the user*prop table defines a value for that property for each user.
+
Also, there is a table dealing with linking users to users (in a graph structure) called wt_edgesThis is described further under the friends concept.
  
userprop and userproplist contain the common properties for a user.  For example, userproplist defines the property twitter and tells you that this is the twitter user account name.  If you enter a twitter account when setting up a user, this goes into userprop.
+
====Extra Information====
 +
 
 +
The tables holding extra information on a user are:
 +
 
 +
* [[Database Table: birthdays]]:  One to one mapping of the user's next birthday.  Original birthday information is held in the user table.
 +
* [[Database Table: userbio]]:  One to one mapping of a user biography onto a user.
 +
* [[Database Table: userlog]]: One to many mapping of user events (e.g. account create) onto a user, showing the time the event happened.  This table logs user activity.
 +
 
 +
====Link Tables to Other Concepts====
 +
 
 +
These tables link the concept of a user with other concepts.
 +
 
 +
* [[Database Table: user_schools]]:  Maps a user onto a school (in the schools table) with the year of starting and the year of finishing.
 +
 
 +
====User Proplists====
 +
 
 +
[[Database Table: userproplist]] and [[Database Table: userproplite2]] form a proplist pair for defining user properties (such as the name of a twitter account).
  
 
===Posts===
 
===Posts===
  
I have absolutely no clue where posts are heldNor do I know where to look. I wish I knew.  [[User:Aphenine|Aphenine]] 22:36, 28 July 2009 (UTC)
+
Thanks to the history of how Dreamwidth developed, the name for a journal (or journal posts) is in fact a log inside the databases (see [[#BradLog]] for further details)Therefore, the tables starting with log hold the journal posts, and any other tables with log in the refer to journals, unless they refer to the process of logging data that DW generates (e.g. the table userlog)As you can see, there's a little room for confusion.
 +
 
 +
Posts are split across two tables: [[Database Table: log2|log2]] and [[Database Table: logtext2|logtext2]].  The former defines the post and all its details (e.g. security settings), while the second one actually has the text that makes up a journal post.
 +
 
 +
There is also a proplist for individual journal entries in the pair of tables logproplist and logprop.  These allow the setting of properties for individual journal entries.
  
 
===Comments===
 
===Comments===
  
Comments are held in two tables, [[Database Table: talk2]] and [[Database Table: talktext2]].  The actual text is held in talktext2, and if you SELECT * FROM talktext2; in MySQL, you get to see all comments ever made in your DW server. Exciting or what?
+
Thanks to the history of how Dreamwidth developed, the name for comments is in fact talk inside the databases (and also in the name of the comment bml pages, see [[#BradLog]] for further details).
  
talk2 holds the structure of the comment, along with who posted it.  It has a field called parenttalkid, and any table with a column heading with the words parent and id in them implies that this table actually represents a tree.  This is how, in the database, is represented the lovely threading you get with comments as they recursively indent down with each new reply until you can barely read.  There are also two fields with node in them, which must play some role in how the tree is constructed, but I don't know how.  Yet.
+
Comments are held in two tables, [[Database Table: talk2]] and [[Database Table: talktext2]].  The actual text is held in talktext2, and talk2 holds the structure of the comment, along with who posted it.   
  
 
===Tags===
 
===Tags===
  
Tags are confusing because there appear to be two different systems.   
+
Thanks to the history of how Dreamwidth developed, inside the databases tags are called keywords (see [[#BradLog]] for further details).
 +
 
 +
The table keyword itself is legacy and does not appear to be used anymoreTags are instead held in userkeywords and usertags. 
 +
 
 +
The table usertags appears to be a tree but holds no tag names, while userkeywords holds tag names but no structure.
 +
 
 +
Tags are mapped to individual posts in the [[Database Table: logtags|logtags]] table, and also in the [[Database Table: logtagsrecent|logtagsrecent]].
 +
 
 +
==Appendix==
 +
 
 +
===BradLog===
  
In the first, the tags are provided by the userkeywords table, which maps a user to a keyword and a keyword IDThis allows every user to maintain their own list of tags.
+
Livejournal, the predecessor to Dreamwidth, dates from an era where the language used to describe blogging hadn't yet settled down to its current standardised formAlthough most of the code used in LJ has been updated over time to follow the new terminology, the database (especially), is rife with the old terminology (with just enough of the new to throw you off), making it very confusing.
  
In the second, tags are provided by two tables, the keywords table and the usertags tableThe keywords table is simple, mapping a keyword ID to a keywordThe mapping to a user is then done through the usertags table, which maps a user to a single keywordExcitingly, there is also a parent keyword ID field, which implies that tags can now be sorted in trees, which is awesome.
+
LJ grew out of a program written by founder Brad, called BradLogIn this, each entry was a '''log''' entry in a web log (blog)Users could '''talk''' about each log entry (comments) and each log entry could be described by '''keywords''' (tags).   
  
[[Category: Development]]
+
[[Category:Development]]
[[Category: Database Tables]]
+
[[Category:Database Tables]]

Latest revision as of 08:30, 5 February 2010

Note: This page has been started by a complete newbie, who is trying to make sense of the database table structure as she goes and is writing the information down. As a result, the details may be inaccurate and so on. Hopefully, it's still better than nothing to the equally clueless! ;-) Aphenine 22:36, 28 July 2009 (UTC)

Introduction

All the information that gets entered into Dreamwidth is, (like with most web apps) stored in a database. It's the database that stores your journal, remembers all the comments and keeps track of all users, while the rest of Dreamwidth (the code and web based part) occupies itself with fetching things from the database to generate web pages for you to view or taking information which you've entered and getting it ready to store into the database. Using techno-speak, the database proves the persistent data storage for the web/perl front-end. If you want to understand what the code is doing (by understanding what data is being sent or received from the database when the code is generating web pages or storing data), or if you want to extend any part of Dreamwidth which needs some way of storing data, then you need to know about the database structure.

Pretty much all databases store their information in tables, and Dreamwidth's database is no exception. To understand Dreamwidth's database structure, you need to understand all the tables that Dreamwidth has and also (quite importantly) how they link together.

Dreamwidth has a lot of tables. The full list of tables can be viewed through Database Tables and all the relationships between the tables can be viewed through Database Relationships. Because these lists are complete and exhaustive but otherwise unhelpful, this page attempts to describe all the database tables that Dreamwidth has as well as their relationships by linking them to front end concepts you will already know from using the site. The idea is to make sure that you can quickly understand what tables correspond to the bit of code (or the DW feature) you are interested in, and whether any table is important to what you are doing.

Understanding This Page

This page should be usable by a people who don't know much about databases (or can't quite remember what they did when they last used one). However, depending on your skill levels, you may need to read up on some things first. If you are a complete newbie, you should read all of these links first.

Dreamwidth uses a slightly amped up version of the popular MySQL database. The code talks to the database using SQL queries. If you have used MySQL before, you should be ok with Dreamwidth. MySQL also tends to be very good about implementing SQL, so the odds are that if you've used another database before, most (but not all) of the SQL you already know will be transferable. There will probably be problems with small things, and these are best resolved by consulting the MySQL manual for any command you are trying to use that is giving you grief.

If you are completely new to databases and SQL (or you want to know how to connect using a dreamhack account), please go see Exploring the database, which has a lovely tutorial for guiding you through connecting to your Dreamwidth implementation and viewing some of the information in the database. There is nothing like entering something into a web-page and then seeing the changes reflected in the database tables to teach you what's happening and also to help debug your code, if needed.

Database tables should also be defined, for complete newbies. If you're thinking of the kind of tables you drew in school to represent data, then you are not far wrong. Tables in databases have some additional properties to worry about. Each column has a type (character string, integer, text, date etc.) which it can be, and columns also need to know their defaults, or whether they are allowed to have entries with no data. Finally, columns can have extra special features like being a key, or auto incrementing.

This guide sometimes references the concept of primary and foreign keys. A primary key in a database table is a column (although sometimes it is more than one column) which uniquely describes that entry in the database. Primary keys are particularly useful in relational databases (of which MySQL is one) because it allows entries in other tables (or even other entries in the same table) to refer to one unique entry. Also, the database tends to be happier if it can identify entries uniquely as well. A column in a table B which references a primary key in table A is called a foreign key. The presence of primary/foreign key pairs implies a relationship between entries in the two separate tables, which is why they are important. If you understand programming, you should view the foreign key as a pointer or reference to a particular entry in a database, and the primary key as being equivalent to the location in memory.

This overview also talks a lot about the relationships between tables. This overview uses terms that are common in relational database design, and should be familiar if you're already familiar with relational databases. As these terms are not completely standard, you might need to check out Database Relationships Explained to make sure you understand the terminology being used. If you are a newbie, you should definitely check out Database Relationships Explained, as it explains the different types of relational database used in Dreamwidth (and commonly on the internet as well).

Tables by Concept

This section attempts to map the concepts you know (for example, from having used the site) with the database tables that represent them, to make finding things easy.

Users

Details about a user are held in the table Database Table: user and this forms the centre of how user information is stored about users.

There are many other tables which help in storing information about users, many (but not all) of them prefixed by the phrase user. We divide these other tables into three groups for ease of description:

  • One to one and one to many tables, which basically hold extra information about each user.
  • Link tables which link users to other concepts (like schools in the schools table, for example).
  • User Property Lists (see #Property Lists above).

Also, there is a table dealing with linking users to users (in a graph structure) called wt_edges. This is described further under the friends concept.

Extra Information

The tables holding extra information on a user are:

  • Database Table: birthdays: One to one mapping of the user's next birthday. Original birthday information is held in the user table.
  • Database Table: userbio: One to one mapping of a user biography onto a user.
  • Database Table: userlog: One to many mapping of user events (e.g. account create) onto a user, showing the time the event happened. This table logs user activity.

Link Tables to Other Concepts

These tables link the concept of a user with other concepts.

User Proplists

Database Table: userproplist and Database Table: userproplite2 form a proplist pair for defining user properties (such as the name of a twitter account).

Posts

Thanks to the history of how Dreamwidth developed, the name for a journal (or journal posts) is in fact a log inside the databases (see #BradLog for further details). Therefore, the tables starting with log hold the journal posts, and any other tables with log in the refer to journals, unless they refer to the process of logging data that DW generates (e.g. the table userlog). As you can see, there's a little room for confusion.

Posts are split across two tables: log2 and logtext2. The former defines the post and all its details (e.g. security settings), while the second one actually has the text that makes up a journal post.

There is also a proplist for individual journal entries in the pair of tables logproplist and logprop. These allow the setting of properties for individual journal entries.

Comments

Thanks to the history of how Dreamwidth developed, the name for comments is in fact talk inside the databases (and also in the name of the comment bml pages, see #BradLog for further details).

Comments are held in two tables, Database Table: talk2 and Database Table: talktext2. The actual text is held in talktext2, and talk2 holds the structure of the comment, along with who posted it.

Tags

Thanks to the history of how Dreamwidth developed, inside the databases tags are called keywords (see #BradLog for further details).

The table keyword itself is legacy and does not appear to be used anymore. Tags are instead held in userkeywords and usertags.

The table usertags appears to be a tree but holds no tag names, while userkeywords holds tag names but no structure.

Tags are mapped to individual posts in the logtags table, and also in the logtagsrecent.

Appendix

BradLog

Livejournal, the predecessor to Dreamwidth, dates from an era where the language used to describe blogging hadn't yet settled down to its current standardised form. Although most of the code used in LJ has been updated over time to follow the new terminology, the database (especially), is rife with the old terminology (with just enough of the new to throw you off), making it very confusing.

LJ grew out of a program written by founder Brad, called BradLog. In this, each entry was a log entry in a web log (blog). Users could talk about each log entry (comments) and each log entry could be described by keywords (tags).