Database Table Overview

From Dreamwidth Notes
Revision as of 08:30, 5 February 2010 by Exor674 (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Note: This page has been started by a complete newbie, who is trying to make sense of the database table structure as she goes and is writing the information down. As a result, the details may be inaccurate and so on. Hopefully, it's still better than nothing to the equally clueless! ;-) Aphenine 22:36, 28 July 2009 (UTC)

Introduction

All the information that gets entered into Dreamwidth is, (like with most web apps) stored in a database. It's the database that stores your journal, remembers all the comments and keeps track of all users, while the rest of Dreamwidth (the code and web based part) occupies itself with fetching things from the database to generate web pages for you to view or taking information which you've entered and getting it ready to store into the database. Using techno-speak, the database proves the persistent data storage for the web/perl front-end. If you want to understand what the code is doing (by understanding what data is being sent or received from the database when the code is generating web pages or storing data), or if you want to extend any part of Dreamwidth which needs some way of storing data, then you need to know about the database structure.

Pretty much all databases store their information in tables, and Dreamwidth's database is no exception. To understand Dreamwidth's database structure, you need to understand all the tables that Dreamwidth has and also (quite importantly) how they link together.

Dreamwidth has a lot of tables. The full list of tables can be viewed through Database Tables and all the relationships between the tables can be viewed through Database Relationships. Because these lists are complete and exhaustive but otherwise unhelpful, this page attempts to describe all the database tables that Dreamwidth has as well as their relationships by linking them to front end concepts you will already know from using the site. The idea is to make sure that you can quickly understand what tables correspond to the bit of code (or the DW feature) you are interested in, and whether any table is important to what you are doing.

Understanding This Page

This page should be usable by a people who don't know much about databases (or can't quite remember what they did when they last used one). However, depending on your skill levels, you may need to read up on some things first. If you are a complete newbie, you should read all of these links first.

Dreamwidth uses a slightly amped up version of the popular MySQL database. The code talks to the database using SQL queries. If you have used MySQL before, you should be ok with Dreamwidth. MySQL also tends to be very good about implementing SQL, so the odds are that if you've used another database before, most (but not all) of the SQL you already know will be transferable. There will probably be problems with small things, and these are best resolved by consulting the MySQL manual for any command you are trying to use that is giving you grief.

If you are completely new to databases and SQL (or you want to know how to connect using a dreamhack account), please go see Exploring the database, which has a lovely tutorial for guiding you through connecting to your Dreamwidth implementation and viewing some of the information in the database. There is nothing like entering something into a web-page and then seeing the changes reflected in the database tables to teach you what's happening and also to help debug your code, if needed.

Database tables should also be defined, for complete newbies. If you're thinking of the kind of tables you drew in school to represent data, then you are not far wrong. Tables in databases have some additional properties to worry about. Each column has a type (character string, integer, text, date etc.) which it can be, and columns also need to know their defaults, or whether they are allowed to have entries with no data. Finally, columns can have extra special features like being a key, or auto incrementing.

This guide sometimes references the concept of primary and foreign keys. A primary key in a database table is a column (although sometimes it is more than one column) which uniquely describes that entry in the database. Primary keys are particularly useful in relational databases (of which MySQL is one) because it allows entries in other tables (or even other entries in the same table) to refer to one unique entry. Also, the database tends to be happier if it can identify entries uniquely as well. A column in a table B which references a primary key in table A is called a foreign key. The presence of primary/foreign key pairs implies a relationship between entries in the two separate tables, which is why they are important. If you understand programming, you should view the foreign key as a pointer or reference to a particular entry in a database, and the primary key as being equivalent to the location in memory.

This overview also talks a lot about the relationships between tables. This overview uses terms that are common in relational database design, and should be familiar if you're already familiar with relational databases. As these terms are not completely standard, you might need to check out Database Relationships Explained to make sure you understand the terminology being used. If you are a newbie, you should definitely check out Database Relationships Explained, as it explains the different types of relational database used in Dreamwidth (and commonly on the internet as well).

Tables by Concept

This section attempts to map the concepts you know (for example, from having used the site) with the database tables that represent them, to make finding things easy.

Users

Details about a user are held in the table Database Table: user and this forms the centre of how user information is stored about users.

There are many other tables which help in storing information about users, many (but not all) of them prefixed by the phrase user. We divide these other tables into three groups for ease of description:

  • One to one and one to many tables, which basically hold extra information about each user.
  • Link tables which link users to other concepts (like schools in the schools table, for example).
  • User Property Lists (see #Property Lists above).

Also, there is a table dealing with linking users to users (in a graph structure) called wt_edges. This is described further under the friends concept.

Extra Information

The tables holding extra information on a user are:

  • Database Table: birthdays: One to one mapping of the user's next birthday. Original birthday information is held in the user table.
  • Database Table: userbio: One to one mapping of a user biography onto a user.
  • Database Table: userlog: One to many mapping of user events (e.g. account create) onto a user, showing the time the event happened. This table logs user activity.

Link Tables to Other Concepts

These tables link the concept of a user with other concepts.

User Proplists

Database Table: userproplist and Database Table: userproplite2 form a proplist pair for defining user properties (such as the name of a twitter account).

Posts

Thanks to the history of how Dreamwidth developed, the name for a journal (or journal posts) is in fact a log inside the databases (see #BradLog for further details). Therefore, the tables starting with log hold the journal posts, and any other tables with log in the refer to journals, unless they refer to the process of logging data that DW generates (e.g. the table userlog). As you can see, there's a little room for confusion.

Posts are split across two tables: log2 and logtext2. The former defines the post and all its details (e.g. security settings), while the second one actually has the text that makes up a journal post.

There is also a proplist for individual journal entries in the pair of tables logproplist and logprop. These allow the setting of properties for individual journal entries.

Comments

Thanks to the history of how Dreamwidth developed, the name for comments is in fact talk inside the databases (and also in the name of the comment bml pages, see #BradLog for further details).

Comments are held in two tables, Database Table: talk2 and Database Table: talktext2. The actual text is held in talktext2, and talk2 holds the structure of the comment, along with who posted it.

Tags

Thanks to the history of how Dreamwidth developed, inside the databases tags are called keywords (see #BradLog for further details).

The table keyword itself is legacy and does not appear to be used anymore. Tags are instead held in userkeywords and usertags.

The table usertags appears to be a tree but holds no tag names, while userkeywords holds tag names but no structure.

Tags are mapped to individual posts in the logtags table, and also in the logtagsrecent.

Appendix

BradLog

Livejournal, the predecessor to Dreamwidth, dates from an era where the language used to describe blogging hadn't yet settled down to its current standardised form. Although most of the code used in LJ has been updated over time to follow the new terminology, the database (especially), is rife with the old terminology (with just enough of the new to throw you off), making it very confusing.

LJ grew out of a program written by founder Brad, called BradLog. In this, each entry was a log entry in a web log (blog). Users could talk about each log entry (comments) and each log entry could be described by keywords (tags).