Difference between revisions of "Schema.doc (MediaWiki)"

from HTYP, the free directory anyone can edit if they can prove to me that they're not a spambot
Jump to navigation Jump to search
m (belated post-import cleanup (partial))
m (→‎Contents: more cleanup)
 
Line 1: Line 1:
 
{{mediawiki dev doc|Schema.doc}}
 
{{mediawiki dev doc|Schema.doc}}
==Contents== The most up-to-date schema for the tables in the database
+
==Contents==
should always be "tables.sql" in the maintenance directory, which is
+
The most up-to-date schema for the tables in the database should always be "tables.sql" in the maintenance directory, which is called from the installation script. Here are a few highlights that may be out of date:
called from the installation script. Here are a few highlights that may
+
===user===
be out of date: ===user===
 
 
Wikipedia users
 
Wikipedia users
 
*'''user_id'''
 
*'''user_id'''
 
*:integer, primary key, autoincrement
 
*:integer, primary key, autoincrement
 
*'''user_name'''
 
*'''user_name'''
*:Usernames must be unique, must not be in the form of an IP address. _Shouldn't_ allow slashes or case conflicts. Spaces are allowed, and
+
*:Usernames must be unique, must not be in the form of an IP address. _Shouldn't_ allow slashes or case conflicts. Spaces are allowed, and are _not_ converted to underscores like titles. (Conflicts?)
are _not_ converted to underscores like titles. (Conflicts?)
 
 
*'''user_password'''
 
*'''user_password'''
 
*:Hash of current password.
 
*:Hash of current password.
Line 15: Line 13:
 
*:Generated for mail-a-new-password feature
 
*:Generated for mail-a-new-password feature
 
*'''user_email'''
 
*'''user_email'''
*:Note -- email should be restricted, not public info. Same with
+
*:Note -- email should be restricted, not public info. Same with passwords. ;)
passwords. ;)
 
 
*'''user_options'''
 
*'''user_options'''
 
*:Newline-separated list of name=value pairs.
 
*:Newline-separated list of name=value pairs.
 
*'''user_token'''
 
*'''user_token'''
*:A pseudorandomly generated value that is stored in a cookie when the
+
*:A pseudorandomly generated value that is stored in a cookie when the "remember password" feature is used (previously, a hash of the password was used, but this was vulnerable to cookie-stealing attacks)
"remember password" feature is used (previously, a hash of the password
 
was used, but this was vulnerable to cookie-stealing attacks)
 
 
===cur===
 
===cur===
 
Wikipedia "current" articles
 
Wikipedia "current" articles
Line 28: Line 23:
 
*:integer, primary key, autoincrement
 
*:integer, primary key, autoincrement
 
*'''cur_namespace'''
 
*'''cur_namespace'''
*:integer index into list of namespaces. See the Namespace class for
+
*:integer index into list of namespaces. See the Namespace class for more details.
more details.
 
 
*'''cur_title'''
 
*'''cur_title'''
*:Title of article (in dbkey form--see Title), without namespace. The
+
*:Title of article (in dbkey form--see Title), without namespace. The combination of namespace,title should be unique in this table.
combination of namespace,title should be unique in this table.
 
 
*'''cur_text'''
 
*'''cur_text'''
 
*:Wikitext of the article.
 
*:Wikitext of the article.
Line 166: Line 159:
 
*:Creation (or refresh) date in standard YMDHMS form. IP blocks expire automatically.
 
*:Creation (or refresh) date in standard YMDHMS form. IP blocks expire automatically.
 
*'''ipb_auto'''
 
*'''ipb_auto'''
*:Indicates that the IP address was banned because a banned user
+
*:Indicates that the IP address was banned because a banned user accessed a page through it. If this is 1, ipb_address will be hidden.
accessed a page through it. If this is 1, ipb_address will be hidden.
 
 
===site_stats===
 
===site_stats===
 
Site-wide statistics
 
Site-wide statistics
 
*'''ss_row_id'''
 
*'''ss_row_id'''
*:Token for where clauses. There's only one row in this table. At some
+
*:Token for where clauses. There's only one row in this table. At some point we might want to use a date here so we can get stats-by-date.
point we might want to use a date here so we can get stats-by-date.
 
 
*'''ss_total_views'''
 
*'''ss_total_views'''
 
*:Number of total views of all pages.
 
*:Number of total views of all pages.
Line 180: Line 171:
 
*:Number of "countable" articles.
 
*:Number of "countable" articles.
 
===hitcounter===
 
===hitcounter===
Stores an ID for every time any article is visited; depending on
+
Stores an ID for every time any article is visited; depending on $wgHitcounterUpdateFreq, it is periodically cleared and the cur_counter column in the cur table updated for the all articles that have been visited.
$wgHitcounterUpdateFreq, it is periodically cleared and the cur_counter
 
column in the cur table updated for the all articles that have been
 
visited.
 
 
*'''hc_id'''
 
*'''hc_id'''
 
*:The ID of an article, representing one hit
 
*:The ID of an article, representing one hit
Line 193: Line 181:
 
*'''wl_namespace'''
 
*'''wl_namespace'''
 
*:Namespace -> cur_namespace
 
*:Namespace -> cur_namespace
*:Note that these should only include even-numbered namespaces for
+
*:Note that these should only include even-numbered namespaces for regular pages; associated talk pages (odd numbered namespaces) are folded in.
regular pages; associated talk pages (odd numbered namespaces) are
 
folded in.
 
 
*'''wl_title'''
 
*'''wl_title'''
 
*:Page title -> cur_title
 
*:Page title -> cur_title
*:Note also that the linked page may not exist in page or talk
+
*:Note also that the linked page may not exist in page or talk namespace, or at all.
namespace, or at all.
 
 
===searchindex===
 
===searchindex===
 
Used for MySQL fulltext searching
 
Used for MySQL fulltext searching
Line 215: Line 200:
 
*:The URL of the wiki, with "$1" as a placeholder for an article name
 
*:The URL of the wiki, with "$1" as a placeholder for an article name
 
*'''iw_local'''
 
*'''iw_local'''
*:A boolean value indicating whether the wiki is in this project (used,
+
*:A boolean value indicating whether the wiki is in this project (used, for example, to detect redirect loops)
for example, to detect redirect loops)
+
 
 
==Edit Log==
 
==Edit Log==
 
*'''2005-06-13''' Transcribed from docs for MediaWiki version 1.4.5
 
*'''2005-06-13''' Transcribed from docs for MediaWiki version 1.4.5

Latest revision as of 21:09, 7 December 2005

Template:mediawiki dev doc

Contents

The most up-to-date schema for the tables in the database should always be "tables.sql" in the maintenance directory, which is called from the installation script. Here are a few highlights that may be out of date:

user

Wikipedia users

  • user_id
    integer, primary key, autoincrement
  • user_name
    Usernames must be unique, must not be in the form of an IP address. _Shouldn't_ allow slashes or case conflicts. Spaces are allowed, and are _not_ converted to underscores like titles. (Conflicts?)
  • user_password
    Hash of current password.
  • user_newpassword
    Generated for mail-a-new-password feature
  • user_email
    Note -- email should be restricted, not public info. Same with passwords. ;)
  • user_options
    Newline-separated list of name=value pairs.
  • user_token
    A pseudorandomly generated value that is stored in a cookie when the "remember password" feature is used (previously, a hash of the password was used, but this was vulnerable to cookie-stealing attacks)

cur

Wikipedia "current" articles

  • cur_id
    integer, primary key, autoincrement
  • cur_namespace
    integer index into list of namespaces. See the Namespace class for more details.
  • cur_title
    Title of article (in dbkey form--see Title), without namespace. The combination of namespace,title should be unique in this table.
  • cur_text
    Wikitext of the article.
  • cur_comment
    The summary of the last change.
  • cur_user
    User id who made the last change, or 0 if unknown.
  • cur_user_text
    Name of the user above, or IP address.
  • cur_timestamp
    Time of the last change.
  • cur_minor_edit
    Flag: 0 or 1 is last change was a "minor" edit.
  • cur_restrictions
    Who may or may not edit the article.
  • cur_counter
    Number of times this page has been viewed.
  • cur_ind_title
    Text version of title for fulltext searches.
  • cur_ind_text
    Plaintext version of text for fulltext searches.
  • cur_is_redirect
    1 indicates the article is a redirect.
  • cur_minor_edit
    1 indicates this was a minor edit.
  • cur_is_new
    1 indicates this is the first revision of a new entry.
  • cur_random
    Random value between 0 and 1, used for Special:Randompage

old

Historical versions articles. Most fields correspond to the same fields in "cur"

  • old_id
  • old_namespace
  • old_title
  • old_text
  • old_comment
  • old_user
  • old_user_text
  • old_timestamp
  • old_minor_edit
  • old_flags
    This last is currently unused.

archive

Temporary storage of deleted articles which may be restored. Fields correspond to those of "cur" and "old"

  • ar_namespace
  • ar_title
  • ar_text
  • ar_comment
  • ar_user
  • ar_user_text
  • ar_timestamp
  • ar_minor_edit
  • ar_flags
    This last is currently unused.

links

Internal links to existing articles

  • l_from
    ID of source article. (currently title, may be changed)
  • l_to
    ID of target article.

brokenlinks

Internal links to non-existent articles

  • bl_from
    ID of source link.
  • bl_to
    Title of target link.

imagelinks

Internal links to images via [[Image:filename]] syntax

  • il_from
    Title of target article.
  • il_to
    Filename of target image.

categorylinks

Track category inclusions

  • cl_from
    corresponds to cur_id of the linking page
  • cl_to
    corresponds to cur_title of the category page
  • cl_sortkey
    the title of the linking page, or an optional override
  • cl_timestampe
    when the link was last added

linkscc

Stores (possibly gzipped) serialized objects with cache arrays to reduce database load slurping up from links and brokenlinks.

  • lcc_pageid
    The ID of the linking page
  • lcc_cacheobj
    A serialized LinkCache object

image

Uploaded images and other files

  • img_name
    Filename.
  • img_size
    File size in bytes.
  • img_description
    Description field given during upload.
  • img_user
    User ID who uploaded the file.
  • img_user_text
    User name who uploaded the file.
  • img_timestamp
    Timestamp when upload took place.

oldimage

Old versions of images stored for potential revert

  • oi_name
    Original filename.
  • oi_archive_name
    Filename of stored old revision; timestamp and exclaimation point prepended to oi_name
  • oi_size
    File size in bytes.
  • oi_description
    Description field given during upload.
  • oi_user
    User ID who uploaded the file.
  • oi_user_text
    User name who uploaded the file.
  • oi_timestamp
    Timestamp when upload took place.

ipblocks

IP addresses and users blocked from editing

  • ipb_id
    Primary key, introduced for privacy.
  • ipb_address
    Blocked IP address in dotted-quad form or user name.
  • ipb_user
    Blocked user ID or 0 for IP blocks.
  • ipb_by
    User ID who made the block.
  • ipb_reason
    Text comment made by blocker.
  • ipb_timestamp
    Creation (or refresh) date in standard YMDHMS form. IP blocks expire automatically.
  • ipb_auto
    Indicates that the IP address was banned because a banned user accessed a page through it. If this is 1, ipb_address will be hidden.

site_stats

Site-wide statistics

  • ss_row_id
    Token for where clauses. There's only one row in this table. At some point we might want to use a date here so we can get stats-by-date.
  • ss_total_views
    Number of total views of all pages.
  • ss_total_edits
    Number of total page edits.
  • ss_good_articles
    Number of "countable" articles.

hitcounter

Stores an ID for every time any article is visited; depending on $wgHitcounterUpdateFreq, it is periodically cleared and the cur_counter column in the cur table updated for the all articles that have been visited.

  • hc_id
    The ID of an article, representing one hit

recentchanges

Will document further when working

watchlist

  • wl_user
    Foreign key -> user_id
  • wl_namespace
    Namespace -> cur_namespace
    Note that these should only include even-numbered namespaces for regular pages; associated talk pages (odd numbered namespaces) are folded in.
  • wl_title
    Page title -> cur_title
    Note also that the linked page may not exist in page or talk namespace, or at all.

searchindex

Used for MySQL fulltext searching

  • si_page
    The ID of an article
  • si_title
    The title of an article, indexed for searching
  • si_text
    The text of an article, indexed for searching

interwiki

Recognized interwiki link prefixes

  • iw_prefix
    The interwiki prefix, (e.g. "Meatball", or the language prefix "de")
  • iw_url
    The URL of the wiki, with "$1" as a placeholder for an article name
  • iw_local
    A boolean value indicating whether the wiki is in this project (used, for example, to detect redirect loops)

Edit Log

  • 2005-06-13 Transcribed from docs for MediaWiki version 1.4.5