Bot Policy

From Dreamwidth Notes
Revision as of 07:07, 21 February 2009 by Xb95 (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

This page outlines a proposal for the Dreamwidth "Bot Policy", the rules that third party applications and tools must abide by in order to prevent being blocked from accessing the Dreamwidth servers.

This document is a proposal and subject to change.

General Purpose Guidelines

Things to keep in mind that are just generally useful:

  • Good clients send a proper user-agent (or other info string) that includes a contact email address.
  • Be kind to the system, try to rate limit your requests, don't abuse the servers.
  • DW staff and volunteers are more than happy to work with you to do neat projects, fix problems in the servers, or tell you if something is good or not. Just ask!

Client Applications

A client application is defined as a tool that runs under the control of an end user. For example, the Semagic client, jbackup, ljMigrate, and similar tools that are run under someone's control for personal use.

These tools are generally unrestricted. We may, from time to time, have to restrict some of the API methods they use (such as syncitems, getevents) if there is too much load on the site. Generally speaking, though, we will not do this unless it's absolutely necessary to protect the good functioning of the site.

Third Party Sites/Utilities

Things that fall into this category are more restricted, as they have the potential to do great harm individually. We are here to serve our users (which includes client applications), but third party sites that scrape data sources are not our primary function.

Generally speaking, if your site is going to be very small and hit DW little (this term is purposefully vague), you can go right ahead and do it. If it becomes a problem we will contact you and let you know that your particular access is turning out to be hard for us.

On the other hand, if you intend to run a service (or your service gets popular) and you are doing dozens or hundreds of requests, then you should contact us and let us know what you are doing so we can make sure we have the proper resources to support your site.

We ask that all third party sites access Dreamwidth APIs on a special domain: b.dreamwidth.org. If you use this domain for your XML-RPC endpoints and web access, you will have full ability to access the site, but we will be able to separate which traffic is bot traffic and which traffic is user traffic. In case the site becomes overloaded, we can turn off this domain to mitigate third party traffic.

(Note: this domain doesn't work yet. This page is just a proposal, it will be implemented sometime, if the proposal is well received.)

IP Rate Limiting

From time to time you may find your IP address temporarily banned if we determine that your site is hitting the servers too hard. In this case, please contact us and let us know which IP has been banned so we can determine how best to serve your needs.