Production Notes

From Dreamwidth Notes
Revision as of 21:55, 18 May 2009 by Xb95 (Talk | contribs)

Jump to: navigation, search


This document is meant to be read by people with sysadmin experience. I'll go back at some point and clean it up, break it down into sections, etc. But for now I'm just trying to dump as much information as possible so that Matthew and Robby have some state on how things are.

Links

Nagios

The Nagios setup is running on dfw-admin01 in /etc/nagios3, most of the configuration files are in /etc/nagios3/conf.d as you can imagine. You can poke around if you want to change it, it's pretty straightforward.

If you do change things, you probably want to commit them to the operations repository.

   make your changes... etc etc
   
   $ sync-back-nagios
   $ cd /root/dw-ops/nagios/conf.d
   $ hg status
   
   if everything looks good, then:
   
   $ commit -a mark -m "Some commit message."

Replace mark with matthew or alierak as appropriate.

Cacti

Most of the graphs are more or less useful. I spend a lot of time looking at dfw-lb01 which shows all of the incoming site traffic. In particular: eth0 is always the "Internet" interface, on all slices. eth1 is the "Internal/Private" interface. And lo is lo.

The only time lo is really interesting is on the dfw-lb01/dfw-lb02 machines. Look at the SSL configuration to see why, but lo is the measure of how much SSL traffic we're doing.

Traffic Flow

This summarizes the flow of traffic. There are a lot more sections that talk far more in depth about various things, but here you go...

  • Site external IP is on dfw-lb01 (or dfw-lb02), which runs Perlbal.
  • User connects to Perlbal. If it's a static request, it serves it locally. If it's dynamic, it hands off to a webserver.
  • Perlbal connects to dfw-webXX and proxies the request.
  • Webservers connect to lots of things: databases, memcache, mogilefsd, gearmand, etc.
  • Response is returned.

That's the basic flow of things and what connects to what. There's a separate flow that happens when the user requests a userpic (or any other MogileFS resource, but for now it's just userpics).

  • User -> Perlbal, "GET /userpic/XXXX/YYY"
  • Perlbal -> Webserver, "GET /userpic/XXXX/YYY"
  • Webserver replies: X-REPROXY-URL: http://dfw-mog01/dev1/0/00/000/234.fid
  • Perlbal -> dfw-mog01, "GET /dev1/..."
  • Mogile storage node replies with image
  • Perlbal munges headers from webserver original reply, plus body of image from mogile storeage node, returns that to the user.

SSL is different again:

  • User -> Pound.
  • Pound handles the SSL handshake and decryption/encryption.
  • Pound connects to localhost:80 (Perlbal).
  • Same process now as originally.

Perlbal

MogileFS

Gearman

Memcached

TheSchwartz

Workers

Incoming Mail

Outgoing Mail

Databases

Webservers

SSL