Brad Fitzpatrick (bradfitz) wrote in news,
Brad Fitzpatrick

Status Update

It's been awhile since I posted a status update and I'm sure at least a few of you are dying for an update (I know I go and reload all the project news pages that I watch), so here's what's up....

The last week and probably for another week or so I'm purposely not adding anything new or doing anything "fun". The focus currently is making things faster, more roboust, safer, fixing bugs, documenting, and polishing things up.

In detail, what we're doing...

  • System Administration -- working with dormando
    • Setting up more servers -- in addition to Kyle and Stan processing web requests, Wendy is now in the mix now too, and Bebe is going online Monday. (at which time Kyle will be pulled out to be upgraded) I'm doing this, since the servers are here, and Dormando's on the east coast.
    • Improving Load Balancing -- Dormando is plowing through mod_backhand code and we're both working to find out how to improve our load balancing. We're at the point now where we know how to work around its bugs. Apache processes are locking up sometimes, so we're gracefully restarting the master server now every so often until we find why.
    • Log rotation and analysis --- we currently rotate web logs (about 300 meg of logs per day) and compress them, but automatic backups of them to my home machine and analysis isn't done yet. I have scripts to transfer them to my own machine, so I suppose this is as easy and putting in a cronjob to pull them. Guess I'm lazy.
    • Backups --- database backup has lately become infeasible, since files are so big.... 1) locking table by table as we copy files kills the site for too long, even in idle hours, 2) copying a 2.5 gig file around kills both the RAID card and the operating system's caches. The only solution to this I've decided is to get a db server doing replication again, not for selecting, but just for backups. We used to do this, but then stopped for some reason as we had to use that server for something. Anyway --- we need backups, so this is top priority.
    • NAT and Firewalls --- new web servers we're adding to the cluster don't have external NICs, so all access to the outside world (DNS and text messaging) is done via NAT. Wendy is doing NAT through Stan now, which involved setting up a firewall at the same time, so we're also going to get pretty restrictive with our firewall rules on all the machines. Dormando is doing nearly all the work on this, and I'm just testing things on my local dev server before putting it live. FTP through the firewall is our last issue.
  • Documentation
    • Database --- finish the schema documentation
    • Code --- document the code more, including call graphs of everything so we can use graphviz and make a pretty overall picture of the site.
    • Installation Instructions --- make it easier for people downloading the LJ source code to install it on their own servers. This involves both better documentation and cleaning the code up to have less LJ-specific things in it.
  • Polish
    • LJ::TextMessage --- foobar is organizing a group of people to add more providers to LJ::TextMessage
    • Bug fixin' --- got several good bug reports from people lately and I'm working on fixing those.
    • No duplicates --- make it impossible to submit duplicate anything on the site from accidentally double clicking a button.
    • Minor feature enhancements --- few things have been bugging me lately, like your status of being logged in isn't recognized on the support request page, and that adding a memory doesn't show you your current keywords. I'm going to fix things like that, even though it's new code... it's not major new code. It's stuff that should've been done a long time ago.
  • Etc...
    And a ton of other things like this.
After I'm done with all this, or find people to do it for me, I'll move on to fun stuff. Notably, finishing the new style system, S2, and the notification/subscription system so you can subscribe to almost anything and choose your method of notification (web inbox, email, text message).

My two main goals lately are reliability and lowering the barrier to entry for new contributors. I can't just add new things all the time, because if nobody can help out with the old new things, they'll die of neglect.

This is longer than it should have been. :-)


  • Post a new comment


    Anonymous comments are disabled in this journal

    default userpic

    Your reply will be screened

    Your IP address will be recorded