DNS30 – Spammers

October 3rd, 2012 Comments off

So you may have seen my previous blog post about wrangling Route53.  I got a comment that got marked as spam in my blog configuration, and here’s the text from it.  I’ve seen it all over the place on route53 blog posts:

Hello, Thanks for sharing your article with us.

Route 53 is designed to automatically scale to handle very large query volumes without any intervention from user.We have developed a UI tool for this interface – DNS30 Professional Edition.We also have online interface for this application.

Now, at first, this would seem like a moderately useful kind of post to get.  But it turns out that this is all over the place, exactly the same.  It doesn’t add anything to the conversation, and is just a solicitation for services.  Hell with that!  These clowns are spammers, engaging in comment spam, and that’s really lame.  They’ve pretty much guaranteed that I’ll badmouth them and avoid them in all contexts.  Good job, DNS30!

Categories: Ramblings Tags:

DevOps: Managing internal and external DNS with Amazon Route53

September 16th, 2012 2 comments

So one fun project I’ve been working on at work is developing the integration to make managing our internal (server facing) and external (customer facing) dns a simpler process.  This has meant integrating a few different things: BIND, Amazon Route53, DHCP, and various tools to tie them together.  This meant taking some tools from various places, and tying them together.

So my system consists of basically four elements: Route 53, for serving internet facing DNS requests; Internal DNS servers that get zone data from route 53; route53d servers for pushing updates to Route53 via its API; and Route53′s Web UI.

I use Route53 is the single point of truth.  Updates go there via the API or WebUI.  Then the data is pulled out of there and published to my internal DNS servers.

Due to the stateless nature of how these servers work, this system is very scalable out of the box.  To handle more load, I just spin up more instances of the internal DNS servers.  Using a load balancer in front of them, I could provide plenty of service to basically an arbitrary amount of machines.

A couple cool programs made this project very possible:  dnscurl.py, route53d, and route53tobind.  These tools made it easy (as in, just writing integration code and deployment code) to tie together all these resources.

One thing that I did in this process was that I wanted to make the whole system scalable.   I didn’t ever want to have to log into the boxes when they’re in production, or indeed, even when deploying them.  So this meant that I developed deployment scripts and kickstart files that specified the whole structure of the internally facing DNS servers and the route53d servers.  Basically, I just kickstart it, and it’s done.  This is done by using a kickstart script that specifies in the postinstall section to pull over a script and run it.  That script configures all the stuff and sets it to start on boot.  This could (and should) be done with Chef or some other configuration management suite.

I periodically poll Route53 for new zone data, and when I get it, push it to the internal servers.  Then call “rndc reload” to incorporate the updated zones into the running server.

And presto, consistent internal and external DNS with scalability and availability.

Tools used:

 

Categories: DevOps, Linux, Operations Tags:

DevOps: Using Fabric to fix the Leap Second bug

July 2nd, 2012 Comments off

So today I had a fun problem:  How am I to correct the leap second bug on 700 systems?  Gosh, that’s a thing.  So I went looking in my systems administrator toolbox and found Fabric.  Turns out that you can automate things in pretty awesome ways with it.

Fabric is a python tool that lets you specify a runlist of things to do, and go execute them across a pile of hosts.  For me, this meant:$ cat fabfile.py

$ from fabric.api import run, env
def fix_date():
     run('date; date `date +"%m%d%H%M%C%y.%S"`; date')

$ fab --skip-bad-hosts -t 3 -H \
      `perl -e 'chomp(@l=<>); print join ",", @l;' < hosts.txt `\
      fix_date

And bam, 30 minutes later, the bug is done.

 

Puerco Pibil, so good

February 27th, 2012 1 comment

Better photo later.  This was a quick iphone shot.This weekend I got a hankering to make one of my favorite dishes, Puerco Pibil.  It’s a delicious slow roasted pork dish from the Yucatan Peninsula.  It features tangy flavors, and moderate heat from the habenero pepper.  My recipe is pretty much the one that Robert Rodriguez outlines in his entertaining 10 Minute Cooking School video.  This weekend, I made a big batch, about 15 lbs of it.  It did this because the meat vendor I use, Cash’n'Carry, sells the pork butt in ~15 lb bags.  $1.48/lb was a sweet deal.

I get all my spices from World Spice, located behind the Pike Place Market.  They’re inexpensive and delicious, and can be shipped.

My recipe, as made this weekend:

  • 15 lbs pork butt, cubed into ~2″ cubes (roughly, precision not important)
  • Banana leaves (asian grocery! cheap!)
  • 1/4 cup minced, de-veined, de-seeded habenero peppers (~16 habeneros)
  • 1/2 cup coarsely chopped garlic
  • 1 cup orange juice, freshly squeezed
  • 2 cups lemon juice, freshly squeezed
  • 1.5 cups white vinegar
  • 3/8 cups delicious reposado tequila
  • 8 oz annatto seeds, ground finely (measured as whole annatto)
  • 1 oz cumin
  • 1.5 oz black peppercorns
  • 24 allspice berries
  • 1.5 tsp cloves

Set aside the banana leaves and cubed pork.

Take the rest of the ingredients.  Grind all the spices into dust, and combine with the garlic, habenero, juices, and vinegar.  This is your annatto paste, and is what the pork will marinate in.  In a large bowl, combine the pork with the annatto paste, mixing thoroughly.   I have a large (6 qt) food bucket that I use for marinating the pork.  Put the pork into a large container with a lid.  Put this into the refrigerator for at least four hours.  I try to marinate it overnight.

On cooking day, pre-heat the oven to 325F.  Line a large pan with banana leaves, leaving enough overhanging banana leaf to allow you to fold it over.  Fill the cavity with pork mix, making sure to put all the delicious marinade into the pan.  Fold the banana leaves over this, and cover with more banana leaves, tightly packing it.  Then cover with aluminum foil.  Or not, but I do.

Bake at 325F for four hours.  You will know it’s nearly done when your house smells delicious.

Enjoy on rice, or soft corn tortillas, or as tamale fill.  There are tons of ways to enjoy this versatile dish.

Categories: Food and Drink, Things I made Tags:

Building the Air Kraken Trike

February 24th, 2012 Comments off

Line art concept sketch (by Molly Friedrich)*** More pictures are here ***

Last year, around this time, I started building the Air Kraken Trike.  It’s a project I was inspired to create from the scenes of great big things like it at Burning Man.  This article is an overview of my process, challenges, costs, and thoughts.   I’ll describe how I came to the design, and various elements of building it.  I’ll provide a broad-strokes cost overview of what it took to make it, to take it to the Burn, and to store it.

Overview

So what’s an Air Kraken?   The Air Kraken is a fun idea from the steampunk community that has to do with an event in march, called Air Kraken Day, where folks take refuge in bars, and carry umbrellas to keep the sky kraken from descending to feast on folks.  It’s pretty hilarious, and I liked the name, so I went with it.    What else is an Air Kraken?  It’s also a ten foot long, six foot wide metal tricycle made out of new and used materials that has been to several events.

It’s made of steel, of drainage culvert, of bus parts, of old bicycles.  I’ve put more than a little blood, sweat, and tears into the construction and application of it.  It’s been a lot of fun, and I’ve learned a lot. There have been frustrations and mis-steps.  But it’s been overall pretty awesome.

It has been a very time intensive project.  I put in a lot of nights and weekends into it.  For about four months, it was effectively my second job.  I’d get home from work around 6ish, eat food, and go out to the shop to work on it.  I’d throw 8-10 hours a day on the weekends into it.  I probably put 600-700 hours into it.  Definitely a lot more than that with the time I wasn’t actively working on it, and “merely” thinking about it.

Read more…

That was Burning Man

February 4th, 2012 1 comment

I feel very fortunate that I got to experience the Burning Man Festival before everything changed.

Wow. What a big statement to make, huh?  To claim that the festival has changed dramatically.  That things might be over.  That what had been isn’t what will be.  Very impertinent of me.  Very bold.  But it feels like it rings true to me.  I started going to the burn very recently, in 2008.  It was something I’d heard about for a long time, and always wanted to do.  I finally got things in gear, and went.  And it was world changing.

I went again every year since then, and if I can, I’ll go this year, too. There have been tons of cool experiences.  Loads of keen, interesting people that I’ve met, and some great friendships that I’ve spawned from the experience.  I’ve let go at the temple of things that held me back, of things that weighed on my heart.   I feel like the trip really meant a lot.

Like a ton of other burners I know, I lost the ticket lottery.  There’s lots of confusion about what’s going on.  Is it scalpers?  Are folks sitting on a bunch of tickets and not saying anything?  Did the community really grow to a hundred thousand in a year?

I think no one really knows what’s going on.  There’s a lot of fear and pain around it.  Fear of scalpers having taken a large portion of tickets to resell.  Pain at not knowing whether your theme camp will be able to go.  And doubt around what will happen.  The BMOrg is doing a lot to try to figure out how they went wrong, and what to do about the situation.  This is evolving, and what comes out of it, I don’t know.

But whatever happens, it won’t be the same.  Burning Man has grown very large.  It’s struggling against its own bonds of success.  As Halcyon said, this festival has perhaps grown too large to be contained by a playa.  So what does that mean?  What does it mean to have outgrown yourself?  To have gotten to a point where continued growth seems unsustainable and unlikely to be effective.  Where do we go from here?

I think one place that we can go is looking to our regional events as an outlet for all those things we wish for so hard in the experience of the dust.  I think that perhaps by looking again to our local communities, we can make local what we find remotely.  That we can bring the spirit and quality of the experience out at the dirt rave here.

Maybe it’s time that things split up.  That we have more events, spread out over more areas.  Freezing Man. Drowning Man. Soak.  Critical Massive.  Flipside.  All kinds of events, each with their own special character that leads us in new directions towards new dreams.

I don’t know where I’ll be this August.  I certainly hope on the playa, but it may not be in the cards. I know I’ll be at my regional.

Good luck.

Categories: Burning Man Tags:

Web Operations Performance – The Handout

November 11th, 2011 Comments off

So one of the issues that I deal with a lot is tuning web applications to not suck.  This is done by a few things; by monitoring, by profiling, caching, caching (CACHING!), and by tuning.  The process for making a web application more awesome basically boils down to this list of steps:

  1. Monitor your application performance (http threads, cpu, memory, thread response time, etc)
  2. Profile your code
  3. Fix slow requests/implement caching
  4. Tune your web-server
  5. Goto 2.

Monitoring

Monitoring the response time of your application is useful and awesome for making positive changes to your environment .  This means paying attention to your application response time, cpu, memory, network traffic, disk IO, disk capacity, etc.  All those metrics that say whether your application is healthy or not.  There are a few different tools available for this that all work pretty well, here’s an incomplete list:

  • Cacti – http://www.cacti.net
  • Munin – http://www.munin-monitoring.org
  • Cricket – http://cricket.sourceforge.net

They all work well and solve slightly different problems.  Pick the one you like most.  I’m a fan of Cacti.

Profiling

Profiling means being able to see how long each call that an application makes takes to execute.  It’s invaluable for getting a feel for what parts, what components, of your application perform badly or perform well.

Caching

Whenever an application fetches data from a resource, that’s an opportunity to improve performance.  Every time something is fetched, there’s the ability to take that result set, and keep it.  Caching the results of database calls, of external API lookups, of intermediate steps, all these things leave lots of room for improving performance. Memcached is the de facto standard for a caching engine.

Cache early, cache often!

(Apache) Tuning

A well configured web-server is crucial to a happy environment.  This means not running into swap, not wasting time with connections that are dead, and other such things that waste time.  In short, don’t look up what you don’t need, don’t save what you don’t need, and be efficient.  Here are some basic things that apply to Apache:

   KeepAlive          Off (Or On, see below, it depends on workload)
   Timeout            5
   HostnameLookups    Off
   MaxClients         (RAMinMB * 0.8) / (AverageThreadSizeInMB)
   MinSpareServers    (0.25 * MaxClients)
   MaxSpareServers    (0.75 * MaxClients)

About these parameters:

  • KeepAlive – this controls whether when one request from a client to the web-server is completed whether that thread will remain connected to the clients for subsequent requests.  In high-scale applications, this can lead to contention for available resources.  Some workloads, however, benefit from keeping this on.  If you are serving lots of different content types on a page to a client, leaving this on can be a good thing.  Test it out, YMMV.
  • Timeout — how long before we assume that the client has gone away and won’t be requesting further data.  The default is 300.  It is in seconds.  This value is aggressive.
  • HostnameLookups — this is for logging, and if it is on, each client will cause a DNS request to be made.  This slows down the request.
  • MaxClients — the total number of threads that the server will allow to run.  Each thread consumes memory.  This model assumes that 20% overhead for other system tasks is appropriate and will keep us out of swap.  On machines above 16GB of ram, use 0.9 instead of 0.8.
  • MinSpareServers — the fewest threads that Apache will leave running.  Setting this too low will result in load spikes when traffic increases.
  • MaxSpareServers — the most spare, unutilized threads that Apache will leave running.  Setting this too low will result in lots of process thrashing and threads are used and then terminated.  The tradeoff is utilized Ram.

There are a lot of other things that can be done as well, so don’t take this as a complete set…

These are my handout style tips on performance tuning.  There are whole volumes of books dedicated to this topic.  Some great resources include:

-Gabriel

Categories: Linux, MySQL, Operations, Ramblings Tags:

Of Tuning WordPress on Cherokee

November 2nd, 2011 Comments off

So at work, I had a blog.  Not my blog, of course.  One I support.  So this blog was thrown together quickly to facilitate business goals.  Like you do.  And that’s great.  We met the deadline.  We got the product functional.  But performance kinda sucked.  A little backstory.  Here’s how it got set up to start.

We love virtualization here.  Everyone should.  It’s a fantastic way to take adventage of hardware that you’ve got laying around that’s being idle.  Idle hardware is lame. So we run KVM.  This means we get to manage machines more effectively, and can provision things faster.  That’s awesome.

So this blog.  It’s a cluster of boxes, made up of a pair of webservers, and a pair of DB backends.  The DB backends are a master, and a backup.  This is a small project, so these got set up with just 4GB of ram.  So that’s fine.  The web servers are each a VM with 1 vcpu, 4GB of ram, and some disk space.

So we set that all up, and it did fine.  But not great.  here’s what it did.

Incidentally, if you’ve got data you’ve collected for performance with Pylot, and need to CSV the averages to make a pretty graph, here’s my way of extracting the averages from the results HTML to feed into gnuplot.

for file in load-dir/*/results.html ; do cat $file  | egrep 'agents|avg' | head -3 | perl -e 'while(<>){s/<\/?t[dr]>//g; s/(\d+\.?\d*)/\t$1/g; chomp; @p=split /\t/,$_; push @r,$p[1]; }; printf "%s\n", join ",",@r; '; done

So what we can see then from these graphs is that a pair of single core boxes, tuned with the default cherokee+php installation don’t do all that great.  They can handle 1-2 simultaneous requests, but past that, the response time gets pretty bad. That’s where the project got left for a while, until the other day I got the request to make it handle 500 requests per second.  “Wow, shit.” I thought.  So I dove in to see what I could do to improve performance on the blog, and it turned out that there was a lot I could do.

  1. Single-core boxes don’t have a lot of performance available to them.
  2. Cherokee uses PHP in FastCGI mode, and does a terrible job of tuning it out of the box.  It defaults to a single PHP thread.
  3. WordPress is very hungry for CPU time.  It chews up a lot of resources doing its DB queries and wrangling them.

To address these points, I did the following.

The VMs themselves were restarted with more CPU cores — four cores per box.   This allowed me to dive into discovering that Cherokee wasn’t tuned well.  Under a single core, I saw 80% cpu utilization on PHP, with high system wait time.  That sucked.   But after I bumped up the core count, it still looked bad.  Still only one CPU @ 80%. WTF.  So then I turned to Cherokee, and I tuned PHP so that it would invoke 16 children, enough to handle the request volume.  This helped a lot, but there was still room to do better.  So I added APC, the Alternative PHP Cache, to the configuration.  This helped out a lot.  I then looked at wordpress specific caching engines, and settled on using W3-Total-Cache to help out.  This brought performance into the range of fulfilling the customer’s request.  I felt great about it.

I used pylot to graph performance at various points through this project so I could figure out how I was doing a better or worse job of tuning the boxes.

Here’s the performance data showing the improvements that caching at various layers added to the party:

So it turns out that these are some great ways to make sure your blog performs:

  • Make sure that there’s enough hardware supporting it
  • Cache in depth and breadth, early and often
  • Tune your PHP/WebServer to handle the project
  • Employ performance testing to measure your real improvements

With some hard work and performance tuning, you can turn an abysmal 7 requests/sec peak performing blog into one that can sustain 350 requests per second, and do it in a tiny fraction of the time.

-Gabriel

 

Categories: Linux, MySQL, Operations, Ramblings Tags:

Pro Tip: For recruiters…

September 21st, 2011 1 comment

Texting me early in the morning without identifying yourself is a fantastic way to piss me off. If it was your goal to antagonize me against your position and company, you did a great job. Also: When I text you back, “do not txt me,” it does NOT mean that you should then call me five minutes later!

Advice to recruiters who haven’t yet pissed me off: Email. It’s great. If I’m interested in the position you have, I’ll let you know! Don’t cold-call me. Don’t cold txt me.

Categories: Ramblings Tags: , , ,

It’s that time of year…

August 23rd, 2011 Comments off

Air Kraken Trike; shooting fireSo it’s time for the annual pilgrimage to Burning Man.  Like so many other folks, I’m going.  Unlike a lot of people, I haven’t had the ticket scramble that the selling out process led to — I bought mine back in January.  Anyway, this post isn’t so much about tickets, as it is about the experience, and how I feel about the upcoming trip.

I’m super excited about it.

This year, I’m bringing my own big art project to the Burn.  The Air Kraken has been a ton of work this year, and I’m so happy that I’ve gotten it done, and that it’ll be on playa.  The project has gone really well due to help from my friends by way of Kickstarter contributions, INW’s Art Grant, and construction assistance.   I’m super appreciative to all their wonderful help!

Check out more about the Air Kraken on my project page.

So what else is going on for the Burn?  I’m planning on taking a ton of photographs this year, and that means bringing a lot of storage… I’m bringing 28GB of CF cards, and 6GB of SD cards for two cameras.  I’m bringing a cool selection of lenses, too:  Nikkor 50mm f/1.8 D, Sigma 21-35mm Alpha f/3.5-5.6, Nikkor 70-210mm f/4-5.6 D, and on my D80 the 18-200mm alphabet soup lens as backup.  There should be lots of cool stuff to see.

If you see me on playa, I demand you ask for a portrait. ;-)    See you in the desert…

Categories: adventure, Burning Man, Ramblings Tags: