Disclaimer: This is only meant to be a list of experiences and solutions. Ultimately, high availability depends a lot of the particular setup of your application, servers, architecture, etc… If you have had different experiences than those outlined here with this tools, or you feel that we are missing something, please comment, send me an email and I will update it.
I’ve noticed a lot of noise lately on how to succesfully deploy large Puppet installations. Incidentally, the kind of places that need this always rely on other tools for inventory and provisioning at the very least. In this collaborative guide, I will try to explore as many paths as possible for a successful cloud deployment where we can put Puppet at the core of our configuration management. Some of these problems are central to configuration management itself, so if you are managing this in a data center by other means, Chef, by hand (no!), Ansible, Salt… I bet you can get some take aways from here. I assume the reader is more or less familiar with the basics of Puppet (server-client, modules and classes…), other tools will be explained in detail.
Puppet has not been around for so long that a single individual could have had experiences in dozens of deployments, so if you feel something is wrong or you want to add anything, you can contact me (my email is on the left bar) and I will fix it.
First off, there is a basic architectural consideration when puppetizing your servers. Once upon a time… when Puppet just came out, everyone assumed we were meant to have a master client setup. Fundamentally, this means the master stores their configurations in “`/etc/puppet/environments/“`, and it figures out how to compile a catalog containing the configuration the client needs. Clients request this compiled catalog and run it. Obviously this solution is taken straight out from the fabulous world of cotton candy and lollypops. It’s great because:
Everyone else is doing this! We can use everyone else’s manifests to setup our own puppet masters.
Scalable? Well, we can put a bunch of puppet masters with a balancer in front. Apache mod_proxy or HA proxy are proven solutions for this. We will do this when our lonely Puppet master starts to blow up and drop requests because there’s just too much traffic or load.
We can use LVS/keepalived to setup permanent connections between our puppet clients in a data center and its closest master.
If we ever get to Google’s scale, we can provide DNS tiers that redirect clients to its closest (by request trip time, or geographically).
Everything here sounds like a bliss doesn’t it? We do have a plan to overcome the issues running a single Puppet master instance. We definitely do.
Here’s the thing: Scaling a single instance to tens or hundreds will fix your pains. But it will only fix those related with having a single instance.
It’s quite complicated to plan in advance the issues having many puppet masters will bring. However, we can sort of use this post as a way to document our experiences and hopefully avoid some headaches to the folks deploying these installations now.
There are some metrics that can more or less help to figure out when you need new puppet masters.
catalog compilation time (one catalog per thread, the shorter the better, as the thread will be busy compiling and not taking in new catalogs)
- last call to master from each node (will let you know when there are some waves of requests)
- new signups (puppetca)
Rate monotonic scheduling
There is space for a quick project that measures the catalog compilation time, and the time of last call from master to node. It can be possible to prevent, or mitigate at least, the effects of a wave of puppet client requests to the master by using this two points of data. You can plan this using MCollective, possibly scaling up and down your puppet master infrastructure depending on the time of the day…
Funnily enough, real time operating systems have a good answer to this.
Rate monotonic scheduling allows you plan your client request waves and avoid DDOSing your puppet masters. Given a number of nodes (puppet clients), and a maximum capacity for all puppet masters to handle x number of requests at the same time (say 500), using these formulas you can come with several periods, so you can plan a massive puppet run for each of them.
Explaining how you can schedule puppet runs using harmonic periods is a little out of scope for this post, but you can contact me privately if you didn’t understand any RMS scheduling guides online.
An example of the problem that RMS fixes:
Puppet interval 1: 25 min
Puppet interval 2: 60 min
Node set 1: 00:00 -> 00:25 -> 00:50 -> 01:15
Node set 2: 00:15 ……………….. 01:15
At 01:15 you better have your pager on.
PuppetCA is another feature of Puppet that you might struggle to scale out. Even though you can share the CA files across all masters (using NFS, Netapp, or any other mechanism), it’s probably not a great idea. It’s a hack that will make your PuppetCA highly available.
Another hack is to autosign every new host that requests it, having a separate CA per puppet master. Anyway no configuration will be applied to the host if the hostname is unknown right? 😉
Puppetlabs take on this is to use a central CA. It’s not HA, but in case of failure, master/node connections remain working since CA certificates and CRL are cached. However if the central CA fails, sign up of new machines will not be possible until manual restore of the CA master.
Puppetlabs has sorted this out for us in 3+ versions of Puppet.
- Point a node to any working master using generic DNS, ‘puppet.redhat.com’
- For multisite deployments, make all nodes within a site point to a local DNS for puppet masters, ‘ranana.puppet.redhat.com’… ugly and requires work.
- SRV records! Nodes will pick up closest working master.
- Algorithm prefers masters in your network.
- DNS setup as many tiers as needed (global -> region -> data center -> puppet masters)
PuppetDB essentially contains the real data from nodes in your installation, be it facts, reports, or catalogs. It can be useful when you want to know where has catalog X been applied, etc…
The DB differs from (part of) Foreman’s DB in that Foreman stores the expected data to be at the nodes, and then it tells the Puppet master to make the nodes look like what you expect, instead of only consuming the data.
As far as I could tell from a presentation by Deepak Giridharagopal (thanks for the puppetdb_foreman mention!) Puppetlabs has some tools in the oven to replicate data from one PuppetDB daemon to another, so mirroring will allow other DBs to take on the master in case of failure, and other strategies explained in that presentation. This is the most blurry component when it comes to scaling in my (very little) experience with it so any contributions in this area will be greatly appreciated.
Masterless – Distributed
Let’s try to list pros and cons of each approach over here
- Computation is massively parallelized
- Easy to work with when number of modules is small
- No SPOF (using autosign)
- Distribute Puppet modules via RPM to nodes using Pulp
- Hard to monitor and spot failures
- Large puppet module code bases will be stored on each node?
- Forces you to resort to Mco/Capistrano for management
- Only choice when module repositories are big and being written to very often
- Failure easier to manage because problems will be at known locations (instead of all nodes)
- Keeping Puppet masters modules in sync is quite hard
- Git + rsync? NFS? GlusterFS? NetApp? Any success story?
Foreman can and should be one of the central pieces if you want to save time managing your infrastructure. If you have several devops guys, developers.. people that want to automatically get a provisioned virtual machine, you will need it. Thankfully, scaling it is considerably easier than scaling Puppet.
First off, you don’t want your Foreman UI or API to break down because your Puppet report broke Foreman. This is easy to solve using a service oriented architecture, which in Foreman would look like:
- ENC: Critical service, should not better be cached to avoid flipping changes back and forth.
- Reports: YAML processing will be slow with very large reports
- UI/API: User facing services, will get the least load
This has the great benefit of being able of allowing failure to happen in one service at a time. It is more or less easy to setup a balancer (HAproxy, apache mod_proxy_balancer). Passenger will also allow you to run your Foreman (or Puppet) multithreaded. I recommend at least two backends per service.
The architecture needed for smart-proxies (DHCP, DNS, etc…) depends very much on where the services are located. Usually, you will want at least one (preferably two) smart-proxy in each of your services, in each of your data centers.
Foreman is able to multi-site scale for most capabilities without new instances, provisioning, inventory, compute resources, etc… do not need to scale ‘geographically’.
Thanks to Red Hat, CERN, and everyone else who has contributed to this post in some way.
Please contact (me @ daniellobato dot me), or comment below if you want me to update any part of this blogpost. I’ll be very happy to get some feedback.