Updated Jul 7, 2012, 12:24 PM
===================================================
Background
I was hired a year ago as a Principle Linux Administrator for a large education company. My focus was to be automation. From bare-metal and virtual-host provisioning to lifecycle node mangement and applications management. It was a foregone conclusion that Puppet would be front and center for the automated node and application management. Around the same time an effort was underway to deploy Ldap for account and authentication services. Having architected the iPlanet Ldap offering at Genuity, and with a recent organizational change, the Ldap project was assigned to me. The inherited Ldap platform consisted of the RedHat Directory Server package, a supplier and several consumers. My preferences would have been a pair of multi-masters with a directory proxy for load balancing and high availability, using consumers for scalability. During this period I designed our puppet infrastructure, a pair of puppet-masters, one as a certificate authority, along with haproxy for client request load balancing and HA.
We have a very diverse application environment. A typical puppet cookie-cutter design would not work. We have nodes in several datacenters, running various distributions, with application dependancies on OS distro, package versions and hardware configuration. There was no way we could use just a default node definition, and I wasn't satisfied with the nodes.pp solution, having to create node definitions in some flat file. The thought of training dozens of administrators and developers in the puppet language seemed inefficient. I would prefer to have a collection of modules and classes, that were not dependent on some local flat-file and would instead gather their parameters from an external data source. This means that the modules are fairly static; no need to keep checking out some particular version of module parameters when changes or additions are needed.
Having completed our Ldap deployment, I looked at Puppet's ENC (external node classifier). The key item I noticed was an ENC for ldap being already available. I don't believe that SQL is a good fit for defining node configuration. The configuration of a node can be limitless, let alone an application and thereby would require some hefty SQL schema changes. And we needed HA for whatever data source we used. Our Ldap deployment was fairly robust, and the simplicity of adding attributes in Ldap makes it very appealing. Couple that with Ldap's excellent API's and search capabilities, it was a no-brainer.
We used centos 5.8 to host the puppetmasters and proxy. We configured apache2 with puppet-passenger to aide in scalability. The scalability options in puppetmasters are not quite at the level I'd like to see. The use of certificates and a certificate authority complicates the setup. Since we were using an Ldap ENC, scalability of the puppetmasters was mandatory. We were aware that the Ldap lookups could impose some delays, so the use of a load-balanced cluster was paramount. We decided to create a dedicated puppet ldap data source; a pair of directory server multi-masters with a proxy. I liked this approach, and saw the same with what iPlanet did with their application server, putting the metadata into their own iPlanet Ldap server and bundling the two.
So now we have a structure that allows for the use of off-the-shelf Ldap browsers to define our node and application configurations to Puppet. We are able to minimize puppet parameters file changes, and we've simplified administrative tasks. The next step was to structure our Ldap DIT (Directory Information Tree). I expect this will evolve. We've implemented some customer facter facts to perform Ldap queries to simplify puppet class definitions. So we've made node declarations a fairly cookie-cutter process, enabling junior level admins to define nodes without knowledge of Puppet, and we've minimized the number of fingers going in and creating puppet classes.
Structure
Puppet is an excellent tool for configuration management of systems. With the use of passenger, puppet servers can be scaled to handle large collections of systems. When the number of systems you are managing goes beyond a hundred, the use of node definition files becomes burdensome. It's much easier to put all the node definitions into Ldap and use an ENC (external node classifier) for puppet to access it. With Ldap HA and scalability features implemented, you now have a fairly robust storage service for all the nodes you manage with puppet, the key advantage being the power of ldapsearch.
I went beyond the general Ldap ENC usage by defining additional Ldap attributes and objectclasses, tailored for the management of a particular resource. I then defined a unique DIT to represent the infrastructure and applications (aka platform), in addition to the environment, (i.e. prd=production, ppe=pre-prod, dev=dev), and machine-role (i.e. proxy, webserver, database, etc.). My initial target for all this was the creation and management of webhosting cluster groups. Each group consists of a proxy, webservers, and optionally database. I created custom Facter facts to complete some of the attributes needed, and constructed the puppet modules to query Ldap for packages, versions, and configuration data. I tried to put as much of the necessary configuration attributes into Facter as possible. This makes the creation of the puppet modules much easier, but pays a penalty in the execution time of Facter. It's a tradeoff, I could have done the ldap queries in a template or script, but I prefer to keep the module classes as simple and generic as possible.
Here's what my DIT (Directory Information Tree) looks like:
I used the schema definition for puppet Ldap ENC, and extended the node definition to the following:
# t1, Hosts, tsand.org
dn: cn=t1,ou=Hosts,dc=tsand,dc=org
objectClass: top
objectClass: puppetClient
objectClass: device
objectClass: ipHost
cn: t1
environment: prd
ipHostNumber: 192.168.1.81
datacenter: dchome
platform: tkc
platformVersion: 1.0
cluster: tkc
puppetClass: base
puppetClass: gluster
puppetClass: ldap_client
puppetClass: puppetclient
role: webserver
Additional Ldap attributes were added:
datacenter: IA5string, single value, used to define the datacenter the machine resides in.
platform: IA5string, single value, used to define the application (or platform) that this machine is used for.
platformVersion: IA5string, single value, a version value that could be used to specify the version of application/platform.
cluster: IA5string, single value, used to identify the cluster group the machine belongs in.
role: IA5string, single value, used to specify the role the machine is used for.
The above attributes were made available to Puppet by way of custom Facter facts. The Facter facts are loaded by the PuppetClass Base.
A helper fact, ldapsuffix, was added to simplify the creation of ldapsearch base strings:
# /etc/puppet/modules/base/lib/facter/ldapsuffix.rb
# vi:set nu ai ap aw smd showmatch tabstop=4 shiftwidth=4:
Facter.add("ldapsuffix") do
setcode do
domain = Facter.value('domain').split('.')
list = []
domain.each do |x|
list << "dc=" + x
end
list.join(',')
end
end
I also added the environment variable as a Facter fact. The fact will query ldap node definition, rather than reading the environment variable. This was necessary when running puppet agent standalone, as the settings in puppet.conf would apply rather than the settings returned from the ENC.
# /etc/puppet/modules/base/lib/facter/environment.rb
# vi:set nu ai ap aw smd showmatch tabstop=4 shiftwidth=4:
require 'ldap'
Facter.add("environment") do
setcode do
host = Facter.value('hostname')
suffix = Facter.value('ldapsuffix')
base = "ou=Hosts,#{suffix}"
server = Puppet[:ldapserver]
port = LDAP::LDAP_PORT
scope = LDAP::LDAP_SCOPE_SUBTREE
filter = "(&(objectclass=puppetclient)(cn=#{host}))"
attrs = ['environment']
conn = LDAP::Conn.new(server,port)
begin
conn.search(base,scope,filter,attrs) { |entry|
data = entry.vals('environment')
}
rescue ::LDAP::ResultError => e
raise Puppet::ParseError, ("ldapquery(): LDAP ResultError - #{e.message}")
end
data
end
end
# /etc/puppet/modules/base/lib/facter/datacenter.rb
# vi:set nu ai ap aw smd showmatch tabstop=4 shiftwidth=4:
require 'ldap'
Facter.add("datacenter") do
setcode do
host = Facter.value('hostname')
suffix = Facter.value('ldapsuffix')
base = "ou=Hosts,#{suffix}"
server = Puppet[:ldapserver]
port = LDAP::LDAP_PORT
scope = LDAP::LDAP_SCOPE_SUBTREE
filter = "(&(objectclass=puppetclient)(cn=#{host}))"
attrs = ['datacenter']
data = ""
conn = LDAP::Conn.new(server,port)
begin
conn.search(base,scope,filter,attrs) { |entry|
data = entry.vals('datacenter')
}
rescue ::LDAP::ResultError => e
raise Puppet::ParseError, ("ldapquery(): LDAP ResultError - #{e.message}")
end
data
end
end
# /etc/puppet/modules/base/lib/facter/platform.rb
# vi:set nu ai ap aw smd showmatch tabstop=4 shiftwidth=4:
require 'ldap'
Facter.add("platform") do
setcode do
host = Facter.value('hostname')
suffix = Facter.value('ldapsuffix')
base = "ou=Hosts,#{suffix}"
server = Puppet[:ldapserver]
port = LDAP::LDAP_PORT
scope = LDAP::LDAP_SCOPE_SUBTREE
filter = "(&(objectclass=puppetclient)(cn=#{host}))"
attrs = ['platform']
data = ""
conn = LDAP::Conn.new(server,port)
begin
conn.search(base,scope,filter,attrs) { |entry|
data = entry.vals('platform')
}
rescue ::LDAP::ResultError => e
raise Puppet::ParseError, ("ldapquery(): LDAP ResultError - #{e.message}")
end
data
end
end
# /etc/puppet/modules/base/lib/facter/platformversion.rb
# vi:set nu ai ap aw smd showmatch tabstop=4 shiftwidth=4:
require 'ldap'
Facter.add("platformversion") do
setcode do
host = Facter.value('hostname')
suffix = Facter.value('ldapsuffix')
base = "ou=Hosts,#{suffix}"
server = Puppet[:ldapserver]
port = LDAP::LDAP_PORT
scope = LDAP::LDAP_SCOPE_SUBTREE
filter = "(&(objectclass=puppetclient)(cn=#{host}))"
attrs = ['platformversion']
data = ""
conn = LDAP::Conn.new(server,port)
begin
conn.search(base,scope,filter,attrs) { |entry|
data = entry.vals('platformversion')
}
rescue ::LDAP::ResultError => e
raise Puppet::ParseError, ("ldapquery(): LDAP ResultError - #{e.message}")
end
data
end
end
# /etc/puppet/modules/base/lib/facter/cluster.rb
# vi:set nu ai ap aw smd showmatch tabstop=4 shiftwidth=4:
require 'ldap'
Facter.add("cluster") do
setcode do
host = Facter.value('hostname')
suffix = Facter.value('ldapsuffix')
base = "ou=Hosts," + suffix
server = Puppet[:ldapserver]
port = LDAP::LDAP_PORT
scope = LDAP::LDAP_SCOPE_SUBTREE
filter = "(&(objectClass=puppetClient)(cn=#{host}))"
attrs = ['cluster']
data = ""
conn = LDAP::Conn.new(server,port)
begin
conn.search(base,scope,filter,attrs) { |entry|
data = entry.vals('cluster')
}
rescue ::LDAP::ResultError => e
raise Puppet::ParseError, ("ldapquery(): LDAP ResultError - #{e.message}")
end
data
end
end
# /etc/puppet/modules/base/lib/facter/role.rb
# vi:set nu ai ap aw smd showmatch tabstop=4 shiftwidth=4:
require 'ldap'
Facter.add("role") do
setcode do
host = Facter.value('hostname')
suffix = Facter.value('ldapsuffix')
base = "ou=Hosts,#{suffix}"
server = Puppet[:ldapserver]
port = LDAP::LDAP_PORT
scope = LDAP::LDAP_SCOPE_SUBTREE
filter = "(&(objectclass=puppetclient)(cn=#{host}))"
attrs = ['role']
conn = LDAP::Conn.new(server,port)
begin
conn.search(base,scope,filter,attrs) { |entry|
data = entry.vals('role')
}
rescue ::LDAP::ResultError => e
raise Puppet::ParseError, ("ldapquery(): LDAP ResultError - #{e.message}")
end
data
end
end
When making changes in Ldap, sometimes it's desirable to have nodes perform a puppet agent run. The following script will issue a "puppet kick" to selected nodes.
Usage:
poke --platform platformname --env environment --> will execute puppet on all nodes that match platform & environment
poke --platform platformname --env environment --role role --> will execute puppet on all nodes that match platform, environment & role
Example:
poke -p tkc -e prd
poke -p tkc -e prd -r webserver
Here's the code:
#! /bin/bash
# vi:set nu ai ap aw smd showmatch tabstop=4 shiftwidth=4:
# our programname
pname=$(basename $0)
# usage function
function usage() {
echo "${pname}: options"
echo "${pname} [-p platform|--platform platform][-r role|--role role][-e env|--env env]"
echo ""
exit 1
}
# error message
function Die() {
echo "$*"
exit 1
}
# initialize our variables
platform="none"
env="none"
# process command line arguments
out=$(getopt -o p:r:e:h --long platform:,role:,env:,help -n psrv -- "$@")
eval set -- "$out"
while true ; do
case "$1" in
-p|--platform) platform=$2; shift 2 ;;
-r|--role) role=$2; shift 2 ;;
-e|--env) env=$2; shift 2 ;;
-h|--help) usage ;;
--) shift; break ;;
*) usage ;;
esac
done
# confirm that at least platform & environment are declared
if [ "$platform" == "none" ]
then
Die "ERROR: must declare platform"
fi
if [ "$env" == "none" ]
then
Die "Error: must declare environment"
fi
# status display
echo ">Platform: $platform"
echo ">Environment: $env"
echo "${role:->Role: $role}"
# query ldap to get our hosts list
hostlist=$(ldapsearch -x -b "ou=hosts,dc=tsand,dc=org" "(&(platform=$platform)${role:+(role=$role)}(environment=$env))" "cn" | sed -n -e 's/^cn: \(.*\)$/\1.tsand.org/gp')
# and ask puppet to run on those nodes
puppet kick $hostlist
exit 0
Systems are classified by:
platform: a label used to distinguish an application.
environment: a label used to identify the use of a machine (i.e. production, ppe, dev, etc.)
role: a fixed string that is used to identify the purpose, and packages necessary for a given machine (i.e. proxy, webserver, database, shared-storage, etc.)
All systems, by way of their ldap nodes entry, will include the "base" class.
class base {
include infrastructure --> this includes the base OS packages, configured based on distro & environment
include $role --> will include the packages necessary for the role of the machine
}
For example, a machine configured with role "proxy", would perform
...
include proxy
...
This in-turn would process:
class proxy {
include proxy_package
include proxy_config
}
# /etc/puppet/modules/proxy/manifests/init.pp
# vi:set nu ai ap aw smd showmatch tabstop=4 shiftwidth=4:
class proxy_package {
define proxy_package::add() {
$tmp=split($name,':')
$pkgname=regsubst($tmp[0],'\n|\r','')
$ensure=regsubst($tmp[1],'\n|\r','')
$enable=regsubst($tmp[2],'\n|\r','')
$srvensure=regsubst($tmp[3],'\n|\r','')
package { "${pkgname}":
ensure => $ensure,
alias => proxy,
}
service { "${pkgname}":
enable => $enable,
ensure => $srvensure,
alias => proxy,
require => Package['proxy'],
}
}
$pkgs=split($packages,',')
proxy_package::add { $pkgs: }
}
The definition of Facter fact "packages" is as follows:
# /etc/puppet/modules/base/lib/facter/packages.rb
# vi:set nu ai ap aw smd showmatch tabstop=4 shiftwidth=4:
require 'ldap'
Facter.add("packages") do
setcode do
platform = Facter.value('platform')
role = Facter.value('role')
env = Facter.value('environment')
suffix = Facter.value('ldapsuffix')
distro = Facter.lsbdistid
base1 = "ou=#{distro},ou=Packages,ou=#{env},ou=#{role},ou=#{platform},ou=Platforms,#{suffix}"
base2 = "ou=Services,ou=#{env},ou=#{role},ou=#{platform},ou=Platforms,#{suffix}"
server = Puppet[:ldapserver]
port = LDAP::LDAP_PORT
scope = LDAP::LDAP_SCOPE_SUBTREE
filter = "(cn=*)"
attrs1 = ['cn','ensure']
attrs2 = ['enable','ensure']
list = []
conn1 = LDAP::Conn.new(server,port)
conn2 = LDAP::Conn.new(server,port)
begin
conn1.search(base1,scope,filter,attrs1) { |entry1|
d1 = entry1.vals('cn').to_s
d2 = entry1.vals('ensure').to_s
begin
conn2.search(base2,scope,filter,attrs2) { |entry2|
d3 = entry2.vals('enable').to_s
d4 = entry2.vals('ensure').to_s
if d1 != "" && d2 != "" && d3 != "" && d4 != ""
d5 = d1 + ":" + d2 + ":" + d3 + ":" + d4
list << d5
end
}
rescue ::LDAP::ResultError => e
end
}
rescue ::LDAP::ResultError => f
end
list.join(",")
end
end
The "packages" fact will query ldap, the config branch and services branch, an display data in the following format:
packagename:install-state|version:enable:state
Where:
packagename: distro specific name of the package
install-state|version: the strings "installed,present,absent", or a version specific
enable: true or false (whether to enable the RC script for the service related to this package
state: running or stopped (allows control via ldap of the related service)
This structure allows control over what packages, their version, as well as their running state to all be controlled from ldap, accomodating distro, and environment differences.
Facter was extended to accommodate distro and version differences, by eliminating the use of a flat file to hold parameters, and by moving those "parameters" into ldap.
This allows the use of conventional ldap-editors & browsers to make changes, particular to a platform, environment, distro, version, etc.