Example (42) Infrastructure Design Guidelines
Author: Alessandro Franceschi ( Lab42 )
- [[Infrastructure_Design_Guidelines#Introduction|Infrastructure
Design Guidelines]]
[[Infrastructure_Design_Guidelines#Preliminarynotes|InfrastructureDesign Guidelines]]
[[Infrastructure_Design_Guidelines#Verysimpleinfrastructure|InfrastructureDesign Guidelines]]
[[Infrastructure_Design_Guidelines#Infrastructurewithroles|InfrastructureDesign Guidelines]]
[[Infrastructure_Design_Guidelines#Infrastructurewithdifferentrolesandzones|Infrastructure Design Guidelines]]
Introduction
Designing a Puppet infrastructure is a matter of knowledge, method,
contingency and somehow fantasy.
First of all you must know Puppet’s logic and it’s main language
features, then you should define a general method to manage points
in common and differences in the configurations you apply to your
hosts, this is mostly dependent on your own infrastructure and
needs, finally you can add a bit of creativity to handle different
situations and singularities.
As usual in Unix world there are different ways to achieve the
wanted results and there is not an unique solution or
recommendation worth for every case, still we try to define here
different scenarios and the relevant “good practices”, well aware
that there might be totally different and still good practices to
handle the same cases.
The guidelines defined here are being applied to the
Example42 Puppet Infrastructure (a
sample infrastructure that can be freely used as starting point for
customization) by Lab42. Regards and credits
to Francesco Crippa of Byte-Code for
the initial architectural approach.
Preliminary notes
- Here with “role” we intend the function of a host. Defining a
role has a sense when there are at least 2 nodes having the same
role.
It can be an arbitrary string, such as “webserver” and should be shared for all the host that have exactly the same services running, where configurations general tend to be similar and can have differences in details as local hostname, IP and similar.
For example a battery of frontend web servers can share the same role (ie role: “webserver”), they can be balanced by a couple of load balancers in HA (ie role: “loadbalancer”), use a backend database cluster (ie role: “database”), being monitored by one or more monitoring host (“monitor”), send syslog messages to one or more syslog servers (“syslog”) and so on.
It’s worth to underline that if you use the concept of role it’s better to always use roles, also when there are cases of roles used only by a single host. - A “zone” can be generally seen as a separated network. In different zones you can define variables for different parameters that change from zone to zone. For example the network IP/subnet, the default gateway but also the dns/ntp/syslog/whatever server that all nodes in the same zone share. A zone can identify also development / testing / staging / production environments, eventually divided in different sub-zones if each of them span over different networks.
- The general logic is that every node (host) inherits a more general node (more precisely a (sub)zone, which could then inherit a “wider” zone) and includes a single role (more precisely a class defining the role).
- The examples here are based on a module based logic, as defined in Module Organisation
The practices used here have been applied successfully in different
companies ranging from few nodes to, in the largest case, about 200
nodes sharing different roles (more than 20) and different zones
(about 10). It should apply seamlessly to wider installations,
where the number of nodes could be of several hundreds, sharing
dozens of roles and zones.
We’ll not face here the issues of planning a distributed and
redundant puppetmaster infrastructure, the delegation of editing
permissions to different groups or how to cope with
testing/production puppet configurations (but we’ll face cases of a
infrastructure with development/testing/production nodes).
We’ll start from simple cases and then try to face more complex
scenarios.
Very simple infrastructure
If you have few nodes to manage, all sharing the same network and without the need of defining roles, the logic is simple and can be reduced to defining nodes in a similar way:
node basenode {
$my_puppet_server = "10.42.0.10"
$my_local_network = "10.42.0.0/24"
$my_syslog_server = "10.42.0.11"
$my_ntp_server = "10.42.0.12"
}
node 'www.example42.com' inherits basenode {
include general
include httpd::php
include mysql::server
}
Note that on basenode you can define variables used in the templates of your classes, these variables can be overriden at host node level to manage exceptions. For example:
node 'ntp.example42.com' inherits basenode {
$my_ntp_server = "0.pool.ntp.org"
include general
}
Note that is important to declare variables BEFORE including the classes that use them.
It’s a good practice to define a class that provides general configurations applied to every node. This class should just include all the common classes. Something like:
class general {
include yum
include hosts
include puppet
include iptables
include sysctl
include nrpe
include ntp
include syslog
}
In a simple environment you can decide to prefer sourcing static
files instead of templates, since their content is not likely to
change within your infrastructure.
A syslog class, for example, can be:
class syslog {
package {
"syslogd":
ensure => present,
name => $operatingsystem ? {
default => "sysklogd",
},
}
file {
"syslog.conf":
owner => "root",
group => "root",
mode => "640",
require => Package["syslogd"],
path => $operatingsystem ? {
default => "/etc/syslog.conf",
},
## If you want to use a template:
content => template("syslog/syslog.conf.erb"),
## If you want to source a static file:
## source => "puppet://$server/syslog/syslog.conf",
}
service {
"syslog":
enable => "true",
ensure => "running",
hasstatus => "true",
require => File["syslog.conf"],
subscribe => File["syslog.conf"],
name => $operatingsystem ? {
default => "syslog",
},
}
}
In this case you can either define the content of your syslog.conf in the template MODULEDIR/syslog/templates/syslog.conf.erb or in the static file MODULEDIR/syslog/files/syslog.conf, of course the two options are mutually exclusive.
Infrastructure with roles
If you have various nodes with similar function it’s worth to consider the use of roles (note that the concept of role in not intrinsic in Puppet but just an arbitrary way to summarize functions), shared by different nodes. Something like:
node 'www1.example42.com' inherits basenode {
include role_webserver # (the role_ prefix is arbitrary and not strictly necessary)
}
node 'www2.example42.com' inherits basenode {
include role_webserver
}
node 'www3.example42.com' inherits basenode {
include role_webserver
}
node 'lb1.example42.com' inherits basenode {
include role_loadbalancer
}
node 'lb2.example42.com' inherits basenode {
include role_loadbalancer
}
You then define roles in normal classes, with something like:
class role_webserver {
$my_role = "webserver"
include general
include httpd::php
}
class role_loadbalancer {
$my_role = "loadbalancer"
include general
include lvs
}
Note the definition of the $my_role variable at the beginning
of the class.
It’s recommended to define such a variable because it can be useful
in different situations, where you must define totally different
configurations according to the role of the host.
For example iptables rules can be crafted to be the same for all
the nodes of the same role:
class iptables {
service {
"iptables":
name => $operatingsystem ? {
default => "iptables",
},
ensure => running,
enable => true,
hasrestart => false,
restart => $operatingsystem ? {
default => ""iptables-restore < /etc/sysconfig/iptables",
},
hasstatus => true,
subscribe File["iptables"],
}
file {
"iptables":
mode => 600, owner => root, group => root,
ensure => present,
path => $operatingsystem ?{
default => "/etc/sysconfig/iptables",
},
source => [ "puppet://$server/iptables/iptables-$my_role" , "puppet://$server/iptables/iptables" ],
}
}
Here you can define the rules for webservers in
MODULEDIR/iptables/files/iptables-webserver, the rules for
loadbalancers in MODULEDIR/iptables/files/iptables-loadbalancer
and a default ruleset, applied if not role-specific files have been
defined, in MODULEDIR/iptables/files/iptables.
You can easily manage host based exceptions changing the source
definition in something like:
source => [ "puppet://$server/iptables/iptables-$hostname" , "puppet://$server/iptables/iptables-$my_role" , "puppet://$server/iptables/iptables" ],
and then, where necessary, creating a file like MODULEDIR/iptables/files/iptables-lb1 to apply specific settings for the host lb1.
Another way to use a variable like $role is directly in templates. You can change the above line in:
content => template("iptables/iptables.erb"),
and create a MODULEDIR/iptables/templates/iptables.erb with something like:
*filter
:INPUT DROP [0:0]
:FORWARD DROP [0:0]
:OUTPUT DROP [0:0]
-A INPUT -i lo -j ACCEPT
-A INPUT -p icmp -j ACCEPT
-A INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT
# SSH allowed only from management console
-A INPUT -s 10.42.0.200 -j ACCEPT
# Role specific settings
<% if my_role=="webserver" %>
-A INPUT -p tcp --dport 80 -j ACCEPT
-A INPUT -p tcp --dport 443 -j ACCEPT
<% end %>
<% if my_role=="dbserver" %>
-A INPUT -s 10.42.0.0/24 -p tcp --dport 3306 -j ACCEPT
<% end %>
-A INPUT -m pkttype --pkt-type UNICAST -j LOG --log-prefix "[INPUT DROP] : "
-A FORWARD -j LOG --log-prefix "[FORWARD DROP] : "
-A OUTPUT -m state --state NEW,RELATED,ESTABLISHED -j ACCEPT
-A OUTPUT -m pkttype --pkt-type UNICAST -j LOG --log-prefix "[OUTPUT DROP] : "
COMMIT
Infrastructure with different roles and zones
More complex scenarios can involve the presence of several nodes
(scaling up to hundreds) using different roles and being placed in
different networks with different functions (ie:
development/testing/production… ).
In these cases it’s recommended to work on nodes' inheritance
managing relevant variables at different levels, according to
custom needs. For example:
node basenode {
$my_puppet_server = "10.42.0.10"
$my_syslog_server = "10.42.0.11"
$my_ntp_server = "10.42.0.12"
}
node devel inherits basenode {
$my_local_network = "192.168.0.0/24"
$my_syslog_server = "192.168.0.11"
$my_zone = "devel"
}
node test inherits basenode {
$my_local_network = "10.42.1.0/24"
$my_syslog_server = "10.42.1.11"
$my_zone = "test"
}
node prod inherits basenode {
$my_local_network = "10.42.0.0/24"
$my_zone = "prod"
}
node 'www1.example42.com' inherits prod {
include role_webserver
}
node 'www1.example42.devel' inherits devel {
include role_webserver
}
A similar approach leaves you freedom to define per zone settings
but also to keep the possibility to override them at more specific
levels.
The inheritance tree can have more intermediate nodes, according to
your own infrastructure, but it’s important, to avoid headaches and
overcomplexity, to have for each host a single and linear
inheritance tree (ie: node inherits subzone inherits zone inherits
basenode).
Note also that zones (as roles these are not a Puppet internal
concept) can be related to IP networks but also to functional
levels (prod/test/devel…) or geographical locations
(headquarters, branch office…). The use of a $my_zone
variable has the same advantages of the $my_role variable, it can
be used in many different places to manage differences based on
different zones. Another example:
class general {
include yum
include hosts
include puppet
include iptables
include sysctl
include nrpe
include ntp
include syslog
case $my_zone {
prod: { include hardening }
test: { include hardening }
default: { }
}
}
So, for each node, you have 2 main characterizations:
– The zone (network or whatever) where it stays (inherited from an
higher level node)
– The role (function) it has (included as a class)
these should be enough to cover many different scenarios with
different complexity keeping both the needs of high-level
standardization and host-level characterization.
In a typical development / testing / production infrastructure you
will have nodes sharing the same role (so you sure that setups and
configurations are coherent) and being part of different zones,
where you can define different settings and variables.