Infrastructure Design Guidelines

Version 2 (Anonymous, 03/13/2010 08:01 pm)

1 1
# Example (42) Infrastructure Design Guidelines
2 1
3 1
Author: Alessandro Franceschi ( Lab42 )
4 1
5 1
- [[Infrastructure\_Design\_Guidelines#Introduction|Infrastructure
6 1
Design Guidelines]]   
7 1
-
8 1
[[Infrastructure\_Design\_Guidelines#Preliminarynotes|Infrastructure
9 1
Design Guidelines]]   
10 1
-
11 1
[[Infrastructure\_Design\_Guidelines#Verysimpleinfrastructure|Infrastructure
12 1
Design Guidelines]]   
13 1
-
14 1
[[Infrastructure\_Design\_Guidelines#Infrastructurewithroles|Infrastructure
15 1
Design Guidelines]]   
16 1
-
17 1
[[Infrastructure\_Design\_Guidelines#Infrastructurewithdifferentrolesandzones|Infrastructure
18 1
Design Guidelines]]
19 1
20 1
## Introduction
21 1
22 1
Designing a Puppet infrastructure is a matter of knowledge, method,
23 1
contingency and somehow fantasy.  
24 1
First of all you must know Puppet's logic and it's main language
25 1
features, then you should define a general method to manage points
26 1
in common and differences in the configurations you apply to your
27 1
hosts, this is mostly dependent on your own infrastructure and
28 1
needs, finally you can add a bit of creativity to handle different
29 1
situations and singularities.  
30 1
As usual in Unix world there are different ways to achieve the
31 1
wanted results and there is not an unique solution or
32 1
recommendation worth for every case, still we try to define here
33 1
different scenarios and the relevant "good practices", well aware
34 1
that there might be totally different and still good practices to
35 1
handle the same cases.  
36 1
The guidelines defined here are being applied to the
37 1
[Example42 Puppet Infrastructure](http://www.example42.com) (a
38 1
sample infrastructure that can be freely used as starting point for
39 1
customization) by [Lab42](http://www.lab42.it). Regards and credits
40 1
to Francesco Crippa of [Byte-Code](http://www.byte-code.com) for
41 1
the initial architectural approach.
42 1
43 1
## Preliminary notes
44 1
45 1
- Here with "**role**" we intend the function of a host. Defining a
46 1
role has a sense when there are at least 2 nodes having the same
47 1
role.  
48 1
It can be an arbitrary string, such as "webserver" and should be
49 1
shared for all the host that have exactly the same services
50 1
running, where configurations general tend to be similar and can
51 1
have differences in details as local hostname, IP and similar.  
52 1
For example a battery of frontend web servers can share the same
53 1
role (ie role: "webserver"), they can be balanced by a couple of
54 1
load balancers in HA (ie role: "loadbalancer"), use a backend
55 1
database cluster (ie role: "database"), being monitored by one or
56 1
more monitoring host ("monitor"), send syslog messages to one or
57 1
more syslog servers ("syslog") and so on.  
58 1
It's worth to underline that if you use the concept of role it's
59 1
better to always use roles, also when there are cases of roles used
60 1
only by a single host.  
61 1
- A "**zone**" can be generally seen as a separated network. In
62 1
different zones you can define variables for different parameters
63 1
that change from zone to zone. For example the network IP/subnet,
64 1
the default gateway but also the dns/ntp/syslog/whatever server
65 1
that all nodes in the same zone share. A zone can identify also
66 1
development / testing / staging / production environments,
67 1
eventually divided in different sub-zones if each of them span over
68 1
different networks.  
69 1
- The general logic is that every node (host) inherits a more
70 1
general node (more precisely a (sub)zone, which could then inherit
71 1
a "wider" zone) and includes a single role (more precisely a class
72 1
defining the role).  
73 1
- The examples here are based on a module based logic, as defined
74 1
in [[Module Organisation]]
75 1
76 1
The practices used here have been applied successfully in different
77 1
companies ranging from few nodes to, in the largest case, about 200
78 1
nodes sharing different roles (more than 20) and different zones
79 1
(about 10). It should apply seamlessly to wider installations,
80 1
where the number of nodes could be of several hundreds, sharing
81 1
dozens of roles and zones.  
82 1
We'll not face here the issues of planning a distributed and
83 1
redundant puppetmaster infrastructure, the delegation of editing
84 1
permissions to different groups or how to cope with
85 1
testing/production puppet configurations (but we'll face cases of a
86 1
infrastructure with development/testing/production nodes).  
87 1
We'll start from simple cases and then try to face more complex
88 1
scenarios.
89 1
90 1
## Very simple infrastructure
91 1
92 1
If you have few nodes to manage, all sharing the same network and
93 1
without the need of defining roles, the logic is simple and can be
94 1
reduced to defining nodes in a similar way:
95 1
96 1
    node basenode {
97 1
            $my_puppet_server = "10.42.0.10"
98 1
            $my_local_network = "10.42.0.0/24"
99 1
    
100 1
            $my_syslog_server = "10.42.0.11"
101 1
            $my_ntp_server = "10.42.0.12"
102 1
    }
103 1
    
104 1
    node 'www.example42.com' inherits basenode {
105 1
            include general
106 1
            include httpd::php
107 1
            include mysql::server
108 1
    }
109 1
110 1
Note that on basenode you can define variables used in the
111 1
templates of your classes, these variables can be overriden at host
112 1
node level to manage exceptions. For example:
113 1
114 1
    node 'ntp.example42.com' inherits basenode {
115 1
            $my_ntp_server = "0.pool.ntp.org"
116 1
    
117 1
            include general
118 1
    }
119 1
120 1
Note that is important to declare variables BEFORE including the
121 1
classes that use them.
122 1
123 1
It's a good practice to define a class that provides general
124 1
configurations applied to every node. This class should just
125 1
include all the common classes. Something like:
126 1
127 1
    class general {
128 1
            include yum
129 1
            include hosts
130 1
            include puppet
131 1
            include iptables
132 1
            include sysctl
133 1
            include nrpe
134 1
            include ntp
135 1
            include syslog
136 1
    }
137 1
138 1
In a simple environment you can decide to prefer sourcing static
139 1
files instead of templates, since their content is not likely to
140 1
change within your infrastructure.  
141 1
A syslog class, for example, can be:
142 1
143 1
    class syslog {
144 1
            package {
145 1
                "syslogd":
146 1
                    ensure  => present,
147 1
                    name    => $operatingsystem ? {
148 1
                            default => "sysklogd",
149 1
                            },
150 1
            }
151 1
    
152 1
            file {
153 1
                "syslog.conf":
154 1
                    owner   => "root",
155 1
                    group    => "root",
156 1
                    mode    => "640",
157 1
                    require  => Package["syslogd"],
158 1
                    path     => $operatingsystem ? {
159 1
                               default => "/etc/syslog.conf",
160 1
                               },
161 1
                    ## If you want to use a template:
162 1
                    content => template("syslog/syslog.conf.erb"),
163 1
    
164 1
                    ## If you want to source a static file:
165 1
                    ## source => "puppet://$server/syslog/syslog.conf",
166 1
        }
167 1
    
168 1
            service {
169 1
                "syslog":
170 1
                    enable    => "true",
171 1
                    ensure    => "running",
172 1
                    hasstatus => "true",
173 1
                    require   => File["syslog.conf"],
174 1
                    subscribe => File["syslog.conf"],
175 1
                    name => $operatingsystem ? {
176 1
                            default => "syslog",
177 1
                            },
178 1
            }
179 1
    }
180 1
181 1
In this case you can either define the content of your syslog.conf
182 1
in the template **MODULEDIR/syslog/templates/syslog.conf.erb** or
183 1
in the static file **MODULEDIR/syslog/files/syslog.conf**, of
184 1
course the two options are mutually exclusive.
185 1
186 1
## Infrastructure with roles
187 1
188 1
If you have various nodes with similar function it's worth to
189 1
consider the use of roles (note that the concept of role in not
190 1
intrinsic in Puppet but just an arbitrary way to summarize
191 1
functions), shared by different nodes. Something like:
192 1
193 1
    node 'www1.example42.com' inherits basenode {
194 1
            include role_webserver # (the role_ prefix is arbitrary and not strictly necessary)
195 1
    }
196 1
    node 'www2.example42.com' inherits basenode {
197 1
            include role_webserver
198 1
    }
199 1
    node 'www3.example42.com' inherits basenode {
200 1
            include role_webserver
201 1
    }
202 1
    node 'lb1.example42.com' inherits basenode {
203 1
            include role_loadbalancer
204 1
    }
205 1
    node 'lb2.example42.com' inherits basenode {
206 1
            include role_loadbalancer
207 1
    }
208 1
209 1
You then define roles in normal classes, with something like:
210 1
211 1
    class role_webserver {
212 1
            $my_role = "webserver"
213 1
            include general
214 1
            include httpd::php
215 1
    }
216 1
    class role_loadbalancer {
217 1
            $my_role = "loadbalancer"
218 1
            include general
219 1
            include lvs
220 1
    }
221 1
222 1
Note the definition of the **$my\_role** variable at the beginning
223 1
of the class.  
224 1
It's recommended to define such a variable because it can be useful
225 1
in different situations, where you must define totally different
226 1
configurations according to the role of the host.  
227 1
For example iptables rules can be crafted to be the same for all
228 1
the nodes of the same role:
229 1
230 1
    class iptables {
231 1
            service {
232 1
                "iptables":
233 1
                    name => $operatingsystem ? {
234 1
                            default => "iptables",
235 1
                            },
236 1
                    ensure => running,
237 1
                    enable => true,
238 1
                    hasrestart => false,
239 1
                    restart => $operatingsystem ? {
240 1
                            default => ""iptables-restore < /etc/sysconfig/iptables",
241 1
                            },
242 1
                    hasstatus => true,
243 1
                    subscribe File["iptables"],
244 1
            }
245 1
    
246 1
            file {    
247 1
                "iptables":
248 1
                    mode => 600, owner => root, group => root,
249 1
                    ensure => present,
250 1
                    path => $operatingsystem ?{
251 1
                                default => "/etc/sysconfig/iptables",
252 1
                            },
253 1
                    source => [ "puppet://$server/iptables/iptables-$my_role" , "puppet://$server/iptables/iptables" ],
254 1
            }
255 1
    }
256 1
257 1
Here you can define the rules for webservers in
258 1
**MODULEDIR/iptables/files/iptables-webserver**, the rules for
259 1
loadbalancers in **MODULEDIR/iptables/files/iptables-loadbalancer**
260 1
and a default ruleset, applied if not role-specific files have been
261 1
defined, in **MODULEDIR/iptables/files/iptables**.  
262 1
You can easily manage host based exceptions changing the source
263 1
definition in something like:
264 1
265 1
    source => [ "puppet://$server/iptables/iptables-$hostname" , "puppet://$server/iptables/iptables-$my_role" , "puppet://$server/iptables/iptables" ],
266 1
267 1
and then, where necessary, creating a file like
268 1
**MODULEDIR/iptables/files/iptables-lb1** to apply specific
269 1
settings for the host lb1.
270 1
271 1
Another way to use a variable like $role is directly in templates.
272 1
You can change the above line in:
273 1
274 1
    content => template("iptables/iptables.erb"),
275 1
276 1
and create a **MODULEDIR/iptables/templates/iptables.erb** with
277 1
something like:
278 1
279 1
    *filter
280 1
    :INPUT DROP [0:0]
281 1
    :FORWARD DROP [0:0]
282 1
    :OUTPUT DROP [0:0]
283 1
    -A INPUT -i lo -j ACCEPT
284 1
    -A INPUT -p icmp -j ACCEPT
285 1
    -A INPUT -m state --state RELATED,ESTABLISHED -j ACCEPT
286 1
    # SSH allowed only from management console
287 1
    -A INPUT -s 10.42.0.200 -j ACCEPT
288 1
    
289 1
    # Role specific settings
290 1
    <% if my_role=="webserver" %>
291 1
    -A INPUT -p tcp --dport 80 -j ACCEPT
292 1
    -A INPUT -p tcp --dport 443 -j ACCEPT
293 1
    <% end %>
294 1
    
295 1
    <% if my_role=="dbserver" %>
296 1
    -A INPUT -s 10.42.0.0/24 -p tcp --dport 3306 -j ACCEPT
297 1
    <% end %>
298 1
    
299 1
    -A INPUT -m pkttype --pkt-type UNICAST -j LOG --log-prefix "[INPUT DROP] : "
300 1
    -A FORWARD -j LOG --log-prefix "[FORWARD DROP] : "
301 1
    -A OUTPUT -m state --state NEW,RELATED,ESTABLISHED -j ACCEPT
302 1
    -A OUTPUT -m pkttype --pkt-type UNICAST -j LOG --log-prefix "[OUTPUT DROP] : "
303 1
    COMMIT
304 1
305 1
## Infrastructure with different roles and zones
306 1
307 1
More complex scenarios can involve the presence of several nodes
308 1
(scaling up to hundreds) using different roles and being placed in
309 1
different networks with different functions (ie:
310 1
development/testing/production... ).  
311 1
In these cases it's recommended to work on nodes' inheritance
312 1
managing relevant variables at different levels, according to
313 1
custom needs. For example:
314 1
315 1
    node basenode {
316 1
            $my_puppet_server = "10.42.0.10"
317 1
            $my_syslog_server = "10.42.0.11"
318 1
            $my_ntp_server = "10.42.0.12"
319 1
    }
320 1
    
321 1
    node devel inherits basenode {
322 1
            $my_local_network = "192.168.0.0/24"
323 1
            $my_syslog_server = "192.168.0.11"
324 1
            $my_zone = "devel"
325 1
    }
326 1
    
327 1
    node test inherits basenode {
328 1
            $my_local_network = "10.42.1.0/24"
329 1
            $my_syslog_server = "10.42.1.11"
330 1
            $my_zone = "test"
331 1
    }
332 1
    
333 1
    node prod inherits basenode {
334 1
            $my_local_network = "10.42.0.0/24"
335 1
            $my_zone = "prod"
336 1
    }
337 1
    
338 1
    node 'www1.example42.com' inherits prod {
339 1
            include role_webserver
340 1
    }
341 1
    
342 1
    node 'www1.example42.devel' inherits devel {
343 1
            include role_webserver
344 1
    }
345 1
346 1
A similar approach leaves you freedom to define per zone settings
347 1
but also to keep the possibility to override them at more specific
348 1
levels.  
349 1
The inheritance tree can have more intermediate nodes, according to
350 1
your own infrastructure, but it's important, to avoid headaches and
351 1
overcomplexity, to have for each host a single and linear
352 1
inheritance tree (ie: node inherits subzone inherits zone inherits
353 1
basenode).  
354 1
Note also that zones (as roles these are not a Puppet internal
355 1
concept) can be related to IP networks but also to functional
356 1
levels (prod/test/devel...) or geographical locations
357 1
(headquarters, branch office...). The use of a **$my\_zone**
358 1
variable has the same advantages of the $my\_role variable, it can
359 1
be used in many different places to manage differences based on
360 1
different zones. Another example:
361 1
362 1
    class general {
363 1
            include yum
364 1
            include hosts
365 1
            include puppet
366 1
            include iptables
367 1
            include sysctl
368 1
            include nrpe
369 1
            include ntp
370 1
            include syslog
371 1
    
372 1
            case $my_zone  {
373 1
                prod: { include hardening }
374 1
                test: { include hardening }
375 1
                default:  {  }
376 1
            }
377 1
    }
378 1
379 1
So, for each node, you have 2 main characterizations:  
380 1
- The zone (network or whatever) where it stays (inherited from an
381 1
higher level node)  
382 1
- The role (function) it has (included as a class)  
383 1
these should be enough to cover many different scenarios with
384 1
different complexity keeping both the needs of high-level
385 1
standardization and host-level characterization.  
386 1
In a typical development / testing / production infrastructure you
387 1
will have nodes sharing the same role (so you sure that setups and
388 1
configurations are coherent) and being part of different zones,
389 1
where you can define different settings and variables.