-
Notifications
You must be signed in to change notification settings - Fork 4
/
README.devel
343 lines (256 loc) · 12.2 KB
/
README.devel
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
############
# Automata #
############
=== Design ===
(lib/kadeploy3/server/automata.rb)
Task:
- Run a specific treatment
- Track OK/KO nodes during the treatment using two specific lists
- Can raise OK/KO nodes to it's manager during the traitment
TaskManager:
- Manages (run) Tasks, take in account timeouts and retries of each step
- Implements a list of steps
- Dispatch nodes in tasks depending on the defined step list and on the current context (nodes in failure, ...)
- Can be controlled by interacting with it's queue
TaskedTaskManager:
- Acts as a task (runnable) but can spawn sub-tasks as a TaskManager
=== Kadeploy ===
In Kadeploy:
- Micro-steps are Tasks
- Macro-steps are TaskedTaskManagers
- Workflows are TaskManagers
So Workflows spawns Macro-steps that are spawning micro-steps depending on the pre-defined step list (and on the configuration).
=== Simplified algorithm ===
--- Task ---
* run()
# ... treatment ...
if fail_during_treatment
@nodes_ko.add(failed_nodes)
end
# ... or to optimize performances ...
if fail_during_treatment
@manager.queue.push(failed_nodes,KO)
end
# ... treatment ...
if success
@nodes_ok.add(@nodes - failed_nodes)
return true
else
@nodes_ko.add(@nodes)
return false
end
--- TaskManager ---
* start()
until @nodes_processing.empty?
# Unstack the top of the queue
step,nodes,status = @queue.pop
# Choose to continue or not depending on the current step status
# and on the predefined step list
if next_step?(step,status)
nextstep = get_next_step(step,status)
run_task(Task.new(nextstep))
else
case(status)
when OK: @nodes_ok.add(nodes)
when KO: @nodes_ko.add(nodes)
when BRK: @nodes_brk.add(nodes)
end
@nodes_processing.delete(nodes)
end
end
-----------------------------------------------------------------
* @queue | T1 (s3,node-[1-3],OK) | T1 (s3,node-4,KO) | T2 (s1,node-10,BRK) |
-----------------------------------------------------------------
^ ^ ^
* run_task(T1): | | |
breakpoint?(T1) | | |
T1.run() | | |
wait(T1)/check timeout | | |
@queue.push(T1.ok_nodes,OK) | |
@queue.push(T1.ko_nodes,KO)------------- |
* run_task(T2): |
breakpoint?(T2) |
@queue.push(T1.nodes,BRK) --------------------------------
--- TaskedTaskManager ---
(inherits from Task and TaskManager)
* run() # Have to be reimplemented (Task)
start() # Inherited from TaskManager
return true
#########
# Cache #
#########
The cache system is deleting the files depending on a priority + LRU sort. First of all, every files (that are not in use) with the lowest priority are deleted depending on their last access date (LRU); then the files with a bigger priority, and so on until enough space has been freed from the cache.
In kadeploy the anonymous environment files have a low priority, the recorded environment files have a big one (stays longer in the cache).
For every cached files, two files are created in the filesystem: a _UID_.file file containing the data and a _UID_.meta file containing metadata in order for the service to be able to reload the cache after a restart. However, this feature has been temporary disabled, so at the moment, the _UID_.meta file is only useful for admins to get more information about a cached file.
=== Design ===
(lib/kadeploy3/server/cache.rb, lib/kadeploy3/server/grabfile.rb)
--- CacheFile ---
@refs: the number of operations currently using the object
* used?()
check if the file is currently in use (@refs > 0), try_lock
* acquire()
increments @refs
* release()
decrements @refs
* idx()
generate the file's UID based on a CacheIndex object
* save()
save the file: update .meta file and changed UID depending on the attributes that were modified
* remove()
delete the file from the filesystem
--- Cache ---
Kadeploy independent implementation of the cache
* load()
initialize the cache, reload (valid) files that are present in the cache directory [has been temporary disabled]
* cache(*file_info,&block)
ensure that a file is present in the cache. If the file is in the cache, update it's last access time, increment @refs and lock it. If the file is not present in the cache, &block is called with the path to a temporary file in which data should be saved/downloaded to, the file is also locked.
* remove(*file_info)
take the lock on the file and remove it from the cache if there is no @refs left
* release(file)
decrement @refs on a specific file
--- GrabFile ---
Link between Kadeploy and the Cache library
@cache: the Cache instance
* grab(file)
cache (eventually download) the specified file into @cache using Cache.cache()
* GrabFile.grab_user_files(context)
cache every user specific files depending on the current context (environments files for deployments, user specific PXE files for reboots, ...)
############
# REST API #
############
=== Design ===
(lib/kadeploy3/common/httpd.rb, lib/kadeploy3/common/http.rb)
Based on ruby WEBrick
Treatment done via a custom HTTPServlet::AbstractServlet (HTTPdHandler)
HTTP client facilities: HTTP::Client
--- HTTPd::Server ---
launch a WEBrick server with some specific initialization parameters, setup custom content handling methods (bindings)
--- RequestHandler ---
setup the HTTP response depending on the HTTP request's content and headers (Accept, Accept-Encoding)
--- HTTPdHandler ----
* treatment()
launch a Thread to take care of the treatment (call to the handle() method) of the request, stops the Thread on client disconnection
* do_METHOD()
launch the treatment and set the HTTP response depending on the object or exception the treatment returned (Content-Type, Content-Length, Content-Encoding, ...)
* handle() [abstract]
the treatment associated with a specific HTTP request depending on the handler type
--- HTTPdHandler children ---
* ContentHandler: return some static content
* ProcedureHandler: calls a specific procedure
* MethodHandler: calls specific method
* MethodFilterHandler: dynamically calls a method depending on the request's parameters (for sample "GET /deployment/1234/logs/cluster1" can call get_deployment_logs(1234,cluster1))
=== Documentation ===
(doc/api/apidoc, doc/api/*.api)
Ruby-based DSL to describe resources, generated via "rake apidoc"
##################
# Authentication #
##################
=== Design ===
(lib/kadeploy3/server/authentication.rb, lib/kadeploy3/server/server.rb)
--- Server ---
* authenticate!(http_request)
this method is called on each request on the server, it determine if the user is correctly identified by one of the available (enabled) methods. If the user cant be authentified by at least one method, an error is returned. Authenticate objects are used to check if the user is autenticated.
--- Authentication ---
* check_host?()
checks if the client (http_request) is found in the whitelist (if there is a whitelist)
* auth!(source_socket,*) [abstract]
authenticate the user using a specific method
--- ACLAuthentication ---
Access Control List based authentication, only trust people authenticated from whitelisted machines
--- CertificateAuthentication ---
x509 certificate authentication, the client provides a certificate that is signed with a trusted private key, the CN field must contain the username
--- IdentAuthentication ---
Ident protocol based authentication (rfc 1413), the server contacts the Ident service on the client's machine to know usernames matches, a list of whitelisted Ident services can be providen
--- HTTPBasicAuthentication ---
HTTP Basic Authentication based authentication (rfc 1945), standard authentication method based on the WWW-Authenticate HTTP header
###########
# Netboot #
###########
=== Design ===
(lib/kadeploy3/server/netboot.rb)
Design pattern: Factory
--- NetBoot.Factory ---
factory that generates NetBoot instances
--- NetBoot::PXE ---
PXE netboot methods, generate boot profiles
* @binary: the boot method's binary (pxelinux.0, grubpxe.0, ...)
* @export: object used to generate export paths depending on the export server and kind (http://server/kernel, tftp://server/kernel, ...)
* @repository_dir: the directory PXE files are saved/exported in (/var/lib/tftp, ...)
* @profiles_dir: the directory the profiles are stored in, depending on the boot method (pxelinux.cfg, ...)
* @profiles_kind: the method used to generate profile's filenames (ip, ip_hex, hostname, ...)
* @chain: the next PXE method to chain over this method
* labelize(header,kind,profile) [abstract]
generate the profile with it's headers and default values
* boot_local(env,diskname,dev_id,part_id) [abstract]
generate a profile to perform a local boot on the partition part_id of the device dev_id
* boot_network(kernel,initrd,params) [abstract]
generate a profile to boot on a kernel that is downloaded over the network (using the method described in @export)
* boot_chain(pxebin) [abstract]
chain another PXE boot method
* NetBoot::PXE::Export
Used to describe file exports in PXE profiles
* Netboot::PXE::PXELinux
* Netboot::PXE::GPXELinux
* Netboot::PXE::IPXE
* Netboot::PXE::GrubPXE
=== Chaining PXE boot methods ===
DHCP
# config
filename: pxelinux.0
next_server: tftp.domain.tld
-> PXELinux
# profile
KERNEL grubpxe.0
APPEND keeppxe
-> GrubPXE
# profile
set root=(hd0,0)
linux /boot/vmlinuz
initrd /boot/initrd
#######################
# Configuration files #
#######################
=== Design ===
(lib/kadeploy3/server/config.rb, lib/kadeploy3/common/configparser.rb)
There is one class for each configuration file, each class uses a Configuration::Parser object to validate the YAML structures (types, nested structures, allowed values).
The configuration objects are then used in the server to know how to configure the service.
Configuration files can be reloaded without shutting the service down, however some fields cannot be reloaded without restarting (the values stored in the @static hash).
--- Configuration ---
The base module
* parse_custom_operations()
parse and verify a custom operations file
--- ConfigFile ---
* file() [abstract]
check that the configuration file exists and is readable
* duplicate()
duplicates the configuration object (usefull for the hot reload)
--- Config ---
Class that aggregate the treatment on the different file specific configuration objects
@common: a CommonConfig object
@clusters: a ClusterSpecificConfig object
@caches: a containing a pointer on the PXE and environment caches (if they are enabled in the configuration files)
@static: a hash that contains static configuration fields (that cannot be modified without shutting down the service)
* initialize()
init the object plus a *Config object of each kind (CommonConfig, ClustersConfig, ...) then call the load() method on them
* sanity_check()
ensure that each mandatory configuration file is present
* load_caches()
load (create) Cache objects in the @caches hash depending on the configuration
--- CommonConfig ---
* load()
loads the config file in the current object using the parser
--- ClustersConfig ---
* load()
loads the config file in the current object using the parser
--- ClusterSpecificConfig ---
* load()
loads the config file in the current object using the parser
--- CommandsConfig ---
* load()
loads the config file in the current object using the parser
--- Configuration::Parser ---
The class that helps to validate ruby Hash structures (kind of DTD)
* parse(fieldname, mandatory, type, &block)
checks that fieldname is present (if mandatory) and has the right type, then calls &block with main_hash[...][fieldname] as parameter. Several calls to parse() can be nested to validate nested structures
* value(fieldname, type, defaultvalue, expectedvalues)
checks and returns the value of fieldname (main_hash[...][fieldname]), a default value can be set, this value is returned if the field is empty. This method calls a type-specific check method "check_#{type.downcase}"