Class: DaemonController

Inherits:
Object
  • Object
show all
Defined in:
lib/daemon_controller.rb,
lib/daemon_controller/lock_file.rb

Overview

daemon_controller, library for robust daemon management Copyright © 2008 Phusion

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Defined Under Namespace

Classes: AlreadyStarted, ConnectError, Error, LockFile, StartError, StartTimeout, StopError, StopTimeout, TimeoutError

Constant Summary collapse

ALLOWED_CONNECT_EXCEPTIONS =
[Errno::ECONNREFUSED, Errno::ENETUNREACH,
Errno::ETIMEDOUT, Errno::ECONNRESET, Errno::EINVAL]

Instance Method Summary collapse

Constructor Details

#initialize(options) ⇒ DaemonController

Create a new DaemonController object.

Mandatory options

:identifier

A human-readable, unique name for this daemon, e.g. “Sphinx search server”. This identifier will be used in some error messages. On some platforms, it will be used for concurrency control: on such platforms, no two DaemonController objects will operate on the same identifier on the same time.

:start_command

The command to start the daemon. This must be a a String, e.g. “mongrel_rails start -e production”.

:ping_command

The ping command is used to check whether the daemon can be connected to. It is also used to ensure that #start only returns when the daemon can be connected to.

The value may be a command string. This command must exit with an exit code of 0 if the daemon can be successfully connected to, or exit with a non-0 exit code on failure.

The value may also be a Proc, which returns an expression that evaluates to true (indicating that the daemon can be connected to) or false (failure). If the Proc raises Errno::ECONNREFUSED, Errno::ENETUNREACH, Errno::ETIMEDOUT or Errno::ECONNRESET, then that also means that the daemon cannot be connected to. NOTE: if the ping command returns an object which responds to #close, then that method will be called on the return value. This makes it possible to specify a ping command such as lambda { TCPSocket.new('localhost', 1234) }, without having to worry about closing it afterwards. Any exceptions raised by #close are ignored.

:pid_file

The PID file that the daemon will write to. Used to check whether the daemon is running.

:log_file

The log file that the daemon will write to. It will be consulted to see whether the daemon has printed any error messages during startup.

Optional options

:stop_command

A command to stop the daemon with, e.g. “/etc/rc.d/nginx stop”. If no stop command is given (i.e. nil), then DaemonController will stop the daemon by killing the PID written in the PID file.

The default value is nil.

:before_start

This may be a Proc. It will be called just before running the start command. The before_start proc is not subject to the start timeout.

:start_timeout

The maximum amount of time, in seconds, that #start may take to start the daemon. Since #start also waits until the daemon can be connected to, that wait time is counted as well. If the daemon does not start in time, then #start will raise an exception.

The default value is 15.

:stop_timeout

The maximum amount of time, in seconds, that #stop may take to stop the daemon. Since #stop also waits until the daemon is no longer running, that wait time is counted as well. If the daemon does not stop in time, then #stop will raise an exception.

The default value is 15.

:log_file_activity_timeout

Once a daemon has gone into the background, it will become difficult to know for certain whether it is still initializing or whether it has failed and exited, until it has written its PID file. It’s 99.9% probable that the daemon has terminated with an if its start timeout has expired, not many system administrators want to wait 15 seconds (the default start timeout) to be notified of whether the daemon has terminated with an error.

An alternative way to check whether the daemon has terminated with an error, is by checking whether its log file has been recently updated. If, after the daemon has started, the log file hasn’t been updated for the amount of seconds given by the :log_file_activity_timeout option, then the daemon is assumed to have terminated with an error.

The default value is 7.



135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
# File 'lib/daemon_controller.rb', line 135

def initialize(options)
	[:identifier, :start_command, :ping_command, :pid_file, :log_file].each do |option|
		if !options.has_key?(option)
			raise ArgumentError, "The ':#{option}' option is mandatory."
		end
	end
	@identifier = options[:identifier]
	@start_command = options[:start_command]
	@stop_command = options[:stop_command]
	@ping_command = options[:ping_command]
	@ping_interval = options[:ping_interval] || 0.1
	@pid_file = options[:pid_file]
	@log_file = options[:log_file]
	@before_start = options[:before_start]
	@start_timeout = options[:start_timeout] || 15
	@stop_timeout = options[:stop_timeout] || 15
	@log_file_activity_timeout = options[:log_file_activity_timeout] || 7
	@lock_file = determine_lock_file(@identifier, @pid_file)
end

Instance Method Details

#connectObject

Connect to the daemon by running the given block, which contains the connection logic. If the daemon isn’t already running, then it will be started.

The block must return nil or raise Errno::ECONNREFUSED, Errno::ENETUNREACH, Errno::ETIMEDOUT, Errno::ECONNRESET to indicate that the daemon cannot be connected to. It must return non-nil if the daemon can be connected to. Upon successful connection, the return value of the block will be returned by #connect.

Note that the block may be called multiple times.

Raises:

  • StartError - an attempt to start the daemon was made, but the start command failed with an error.

  • StartTimeout - an attempt to start the daemon was made, but the daemon did not start in time, or it failed after it has gone into the background.

  • ConnectError - the daemon wasn’t already running, but we couldn’t connect to the daemon even after starting it.



187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
# File 'lib/daemon_controller.rb', line 187

def connect
	connection = nil
	@lock_file.shared_lock do
		begin
			connection = yield
		rescue *ALLOWED_CONNECT_EXCEPTIONS
			connection = nil
		end
	end
	if connection.nil?
		@lock_file.exclusive_lock do
			if !daemon_is_running?
				start_without_locking
			end
			begin
				connection = yield
			rescue *ALLOWED_CONNECT_EXCEPTIONS
				connection = nil
			end
			if connection.nil?
				# Daemon is running but we couldn't connect to it. Possible
				# reasons:
				# - The daemon froze.
				# - Bizarre security restrictions.
				# - There's a bug in the yielded code.
				raise ConnectError, "Cannot connect to the daemon"
			else
				return connection
			end
		end
	else
		return connection
	end
end

#pidObject

Returns the daemon’s PID, as reported by its PID file. Returns the PID as an integer, or nil there is no valid PID in the PID file.

This method doesn’t check whether the daemon’s actually running. Use #running? if you want to check whether it’s actually running.

Raises SystemCallError or IOError if something went wrong during reading of the PID file.



250
251
252
253
254
# File 'lib/daemon_controller.rb', line 250

def pid
	@lock_file.shared_lock do
		return read_pid_file
	end
end

#running?Boolean

Checks whether the daemon is still running. This is done by reading the PID file and then checking whether there is a process with that PID.

Raises SystemCallError or IOError if something went wrong during reading of the PID file.

Returns:

  • (Boolean)


262
263
264
265
266
# File 'lib/daemon_controller.rb', line 262

def running?
	@lock_file.shared_lock do
		return daemon_is_running?
	end
end

#startObject

Start the daemon and wait until it can be pinged.

Raises:

  • AlreadyStarted - the daemon is already running.

  • StartError - the start command failed.

  • StartTimeout - the daemon did not start in time. This could also mean that the daemon failed after it has gone into the background.



162
163
164
165
166
# File 'lib/daemon_controller.rb', line 162

def start
	@lock_file.exclusive_lock do
		start_without_locking
	end
end

#stopObject

Stop the daemon and wait until it has exited.

Raises:

  • StopError - the stop command failed.

  • StopTimeout - the daemon didn’t stop in time.



227
228
229
230
231
232
233
234
235
236
237
238
239
240
# File 'lib/daemon_controller.rb', line 227

def stop
	@lock_file.exclusive_lock do
		begin
			Timeout.timeout(@stop_timeout) do
				kill_daemon
				wait_until do
					!daemon_is_running?
				end
			end
		rescue Timeout::Error
			raise StopTimeout, "Daemon '#{@identifier}' did not exit in time"
		end
	end
end