Class: Chef::Expander::ClusterSupervisor

Inherits:
Object
  • Object
show all
Includes:
Daemonizable, Loggable
Defined in:
lib/chef/expander/cluster_supervisor.rb

Overview

ClusterSupervisor

Manages a cluster of chef-expander processes. Usually this class will be instantiated from the chef-expander-cluster executable.

ClusterSupervisor works by forking the desired number of processes, then running VNodeSupervisor.start_cluster_worker within the forked process. ClusterSupervisor keeps track of the process ids of its children, and will periodically attempt to reap them in a non-blocking call. If they are reaped, ClusterSupervisor knows they died and need to be respawned.

The child processes are responsible for checking on the master process and dying if the master has died (VNodeSupervisor does this when started in with start_cluster_worker).

TODO:

  • This implementation currently assumes there is only one cluster, so it will claim all of the vnodes. It may be advantageous to allow multiple clusters.

  • There is no heartbeat implementation at this time, so a zombified child process will not be automatically killed–This behavior is left to the meatcloud for now.

Constant Summary

Constants included from Loggable

Loggable::LOGGER

Instance Method Summary collapse

Methods included from Daemonizable

#configure_process, #daemonize, #ensure_exclusive, #release_locks, #set_user_and_group

Methods included from Loggable

#log

Constructor Details

#initializeClusterSupervisor

Returns a new instance of ClusterSupervisor.



54
55
56
57
58
# File 'lib/chef/expander/cluster_supervisor.rb', line 54

def initialize
  @workers = {}
  @running = true
  @kill    = :TERM
end

Instance Method Details

#maintain_workersObject



102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
# File 'lib/chef/expander/cluster_supervisor.rb', line 102

def maintain_workers
  while @running
    sleep 1
    workers_to_replace = {}
    @workers.each do |process_id, worker_params|
      if result = Process.waitpid2(process_id, Process::WNOHANG)
        log.error { "worker #{worker_params[:index]} (PID: #{process_id}) died with status #{result[1].exitstatus || '(no status)'}"}
        workers_to_replace[process_id] = worker_params
      end
    end
    workers_to_replace.each do |dead_pid, worker_params|
      @workers.delete(dead_pid)
      start_worker(worker_params[:index])
    end
  end

  @workers.each do |pid, worker_params|
    log.info { "Stopping worker #{worker_params[:index]} (PID: #{pid})"}
    Process.kill(@kill, pid)
  end
  @workers.each do |pid, worker_params|
    Process.waitpid2(pid)
  end

end

#startObject



60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
# File 'lib/chef/expander/cluster_supervisor.rb', line 60

def start
  trap(:INT)  { stop(:INT) }
  trap(:TERM) { stop(:TERM)}
  Expander.init_config(ARGV)

  log.info("Chef Expander #{Expander.version} starting cluster with #{Expander.config.node_count} nodes")
  configure_process
  start_workers
  maintain_workers
  release_locks
rescue Configuration::InvalidConfiguration => e
  log.fatal {"Configuration Error: " + e.message}
  exit(2)
rescue Exception => e
  raise if SystemExit === e

  log.fatal {e}
  exit(1)
end

#start_worker(index) ⇒ Object



86
87
88
89
90
91
92
93
94
# File 'lib/chef/expander/cluster_supervisor.rb', line 86

def start_worker(index)
  log.info { "Starting cluster worker #{index}" }
  worker_params = {:index => index}
  child_pid = fork do
    Expander.config.index = index
    VNodeSupervisor.start_cluster_worker
  end
  @workers[child_pid] = worker_params
end

#start_workersObject



80
81
82
83
84
# File 'lib/chef/expander/cluster_supervisor.rb', line 80

def start_workers
  Expander.config.node_count.times do |i|
    start_worker(i + 1)
  end
end

#stop(signal) ⇒ Object



96
97
98
99
100
# File 'lib/chef/expander/cluster_supervisor.rb', line 96

def stop(signal)
  log.info { "Stopping cluster on signal (#{signal})" }
  @running = false
  @kill    = signal
end