Module: NewRelic::Agent::AgentHelpers::StartWorkerThread

Included in:
NewRelic::Agent::Agent
Defined in:
lib/new_relic/agent/agent_helpers/start_worker_thread.rb

Constant Summary collapse

LOG_ONCE_KEYS_RESET_PERIOD =
60.0
TRANSACTION_EVENT_DATA =
'transaction_event_data'.freeze
CUSTOM_EVENT_DATA =
'custom_event_data'.freeze
ERROR_EVENT_DATA =
'error_event_data'.freeze
SPAN_EVENT_DATA =
'span_event_data'.freeze
LOG_EVENT_DATA =
'log_event_data'.freeze

Instance Method Summary collapse

Instance Method Details

#catch_errorsObject

a wrapper method to handle all the errors that can happen in the connection and worker thread system. This guarantees a no-throw from the background thread.



108
109
110
111
112
113
114
115
116
117
# File 'lib/new_relic/agent/agent_helpers/start_worker_thread.rb', line 108

def catch_errors
  yield
rescue NewRelic::Agent::ForceRestartException => e
  handle_force_restart(e)
  retry
rescue NewRelic::Agent::ForceDisconnectException => e
  handle_force_disconnect(e)
rescue => e
  handle_other_error(e)
end

#create_and_run_event_loopObject



62
63
64
65
66
67
68
69
70
71
72
# File 'lib/new_relic/agent/agent_helpers/start_worker_thread.rb', line 62

def create_and_run_event_loop
  @event_loop = create_event_loop
  data_harvest = :"#{Agent.config[:data_report_period]}_second_harvest"
  @event_loop.on(data_harvest) do
    transmit_data
  end
  establish_interval_transmissions
  establish_fire_everies(data_harvest)

  @event_loop.run
end

#create_event_loopObject



33
34
35
# File 'lib/new_relic/agent/agent_helpers/start_worker_thread.rb', line 33

def create_event_loop
  EventLoop.new
end

#deferred_work!(connection_options) ⇒ Object

This is the method that is run in a new thread in order to background the harvesting and sending of data during the normal operation of the agent.

Takes connection options that determine how we should connect to the server, and loops endlessly - typically we never return from this method unless we’re shutting down the agent



127
128
129
130
131
132
133
134
135
136
137
138
139
140
# File 'lib/new_relic/agent/agent_helpers/start_worker_thread.rb', line 127

def deferred_work!(connection_options)
  catch_errors do
    NewRelic::Agent.disable_all_tracing do
      connect(connection_options)
      if connected?
        create_and_run_event_loop
        # never reaches here unless there is a problem or
        # the agent is exiting
      else
        ::NewRelic::Agent.logger.debug('No connection.  Worker thread ending.')
      end
    end
  end
end

#handle_force_disconnect(error) ⇒ Object

when a disconnect is requested, stop the current thread, which is the worker thread that gathers data and talks to the server.



88
89
90
91
92
# File 'lib/new_relic/agent/agent_helpers/start_worker_thread.rb', line 88

def handle_force_disconnect(error)
  ::NewRelic::Agent.logger.warn('Agent received a ForceDisconnectException from the server, disconnecting. ' \
    "(#{error.message})")
  disconnect
end

#handle_force_restart(error) ⇒ Object

Handles the case where the server tells us to restart - this clears the data, clears connection attempts, and waits a while to reconnect.



77
78
79
80
81
82
83
# File 'lib/new_relic/agent/agent_helpers/start_worker_thread.rb', line 77

def handle_force_restart(error)
  ::NewRelic::Agent.logger.debug(error.message)
  drop_buffered_data
  @service&.force_restart
  @connect_state = :pending
  sleep(30)
end

#handle_other_error(error) ⇒ Object

Handles an unknown error in the worker thread by logging it and disconnecting the agent, since we are now in an unknown state.



97
98
99
100
101
102
103
# File 'lib/new_relic/agent/agent_helpers/start_worker_thread.rb', line 97

def handle_other_error(error)
  ::NewRelic::Agent.logger.error('Unhandled error in worker thread, disconnecting.')
  # These errors are fatal (that is, they will prevent the agent from
  # reporting entirely), so we really want backtraces when they happen
  ::NewRelic::Agent.logger.log_exception(:error, error)
  disconnect
end

#interval_for(event_type) ⇒ Object

Certain event types may sometimes need to be on the same interval as metrics, so we will check config assigned in EventHarvestConfig to determine the interval on which to report them



57
58
59
60
# File 'lib/new_relic/agent/agent_helpers/start_worker_thread.rb', line 57

def interval_for(event_type)
  interval = Agent.config[:"event_report_period.#{event_type}"]
  :"#{interval}_second_harvest"
end

#start_worker_thread(connection_options = {}) ⇒ Object

Try to launch the worker thread and connect to the server.

See #connect for a description of connection_options.



20
21
22
23
24
25
26
27
28
29
30
31
# File 'lib/new_relic/agent/agent_helpers/start_worker_thread.rb', line 20

def start_worker_thread(connection_options = {})
  if disable = NewRelic::Agent.config[:disable_harvest_thread]
    NewRelic::Agent.logger.info('Not starting Ruby Agent worker thread because :disable_harvest_thread is ' \
      "#{disable}")
    return
  end

  ::NewRelic::Agent.logger.debug('Creating Ruby Agent worker thread.')
  @worker_thread = Threading::AgentThread.create('Worker Loop') do
    deferred_work!(connection_options)
  end
end

#stop_event_loopObject

If the @worker_thread encounters an error during the attempt to connect to the collector then the connect attempts enter an exponential backoff retry loop. To avoid potential race conditions with shutting down while also attempting to reconnect, we join the pending data to the server, but without waiting indefinitely for a reconnect to succeed. The use-case where this typically arises is in cronjob scheduled rake tasks where there’s also some network stability/latency issues happening.



44
45
46
47
48
49
50
51
52
# File 'lib/new_relic/agent/agent_helpers/start_worker_thread.rb', line 44

def stop_event_loop
  @event_loop&.stop
  # Wait the end of the event loop thread.
  if @worker_thread
    unless @worker_thread.join(3)
      ::NewRelic::Agent.logger.debug('Event loop thread did not stop within 3 seconds')
    end
  end
end