Class: Arachni::RPC::Server::Framework

Inherits:
Framework show all
Includes:
Distributor, Utilities
Defined in:
lib/arachni/rpc/server/framework.rb,
lib/arachni/rpc/server/distributor.rb

Overview

Wraps the framework of the local instance and the frameworks of all its slaves (when in High Performance Grid mode) into a neat, little, easy to handle package.

Disregard all:

  • ‘block’ parameters, they are there for internal processing reasons and cannot be accessed via the API

  • inherited methods and attributes

Author:

Defined Under Namespace

Modules: Distributor

Constant Summary

Constants included from Distributor

Distributor::MAX_CONCURRENCY, Distributor::MIN_PAGES_PER_INSTANCE

Constants inherited from Framework

Framework::REVISION

Instance Attribute Summary

Attributes inherited from Framework

#http, #modules, #opts, #page_queue_total_size, #plugins, #reports, #session, #sitemap, #spider, #url_queue_total_size

Instance Method Summary collapse

Methods included from Distributor

#build_elem_list, #connect_to_dispatcher, #connect_to_instance, #dispatcher, #distribute_elements, #each_slave, #iterator_for, #map_slaves, #max_eta, #merge_stats, #pick_dispatchers, #prefered_dispatchers, #slave_iterator, #spawn, #split_urls

Methods included from Utilities

#cookie_encode, #cookies_from_document, #cookies_from_file, #cookies_from_response, #exception_jail, #exclude_path?, #extract_domain, #form_decode, #form_encode, #form_parse_request_body, #forms_from_document, #forms_from_response, #get_path, #hash_keys_to_str, #html_decode, #html_encode, #include_path?, #links_from_document, #links_from_response, #normalize_url, #page_from_response, #page_from_url, #parse_query, #parse_set_cookie, #parse_url_vars, #path_in_domain?, #path_too_deep?, #remove_constants, #seed, #skip_path?, #to_absolute, #uri_decode, #uri_encode, #uri_parse, #uri_parser, #url_sanitize

Methods inherited from Framework

#audit_store, #lsmod, #lsrep, #on_run_mods, #paused?, #push_to_page_queue, #push_to_url_queue, reset, #reset, #reset_spider, #revision, #running?, #stats, #status, #version

Methods included from Mixins::Observable

#method_missing

Methods included from UI::Output

#debug?, #debug_off, #debug_on, #disable_only_positives, #flush_buffer, #mute, #muted?, old_reset_output_options, #only_positives, #only_positives?, #print_bad, #print_debug, #print_debug_backtrace, #print_debug_pp, #print_error, #print_error_backtrace, #print_info, #print_line, #print_ok, #print_status, #print_verbose, #reroute_to_file, #reroute_to_file?, reset_output_options, #set_buffer_cap, #uncap_buffer, #unmute, #verbose, #verbose?

Constructor Details

#initialize(opts) ⇒ Framework

Returns a new instance of Framework.



58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
# File 'lib/arachni/rpc/server/framework.rb', line 58

def initialize( opts )
    super( opts )

    # already inherited but lets make it explicit
    @opts = opts

    @modules = Module::Manager.new( self )
    @plugins = Plugin::Manager.new( self )

    # holds all running instances
    @instances = []

    # when in HPG mode we need to create our own sitemap which will be a
    # composite of all sitemaps of all instances
    #
    # this var will hold the combined URLs and will returned by our
    # auditstore_sitemap() override.
    @override_sitemap = Set.new

    # if we're a slave this var will hold the URL of our master
    @master_url = ''

    # some methods need to be accessible over RPC for instance management,
    # restricting elements, adding more pages etc.
    #
    # however, when in HPG mode, the master should not be tampered with,
    # so we generate a local token (which is not known to API clients)
    # to be used server side by self to facilitate access control
    @local_token = gen_token
end

Dynamic Method Handling

This class handles dynamic methods through the method_missing method in the class Arachni::Mixins::Observable

Instance Method Details

#busy?(include_slaves = true, &block) ⇒ Boolean

Returns true if the system is scanning, false if #run hasn’t been called yet or if the scan has finished.

Parameters:

  • include_slaves (Bool) (defaults to: true)

    take slave status into account too? If so, it will only return false if slaves are done too.

  • block (Proc)

    block to which to pass the result

Returns:

  • (Boolean)


99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
# File 'lib/arachni/rpc/server/framework.rb', line 99

def busy?( include_slaves = true, &block )
    busyness = [ extended_running? ]

    if @instances.empty? || !include_slaves
        block.call( busyness[0] )
        return
    end

    foreach = proc do |instance, iter|
        instance.framework.busy? { |res| iter.return( res ) }
    end
    after = proc do |res|
        busyness << res
        busyness.flatten!
        block.call( busyness.include?( true ) )
    end

    map_slaves( foreach, after )
end

#clean_up(&block) ⇒ Object

If the scan needs to be aborted abruptly this method takes care of any unfinished business (like running plug-ins).

Should be called before grabbing the #auditstore, especially when running in HPG mode as it will take care of merging the plug-in results of all instances.

Parameters:

  • block (Proc)

    block to be called once the cleanup has finished



287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
# File 'lib/arachni/rpc/server/framework.rb', line 287

def clean_up( &block )
    super( true )

    if @instances.empty?
        block.call( true ) if block_given?
        return
    end

    foreach = proc do |instance, iter|
        instance.framework.clean_up {
            instance.plugins.results do |res|
                iter.return( !res.rpc_exception? ? res : nil )
            end
        }
    end
    after = proc { |results| @plugins.merge_results( results.compact ); block.call( true ) }
    map_slaves( foreach, after )
end

#high_performance?Bool

Returns true if running in HPG (High Performance Grid) mode and instance is the master, false otherwise.

Returns:

  • (Bool)

    true if running in HPG (High Performance Grid) mode and instance is the master, false otherwise.



136
137
138
# File 'lib/arachni/rpc/server/framework.rb', line 136

def high_performance?
    @opts.grid_mode == 'high_performance'
end

#issuesArray<Arachni::Issue>

Returns all discovered issues albeit without any variations.

Returns:



517
518
519
520
521
522
# File 'lib/arachni/rpc/server/framework.rb', line 517

def issues
    auditstore.issues.deep_clone.map do |issue|
        issue.variations.clear
        issue
    end
end

#issues_as_hashArray<Hash>

Returns #issues as an array of hashes.

Returns:

See Also:



529
530
531
# File 'lib/arachni/rpc/server/framework.rb', line 529

def issues_as_hash
    issues.map { |i| i.to_h }
end

#lsplugArray<Hash>

Returns information about all available plug-ins.

Returns:

  • (Array<Hash>)

    information about all available plug-ins



122
123
124
125
126
127
128
129
# File 'lib/arachni/rpc/server/framework.rb', line 122

def lsplug
    super.map do |plugin|
        plugin[:options] = [plugin[:options]].flatten.compact.map do |opt|
            opt.to_h.merge( 'type' => opt.type )
        end
        plugin
    end
end

#output(&block) ⇒ Array<Hash>

Merged output of all running instances.

This is going probably to be wildly out of sync and lack A LOT of messages.

It’s here to give the notion of scan progress to the end-user rather than provide an accurate depiction of the actual progress.

The returned object will be in the form of:

[ { <type> => <message> } ]

like:

[
    { status: 'Initiating'},
    {   info: 'Some informational msg...'},
]

Possible message types are:

  • status – Status messages, usually to denote progress.

  • info – Informational messages, like notices.

  • ok – Denotes a successful operation or a positive result.

  • verbose – Verbose messages, extra information about whatever.

  • bad – Opposite of :ok, an operation didn’t go as expected, something has failed but it’s recoverable.

  • error – An error has occurred, this is not good.

  • line – Generic message, no type.

Parameters:

  • block (Proc)

    block to which to pass the result

Returns:



359
360
361
362
363
364
365
366
367
368
369
370
371
372
# File 'lib/arachni/rpc/server/framework.rb', line 359

def output( &block )
    buffer = flush_buffer

    if @instances.empty?
        block.call( buffer )
        return
    end

    foreach = proc do |instance, iter|
        instance.service.output { |out| iter.return( out ) }
    end
    after = proc { |out| block.call( (buffer | out).flatten ) }
    map_slaves( foreach, after )
end

#pauseObject Also known as: pause!

Pauses the running scan on a best effort basis.



309
310
311
312
313
# File 'lib/arachni/rpc/server/framework.rb', line 309

def pause
    super
    each_slave{ |instance, iter| instance.framework.pause{ iter.next } }
    true
end

#progress_data(opts = {}, &block) ⇒ Object Also known as: progress

Returns aggregated progress data and helps to limit the amount of calls required in order to get an accurate depiction of a scan’s progress and includes:

  • output messages

  • discovered issues

  • overall statistics

  • overall scan status

  • statistics of all instances individually

Parameters:

  • opts (Hash) (defaults to: {})

    contains info about what data to return:

    • :messages – include output messages

    • :slaves – include slave data

    • :issues – include issue summaries

    Uses an implicit include for the above (i.e. nil will be considered true).

    • :as_hash – if set to true will convert issues to hashes before returning

  • block (Proc)

    block to which to pass the result



393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
# File 'lib/arachni/rpc/server/framework.rb', line 393

def progress_data( opts= {}, &block )

    include_messages = opts[:messages].nil? ? true : opts[:messages]
    include_slaves   = opts[:slaves].nil? ? true : opts[:slaves]
    include_issues   = opts[:issues].nil? ? true : opts[:issues]

    as_hash = opts[:as_hash] ? true : opts[:as_hash]

    data = {
        'stats'  => {},
        'status' => status,
        'busy'   => extended_running?
    }

    data['messages']  = flush_buffer if include_messages

    if include_issues
        data['issues'] = as_hash ? issues_as_hash : issues
    end

    data['instances'] = {} if include_slaves

    stats = []
    stat_hash = {}
    stats( true, true ).each { |k, v| stat_hash[k.to_s] = v }

    if @opts.datastore[:dispatcher_url] && include_slaves
        data['instances'][self_url] = stat_hash.dup
        data['instances'][self_url]['url'] = self_url
        data['instances'][self_url]['status'] = status
    end

    stats << stat_hash

    if @instances.empty? || !include_slaves
        data['stats'] = merge_stats( stats )
        data['instances'] = data['instances'].values if include_slaves
        block.call( data )
        return
    end

    foreach = proc do |instance, iter|
        instance.framework.progress_data( opts ) do |tmp|
            if !tmp.rpc_exception?
                tmp['url'] = instance.url
                iter.return( tmp )
            else
                iter.return( nil )
            end
        end
    end

    after = proc do |slave_data|
        slave_data.compact!
        slave_data.each do |slave|
            data['messages']  |= slave['messages'] if include_messages
            data['issues']    |= slave['issues'] if include_issues

            if include_slaves
                url = slave['url']
                data['instances'][url]           = slave['stats']
                data['instances'][url]['url']    = url
                data['instances'][url]['status'] = slave['status']
            end

            stats << slave['stats']
        end

        if include_slaves
            sorted_data_instances = {}
            data['instances'].keys.sort.each do |url|
                sorted_data_instances[url] = data['instances'][url]
            end
            data['instances'] = sorted_data_instances.values
        end

        data['stats'] = merge_stats( stats )
        block.call( data )
    end

    map_slaves( foreach, after )
end

#register_issues(issues, token = nil) ⇒ Bool

Registers an array holding Issue objects with the local instance.

Primarily used by slaves to register issues they find on the spot.

Parameters:

  • issues (Array<Arachni::Issue>)
  • token (String) (defaults to: nil)

    privileged token, prevents this method from being called by 3rd parties when this instance is a master. If this instance is not a master one the token needn’t be provided.

Returns:

  • (Bool)

    true on success, false on invalid token or if not in HPG mode



591
592
593
594
595
# File 'lib/arachni/rpc/server/framework.rb', line 591

def register_issues( issues, token = nil )
    return false if high_performance? && !valid_token?( token )
    @modules.class.register_results( issues )
    true
end

#reportHash Also known as: audit_store_as_hash, auditstore_as_hash

Returns the results of the audit as a hash.

Returns:

  • (Hash)


482
483
484
# File 'lib/arachni/rpc/server/framework.rb', line 482

def report
    audit_store.to_h
end

#report_as(name) ⇒ Object



488
489
490
491
492
493
494
495
496
497
498
# File 'lib/arachni/rpc/server/framework.rb', line 488

def report_as( name )
    fail Exceptions::ComponentNotFound, "Report '#{name}' could not be found." if !reports.available.include?( name.to_s )
    fail TypeError, "Report '#{name}' cannot format the audit results as a String." if !reports[name].has_outfile?

    outfile = "/tmp/arachn_report_as.#{name}"
    reports.run_one( name, auditstore, 'outfile' => outfile )

    str = IO.read( outfile )
    File.delete( outfile )
    str
end

#restrict_to_elements(elements, token = nil) ⇒ Bool

Restricts the scope of the audit to individual elements.

Parameters:

  • elements (Array<String>)

    list of element IDs (as created by Element::Capabilities::Auditable#scope_audit_id)

  • token (String) (defaults to: nil)

    privileged token, prevents this method from being called by 3rd parties when this instance is a master. If this instance is not a master one the token needn’t be provided.

Returns:

  • (Bool)

    true on success, false on invalid token



553
554
555
556
557
# File 'lib/arachni/rpc/server/framework.rb', line 553

def restrict_to_elements( elements, token = nil )
    return false if high_performance? && !valid_token?( token )
    Element::Capabilities::Auditable.restrict_to_elements( elements )
    true
end

#resumeObject Also known as: resume!

Resumes a paused scan right away.



319
320
321
322
323
# File 'lib/arachni/rpc/server/framework.rb', line 319

def resume
    super
    each_slave { |instance, iter| instance.framework.resume{ iter.next } }
    true
end

#runBool

Starts the audit.

Returns:

  • (Bool)

    false if already running, true otherwise



145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
# File 'lib/arachni/rpc/server/framework.rb', line 145

def run
    # return if we're already running
    return false if extended_running?

    @extended_running = true

    #
    # if we're in HPG mode do fancy stuff like distributing and balancing workload
    # as well as starting slave instances and deal with some lower level
    # operations of the local instance like running plug-ins etc...
    #
    # otherwise just run the local instance, nothing special...
    #
    if high_performance?

        ::Thread.new {

            #
            # We're in HPG (High Performance Grid) mode,
            # things are going to get weird...
            #

            # we'll need analyze the pages prior to assigning
            # them to each instance at the element level so as to gain
            # more granular control over the assigned workload
            #
            # put simply, we'll need to perform some magic in order
            # to prevent different instances from auditing the same elements
            # and wasting bandwidth
            #
            # for example: search forms, logout links and the like will
            # most likely exist on most pages of the site and since each
            # instance is assigned a set of URLs/pages to audit they will end up
            # with common elements so we have to prevent instances from
            # performing identical checks.
            #
            # interesting note: should previously unseen elements dynamically
            # appear during the audit they will override these restrictions
            # and each instance will audit them at will.
            #

            element_ids_per_page = {}

            # prepare the local instance (runs plugins and starts the timer)
            prepare

            # we need to take our cues from the local framework as some
            # plug-ins may need the system to wait for them to finish
            # before moving on.
            sleep( 0.2 ) while paused?

            @status = :crawling
            # start the crawl and extract all paths
            spider.run do |page|
                @override_sitemap << page.url
                element_ids_per_page[page.url] = build_elem_list( page )
            end

            @status = :distributing
            # the plug-ins may have updated the page queue
            # so we need to distribute these pages as well
            page_a = []
            while !@page_queue.empty? && page = @page_queue.pop
                page_a << page
                @override_sitemap << page.url
                element_ids_per_page[page.url] = build_elem_list( page )
            end

            # get the Dispatchers with unique Pipe IDs
            # in order to take advantage of line aggregation
            prefered_dispatchers do |pref_dispatchers|

                # split the URLs of the pages in equal chunks
                chunks    = split_urls( element_ids_per_page.keys, pref_dispatchers.size + 1 )
                chunk_cnt = chunks.size

                if chunk_cnt > 0
                    # split the page array into chunks that will be distributed
                    # across the instances
                    page_chunks = page_a.chunk( chunk_cnt )

                    # assign us our fair share of plug-in discovered pages
                    update_page_queue( page_chunks.pop, @local_token )

                    # remove duplicate elements across the (per instance) chunks
                    # while spreading them out evenly
                    elements = distribute_elements( chunks, element_ids_per_page )

                    # restrict the local instance to its assigned elements
                    restrict_to_elements( elements.pop, @local_token )

                    # set the URLs to be audited by the local instance
                    @opts.restrict_paths = chunks.pop

                    chunks.each do |chunk|
                        # spawn a remote instance, assign a chunk of URLs
                        # and elements to it and run it
                        spawn( pref_dispatchers.pop,
                               urls:     chunk,
                               elements: elements.pop,
                               pages:    page_chunks.pop
                        ) { |inst| @instances << inst }
                    end
                end

                # start the local instance
                Thread.new {
                    # ap 'AUDITING'
                    audit

                    # ap 'OLD CLEAN UP'
                    old_clean_up

                    # ap 'DONE'
                    @extended_running = false
                    @status = :done
                }
            end
        }
    else
        # start the local instance
        Thread.new {
            # ap 'AUDITING'
            super
            # ap 'DONE'
            @extended_running = false
        }
    end

    true
end

#serialized_auditstoreString

Returns YAML representation of #auditstore.

Returns:



503
504
505
# File 'lib/arachni/rpc/server/framework.rb', line 503

def serialized_auditstore
    audit_store.to_yaml
end

#serialized_reportString

Returns YAML representation of #report.

Returns:



510
511
512
# File 'lib/arachni/rpc/server/framework.rb', line 510

def serialized_report
    audit_store.to_h.to_yaml
end

#set_master(url, token) ⇒ Bool

Sets the URL and authentication token required to connect to the instance’s master.

Parameters:

  • url (String)

    master’s URL in ‘hostname:port’ form

  • token (String)

    master’s authentication token

Returns:

  • (Bool)

    true on success, false if the current instance is the master of the HPG (in which case this method is not applicable)



606
607
608
609
610
611
612
613
614
615
# File 'lib/arachni/rpc/server/framework.rb', line 606

def set_master( url, token )
    return false if high_performance?

    @master_url = url
    @master = connect_to_instance( 'url' => url, 'token' => token )

    @modules.do_not_store
    @modules.on_register_results { |r| report_issues_to_master( r ) }
    true
end

#update_page_queue(pages, token = nil) ⇒ Bool

Updates the page queue with the provided pages.

Parameters:

  • pages (Array<Arachni::Page>)

    list of pages

  • token (String) (defaults to: nil)

    privileged token, prevents this method from being called by 3rd parties when this instance is a master. If this instance is not a master one the token needn’t be provided.

Returns:

  • (Bool)

    true on success, false on invalid token



571
572
573
574
575
# File 'lib/arachni/rpc/server/framework.rb', line 571

def update_page_queue( pages, token = nil )
    return false if high_performance? && !valid_token?( token )
    pages.each { |page| push_to_page_queue( page )}
    true
end