Class: Arachni::RPC::Server::Framework

Inherits:
Framework show all
Includes:
Module::Utilities
Defined in:
lib/arachni/rpc/server/framework.rb

Overview

Wraps the framework of the local instance and the frameworks of all remote slaves (when in High Performance Grid mode) into a neat, little, easy to handle package.

@author: Tasos “Zapotek” Laskos

<[email protected]>
<[email protected]>

@version: 0.1

Constant Summary collapse

MAX_CONCURRENCY =
20
MIN_PAGES_PER_INSTANCE =
30

Constants inherited from Framework

Framework::REVISION

Instance Attribute Summary collapse

Attributes inherited from Framework

#auditmap, #page_queue_size, #page_queue_total_size, #reports, #sitemap, #spider, #url_queue_size, #url_queue_total_size

Instance Method Summary collapse

Methods included from Module::Utilities

#exception_jail, #get_path, #hash_keys_to_str, #normalize_url, #read_file, #seed, #uri_decode, #uri_encode, #uri_parse, #uri_parser, #url_sanitize

Methods inherited from Framework

#audit, #audit_page_queue, #audit_queue, #audit_store, #audit_store_sitemap, #http, #lsmod, #lsrep, #paused?, #plugin_store, #prepare, #push_to_page_queue, #push_to_url_queue, #revision, #running?, #stats, #version

Methods included from Mixins::Observable

#method_missing

Methods included from UI::Output

#buffer, #debug!, #debug?, #flush_buffer, #mute!, #muted?, #only_positives!, #only_positives?, #print_bad, #print_debug, #print_debug_backtrace, #print_debug_pp, #print_error, #print_error_backtrace, #print_info, #print_line, #print_ok, #print_status, #print_verbose, #reroute_to_file, #reroute_to_file?, #uncap_buffer!, #unmute!, #verbose!, #verbose?

Constructor Details

#initialize(opts) ⇒ Framework

Returns a new instance of Framework.



52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
# File 'lib/arachni/rpc/server/framework.rb', line 52

def initialize( opts )
    super( opts )

    @modules = Arachni::RPC::Server::Module::Manager.new( opts )
    @plugins = Arachni::RPC::Server::Plugin::Manager.new( self )

    # holds all running instances
    @instances = []

    @crawling_done = nil
    @override_sitemap = []

    # if we're a slave this var will hold the URL of our master
    @master_url = ''

    @local_token = gen_token
end

Dynamic Method Handling

This class handles dynamic methods through the method_missing method in the class Arachni::Mixins::Observable

Instance Attribute Details

#instancesObject (readonly)

Returns the value of attribute instances.



47
48
49
# File 'lib/arachni/rpc/server/framework.rb', line 47

def instances
  @instances
end

#modulesObject (readonly)

Returns the value of attribute modules.



47
48
49
# File 'lib/arachni/rpc/server/framework.rb', line 47

def modules
  @modules
end

#optsObject (readonly)

Returns the value of attribute opts.



47
48
49
# File 'lib/arachni/rpc/server/framework.rb', line 47

def opts
  @opts
end

#pluginsObject (readonly)

Returns the value of attribute plugins.



47
48
49
# File 'lib/arachni/rpc/server/framework.rb', line 47

def plugins
  @plugins
end

Instance Method Details

#busy?(include_slaves = true, &block) ⇒ Boolean

Returns true if the system is scanning, false if #run hasn’t been called yet or if the scan has finished.

Parameters:

  • include_slaves (Bool) (defaults to: true)

    take slave status into account too? <br/> If so, it will only return false if slaves are done too.

  • &block (Proc)

    block to which to pass the result

Returns:

  • (Boolean)


89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
# File 'lib/arachni/rpc/server/framework.rb', line 89

def busy?( include_slaves = true, &block )

    busyness = [ extended_running? ]

    if @instances.empty? || !include_slaves
        block.call( busyness[0] ) if block_given?
        return
    end

    ::EM::Iterator.new( @instances, @instances.size ).map( proc {
        |instance, iter|
        connect_to_instance( instance ).framework.busy? {
            |res|
            iter.return( res )
        }
    }, proc {
        |res|
        busyness << res
        busyness.flatten!
        block.call( !busyness.reject{ |is_busy| !is_busy }.empty? )
    })
end

#clean_up!(&block) ⇒ Object

If the scan needs to be aborted abruptly this method takes care of any unfinished business (like running plug-ins).

Should be called before grabbing the #auditstore, especially when running in HPG mode as it will take care of merging the plug-in results of all instances.

Parameters:

  • &block (Proc)

    block to be called once the cleanup has finished



304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
# File 'lib/arachni/rpc/server/framework.rb', line 304

def clean_up!( &block )
    old_clean_up!( true )

    if @instances.empty?
        block.call if block_given?
        return
    end

    ::EM::Iterator.new( @instances, @instances.size ).map( proc {
        |instance, iter|
        instance_conn = connect_to_instance( instance )

        instance_conn.framework.clean_up! {
            instance_conn.framework.get_plugin_store {
                |res|
                iter.return( !res.rpc_exception? ?  res : nil )
            }
        }

    }, proc {
        |results|
        results.compact!
        results << get_plugin_store
        update_plugin_results!( results )
        block.call
    })
end

#get_plugin_storeHash

Returns the results of the plug-ins

Returns:

  • (Hash)

    plugin name => result



75
76
77
# File 'lib/arachni/rpc/server/framework.rb', line 75

def get_plugin_store
    @plugin_store
end

#high_performance?Bool

Returns true if running in HPG (High Performance Grid) mode and we’re the master, false otherwise.

Returns:

  • (Bool)


140
141
142
# File 'lib/arachni/rpc/server/framework.rb', line 140

def high_performance?
    @opts.grid_mode == 'high_performance'
end

#issuesArray<Arachni::Issue>

Returns a array containing summaries of all discovered issues (i.e. no variations).

Returns:



575
576
577
578
579
580
581
582
# File 'lib/arachni/rpc/server/framework.rb', line 575

def issues
    audit_store.issues.map {
        |issue|
        tmp_issue = issue.deep_clone
        tmp_issue.variations = []
        tmp_issue
    }
end

#issues_as_hashArray<Hash>

Returns the return value of #issues as an Array of hashes

Returns:

See Also:



591
592
593
# File 'lib/arachni/rpc/server/framework.rb', line 591

def issues_as_hash
    issues.map { |i| i.to_h }
end

#lsplugArray<Hash>

Returns an array containing information about all available plug-ins.

Returns:



117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
# File 'lib/arachni/rpc/server/framework.rb', line 117

def lsplug
    plug_info = []

    super.each {
        |plugin|

        plugin[:options] = [plugin[:options]].flatten.compact.map {
            |opt|
            opt.to_h.merge( 'type' => opt.type )
        }

        plug_info << plugin
    }

    return plug_info
end

#masterString

Returns the master’s URL

Returns:



361
362
363
# File 'lib/arachni/rpc/server/framework.rb', line 361

def master
    @master_url
end

#output(&block) ⇒ Object

Returns the merged output of all running instances.

This is going probably to be wildly out of sync and lack A LOT of messages.

It’s here to give the notion of scan progress to the end-user rather than provide an accurate depiction of the actual progress.

Parameters:

  • &block (Proc)

    block to which to pass the result



375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
# File 'lib/arachni/rpc/server/framework.rb', line 375

def output( &block )

    buffer = flush_buffer

    if @instances.empty?
        block.call( buffer )
        return
    end

    ::EM::Iterator.new( @instances, MAX_CONCURRENCY ).map( proc {
        |instance, iter|
        connect_to_instance( instance ).service.output {
            |out|
            iter.return( out )
        }
    }, proc {
        |out|
        block.call( (buffer | out).flatten )
    })
end

#pause!Object

Pauses the running scan on a best effort basis.



335
336
337
338
339
340
341
342
# File 'lib/arachni/rpc/server/framework.rb', line 335

def pause!
    super
    ::EM::Iterator.new( @instances, @instances.size ).each {
        |instance, iter|
        connect_to_instance( instance ).framework.pause!{ iter.next }
    }
    return true
end

#progress_data(opts = {}, &block) ⇒ Object

Returns aggregated progress data and helps to limit the amount of calls required in order to get an accurate depiction of a scan’s progress and includes:

o output messages
o discovered issues
o overall statistics
o overall scan status
o statistics of all instances individually

Parameters:

  • opts (Hash) (defaults to: {})

    contains info about what data to return:

    • :messages – include output messages

    • :slaves – include slave data

    • :issues – include issue summaries

    Uses an implicit include for the above (i.e. nil will be considered true).

    • :as_hash – if set to true will convert issues to hashes before returning

  • &block (Proc)

    block to which to pass the result



439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
# File 'lib/arachni/rpc/server/framework.rb', line 439

def progress_data( opts= {}, &block )

    if opts[:messages].nil?
        include_messages = true
    else
        include_messages = opts[:messages]
    end

    if opts[:slaves].nil?
        include_slaves = true
    else
        include_slaves = opts[:slaves]
    end

    if opts[:issues].nil?
        include_issues = true
    else
        include_issues = opts[:issues]
    end

    if opts[:as_hash]
        as_hash = true
    else
        as_hash = opts[:as_hash]
    end

    data = {
        'stats'     => {},
        'status'    => status,
        'busy'      => extended_running?
    }

    data['messages']  = flush_buffer if include_messages

    if include_issues
        data['issues'] = as_hash ? issues_as_hash : issues
    end

    data['instances'] = {} if include_slaves

    stats = []
    stat_hash = {}
    stats( true, true ).each {
        |k, v|
        stat_hash[k.to_s] = v
    }

    if @opts.datastore[:dispatcher_url] && include_slaves
        data['instances'][self_url] = stat_hash.dup
        data['instances'][self_url]['url'] = self_url
        data['instances'][self_url]['status'] = status
    end

    stats << stat_hash

    if @instances.empty? || !include_slaves
        data['stats'] = merge_stats( stats )
        data['instances'] = data['instances'].values if include_slaves
        block.call( data )
        return
    end

    ::EM::Iterator.new( @instances, MAX_CONCURRENCY ).map( proc {
        |instance, iter|
        connect_to_instance( instance ).framework.progress_data( opts ) {
            |tmp|
            if !tmp.rpc_exception?
                tmp['url'] = instance['url']
                iter.return( tmp )
            else
                iter.return( nil )
            end
        }
    }, proc {
        |slave_data|

        slave_data.compact!
        slave_data.each {
            |slave|
            data['messages']  |= slave['messages'] if include_messages
            data['issues']    |= slave['issues'] if include_issues

            if include_slaves
                url = slave['url']
                data['instances'][url]           = slave['stats']
                data['instances'][url]['url']    = url
                data['instances'][url]['status'] = slave['status']
            end

            stats << slave['stats']
        }

        if include_slaves
            sorted_data_instances = {}
            data['instances'].keys.sort.each {
                |url|
                sorted_data_instances[url] = data['instances'][url]
            }
            data['instances'] = sorted_data_instances.values
        end

        data['stats'] = merge_stats( stats )
        block.call( data )
    })
end

#register_issues(issues, token) ⇒ Bool

Registers an array holding Issue objects with the local instance.

Primarily used by slaves to register issues they find on the spot.

Parameters:

Returns:

  • (Bool)

    true on success, false on invalid token or if not in HPG mode



668
669
670
671
672
673
# File 'lib/arachni/rpc/server/framework.rb', line 668

def register_issues( issues, token )
    return false if high_performance? && !valid_token?( token )

    @modules.class.register_results( issues )
    return true
end

#reportHash Also known as: audit_store_as_hash, auditstore_as_hash

Returns the results of the audit as a hash.

Returns:

  • (Hash)


550
551
552
# File 'lib/arachni/rpc/server/framework.rb', line 550

def report
    audit_store.to_h
end

#restrict_to_elements!(elements, token = nil) ⇒ Bool

Restricts the scope of the audit to individual elements.

Parameters:

  • elements (Array)

    list of element IDs

  • token (String) (defaults to: nil)

    privileged token, prevents this method from being called by 3rd parties.

Returns:

  • (Bool)

    true on success, false on invalid token



611
612
613
614
615
616
# File 'lib/arachni/rpc/server/framework.rb', line 611

def restrict_to_elements!( elements, token = nil )
    return false if high_performance? && !valid_token?( token )

    ::Arachni::Element::Auditable.restrict_to_elements!( elements )
    return true
end

#resume!Object

Resumes a paused scan right away.



347
348
349
350
351
352
353
354
# File 'lib/arachni/rpc/server/framework.rb', line 347

def resume!
    super
    ::EM::Iterator.new( @instances, @instances.size ).each {
        |instance, iter|
        connect_to_instance( instance ).framework.resume!{ iter.next }
    }
    return true
end

#runBool

Starts the audit.

Returns:

  • (Bool)

    false if already running, true otherwise



149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
# File 'lib/arachni/rpc/server/framework.rb', line 149

def run
    # return if we're already running
    return false if extended_running?

    # EventMachine.add_periodic_timer( 1 ) do
        # print "Arachni::RPC::Client::Handler objects: "
        # puts ObjectSpace.each_object( Arachni::RPC::Client::Handler ) {}
#
        # print "Arachni::RPC::Server::Proxy objects: "
        # puts ObjectSpace.each_object( Arachni::RPC::Server::Proxy ) {}
#
        # puts "Active connections: #{::EM.connection_count}"
        # puts '--------------------------------------------'
    # end

    @extended_running = true
    ::EM.defer {

        #
        # if we're in HPG mode do fancy stuff like distributing and balancing workload
        # as well as starting slave instances and deal with some lower level
        # operations of the local instance like running plug-ins etc...
        #
        # otherwise just run the local instance, nothing special...
        #
        if high_performance?

            #
            # We're in HPG (High Performance Grid) mode,
            # things are going to get weird...
            #

            # we'll need analyze the pages prior to assigning
            # them to each instance at the element level so as to gain
            # more granular control over the assigned workload
            #
            # put simply, we'll need to perform some magic in order
            # to prevent different instances from auditing the same elements
            # and wasting bandwidth
            #
            # for example: search forms, logout links and the like will
            # most likely exist on most pages of the site and since each
            # instance is assigned a set of URLs/pages to audit they will end up
            # with common elements so we have to prevent instances from
            # performing identical checks.
            #
            # interesting note: should previously unseen elements dynamically
            # appear during the audit they will override these restrictions
            # and each instance will audit them at will.
            #
            pages = ::Arachni::Database::Hash.new

            # prepare the local instance (runs plugins and start the timer)
            prepare

            # we need to take our cues from the local framework as some
            # plug-ins may need the system to wait for them to finish
            # before moving on.
            sleep( 0.2 ) while paused?

            # start the crawl and extract all paths
            Arachni::Spider.new( @opts ).run {
                |page|
                @override_sitemap << page.url
                pages[page.url] = page
            }
            @crawling_done = true

            # the plug-ins may have update the framework page queue
            # so we need to distribute these pages as well
            page_a = []
            page_q = @page_queue
            while !page_q.empty? && page = page_q.pop
                page_a << page
                pages[page.url] = page
            end

            # get the Dispatchers with unique Pipe IDs
            # in order to take advantage of line aggregation
            prefered_dispatchers {
                |pref_dispatchers|

                # split the URLs of the pages in equal chunks
                chunks    = split_urls( pages.keys, pref_dispatchers )
                chunk_cnt = chunks.size

                if chunk_cnt > 0
                    # split the page array into chunks that will be distributed
                    # across the instances
                    page_chunks = page_a.chunk( chunk_cnt )

                    # assign us our fair share of plug-in discovered pages
                    update_page_queue!( page_chunks.pop, @local_token )

                    # remove duplicate elements across the (per instance) chunks
                    # while spreading them out evenly
                    elements = distribute_elements( chunks, pages )

                    # empty out the Hash and remove temporary files
                    pages.clear

                    # restrict the local instance to its assigned elements
                    restrict_to_elements!( elements.pop, @local_token )

                    # set the URLs to be audited by the local instance
                    @opts.restrict_paths = chunks.pop

                    chunks.each_with_index {
                        |chunk, i|

                        # spawn a remote instance, assign a chunk of URLs
                        # and elements to it and run it
                        spawn( chunk, page_chunks[i], elements[i], pref_dispatchers[i] ) {
                            |inst|
                            @instances << inst
                        }
                    }
                end

                # start the local instance
                Thread.new {
                    # ap 'AUDITING'
                    audit

                    # ap 'OLD CLEAN UP'
                    old_clean_up!

                    # ap 'DONE'
                    @extended_running = false
                }
            }
        else
            # start the local instance
            Thread.new {
                # ap 'AUDITING'
                super
                # ap 'DONE'
                @extended_running = false
            }
        end
    }

    return true
end

#serialized_auditstoreString

Returns YAML representation of #auditstore.

Returns:



559
560
561
# File 'lib/arachni/rpc/server/framework.rb', line 559

def serialized_auditstore
    audit_store.to_yaml
end

#serialized_reportString

Returns YAML representation of #report.

Returns:



566
567
568
# File 'lib/arachni/rpc/server/framework.rb', line 566

def serialized_report
    audit_store.to_h.to_yaml
end

#set_master(url, token) ⇒ Bool

Sets the URL and authentication token required to connect to our master.

Parameters:

  • url (String)

    master’s URL in ‘hostname:port’ form

  • token (String)

    master’s authentication token

Returns:

  • (Bool)

    true on success, false if this is the master of the HPG (in which case this is not applicable).



627
628
629
630
631
632
633
634
635
636
637
638
639
640
# File 'lib/arachni/rpc/server/framework.rb', line 627

def set_master( url, token )
    return false if high_performance?

    @master_url = url
    @master = connect_to_instance( { 'url' => url, 'token' => token } )

    @modules.class.do_not_store!
    @modules.class.on_register_results {
        |results|
        report_issues_to_master( results )
    }

    return true
end

#statusString

Returns the status of the instance as a string.

Possible values are:

o crawling
o paused
o done
o busy

Returns:



407
408
409
410
411
412
413
414
415
416
417
418
# File 'lib/arachni/rpc/server/framework.rb', line 407

def status
    if( !@crawling_done && master.empty? && high_performance?) ||
        ( master.empty? && !high_performance? && stats[:current_page].empty? )
        return 'crawling'
    elsif paused?
        return 'paused'
    elsif !extended_running?
        return 'done'
    else
        return 'busy'
    end
end

#update_page_queue!(pages, token = nil) ⇒ Bool

Updates the page queue with the provided pages.

Parameters:

  • pages (Array)

    list of pages

  • token (String) (defaults to: nil)

    privileged token, prevents this method from being called by 3rd parties.

Returns:

  • (Bool)

    true on success, false on invalid token



651
652
653
654
655
# File 'lib/arachni/rpc/server/framework.rb', line 651

def update_page_queue!( pages, token = nil )
    return false if high_performance? && !valid_token?( token )
    pages.each { |page| @page_queue << page }
    return true
end