Class: Arachni::HTTP

Inherits:
Object show all
Includes:
Mixins::Observable, Module::Output, Utilities, Singleton
Defined in:
lib/arachni/http.rb,
lib/arachni/http/cookie_jar.rb

Overview

Provides a system-wide, simple and high-performance HTTP interface.

Author:

Defined Under Namespace

Classes: CookieJar

Constant Summary collapse

MAX_CONCURRENCY =

Default maximum concurrency for HTTP requests.

20
REDIRECT_LIMIT =

Default maximum redirect limit.

20
USER_AGENT =

Default user agent (will be appended the current Arachni version).

'Arachni/v'
MAX_QUEUE_SIZE =

Don’t let the request queue grow more than this amount, if it does then run the queued requests to unload it

5000

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Methods included from Mixins::Observable

#method_missing

Methods included from Utilities

#cookie_encode, #cookies_from_document, #cookies_from_file, #cookies_from_response, #exception_jail, #exclude_path?, #extract_domain, #form_decode, #form_encode, #form_parse_request_body, #forms_from_document, #forms_from_response, #get_path, #hash_keys_to_str, #html_decode, #html_encode, #include_path?, #links_from_document, #links_from_response, #normalize_url, #page_from_response, #page_from_url, #parse_query, #parse_set_cookie, #parse_url_vars, #path_in_domain?, #path_too_deep?, #remove_constants, #seed, #skip_path?, #to_absolute, #uri_decode, #uri_encode, #uri_parse, #uri_parser, #url_sanitize

Methods included from Module::Output

#fancy_name, #print_bad, #print_debug, #print_error, #print_info, #print_line, #print_ok, #print_status, #print_verbose

Methods included from UI::Output

#debug?, #debug_off, #debug_on, #disable_only_positives, #flush_buffer, #mute, #muted?, old_reset_output_options, #only_positives, #only_positives?, #print_bad, #print_debug, #print_debug_backtrace, #print_debug_pp, #print_error, #print_error_backtrace, #print_info, #print_line, #print_ok, #print_status, #print_verbose, #reroute_to_file, #reroute_to_file?, reset_output_options, #set_buffer_cap, #uncap_buffer, #unmute, #verbose, #verbose?

Constructor Details

#initializeHTTP

Returns a new instance of HTTP.



82
83
84
# File 'lib/arachni/http.rb', line 82

def initialize
    reset
end

Dynamic Method Handling

This class handles dynamic methods through the method_missing method in the class Arachni::Mixins::Observable

Instance Attribute Details

Returns:



62
63
64
# File 'lib/arachni/http.rb', line 62

def cookie_jar
  @cookie_jar
end

#curr_res_cntInteger (readonly)

Returns amount of responses received for the running requests (of the current burst).

Returns:

  • (Integer)

    amount of responses received for the running requests (of the current burst)



77
78
79
# File 'lib/arachni/http.rb', line 77

def curr_res_cnt
  @curr_res_cnt
end

#curr_res_timeInteger (readonly)

Returns sum of the response times of the running requests (of the current burst).

Returns:

  • (Integer)

    sum of the response times of the running requests (of the current burst)



74
75
76
# File 'lib/arachni/http.rb', line 74

def curr_res_time
  @curr_res_time
end

#headersHash (readonly)

Returns default headers for each request.

Returns:

  • (Hash)

    default headers for each request



59
60
61
# File 'lib/arachni/http.rb', line 59

def headers
  @headers
end

#request_countInteger (readonly)

Returns amount of performed requests.

Returns:

  • (Integer)

    amount of performed requests



65
66
67
# File 'lib/arachni/http.rb', line 65

def request_count
  @request_count
end

#response_countInteger (readonly)

Returns amount of received responses.

Returns:

  • (Integer)

    amount of received responses



68
69
70
# File 'lib/arachni/http.rb', line 68

def response_count
  @response_count
end

#time_out_countInteger (readonly)

Returns amount of timed-out requests.

Returns:

  • (Integer)

    amount of timed-out requests



71
72
73
# File 'lib/arachni/http.rb', line 71

def time_out_count
  @time_out_count
end

#trainerArachni::Module::Trainer (readonly)



80
81
82
# File 'lib/arachni/http.rb', line 80

def trainer
  @trainer
end

#urlString (readonly)

Returns framework seed/target URL.

Returns:

  • (String)

    framework seed/target URL



56
57
58
# File 'lib/arachni/http.rb', line 56

def url
  @url
end

Class Method Details

.method_missing(sym, *args, &block) ⇒ Object



587
588
589
# File 'lib/arachni/http.rb', line 587

def self.method_missing( sym, *args, &block )
    instance.send( sym, *args, &block )
end

Instance Method Details

#abortObject

Aborts the running requests on a best effort basis



205
206
207
# File 'lib/arachni/http.rb', line 205

def abort
    exception_jail { @hydra.abort }
end

#after_run(&block) ⇒ Arachni::HTTP

Gets called each time a hydra run finishes.

Returns:



253
254
255
256
# File 'lib/arachni/http.rb', line 253

def after_run( &block )
    @after_run << block
    self
end

#after_run_persistent(&block) ⇒ Arachni::HTTP

Like #after_run but will not be removed after it’s run.

Returns:



263
264
265
266
# File 'lib/arachni/http.rb', line 263

def after_run_persistent( &block )
    add_after_run_persistent( &block )
    self
end

#average_res_timeInteger

Returns average response time for the running requests (i.e. the current burst).

Returns:

  • (Integer)

    average response time for the running requests (i.e. the current burst)



216
217
218
219
# File 'lib/arachni/http.rb', line 216

def average_res_time
    return 0 if @curr_res_cnt == 0
    @curr_res_time / @curr_res_cnt
end

#burst_runtimeInteger

Returns amount of time (in seconds) that the current burst has been running.

Returns:

  • (Integer)

    amount of time (in seconds) that the current burst has been running



210
211
212
213
# File 'lib/arachni/http.rb', line 210

def burst_runtime
    @burst_runtime.to_i > 0 ?
        @burst_runtime : Time.now - (@burst_runtime_start || Time.now)
end

Gets a url with cookies and url variables

Parameters:

  • url (URI) (defaults to: @url)

    URL to GET

  • opts (Hash) (defaults to: { })

    request options

    • :params => cookies || {}

    • :train => force Arachni to analyze the HTML code || false

    • :async => make the request async? || true

    • :headers => HTTP request headers || {}

  • block (Block)

    callback to be passed the response

Returns:



430
431
432
433
434
# File 'lib/arachni/http.rb', line 430

def cookie( url = @url, opts = { }, &block )
    opts[:cookies] = (opts[:params] || {}).dup
    opts[:params]  = nil
    request( url, opts, &block )
end

#cookiesArray<Arachni::Element::Cookie>

Returns all cookies in the jar.

Returns:



244
245
246
# File 'lib/arachni/http.rb', line 244

def cookies
    @cookie_jar.cookies
end

#curr_res_per_secondInteger

Returns responses/second for the running requests (i.e. the current burst).

Returns:

  • (Integer)

    responses/second for the running requests (i.e. the current burst)



222
223
224
225
226
227
# File 'lib/arachni/http.rb', line 222

def curr_res_per_second
    if @curr_res_cnt > 0 && burst_runtime > 0
        return (@curr_res_cnt / burst_runtime).to_i
    end
    0
end

#custom_404?(res, &block) ⇒ Boolean

Checks whether or not the provided response is a custom 404 page

Parameters:

  • res (Typhoeus::Response)

    the response to check

  • block (Block)

    to be passed true or false depending on the result

Returns:

  • (Boolean)


523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
# File 'lib/arachni/http.rb', line 523

def custom_404?( res, &block )
    precision = 2

    @_404 ||= {}
    path  = get_path( res.effective_url )
    @_404[path] ||= []

    uri = uri_parse( res.effective_url )
    trv_back = File.dirname( uri.path )
    trv_back_url = uri.scheme + '://' +  uri.host + ':' + uri.port.to_s + trv_back
    trv_back_url += '/' if trv_back_url[-1] != '/'

    # 404 probes
    generators = [
        # get a random path with an extension
        proc{ path + random_string + '.' + random_string[0..precision] },

        # get a random path without an extension
        proc{ path + random_string },

        # move up a dir and get a random file
        proc{ trv_back_url + random_string },

        # move up a dir and get a random file with an extension
        proc{ trv_back_url + random_string + '.' + random_string[0..precision] },

        # get a random directory
        proc{ path + random_string + '/' }
    ]

    @_404_gathered ||= BloomFilter.new

    gathered = 0
    body = res.body

    if !@_404_gathered.include?( path )
        generators.each.with_index do |generator, i|
            @_404[path][i] ||= {}

            precision.times {
                get( generator.call, follow_location: true ) do |c_res|
                    gathered += 1

                    if gathered == generators.size * precision
                        @_404_gathered << path

                        # save the hash of the refined responses, no sense
                        # in wasting space
                        @_404[path].each { |c404| c404['rdiff'] = c404['rdiff'].hash }

                        block.call is_404?( path, body )
                    else
                        @_404[path][i]['body'] ||= c_res.body
                        @_404[path][i]['rdiff'] = @_404[path][i]['body'].rdiff( c_res.body )
                    end
                end
            }
        end
    else
        block.call is_404?( path, body )
    end
    nil
end

#get(url = @url, opts = { }, &block) ⇒ Typhoeus::Request

Gets a URL passing the provided query parameters

Parameters:

  • url (URI) (defaults to: @url)

    URL to GET

  • opts (Hash) (defaults to: { })

    request options

    • :params => request parameters || {}

    • :train => force Arachni to analyze the HTML code || false

    • :async => make the request async? || true

    • :headers => HTTP request headers || {}

    • :follow_location => follow redirects || false

  • block (Block)

    callback to be passed the response

Returns:



375
376
377
# File 'lib/arachni/http.rb', line 375

def get( url = @url, opts = { }, &block )
    request( url, opts, &block )
end

#header(url = @url, opts = { }, &block) ⇒ Typhoeus::Request

Gets a url with optional url variables and modified headers

Parameters:

  • url (URI) (defaults to: @url)

    URL to GET

  • opts (Hash) (defaults to: { })

    request options

    • :params => headers || {}

    • :train => force Arachni to analyze the HTML code || false

    • :async => make the request async? || true

  • block (Block)

    callback to be passed the response

Returns:



449
450
451
452
453
# File 'lib/arachni/http.rb', line 449

def header( url = @url, opts = { }, &block )
    opts[:headers] = (opts[:params] || {}).dup
    opts[:params]  = nil
    request( url, opts, &block )
end

#max_concurrencyInteger

Returns current maximum concurrency of HTTP requests.

Returns:

  • (Integer)

    current maximum concurrency of HTTP requests



239
240
241
# File 'lib/arachni/http.rb', line 239

def max_concurrency
    @hydra.max_concurrency
end

#max_concurrency=(concurrency) ⇒ Object

Sets the maximum concurrency of HTTP requests

Parameters:

  • concurrency (Integer)


234
235
236
# File 'lib/arachni/http.rb', line 234

def max_concurrency=( concurrency )
    @hydra.max_concurrency = concurrency
end

#on_new_cookies(&block) ⇒ Object

Parameters:

  • block (Block)

    to be passed the new cookies and the response that set them



513
514
515
# File 'lib/arachni/http.rb', line 513

def on_new_cookies( &block )
    add_on_new_cookies( &block )
end

#page=(page) ⇒ Object

Sets the current working page, passes it to the #trainer and updates the #cookie_jar using the page’s cookiejar.

Parameters:



176
177
178
179
180
181
# File 'lib/arachni/http.rb', line 176

def page=( page )
    trainer.page = page
    # update the cookies
    update_cookies( page.cookiejar ) if !page.cookiejar.empty?
    page
end

#parse_and_set_cookies(res) ⇒ Object

Extracts cookies from an HTTP response and updates the cookie-jar

It also executes callbacks added with “add_on_new_cookies( &block )”.

Parameters:



500
501
502
503
504
505
506
507
508
# File 'lib/arachni/http.rb', line 500

def parse_and_set_cookies( res )
    cookies = Cookie.from_response( res )
    update_cookies( cookies )

    # update framework cookies
    Options.cookies = cookies

    call_on_new_cookies( cookies, res )
end

#post(url = @url, opts = { }, &block) ⇒ Typhoeus::Request

Posts a form to a URL with the provided query parameters

Parameters:

  • url (URI) (defaults to: @url)

    URL to POST

  • opts (Hash) (defaults to: { })

    request options

    • :params => request parameters || {}

    • :train => force Arachni to analyze the HTML code || false

    • :async => make the request async? || true

    • :headers => HTTP request headers || {}

  • block (Block)

    callback to be passed the response

Returns:



393
394
395
# File 'lib/arachni/http.rb', line 393

def post( url = @url, opts = { }, &block )
    request( url, opts.merge( method: :post ), &block )
end

#request(url = @url, opts = {}, &block) ⇒ Typhoeus::Request

Makes a generic request

Parameters:

  • url (URI) (defaults to: @url)
  • opts (Hash) (defaults to: {})
  • block (Block)

    callback

Returns:



277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
# File 'lib/arachni/http.rb', line 277

def request( url = @url, opts = {}, &block )
    fail 'URL cannot be empty.' if !url

    params    = opts[:params] || {}
    train     = opts[:train]
    timeout   = opts[:timeout]
    cookies   = opts[:cookies] || {}
    async     = opts[:async]
    async     = true if async.nil?
    headers   = opts[:headers] || {}

    update_cookies  = opts[:update_cookies]
    follow_location = opts[:follow_location] || false

    #
    # the exception jail function wraps the block passed to it
    # in exception handling and runs it
    #
    # how cool is Ruby? Seriously....
    #
    exception_jail( false ) {

        if !opts[:no_cookiejar]
            cookies = begin
                @cookie_jar.for_url( url ).inject({}) do |h, c|
                    h[c.name] = c.value
                    h
                end.merge( cookies )
            rescue => e
                print_error "Could not get cookies for URL '#{url}' from Cookiejar (#{e})."
                print_error_backtrace e
                cookies
            end
        end

        headers           = @headers.merge( headers )
        headers['Cookie'] ||= cookies.map { |k, v| "#{cookie_encode( k )}=#{cookie_encode( v )}" }.join( ';' )

        headers.delete( 'Cookie' ) if headers['Cookie'].empty?
        headers.each { |k, v| headers[k] = Header.encode( v ) if v }

        # There are cases where the url already has a query and we also have
        # some params to work with. Some webapp frameworks will break
        # or get confused...plus the url will not be RFC compliant.
        #
        # Thus we need to merge the provided params with the
        # params of the url query and remove the latter from the url.
        cparams = params.dup
        curl    = normalize_url( url ).dup

        if opts[:method] != :post
            begin
                parsed = uri_parse( curl )
                cparams = parse_url_vars( curl ).merge( cparams )
                curl.gsub!( "?#{parsed.query}", '' ) if parsed.query
            rescue
                return
            end
        else
            cparams = cparams.inject( {} ) do |h, (k, v)|
                h[form_encode( k )] = form_encode( v ) if v && k
                h
            end
        end

        opts = {
            headers: headers,
            params:  cparams.empty? ? nil : cparams,
            method:  opts[:method].nil? ? :get : opts[:method],
            body:    opts[:body]
        }.merge( @opts )

        opts[:follow_location] = follow_location if follow_location
        opts[:timeout]         = timeout if timeout

        req = Typhoeus::Request.new( curl, opts )
        req.train if train
        req.update_cookies if update_cookies
        queue( req, async, &block )
        req
    }
end

#resetArachni::HTTP

Re-initializes the singleton

Returns:



91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
# File 'lib/arachni/http.rb', line 91

def reset
    opts = Options

    req_limit = opts.http_req_limit || 20

    hydra_opts = {
        max_concurrency: req_limit,
        method:          :auto
    }

    if opts.url
        parsed_url = uri_parse( opts.url )
        hydra_opts.merge!(
            username: parsed_url.user,
            password: parsed_url.password
        )
    end

    @url = opts.url.to_s
    @url = nil if @url.empty?

    @hydra      = Typhoeus::Hydra.new( hydra_opts )
    @hydra_sync = Typhoeus::Hydra.new( hydra_opts.merge( max_concurrency: 1 ) )

    @hydra.disable_memoization
    @hydra_sync.disable_memoization

    @trainer = Module::Trainer.new( opts )

    opts.user_agent ||= USER_AGENT + VERSION.to_s
    @headers = {
        'Accept' => 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
        'User-Agent'    => opts.user_agent
    }
    @headers['From'] = opts.authed_by if opts.authed_by

    @headers.merge!( opts.custom_headers )

    @cookie_jar = CookieJar.new( opts.cookie_jar )
    update_cookies( opts.cookies ) if opts.cookies

    if opts.cookie_string
        cookies = opts.cookie_string.split( ';' ).map do |cookie_pair|
            k, v = *cookie_pair.split( '=', 2 )
            Cookie.new( opts.url.to_s, k.strip => Cookie.decode( v.strip ) )
        end.flatten.compact
        update_cookies( cookies )
    end

    proxy_opts = {}
    proxy_opts = {
        proxy:          "#{opts.proxy_host}:#{opts.proxy_port}",
        proxy_username: opts.proxy_username,
        proxy_password: opts.proxy_password,
        proxy_type:     opts.proxy_type
    } if opts.proxy_host

    opts.redirect_limit ||= REDIRECT_LIMIT
    @opts = {
        follow_location:               false,
        max_redirects:                 opts.redirect_limit,
        disable_ssl_peer_verification: true,
        timeout:                       opts.http_timeout || 50000
    }.merge( proxy_opts )

    @request_count  = 0
    @response_count = 0
    @time_out_count = 0

    @curr_res_time = 0
    @curr_res_cnt  = 0
    @burst_runtime = 0

    @queue_size = 0

    @after_run = []
    self
end

#runObject

Runs all queued requests



184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
# File 'lib/arachni/http.rb', line 184

def run
    exception_jail {
        @burst_runtime = nil
        hydra_run

        @after_run.each { |block| block.call }
        @after_run.clear

        call_after_run_persistent

        @curr_res_time = 0
        @curr_res_cnt  = 0
        true
    }
rescue SystemExit
    raise
rescue
    nil
end

#sandbox(&block) ⇒ Object

Executes a block under a sandbox.

Cookies or new callbacks set as a result of the block won’t affect the HTTP singleton.

Parameters:

  • block (Block)

Returns:

  • (Object)

    return value of the block



465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
# File 'lib/arachni/http.rb', line 465

def sandbox( &block )
    h = {}
    instance_variables.each do |iv|
        val = instance_variable_get( iv )
        h[iv] = val.deep_clone rescue val.dup rescue val
    end

    hooks = {}
    @__hooks.each { |k, v| hooks[k] = v.dup }

    ret = block.call( self )

    h.each { |iv, val| instance_variable_set( iv, val ) }
    @__hooks = hooks

    ret
end

#trace(url = @url, opts = { }, &block) ⇒ Typhoeus::Request

Sends an HTTP TRACE request to “url”.

Parameters:

  • url (URI) (defaults to: @url)

    URL to POST

  • opts (Hash) (defaults to: { })

    request options

    • :params => request parameters || {}

    • :train => force Arachni to analyze the HTML code || false

    • :async => make the request async? || true

    • :headers => HTTP request headers || {}

  • block (Block)

    callback to be passed the response

Returns:



411
412
413
# File 'lib/arachni/http.rb', line 411

def trace( url = @url, opts = { }, &block )
    request( url, opts.merge( method: :trace ), &block )
end

#update_cookies(cookies) ⇒ Object Also known as: set_cookies

Updates the cookie-jar with the passed cookies

Parameters:



488
489
490
# File 'lib/arachni/http.rb', line 488

def update_cookies( cookies )
    @cookie_jar.update( cookies )
end