Method: Wgit::DSL#index_www

Defined in:
lib/wgit/dsl.rb

#index_www(max_sites: -1,, max_data: 1_048_576_000) ⇒ Object

Indexes the World Wide Web using Wgit::Indexer#index_www underneath.

Parameters:

  • max_sites (Integer) (defaults to: -1,)

    The number of separate and whole websites to be crawled before the method exits. Defaults to -1 which means the crawl will occur until manually stopped (Ctrl+C etc).

  • max_data (Integer) (defaults to: 1_048_576_000)

    The maximum amount of bytes that will be scraped from the web (default is 1GB). Note, that this value is used to determine when to stop crawling; it's not a guarantee of the max data that will be obtained.



183
184
185
186
187
# File 'lib/wgit/dsl.rb', line 183

def index_www(max_sites: -1, max_data: 1_048_576_000)
  indexer = Wgit::Indexer.new(get_db, get_crawler)

  indexer.index_www(max_sites:, max_data:)
end