Method: Wgit::DSL#extract

Defined in:
lib/wgit/dsl.rb

#extract(var, xpath, opts = {}) {|value, source, type| ... } ⇒ Symbol

Defines an extractor using Wgit::Document.define_extractor underneath.

Parameters:

  • var (Symbol)

    The name of the variable to be initialised, that will contain the extracted content.

  • xpath (String, #call)

    The xpath used to find the element(s) of the webpage. Only used when initializing from HTML.

    Pass a callable object (proc etc.) if you want the xpath value to be derived on Document initialisation (instead of when the extractor is defined). The call method must return a valid xpath String.

  • opts (Hash) (defaults to: {})

    The options to define an extractor with. The options are only used when intializing from HTML, not the database.

Options Hash (opts):

  • :singleton (Boolean)

    The singleton option determines whether or not the result(s) should be in an Array. If multiple results are found and singleton is true then the first result will be used. Defaults to true.

  • :text_content_only (Boolean)

    The text_content_only option if true will use the text content of the Nokogiri result object, otherwise the Nokogiri object itself is returned. Defaults to true.

Yields:

  • The block is executed when a Wgit::Document is initialized, regardless of the source. Use it (optionally) to process the result value.

Yield Parameters:

  • value (Object)

    The result value to be assigned to the new var.

  • source (Wgit::Document, Object)

    The source of the value.

  • type (Symbol)

    The source type, either :document or (DB) :object.

Yield Returns:

  • (Object)

    The return value of the block becomes the new var's value. Return the block's value param unchanged if you want to inspect.

Returns:

  • (Symbol)

    The given var Symbol if successful.

Raises:

  • (StandardError)

    If the var param isn't valid.



43
44
45
# File 'lib/wgit/dsl.rb', line 43

def extract(var, xpath, opts = {}, &block)
  Wgit::Document.define_extractor(var, xpath, opts, &block)
end