Class: Infoboxer::Tree::Template

Inherits:
Compound show all
Includes:
Linkable
Defined in:
lib/infoboxer/tree/template.rb

Overview

Represents MediaWiki template.

Template is basically a thing with name, some variables and their values. When pages are displayed in browser, templates are rendered in something different by wiki engine; yet, when extracting information with Infoboxer, you are working with original templates.

It requires some mastering and understanding, yet allows to do very poweful things. There are many kinds of them, from pure formatting-related (which are typically not more than small bells and whistles for page outlook, and should be rendered as a text) to very information-heavy ones, like infoboxes, from which Infoboxer borrows its name!

Basically, for information extraction from template you'll list its #variables, and then use #fetch method (and its variants: #fetch_hash/##fetch_date) to extract their values.

On variables naming

MediaWiki templates can contain named and unnamed variables. Example:

{{birth date and age|1953|2|19|df=y}}

This is template with name "birth date and age", three unnamed variables with values "1953", "2" and "19", and one named variable with name "df" and value "y".

For consistency, Infoboxer treats unnamed variables exactly the same way MediaWiki does: they considered to have numeric names, which are started from 1 and stored as a strings. So, for template shown above, the following is correct:

template.fetch('1').text == '1953'
template.fetch('2').text == '2'
template.fetch('3').text == '19'
template.fetch('df').text == 'y'

Note also, that named variables with simple text values are duplicated as a template node Node#params, so, the following is correct also:

template.params['df'] == 'y'
template.params.has_key?('1') == false

For more advanced topics, like subclassing templates by names and converting them to inline text, please read Infoboxer::Templates module's documentation.

Direct Known Subclasses

Infoboxer::Templates::Base

Instance Attribute Summary collapse

Attributes inherited from Compound

#children

Attributes inherited from Node

#params, #parent

Instance Method Summary collapse

Methods included from Linkable

#url

Methods inherited from Compound

#index_of

Methods inherited from Node

#==, #children, coder, def_readers, #first?, #index, #inspect, #next_siblings, #prev_siblings, #siblings, #text_, #to_s

Methods included from Navigation::Wikipath

#wikipath

Methods included from Navigation::Sections::Node

#in_sections

Methods included from Navigation::Shortcuts::Node

#bold?, #categories, #external_links, #heading?, #headings, #images, #infobox, #infoboxes, #italic?, #lists, #paragraphs, #tables, #templates, #wikilinks

Methods included from Navigation::Lookup::Node

#_lookup, #_lookup_children, #_lookup_next_siblings, #_lookup_parents, #_lookup_prev_sibling, #_lookup_prev_siblings, #_lookup_siblings, #_matches?, #lookup, #lookup_children, #lookup_next_siblings, #lookup_parents, #lookup_prev_sibling, #lookup_prev_siblings, #lookup_siblings, #matches?, #parent?

Constructor Details

#initialize(name, variables = Nodes[]) ⇒ Template

Returns a new instance of Template.



116
117
118
119
# File 'lib/infoboxer/tree/template.rb', line 116

def initialize(name, variables = Nodes[])
  super(variables, **extract_params(variables))
  @name = name
end

Instance Attribute Details

#nameString (readonly)

Template name, designating its contents structure.

See also #url, which you can navigate to read template's definition (and, in Wikipedia and many other projects, its documentation).

Returns:

  • (String)


106
107
108
# File 'lib/infoboxer/tree/template.rb', line 106

def name
  @name
end

Instance Method Details

#fetch(*patterns) ⇒ Nodes<Var>

Fetches template variable(s) by name(s) or patterns.

Usage:

argentina.infobox.fetch('leader_title_1')   # => one Var node
argentina.infobox.fetch('leader_title_1',
                        'leader_name_1')    # => two Var nodes
argentina.infobox.fetch(/leader_title_\d+/) # => several Var nodes

Returns:



170
171
172
# File 'lib/infoboxer/tree/template.rb', line 170

def fetch(*patterns)
  Nodes[*patterns.map { |p| variables.find(name: p) }.flatten]
end

#fetch_date(*patterns) ⇒ Date

Fetches date by list of variable names containing date components.

(Experimental, subject to change or enchance.)

Explanation: if you have template like

{{birth date and age|1953|2|19|df=y}}

...there is a short way to obtain date from it:

template.fetch_date('1', '2', '3') # => Date.new(1953,2,19)

Returns:

  • (Date)


195
196
197
198
199
200
201
202
203
204
# File 'lib/infoboxer/tree/template.rb', line 195

def fetch_date(*patterns)
  components = fetch(*patterns)
  components.pop while components.last.nil? && !components.empty?

  if components.empty?
    nil
  else
    Date.new(*components.map { |v| v.to_s.to_i })
  end
end

#fetch_hash(*patterns) ⇒ Hash<String => Var>

Fetches hash {name => variable}, by same patterns as #fetch.

Returns:

  • (Hash<String => Var>)


177
178
179
# File 'lib/infoboxer/tree/template.rb', line 177

def fetch_hash(*patterns)
  fetch(*patterns).map { |v| [v.name, v] }.to_h
end

#followMediaWiki::Page

Extracts template source and returns it parsed (or nil, if template not found).

NB: Infoboxer does NO variable substitution or other template evaluation actions. Moreover, it will almost certainly NOT parse template definitions correctly. You should use this method ONLY for "transclusion" templates (parts of content, which are included into other pages "as is").

Look for example at this page's source: each subtable about some region is just a transclusion of template. This can be processed like:

Infoboxer.wp.get('Tropical and subtropical coniferous forests').
  templates(name: /forests^/).
  follow.tables #.and_so_on

See also Linkable#follow for general notes on the following links.

Returns:



# File 'lib/infoboxer/tree/template.rb', line 208

Wikilink name of this template's source.



234
235
236
237
# File 'lib/infoboxer/tree/template.rb', line 234

def link
  # FIXME: super-naive for now, doesn't thinks about subpages and stuff.
  "Template:#{name}"
end

#named_variablesObject



154
155
156
# File 'lib/infoboxer/tree/template.rb', line 154

def named_variables
  variables.select(&:named?)
end

#textObject



121
122
123
124
# File 'lib/infoboxer/tree/template.rb', line 121

def text
  res = unnamed_variables.map(&:text).join('|')
  res.empty? ? '' : "{#{name}:#{res}}"
end

#to_hHash{String => String}

Represents entire template as hash of String => String, where keys are variable names and values are text representation of variables contents.

Returns:

  • (Hash{String => String})


141
142
143
# File 'lib/infoboxer/tree/template.rb', line 141

def to_h
  variables.map { |var| [var.name, var.text] }.to_h
end

#to_tree(level = 0) ⇒ Object



131
132
133
134
# File 'lib/infoboxer/tree/template.rb', line 131

def to_tree(level = 0)
  '  ' * level + "<#{descr}>\n" +
    variables.map { |var| var.to_tree(level + 1) }.join
end

#unnamed_variablesNodes<Var>

Returns list of template variables with numeric names (which are treated as "unnamed" variables by MediaWiki templates, see class docs for explanation).

Returns:



150
151
152
# File 'lib/infoboxer/tree/template.rb', line 150

def unnamed_variables
  variables.reject(&:named?)
end

#unwrapObject



126
127
128
# File 'lib/infoboxer/tree/template.rb', line 126

def unwrap
  unnamed_variables.flat_map(&:children).unwrap
end