Class: Wikiwhat::Text

Inherits:
Results show all
Defined in:
lib/wikiwhat/parse.rb

Overview

Extract portions of text from Wiki article

Instance Method Summary collapse

Methods inherited from Results

#content_split, #pull_from_hash

Constructor Details

#initialize(api_return, prop = 'extract') ⇒ Text

Returns a new instance of Text.



46
47
48
49
50
51
# File 'lib/wikiwhat/parse.rb', line 46

def initialize(api_return, prop='extract')
  @request = self.pull_from_hash(api_return, prop)
  if @request.class == Array
    @request = self.pull_from_hash(@request[0], "*")
  end
end

Instance Method Details

#find_header(header) ⇒ Object

Find all paragraphs under a given heading

header = the name of the header as a String paras = the number of paragraphs

Return a String.



87
88
89
90
91
92
93
94
95
96
97
98
99
100
# File 'lib/wikiwhat/parse.rb', line 87

def find_header(header)
  # Find the requested header
  start = @request.index(header)
  if start
    # Find next instance of the tag.
    end_first_tag = start + @request[start..-1].index("h2") + 3
    # Find
    start_next_tag = @request[end_first_tag..-1].index("h2") + end_first_tag - 2
    # Select substring of requested text.
    @request[end_first_tag..start_next_tag]
  else
    raise Wikiwhat::WikiwhatError.new("Sorry, that header isn't on this page.")
  end
end

#only_text(string) ⇒ Object

Removes HTML tags from a String

string - a String that contains HTML tags.

Returns the string without HTML tags.



107
108
109
# File 'lib/wikiwhat/parse.rb', line 107

def only_text(string)
  no_html_tags = string.gsub(/<\/?.*?>/,'')
end

#paragraph(quantity) ⇒ Object

Returns the requested number of paragraphs of a Wiki article

quantity - the Number of paragraphs to be returned starting from the top

of the article. Defaults is to get the first paragraph.

Return an array of strings.



59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
# File 'lib/wikiwhat/parse.rb', line 59

def paragraph(quantity)
  # Break the article into individual paragraphs and store in an array.
  start = @request.split("</p>")

  # Re-add the closing paragraph HTML tags.
  start.each do |string|
    string << "</p>"
  end

  # Check to make sure the quantity being requested is not more paragraphs
  # than exist.
  #
  # Return the correct number of paragraphs assigned to new_arr
  if start.length < quantity
    quantity = start.length - 1
    new_arr = start[0..quantity]
  else
    quantity = quantity - 1
    new_arr = start[0..quantity]
  end
end

#refsObject

Find all references on a page.

Return all refrences as an array of arrays.

TODO: Currently nested array, want to return as array of strings.



145
146
147
148
149
150
# File 'lib/wikiwhat/parse.rb', line 145

def refs
  @content = content_split(1, 2)

  #add all references to an array. still in wiki markup
  @content.scan(/<ref>(.*?)<\/ref>/)
end

Find the image from the sidebar, if one exists

Return the url of the image as a String.



119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
# File 'lib/wikiwhat/parse.rb', line 119

def sidebar_image
  # Check to see if a sidebar image exists
  if self.content_split(0)[/(image).*?(\.\w\w(g|G|f|F))/]
    # Grab the sidebar image title
    image_name = self.content_split(0)[/(image).*?(\.\w\w(g|G|f|F))/]
    # Remove the 'image = ' part of the string
    image_name = image_name.split("=")[1].strip
    # Call Wikipedia for image url
    get_url = Wikiwhat::Call.call_api(('File:'+ image_name),
      :prop => "imageinfo", :iiprop => true)
    # Pull url from hash
    img_name_2 = pull_from_hash(get_url, "pages")
    img_array = pull_from_hash(img_name_2, "imageinfo")
    img_array[0]["url"]
  else
    # If no sidebar image exists, raise error.
    raise Wikiwhat::WikiwhatError.new("Sorry, it looks like there is no sidebar image
      on this page.")
  end
end