Class: TextRank::CharFilter::StripHtml
- Inherits:
-
Nokogiri::XML::SAX::Document
- Object
- Nokogiri::XML::SAX::Document
- TextRank::CharFilter::StripHtml
- Defined in:
- lib/text_rank/char_filter/strip_html.rb
Overview
Character filter to remove HTML tags and convert HTML entities to text.
Example
StripHtml.new.filter!(""Optimism", said Cacambo, "What is that?"")
=> "\"Optimism\", said Cacambo, \"What is that?\""
StringHtml.new.filter!("<b>Alas! It is the <u>obstinacy</u> of maintaining that everything is best when it is worst.</b>")
=> "Alas! It is the obstinacy of maintaining that everything is best when it is worst."
Instance Method Summary collapse
-
#filter!(text) ⇒ String
Perform the filter.
-
#initialize ⇒ StripHtml
constructor
A new instance of StripHtml.
Constructor Details
#initialize ⇒ StripHtml
Returns a new instance of StripHtml.
19 20 21 |
# File 'lib/text_rank/char_filter/strip_html.rb', line 19 def initialize @text = StringIO.new end |
Instance Method Details
#filter!(text) ⇒ String
Perform the filter
26 27 28 29 30 |
# File 'lib/text_rank/char_filter/strip_html.rb', line 26 def filter!(text) @text.rewind Nokogiri::HTML::SAX::Parser.new(self).parse(text) @text.string end |