Class: Deepsearch::Engine::Steps::DataAggregation::ParsedWebsite
- Inherits:
-
Object
- Object
- Deepsearch::Engine::Steps::DataAggregation::ParsedWebsite
- Defined in:
- lib/deepsearch/engine/steps/data_aggregation/parsed_website.rb
Overview
Fetches content from a URL, parses it, and cleans it to extract meaningful text. It handles HTTP requests, content type detection, and removal of unwanted HTML elements.
Instance Attribute Summary collapse
-
#content ⇒ Object
readonly
Returns the value of attribute content.
-
#error ⇒ Object
readonly
Returns the value of attribute error.
-
#metadata ⇒ Object
readonly
Returns the value of attribute metadata.
-
#success ⇒ Object
readonly
Returns the value of attribute success.
-
#timestamp ⇒ Object
readonly
Returns the value of attribute timestamp.
-
#url ⇒ Object
readonly
Returns the value of attribute url.
Instance Method Summary collapse
-
#initialize(url:) ⇒ ParsedWebsite
constructor
A new instance of ParsedWebsite.
- #size ⇒ Object
- #success? ⇒ Boolean
- #to_h ⇒ Object
Constructor Details
#initialize(url:) ⇒ ParsedWebsite
Returns a new instance of ParsedWebsite.
16 17 18 19 20 21 22 |
# File 'lib/deepsearch/engine/steps/data_aggregation/parsed_website.rb', line 16 def initialize(url:) @url = url @content = nil @success = false @error = nil fetch_content! end |
Instance Attribute Details
#content ⇒ Object (readonly)
Returns the value of attribute content.
14 15 16 |
# File 'lib/deepsearch/engine/steps/data_aggregation/parsed_website.rb', line 14 def content @content end |
#error ⇒ Object (readonly)
Returns the value of attribute error.
14 15 16 |
# File 'lib/deepsearch/engine/steps/data_aggregation/parsed_website.rb', line 14 def error @error end |
#metadata ⇒ Object (readonly)
Returns the value of attribute metadata.
14 15 16 |
# File 'lib/deepsearch/engine/steps/data_aggregation/parsed_website.rb', line 14 def end |
#success ⇒ Object (readonly)
Returns the value of attribute success.
14 15 16 |
# File 'lib/deepsearch/engine/steps/data_aggregation/parsed_website.rb', line 14 def success @success end |
#timestamp ⇒ Object (readonly)
Returns the value of attribute timestamp.
14 15 16 |
# File 'lib/deepsearch/engine/steps/data_aggregation/parsed_website.rb', line 14 def end |
#url ⇒ Object (readonly)
Returns the value of attribute url.
14 15 16 |
# File 'lib/deepsearch/engine/steps/data_aggregation/parsed_website.rb', line 14 def url @url end |
Instance Method Details
#size ⇒ Object
28 29 30 |
# File 'lib/deepsearch/engine/steps/data_aggregation/parsed_website.rb', line 28 def size content.to_s.size end |
#success? ⇒ Boolean
24 25 26 |
# File 'lib/deepsearch/engine/steps/data_aggregation/parsed_website.rb', line 24 def success? @success end |
#to_h ⇒ Object
32 33 34 35 36 37 38 39 |
# File 'lib/deepsearch/engine/steps/data_aggregation/parsed_website.rb', line 32 def to_h { url: url, success: success?, error: error, content: content } end |