Module: NewsScraper
- Extended by:
- NewsScraper
- Included in:
- NewsScraper
- Defined in:
- lib/news_scraper.rb,
lib/news_scraper/cli.rb,
lib/news_scraper/errors.rb,
lib/news_scraper/scraper.rb,
lib/news_scraper/trainer.rb,
lib/news_scraper/version.rb,
lib/news_scraper/uri_parser.rb,
lib/news_scraper/configuration.rb,
lib/news_scraper/extractors/article.rb,
lib/news_scraper/extractors_helpers.rb,
lib/news_scraper/trainer/url_trainer.rb,
lib/news_scraper/transformers/article.rb,
lib/news_scraper/trainer/preset_selector.rb,
lib/news_scraper/extractors/google_news_rss.rb,
lib/news_scraper/transformers/trainer_article.rb,
lib/news_scraper/transformers/nokogiri/functions.rb,
lib/news_scraper/transformers/helpers/highscore_parser.rb
Defined Under Namespace
Modules: CLI, Extractors, ExtractorsHelpers, Trainer, Transformers Classes: Configuration, ResponseError, Scraper, URIParser
Constant Summary collapse
- VERSION =
"1.1.1".freeze
Instance Attribute Summary collapse
-
#configuration ⇒ Object
:nocov:.
Instance Method Summary collapse
- #configure {|configuration| ... } ⇒ Object
- #reset_configuration ⇒ Object
-
#train(query:) ⇒ Object
NewsScraper::train
is an interactive command-line prompt that:.
Instance Attribute Details
#configuration ⇒ Object
:nocov:
47 48 49 |
# File 'lib/news_scraper.rb', line 47 def configuration @configuration ||= Configuration.new end |
Instance Method Details
#configure {|configuration| ... } ⇒ Object
55 56 57 |
# File 'lib/news_scraper.rb', line 55 def configure yield(configuration) end |
#reset_configuration ⇒ Object
51 52 53 |
# File 'lib/news_scraper.rb', line 51 def reset_configuration @configuration = Configuration.new end |
#train(query:) ⇒ Object
NewsScraper::train
is an interactive command-line prompt that:
-
Collates all articles for the given :query
-
Grep for
:data_types
using:presets
in the config set in theconfiguration
-
Displays the results of each
:preset
grep for a given:data_type
-
Prompts to select one of the
:presets
or define a pattern for that domain’s:data_type
N.B: User may ignore all presets and manually configure it in the YAML file
-
Saves the selected
:preset
toconfig/article_scrape_patterns.yml
Params
-
query
: a keyword arugment specifying the query to train on
:nocov:
42 43 44 |
# File 'lib/news_scraper.rb', line 42 def train(query:) Trainer.train(query: query) end |