Module: OutriderTools::Clean
- Defined in:
- lib/outrider/tools.rb
Class Method Summary collapse
- .file_types(sub = :all) ⇒ Object
-
.process_words_to_array(words = "") ⇒ Object
takes string of words, sorts out duds and returns array.
- .tidy_urls(hrefs, page_uri, domain, files) ⇒ Object
-
.word_array_to_string(strings) ⇒ Object
takes array of strings and combines them.
Class Method Details
.file_types(sub = :all) ⇒ Object
167 168 169 170 171 172 173 174 175 176 177 178 |
# File 'lib/outrider/tools.rb', line 167 def self.file_types sub = :all case sub when :all return %w[png jpeg jpg gif svg txt js css zip gz pdf] when :images return %w[png jpeg jpg gif svg] when :pdfs return %w[pdf] else return %w[png jpeg jpg gif svg txt js css zip gz pdf] end end |
.process_words_to_array(words = "") ⇒ Object
takes string of words, sorts out duds and returns array
184 185 186 187 188 |
# File 'lib/outrider/tools.rb', line 184 def self.process_words_to_array words = "" clean_words = words.split.each do |word| word.downcase! end end |
.tidy_urls(hrefs, page_uri, domain, files) ⇒ Object
141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 |
# File 'lib/outrider/tools.rb', line 141 def self.tidy_urls hrefs, page_uri, domain, files # Make these URIs, throwing out problem ones like mailto: uris = hrefs.map{ |href| URI.join( page_uri, href ) rescue nil }.compact # Pare it down to only those pages that are on the same site uris.select!{ |uri| uri.host == domain.host } # Throw out links to files (this could be more efficient with regex) uris.reject!{ |uri| files.any?{ |ext| uri.path.end_with?(".#{ext}") } } # Throw out duplicates uris.reject!{ |uri| ProjectData.exists?( url: uri.to_s) } # Remove #foo fragments so that sub-page links aren't differentiated uris.each{ |uri| uri.fragment = nil } return uris end |
.word_array_to_string(strings) ⇒ Object
takes array of strings and combines them
193 194 195 196 197 198 199 |
# File 'lib/outrider/tools.rb', line 193 def self.word_array_to_string strings the_string = '' strings.each do |string| the_string += string.gsub(/[^a-z0-9\s]/i, '') end return the_string end |