Class: Sunflower::Page
- Inherits:
-
Object
- Object
- Sunflower::Page
- Defined in:
- lib/sunflower/core.rb,
lib/sunflower/commontasks.rb
Overview
Class representing a single Wiki page. To load specified page, use #new. To save it back, use #save.
Constant Summary collapse
- INVALID_CHARS =
Characters which MediaWiki does not permit in page title.
%w(# < > [ ] | { })- INVALID_CHARS_REGEX =
Regex matching characters which MediaWiki does not permit in page title.
Regexp.union *INVALID_CHARS
Instance Attribute Summary collapse
-
#counter ⇒ Object
readonly
Value of given attribute, as returned by API call prop=info for this page.
-
#edittoken ⇒ Object
readonly
Value of given attribute, as returned by API call prop=info for this page.
-
#lastrevid ⇒ Object
readonly
Value of given attribute, as returned by API call prop=info for this page.
-
#length ⇒ Object
readonly
Value of given attribute, as returned by API call prop=info for this page.
-
#ns ⇒ Object
readonly
Value of given attribute, as returned by API call prop=info for this page.
-
#orig_text ⇒ Object
readonly
The text of the page, as of when it was loaded.
-
#pageid ⇒ Object
readonly
Value of given attribute, as returned by API call prop=info for this page.
-
#preloaded_attrs ⇒ Object
Whether this datum is already loaded.
-
#preloaded_text ⇒ Object
Whether this datum is already loaded.
-
#protection ⇒ Object
readonly
Value of given attribute, as returned by API call prop=info for this page.
-
#real_title ⇒ Object
readonly
Value of ‘title` attribute, as returned by API call prop=info for this page.
-
#starttimestamp ⇒ Object
readonly
Value of given attribute, as returned by API call prop=info for this page.
-
#sunflower ⇒ Object
readonly
The Sunflower instance this page belongs to.
-
#text ⇒ Object
The current text of the page.
-
#title ⇒ Object
readonly
Page title, as passed to #initialize and cleaned by Sunflower#cleanup_title.
-
#touched ⇒ Object
readonly
Value of given attribute, as returned by API call prop=info for this page.
Class Method Summary collapse
Instance Method Summary collapse
-
#append(txt, newlines = 2) ⇒ Object
appends newlines and text by default - 2 newlines.
-
#change_category(from, to) ⇒ Object
Replace the category from with category to in page wikitext.
-
#code_cleanup ⇒ Object
simple, safe code cleanup use Sunflower.always_do_code_cleanup=true to do it automatically just before saving page.
-
#code_cleanup_plwiki(str) ⇒ Object
plwiki-specific cleanup routines.
-
#dump ⇒ Object
Save the current text of this page to a file whose name is based on page title, with non-alphanumeric characters stripped.
-
#dump_to(file) ⇒ Object
Save the current text of this page to file (which can be either a filename or an IO).
- #gsub(from, to) ⇒ Object
-
#initialize(title = '', url = '') ⇒ Page
constructor
Load the specified page.
-
#preload_attrs ⇒ Object
Load the metadata associated with this page.
-
#preload_text ⇒ Object
Load the text of this page.
-
#prepend(txt, newlines = 2) ⇒ Object
prepends text and newlines by default - 2 newlines.
-
#remove_category(cat) ⇒ Object
Remove the category from page wikitext.
-
#replace(from, to, once = false) ⇒ Object
replaces “from” with “to” in page text “from” may be regex.
-
#save(title = @title, summary = @sunflower.summary) ⇒ Object
(also: #put)
Save the modifications to this page, possibly under a different title.
- #sub(from, to) ⇒ Object
Constructor Details
#initialize(title = '', url = '') ⇒ Page
Load the specified page. Only the text will be immediately loaded - attributes and edit token will be loaded when needed, or when you call #preload_attrs.
If you are using multiple Sunflowers, you have to specify which one this page belongs to using the second argument of function. You can pass either a Sunflower object, wiki URL, or a shorthand id as specified in Sunflower.resolve_wikimedia_id.
478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 |
# File 'lib/sunflower/core.rb', line 478 def initialize title='', url='' raise Sunflower::Error, 'title invalid: '+title if title =~ INVALID_CHARS_REGEX case url when Sunflower @sunflower = url when '', nil count = ObjectSpace.each_object(Sunflower){|o| @sunflower=o} raise Sunflower::Error, 'no Sunflowers present' if count==0 raise Sunflower::Error, 'you must pass wiki name if using multiple Sunflowers at once' if count>1 else url = (url.include?('.') ? url : Sunflower.resolve_wikimedia_id(url)) ObjectSpace.each_object(Sunflower){|o| @sunflower=o if o.wikiURL==url} raise Sunflower::Error, "no Sunflower for #{url}" if !@sunflower end @title = @sunflower.cleanup_title title @preloaded_text = false @preloaded_attrs = false end |
Instance Attribute Details
#counter ⇒ Object (readonly)
Value of given attribute, as returned by API call prop=info for this page. Lazy-loaded.
440 441 442 |
# File 'lib/sunflower/core.rb', line 440 def counter @counter end |
#edittoken ⇒ Object (readonly)
Value of given attribute, as returned by API call prop=info for this page. Lazy-loaded.
440 441 442 |
# File 'lib/sunflower/core.rb', line 440 def edittoken @edittoken end |
#lastrevid ⇒ Object (readonly)
Value of given attribute, as returned by API call prop=info for this page. Lazy-loaded.
440 441 442 |
# File 'lib/sunflower/core.rb', line 440 def lastrevid @lastrevid end |
#length ⇒ Object (readonly)
Value of given attribute, as returned by API call prop=info for this page. Lazy-loaded.
440 441 442 |
# File 'lib/sunflower/core.rb', line 440 def length @length end |
#ns ⇒ Object (readonly)
Value of given attribute, as returned by API call prop=info for this page. Lazy-loaded.
440 441 442 |
# File 'lib/sunflower/core.rb', line 440 def ns @ns end |
#orig_text ⇒ Object (readonly)
The text of the page, as of when it was loaded. Lazy-loaded.
432 433 434 |
# File 'lib/sunflower/core.rb', line 432 def orig_text @orig_text end |
#pageid ⇒ Object (readonly)
Value of given attribute, as returned by API call prop=info for this page. Lazy-loaded.
440 441 442 |
# File 'lib/sunflower/core.rb', line 440 def pageid @pageid end |
#preloaded_attrs ⇒ Object
Whether this datum is already loaded. Can be set to true to suppress loading (used e.g. by Sunflower::List#pages_preloaded)
446 447 448 |
# File 'lib/sunflower/core.rb', line 446 def preloaded_attrs @preloaded_attrs end |
#preloaded_text ⇒ Object
Whether this datum is already loaded. Can be set to true to suppress loading (used e.g. by Sunflower::List#pages_preloaded)
446 447 448 |
# File 'lib/sunflower/core.rb', line 446 def preloaded_text @preloaded_text end |
#protection ⇒ Object (readonly)
Value of given attribute, as returned by API call prop=info for this page. Lazy-loaded.
440 441 442 |
# File 'lib/sunflower/core.rb', line 440 def protection @protection end |
#real_title ⇒ Object (readonly)
Value of ‘title` attribute, as returned by API call prop=info for this page. Lazy-loaded. See #title.
442 443 444 |
# File 'lib/sunflower/core.rb', line 442 def real_title @real_title end |
#starttimestamp ⇒ Object (readonly)
Value of given attribute, as returned by API call prop=info for this page. Lazy-loaded.
440 441 442 |
# File 'lib/sunflower/core.rb', line 440 def end |
#sunflower ⇒ Object (readonly)
The Sunflower instance this page belongs to.
427 428 429 |
# File 'lib/sunflower/core.rb', line 427 def sunflower @sunflower end |
#text ⇒ Object
The current text of the page. Lazy-loaded.
430 431 432 |
# File 'lib/sunflower/core.rb', line 430 def text @text end |
#title ⇒ Object (readonly)
Page title, as passed to #initialize and cleaned by Sunflower#cleanup_title. Real page title as canonicalized by MediaWiki software can be accessed via #real_title (but it should always be the same).
437 438 439 |
# File 'lib/sunflower/core.rb', line 437 def title @title end |
#touched ⇒ Object (readonly)
Value of given attribute, as returned by API call prop=info for this page. Lazy-loaded.
440 441 442 |
# File 'lib/sunflower/core.rb', line 440 def touched @touched end |
Class Method Details
.get(title, wiki = '') ⇒ Object
572 573 574 |
# File 'lib/sunflower/core.rb', line 572 def self.get title, wiki='' self.new(title, wiki) end |
.load(title, wiki = '') ⇒ Object
576 577 578 |
# File 'lib/sunflower/core.rb', line 576 def self.load title, wiki='' self.new(title, wiki) end |
Instance Method Details
#append(txt, newlines = 2) ⇒ Object
appends newlines and text by default - 2 newlines
18 19 20 |
# File 'lib/sunflower/commontasks.rb', line 18 def append txt, newlines=2 self.text = self.text.rstrip + ("\n"*newlines) + txt end |
#change_category(from, to) ⇒ Object
Replace the category from with category to in page wikitext.
Inputs can be either with the Category: prefix (or localised version) or without.
103 104 105 106 107 108 109 110 111 112 |
# File 'lib/sunflower/commontasks.rb', line 103 def change_category from, to cat_regex = self.sunflower.ns_regex_for 'Category' from = self.sunflower.cleanup_title(from).sub(/^#{cat_regex}:/, '') to = self.sunflower.cleanup_title(to ).sub(/^#{cat_regex}:/, '') self.text.gsub!(/\[\[ *#{cat_regex} *: *#{Regexp.escape from} *(\||\]\])/){ rest = $1 "[[#{self.sunflower.ns_local_for 'Category'}:#{to}#{rest}" } end |
#code_cleanup ⇒ Object
simple, safe code cleanup use Sunflower.always_do_code_cleanup=true to do it automatically just before saving page
76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 |
# File 'lib/sunflower/commontasks.rb', line 76 def code_cleanup str = self.text.gsub /\r\n/, "\n" str.gsub!(/\[\[([^\|\]]+)(\||\]\])/){ name, rest = $1, $2 "[[#{self.sunflower.cleanup_title name, true, true}#{rest}" } # headings str.gsub!(/(^|\n)(=+) *([^=\n]*[^ :=\n])[ :]*=/, '\1\2 \3 ='); # =a= > = a =, =a:= > = a = str.gsub!(/(^|\n)(=+[^=\n]+=+)[\n]{2,}/, "\\1\\2\n"); # one newline # spaced lists str.gsub!(/(\n[#*:;]+)([^ \t\n#*:;{])/, '\1 \2'); if wikiid = self.sunflower.siteinfo['general']['wikiid'] if self.respond_to? :"code_cleanup_#{wikiid}" str = self.send :"code_cleanup_#{wikiid}", str end end self.text = str end |
#code_cleanup_plwiki(str) ⇒ Object
plwiki-specific cleanup routines. based on Nux’s cleaner: pl.wikipedia.org/wiki/Wikipedysta:Nux/wp_sk.js
30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 |
# File 'lib/sunflower/commontasks.rb', line 30 def code_cleanup_plwiki str str = str.dup str.gsub!(/\{\{\{(?:poprzednik|następca|pop|nast|lata|info|lang)\|(.+?)\}\}\}/i,'\1') str.gsub!(/(={1,5})\s*Przypisy\s*\1\s*<references\s?\/>/i){ if $1=='=' || $1=='==' '{{Przypisy}}' else '{{Przypisy|stopień= '+$1+'}}' end } # sklejanie skrótów linkowych str.gsub!(/m\.? ?\[\[n\.? ?p\.? ?m\.?\]\]/, 'm [[n.p.m.]]'); # korekty dat - niepotrzebny przecinek str.gsub!(/(\[\[[0-9]+ (stycznia|lutego|marca|kwietnia|maja|czerwca|lipca|sierpnia|września|października|listopada|grudnia)\]\]), (\[\[[0-9]{4}\]\])/i, '\1 \3'); # linkowanie do wieków str.gsub!(/\[\[([XVI]{1,5}) [wW]\.?\]\]/, '[[\1 wiek|\1 w.]]'); str.gsub!(/\[\[([XVI]{1,5}) [wW]\.?\|/, '[[\1 wiek|'); str.gsub!(/\[\[(III|II|IV|VIII|VII|VI|IX|XIII|XII|XI|XIV|XV|XVIII|XVII|XVI|XIX|XXI|XX)\]\]/, '[[\1 wiek|\1]]'); str.gsub!(/\[\[(III|II|IV|VIII|VII|VI|IX|XIII|XII|XI|XIV|XV|XVIII|XVII|XVI|XIX|XXI|XX)\|/, '[[\1 wiek|'); # rozwijanie typowych linków str.gsub!(/\[\[ang\.\]\]/, '[[język angielski|ang.]]'); str.gsub!(/\[\[cz\.\]\]/, '[[język czeski|cz.]]'); str.gsub!(/\[\[fr\.\]\]/, '[[język francuski|fr.]]'); str.gsub!(/\[\[łac\.\]\]/, '[[łacina|łac.]]'); str.gsub!(/\[\[niem\.\]\]/, '[[język niemiecki|niem.]]'); str.gsub!(/\[\[pol\.\]\]/, '[[język polski|pol.]]'); str.gsub!(/\[\[pl\.\]\]/, '[[język polski|pol.]]'); str.gsub!(/\[\[ros\.\]\]/, '[[język rosyjski|ros.]]'); str.gsub!(/\[\[(((G|g)iga|(M|m)ega|(K|k)ilo)herc|[GMk]Hz)\|/, '[[herc|'); # unifikacja nagłówkowa str.gsub!(/[ \n\t]*\n'''? *(Zobacz|Patrz) (też|także):* *'''?[ \n\t]*/i, "\n\n== Zobacz też ==\n"); str.gsub!(/[ \n\t]*\n(=+) *(Zobacz|Patrz) (też|także):* *=+[ \n\t]*/i, "\n\n\\1 Zobacz też \\1\n"); str.gsub!(/[ \n\t]*\n'''? *((Zewnętrzn[ey] )?(Linki?|Łącza|Stron[ay]|Zobacz w (internecie|sieci))( zewn[eę]trzn[aey])?):* *'''?[ \n\t]*/i, "\n\n== Linki zewnętrzne ==\n"); str.gsub!(/[ \n\t]*\n(=+) *((Zewnętrzn[ey] )?(Linki?|Łącza|Stron[ay]|Zobacz w (internecie|sieci))( zewn[eę]trzn[aey])?):* *=+[ \n\t]*/i, "\n\n\\1 Linki zewnętrzne \\1\n"); return str end |
#dump ⇒ Object
Save the current text of this page to a file whose name is based on page title, with non-alphanumeric characters stripped.
543 544 545 |
# File 'lib/sunflower/core.rb', line 543 def dump self.dump_to @title.gsub(/[^a-zA-Z0-9\-]/,'_')+'.txt' end |
#dump_to(file) ⇒ Object
Save the current text of this page to file (which can be either a filename or an IO).
534 535 536 537 538 539 540 |
# File 'lib/sunflower/core.rb', line 534 def dump_to file if file.respond_to? :write #probably file or IO file.write @text else #filename? File.open(file.to_s, 'w'){|f| f.write @text} end end |
#gsub(from, to) ⇒ Object
9 10 11 |
# File 'lib/sunflower/commontasks.rb', line 9 def gsub from, to self.replace from, to end |
#preload_attrs ⇒ Object
Load the metadata associated with this page. Semi-private.
522 523 524 525 526 527 528 529 530 531 |
# File 'lib/sunflower/core.rb', line 522 def preload_attrs r = @sunflower.API('action=query&prop=info&inprop=protection&intoken=edit&titles='+CGI.escape(@title)) r = r['query']['pages'].values.first r.each{|key, value| key = 'real_title' if key == 'title' self.instance_variable_set('@'+key, value) } @preloaded_attrs = true end |
#preload_text ⇒ Object
Load the text of this page. Semi-private.
501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 |
# File 'lib/sunflower/core.rb', line 501 def preload_text if title == '' @text = '' else r = @sunflower.API('action=query&prop=revisions&rvprop=content&titles='+CGI.escape(@title)) r = r['query']['pages'].values.first if r['missing'] @text = '' elsif r['invalid'] raise Sunflower::Error, 'title invalid: '+@title else @text = r['revisions'][0]['*'] end end @orig_text = @text.dup @preloaded_text = true end |
#prepend(txt, newlines = 2) ⇒ Object
prepends text and newlines by default - 2 newlines
24 25 26 |
# File 'lib/sunflower/commontasks.rb', line 24 def prepend txt, newlines=2 self.text = txt + ("\n"*newlines) + self.text.lstrip end |
#remove_category(cat) ⇒ Object
Remove the category from page wikitext.
Input can be either with the Category: prefix (or localised version) or without.
117 118 119 120 121 122 |
# File 'lib/sunflower/commontasks.rb', line 117 def remove_category cat cat_regex = self.sunflower.ns_regex_for 'Category' cat = self.sunflower.cleanup_title(cat).sub(/^#{cat_regex}:/, '') self.text.gsub!(/\[\[ *#{cat_regex} *: *#{Regexp.escape cat} *(\|[^\]]*)?\]\](\r?\n)?/, '') end |
#replace(from, to, once = false) ⇒ Object
replaces “from” with “to” in page text “from” may be regex
6 7 8 |
# File 'lib/sunflower/commontasks.rb', line 6 def replace from, to, once=false self.text = self.text.send( (once ? 'sub' : 'gsub'), from, to ) end |
#save(title = @title, summary = @sunflower.summary) ⇒ Object Also known as: put
Save the modifications to this page, possibly under a different title. Default summary is this page’s Sunflower’s summary (see Sunflower#summary=). Default title is the current title.
Will not perform API request if no changes were made.
Will call #code_cleanup if Sunflower#always_do_code_cleanup is set.
Returns the JSON result of API call or nil when API call was not made.
554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 |
# File 'lib/sunflower/core.rb', line 554 def save title=@title, summary=@sunflower.summary preload_attrs unless @preloaded_attrs raise Sunflower::Error, 'title invalid: '+title if title =~ INVALID_CHARS_REGEX raise Sunflower::Error, 'empty or no summary!' if !summary or summary=='' if @orig_text==@text && title==@title @sunflower.log('Page '+title+' not saved - no changes.') return nil end self.code_cleanup if @sunflower.always_do_code_cleanup && self.respond_to?('code_cleanup') return @sunflower.API("action=edit&bot=1&title=#{CGI.escape(title)}&text=#{CGI.escape(@text)}&summary=#{CGI.escape(summary)}&token=#{CGI.escape(@edittoken)}") end |
#sub(from, to) ⇒ Object
12 13 14 |
# File 'lib/sunflower/commontasks.rb', line 12 def sub from, to self.replace from, to, true end |