wikiscript - scripts for wikipedia (get wikitext for page etc.)

Usage

Read-only access to wikikpedia pages. Example - Get wikitext source (via en.wikipedia.org/w/index.php?action=raw&title=<title>):

page = Wikiscript::Page.get( '2022_FIFA_World_Cup' )  # same as Wikiscript.get
page.text

prints

The '''2022 FIFA World Cup''' is scheduled to be the 22nd edition of the [[FIFA World Cup]],
the quadrennial international men's [[association football]] championship contested by the
[[List of men's national association football teams|national teams]] of the member associations of [[FIFA]].
It is scheduled to take place in [[Qatar]] in 2022. This will be the first World Cup ever to be held
in the [[Arab world]] and the first in a Muslim-majority country...

Or build your own page from scratch (no download):

page = Wikiscript::Page.new( "The '''2022 FIFA World Cup''' is scheduled to be the 22nd edition of the [[FIFA World Cup]],\nthe quadrennial international men's [[association football]] championship contested by the\n[[List of men's national association football teams|national teams]] of the member associations of [[FIFA]].\nIt is scheduled to take place in [[Qatar]] in 2022. This will be the first World Cup ever to be held\nin the [[Arab world]] and the first in a Muslim-majority country...\n", title: '2022_FIFA_World_Cup' )
page.text

prints

The '''2022 FIFA World Cup''' is scheduled to be the 22nd edition of the [[FIFA World Cup]],
the quadrennial international men's [[association football]] championship contested by the
[[List of men's national association football teams|national teams]] of the member associations of [[FIFA]].
It is scheduled to take place in [[Qatar]] in 2022. This will be the first World Cup ever to be held
in the [[Arab world]] and the first in a Muslim-majority country...

Tables

Parse wiki tables into an array. Example:

table = Wikiscript.parse_table( "{|\n|-\n! header1\n! header2\n! header3\n|-\n| row1cell1\n| row1cell2\n| row1cell3\n|-\n| row2cell1\n| row2cell2\n| row2cell3\n|}\n" )

# -or-

table = Wikiscript.parse_table( "{|\n! header1 !! header2 !! header3\n|-\n| row1cell1 || row1cell2 || row1cell3\n|-\n| row2cell1 || row2cell2 || row2cell3\n|}\n" )

# -or-

table = Wikiscript.parse_table( "{|\n|-\n!\nheader1\n!\nheader2\n!\nheader3\n|-\n|\nrow1cell1\n|\nrow1cell2\n|\nrow1cell3\n|-\n|\nrow2cell1\n|\nrow2cell2\n|\nrow2cell3\n|}\n" )

resulting in:

pp table
#=> [["header1",   "header2",   "header3"],
#    ["row1cell1", "row1cell2", "row1cell3"],
#    ["row2cell1", "row2cell2", "row2cell3"]]

Note: parse_table will strip/remove (leading) style attributes (e.g. àttribute="value" | and (inline) bold and italic emphases (e.g. '') from the (cell) text. Example:

table = Wikiscript.parse_table( "{|\n|-\n! style=\"width:200px;\"|Club\n! style=\"width:150px;\"|City\n|-\n|[[Biu Chun Rangers]]||[[Sham Shui Po]]\n|-\n|bgcolor=#ffff44 |''[[Eastern Sports Club|Eastern]]''||[[Mong Kok]]\n|-\n|[[HKFC Soccer Section]]||[[Happy Valley, Hong Kong|Happy Valley]]\n|}\n" )

resulting in:

pp table
#=> [["Club",                            "City"],
#    ["[[Biu Chun Rangers]]",            "[[Sham Shui Po]]"],
#    ["[[Eastern Sports Club|Eastern]]", "[[Mong Kok]]"],
#    ["[[HKFC Soccer Section]]",         "[[Happy Valley, Hong Kong|Happy Valley]]"]]

Split links into two parts. Note: The alternate link title is optional. Example:

link, title = Wikiscript.parse_link( '[[La Florida, Chile|La Florida]]' )
link   #=> "La Florida, Chile"
title  #=> "La Florida"

link, title = Wikiscript.parse_link( '[[ La Florida, Chile]]' )
link   #=> "La Florida, Chile"
title  #=> nil

link, title = Wikiscript.parse_link( 'La Florida' )
link   #=> nil
title  #=> nil

Document Element Structure

Get the document's element structure. Note: For now only section headings (h1, h2, h3, ...) and tables are supported. Example:

nodes = Wikiscript.parse( "=Heading 1==\n==Heading 2==\n===Heading 3===\n\n{|\n|-\n! header1\n! header2\n! header3\n|-\n| row1cell1\n| row1cell2\n| row1cell3\n|-\n| row2cell1\n| row2cell2\n| row2cell3\n|}\n" )

pp nodes
#=> [[:h1, "Heading 1"],
#    [:h2, "Heading 2"],
#    [:h3, "Heading 3"],
#    [:table, [["header1", "header2", "header3"],
#              ["row1cell1", "row1cell2", "row1cell3"],
#              ["row2cell1", "row2cell2", "row2cell3"]]]

That's all for now. More functionality will get added over time.

Install

Just install the gem:

$ gem install wikiscript

License

The wikiscript scripts are dedicated to the public domain. Use it as you please with no restrictions whatsoever.