Class: Poliqarp::Client
- Inherits:
-
Object
- Object
- Poliqarp::Client
- Defined in:
- lib/poliqarpr/client.rb
Overview
- Author
-
Aleksander Pohl ([email protected])
- License
-
MIT License
This class is the implementation of the Poliqarp server client.
Instance Attribute Summary collapse
-
#config ⇒ Object
readonly
The configuration of the client.
-
#debug(msg = nil) ⇒ Object
Prints the debug
msg
to the logger if debugging is turned on. -
#logger ⇒ Object
Logger used for debugging.
Class Method Summary collapse
-
.const_missing(const) ⇒ Object
A hint about installation of default corpus gem.
Instance Method Summary collapse
-
#close ⇒ Object
Closes the opened session.
-
#close_corpus ⇒ Object
Closes the opened corpus.
-
#context(query, index) ⇒ Object
Returns the long context of the excerpt which is identified by given (query, index) pair.
-
#count(query) ⇒ Object
Returns the number of results for given query.
-
#find(query, options = {}) ⇒ Object
(also: #query)
Send the query to the opened corpus.
-
#initialize(session_name = "RUBY", debug = false) ⇒ Client
constructor
Creates new poliqarp server client.
-
#metadata(query, index) ⇒ Object
Returns the metadata of the excerpt which is identified by given (query, index) pair.
-
#metadata_types ⇒ Object
TODO.
-
#new_session(port = 4567) ⇒ Object
Creates new session for the client with the name given in constructor.
-
#open_corpus(path, &handler) ⇒ Object
Asynchronous Opens the corpus given as
path
. -
#ping ⇒ Object
Server diagnostics – the result should be :pong.
-
#stats ⇒ Object
Returns corpus statistics: *
:segment_tokens
the number of segments in the corpus (two segments which look exactly the same are counted separately) *:segment_types
the number of segment types in the corpus (two segments which look exactly the same are counted as one type) *:lemmata
the number of lemmata (lexemes) types (all forms of inflected word, e.g. ‘kot’, ‘kotu’, … are treated as one “word” – lemmata) *:tags
the number of different grammar tags (each combination of atomic tags is treated as different “tag”). -
#tagset ⇒ Object
Returns the tag-set used in the corpus.
-
#version ⇒ Object
Returns server version.
Constructor Details
#initialize(session_name = "RUBY", debug = false) ⇒ Client
Creates new poliqarp server client.
Parameters:
-
session_name
the name of the client session. Defaults to “RUBY”. -
debug
if set to true, all messages sent and received from server are printed to standard output. Defaults to false.
24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 |
# File 'lib/poliqarpr/client.rb', line 24 def initialize(session_name="RUBY", debug=false) @session_name = session_name @debug = debug @logger = STDOUT @connector = Connector.new(self) @config = Config.new(self,5000) @answer_queue = Queue.new @waiting_mutex = Mutex.new @query_mutex = Mutex.new new_session config.left_context_size = 5 config.right_context_size = 5 config. = [] config.lemmata = [] end |
Instance Attribute Details
#config ⇒ Object (readonly)
The configuration of the client.
16 17 18 |
# File 'lib/poliqarpr/client.rb', line 16 def config @config end |
#debug(msg = nil) ⇒ Object
Prints the debug msg
to the logger if debugging is turned on. Accepts both regular message and block with message. The second form is provided for messages which aren’t cheep to build.
75 76 77 78 79 80 81 82 83 |
# File 'lib/poliqarpr/client.rb', line 75 def debug(msg=nil) if @debug if block_given? msg = yield end logger.puts msg logger.flush end end |
#logger ⇒ Object
Logger used for debugging. STDOUT by default.
13 14 15 |
# File 'lib/poliqarpr/client.rb', line 13 def logger @logger end |
Class Method Details
.const_missing(const) ⇒ Object
A hint about installation of default corpus gem
41 42 43 44 45 46 |
# File 'lib/poliqarpr/client.rb', line 41 def self.const_missing(const) if const.to_s =~ /DEFAULT_CORPUS/ raise "You need to install gem 'poliqarpr-corpus' to use the default corpus" end super end |
Instance Method Details
#close ⇒ Object
Closes the opened session.
62 63 64 65 |
# File 'lib/poliqarpr/client.rb', line 62 def close talk "CLOSE-SESSION" @session = false end |
#close_corpus ⇒ Object
Closes the opened corpus.
68 69 70 |
# File 'lib/poliqarpr/client.rb', line 68 def close_corpus talk "CLOSE" end |
#context(query, index) ⇒ Object
Returns the long context of the excerpt which is identified by given (query, index) pair.
198 199 200 201 202 203 204 205 206 207 208 209 210 211 |
# File 'lib/poliqarpr/client.rb', line 198 def context(query,index) make_query(query) result = [] talk "GET-CONTEXT #{index}" # 1st part result << read_word # 2nd part result << read_word # 3rd part result << read_word # 4th part result << read_word result end |
#count(query) ⇒ Object
Returns the number of results for given query.
192 193 194 |
# File 'lib/poliqarpr/client.rb', line 192 def count(query) count_results(make_query(query)) end |
#find(query, options = {}) ⇒ Object Also known as: query
Send the query to the opened corpus.
Options:
-
index
the index of the (only one) result to be returned. The index is relative to the beginning of the query result. In normal case you should query the corpus without specifying the index, to see what results are returned. Then you can use the index and the same query to retrieve one result. The pair (query, index) is a kind of unique identifier of the excerpt. -
page_size
the size of the page of results. If the page size is 0, then all results are returned on one page. It is ignored if theindex
option is present. Defaults to 0. -
page_index
the index of the page of results (the first page has index 1, not 0). It is ignored if theindex
option is present. Defaults to 1.
181 182 183 184 185 186 187 |
# File 'lib/poliqarpr/client.rb', line 181 def find(query,={}) if [:index] find_one(query, [:index]) else find_many(query, ) end end |
#metadata(query, index) ⇒ Object
Returns the metadata of the excerpt which is identified by given (query, index) pair.
215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 |
# File 'lib/poliqarpr/client.rb', line 215 def (query, index) make_query(query) result = {} answer = talk("METADATA #{index}") count = answer.split(" ")[1].to_i count.times do |index| type = read_word.gsub(/[^a-zA-Z]/,"").to_sym value = read_word[2..-1] unless value.nil? result[type] ||= [] result[type] << value end end result end |
#metadata_types ⇒ Object
TODO
141 142 143 |
# File 'lib/poliqarpr/client.rb', line 141 def raise "Not implemented" end |
#new_session(port = 4567) ⇒ Object
Creates new session for the client with the name given in constructor. If the session was already opened, it is closed.
Parameters:
-
port
- the port on which the poliqarpd server is accepting connections (defaults to 4567)
53 54 55 56 57 58 59 |
# File 'lib/poliqarpr/client.rb', line 53 def new_session(port=4567) close if @session @connector.open("localhost",port) talk("MAKE-SESSION #{@session_name}") resize_buffer(config.buffer_size) @session = true end |
#open_corpus(path, &handler) ⇒ Object
Asynchronous Opens the corpus given as path
. To open the default corpus pass :default
as the argument.
If you don’t want to wait until the call is finished, you have to provide handler
for the asynchronous answer.
90 91 92 93 94 95 96 97 98 99 100 101 |
# File 'lib/poliqarpr/client.rb', line 90 def open_corpus(path, &handler) if path == :default open_corpus(DEFAULT_CORPUS, &handler) else result = talk("OPEN #{path}", :async, &handler) if result == "OPENED" result else raise PoliqarpException.new(result) end end end |
#ping ⇒ Object
Server diagnostics – the result should be :pong
104 105 106 |
# File 'lib/poliqarpr/client.rb', line 104 def ping :pong if talk("PING") =~ /PONG/ end |
#stats ⇒ Object
Returns corpus statistics:
-
:segment_tokens
the number of segments in the corpus (two segments which look exactly the same are counted separately) -
:segment_types
the number of segment types in the corpus (two segments which look exactly the same are counted as one type) -
:lemmata
the number of lemmata (lexemes) types (all forms of inflected word, e.g. ‘kot’, ‘kotu’, … are treated as one “word” – lemmata) -
:tags
the number of different grammar tags (each combination of atomic tags is treated as different “tag”)
123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 |
# File 'lib/poliqarpr/client.rb', line 123 def stats stats = {} talk("CORPUS-STATS").split.each_with_index do |value, index| case index when 1 stats[:segment_tokens] = value.to_i when 2 stats[:segment_types] = value.to_i when 3 stats[:lemmata] = value.to_i when 4 stats[:tags] = value.to_i end end stats end |
#tagset ⇒ Object
Returns the tag-set used in the corpus. It is divided into two groups:
-
:categories
enlists tags belonging to grammatical categories (each category has a list of its tags, eg. gender: m1 m2 m3 f n, means that there are 5 genders: masculine(1,2,3), feminine and neuter) -
:classes
enlists grammatical tags used to describe it (each class has a list of tags used to describe it, eg. adj: degree gender case number, means that adjectives are described in terms of degree, gender, case and number)
154 155 156 157 158 159 160 161 162 163 164 165 166 |
# File 'lib/poliqarpr/client.rb', line 154 def answer = talk("GET-TAGSET") counters = answer.split result = {} [:categories, :classes].each_with_index do |type, type_index| result[type] = {} counters[type_index+1].to_i.times do |index| values = read_word.split result[type][values[0].to_sym] = values[1..-1].map{|v| v.to_sym} end end result end |
#version ⇒ Object
Returns server version
109 110 111 |
# File 'lib/poliqarpr/client.rb', line 109 def version talk("VERSION") end |