Class: Poliqarp::Client

Inherits:
Object
  • Object
show all
Defined in:
lib/poliqarpr/client.rb

Overview

Author

Aleksander Pohl ([email protected])

License

MIT License

This class is the implementation of the Poliqarp server client.

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(session_name = "RUBY", debug = false) ⇒ Client

Creates new poliqarp server client.

Parameters:

  • session_name the name of the client session. Defaults to “RUBY”.

  • debug if set to true, all messages sent and received from server are printed to standard output. Defaults to false.



24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
# File 'lib/poliqarpr/client.rb', line 24

def initialize(session_name="RUBY", debug=false)
  @session_name = session_name
  @debug = debug
  @logger = STDOUT
  @connector = Connector.new(self)
  @config = Config.new(self,5000)
  @answer_queue = Queue.new
  @waiting_mutex = Mutex.new
  @query_mutex = Mutex.new
  new_session
  config.left_context_size = 5
  config.right_context_size = 5
  config.tags = []
  config.lemmata = []
end

Instance Attribute Details

#configObject (readonly)

The configuration of the client.



16
17
18
# File 'lib/poliqarpr/client.rb', line 16

def config
  @config
end

#debug(msg = nil) ⇒ Object

Prints the debug msg to the logger if debugging is turned on. Accepts both regular message and block with message. The second form is provided for messages which aren’t cheep to build.



75
76
77
78
79
80
81
82
83
# File 'lib/poliqarpr/client.rb', line 75

def debug(msg=nil)
  if @debug
    if block_given?
      msg = yield
    end
    logger.puts msg
    logger.flush
  end
end

#loggerObject

Logger used for debugging. STDOUT by default.



13
14
15
# File 'lib/poliqarpr/client.rb', line 13

def logger
  @logger
end

Class Method Details

.const_missing(const) ⇒ Object

A hint about installation of default corpus gem



41
42
43
44
45
46
# File 'lib/poliqarpr/client.rb', line 41

def self.const_missing(const)
  if const.to_s =~ /DEFAULT_CORPUS/
    raise "You need to install gem 'poliqarpr-corpus' to use the default corpus"
  end
  super
end

Instance Method Details

#closeObject

Closes the opened session.



62
63
64
65
# File 'lib/poliqarpr/client.rb', line 62

def close
  talk "CLOSE-SESSION"
  @session = false
end

#close_corpusObject

Closes the opened corpus.



68
69
70
# File 'lib/poliqarpr/client.rb', line 68

def close_corpus
  talk "CLOSE"
end

#context(query, index) ⇒ Object

Returns the long context of the excerpt which is identified by given (query, index) pair.



198
199
200
201
202
203
204
205
206
207
208
209
210
211
# File 'lib/poliqarpr/client.rb', line 198

def context(query,index)
  make_query(query)
  result = []
  talk "GET-CONTEXT #{index}"
  # 1st part
  result << read_word
  # 2nd part
  result << read_word
  # 3rd part
  result << read_word
  # 4th part
  result << read_word
  result
end

#count(query) ⇒ Object

Returns the number of results for given query.



192
193
194
# File 'lib/poliqarpr/client.rb', line 192

def count(query)
  count_results(make_query(query))
end

#find(query, options = {}) ⇒ Object Also known as: query

Send the query to the opened corpus.

Options:

  • index the index of the (only one) result to be returned. The index is relative to the beginning of the query result. In normal case you should query the corpus without specifying the index, to see what results are returned. Then you can use the index and the same query to retrieve one result. The pair (query, index) is a kind of unique identifier of the excerpt.

  • page_size the size of the page of results. If the page size is 0, then all results are returned on one page. It is ignored if the index option is present. Defaults to 0.

  • page_index the index of the page of results (the first page has index 1, not 0). It is ignored if the index option is present. Defaults to 1.



181
182
183
184
185
186
187
# File 'lib/poliqarpr/client.rb', line 181

def find(query,options={})
  if options[:index]
    find_one(query, options[:index])
  else
    find_many(query, options)
  end
end

#metadata(query, index) ⇒ Object

Returns the metadata of the excerpt which is identified by given (query, index) pair.



215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
# File 'lib/poliqarpr/client.rb', line 215

def (query, index)
  make_query(query)
  result = {}
  answer = talk("METADATA #{index}")
  count = answer.split(" ")[1].to_i
  count.times do |index|
    type = read_word.gsub(/[^a-zA-Z]/,"").to_sym
    value = read_word[2..-1]
    unless value.nil?
      result[type] ||= []
      result[type] << value
    end
  end
  result
end

#metadata_typesObject

TODO



141
142
143
# File 'lib/poliqarpr/client.rb', line 141

def 
  raise "Not implemented"
end

#new_session(port = 4567) ⇒ Object

Creates new session for the client with the name given in constructor. If the session was already opened, it is closed.

Parameters:

  • port - the port on which the poliqarpd server is accepting connections (defaults to 4567)



53
54
55
56
57
58
59
# File 'lib/poliqarpr/client.rb', line 53

def new_session(port=4567)
  close if @session
  @connector.open("localhost",port)
  talk("MAKE-SESSION #{@session_name}")
  resize_buffer(config.buffer_size)
  @session = true
end

#open_corpus(path, &handler) ⇒ Object

Asynchronous Opens the corpus given as path. To open the default corpus pass :default as the argument.

If you don’t want to wait until the call is finished, you have to provide handler for the asynchronous answer.



90
91
92
93
94
95
96
97
98
99
100
101
# File 'lib/poliqarpr/client.rb', line 90

def open_corpus(path, &handler)
  if path == :default
    open_corpus(DEFAULT_CORPUS, &handler)
  else
    result = talk("OPEN #{path}", :async, &handler)
    if result == "OPENED"
      result
    else
      raise PoliqarpException.new(result)
    end
  end
end

#pingObject

Server diagnostics – the result should be :pong



104
105
106
# File 'lib/poliqarpr/client.rb', line 104

def ping
  :pong if talk("PING") =~ /PONG/
end

#statsObject

Returns corpus statistics:

  • :segment_tokens the number of segments in the corpus (two segments which look exactly the same are counted separately)

  • :segment_types the number of segment types in the corpus (two segments which look exactly the same are counted as one type)

  • :lemmata the number of lemmata (lexemes) types (all forms of inflected word, e.g. ‘kot’, ‘kotu’, … are treated as one “word” – lemmata)

  • :tags the number of different grammar tags (each combination of atomic tags is treated as different “tag”)



123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
# File 'lib/poliqarpr/client.rb', line 123

def stats
  stats = {}
  talk("CORPUS-STATS").split.each_with_index do |value, index|
    case index
    when 1
      stats[:segment_tokens] = value.to_i
    when 2
      stats[:segment_types] = value.to_i
    when 3
      stats[:lemmata] = value.to_i
    when 4
      stats[:tags] = value.to_i
    end
  end
  stats
end

#tagsetObject

Returns the tag-set used in the corpus. It is divided into two groups:

  • :categories enlists tags belonging to grammatical categories (each category has a list of its tags, eg. gender: m1 m2 m3 f n, means that there are 5 genders: masculine(1,2,3), feminine and neuter)

  • :classes enlists grammatical tags used to describe it (each class has a list of tags used to describe it, eg. adj: degree gender case number, means that adjectives are described in terms of degree, gender, case and number)



154
155
156
157
158
159
160
161
162
163
164
165
166
# File 'lib/poliqarpr/client.rb', line 154

def tagset
  answer = talk("GET-TAGSET")
  counters = answer.split
  result = {}
  [:categories, :classes].each_with_index do |type, type_index|
    result[type] = {}
    counters[type_index+1].to_i.times do |index|
      values = read_word.split
      result[type][values[0].to_sym] = values[1..-1].map{|v| v.to_sym}
    end
  end
  result
end

#versionObject

Returns server version



109
110
111
# File 'lib/poliqarpr/client.rb', line 109

def version
  talk("VERSION")
end