KvgCharacterRecognition

KvgCharacterRecognition module contains a CJK-character recognition engine which uses pattern/template matching techniques to achieve recognitionof stroke-order and stroke-number free handwritten character patterns in the format [stroke1, stroke2 ...]. A stroke is an array of points in the format [[x1, y1], [x2, y2], ...]. For templates, we use svg data from the KanjiVG project

The engine takes 3 steps to perform the recognition of an input pattern.

  1. Preprocessing The preprocessing step consists of smoothing, normalizing, interpolating and downsampling of the data points.
  2. Feature Extraction Smoothed heatmap, significant points and directional feature densities are used as features. A heatmap divides the input pattern in small grids and stores the number of data points in each grid. Significant points are defined as start and end point of a stroke, points on curve or edge. Directional feature densities are introduced in the paper "On-line Recognition of Freely Handwritten Japanese Character Using Directional Feature Density"
  3. Matching We use the significant points to perform a coarse recognition of the input pattern, that filters out template patterns with great distance to the input pattern. Next, a mixed distance score of directional feature density and smoothed heatmap is calculated. ## Installation

Add this line to your application's Gemfile:

gem 'kvg_character_recognition'

And then execute:

$ bundle

Or install it yourself as:

$ gem install kvg_character_recognition

Usage

  1. Create a database(e.g. using sqlite3 data.db)

  2. Setup the characters table in the database and populate it with kanjivg templates from the xml release

    require 'kvg_character_recognition'
    

KvgCharacterRecognition::Database.setup

KvgCharacterRecognition::Database.populate_from_xml "kanjivg-20150615-2.xml"


3. Recognition

Use an input field of size 300x300 for the best recognition accuracy. The input pattern in the example is the character 

Configuration

You can try out different parameters for adapting the extracted features to your input settings i.e. other sample rate, size Don't forget to redo the whole database step after changing the configuration.

  #this is the default configuration
  config = {
    size: 109, #fixed canvas size of kanjivg data
    downsample_interval: 4,
    interpolate_distance: 0.8,
    direction_grid: 15,
    smoothed_heatmap_grid: 20,
    significant_points_heatmap_grid: 3
  }

  #from hash
  Kvgcharacterrecognition.configure(config)
  #from yaml file
  Kvgcharacterrecognition.configure_with(path_to_yml)

  #configure database with yml
  #TODO why is postgres slower than sqlite?
  Kvgcharacterrecognition.configure_database(path_to_yml)

Development

After checking out the repo, run bin/setup to install dependencies. Then, run rake spec to run the tests. You can also run bin/console for an interactive prompt that will allow you to experiment.

To install this gem onto your local machine, run bundle exec rake install. To release a new version, update the version number in version.rb, and then run bundle exec rake release, which will create a git tag for the version, push git commits and tags, and push the .gem file to rubygems.org.

Contributing

Bug reports and pull requests are welcome on GitHub at https://github.com/[USERNAME]/kvg_character_recognition.

License

The gem is available as open source under the terms of the MIT License.