forthebadge forthebadge Gem Version

This gem was last updated on the 05.04.2024 (dd.mm.yyyy notation), at 20:00:41 o'clock.

Environment variable to use a specific yaml file

You can use an environment variable to denote the default yaml file in use. This allows you to use your own yaml file format, rather than the yaml files that are distributed with this gem.

The name of that environment variable must be DICTIONARIES_FILE. It shall point to your yaml file that holds the key-value pairs.

For example, if your file is at /opt/czech.yml, then DICTIONARIES_FILE should point at that location.

In bash, this may be equivalent to:

export DICTIONARIES_FILE=/opt/czech.yml

Note that in the long run, the dictionaries gem could be extended with these yaml files - or allow means to download these files over the www. But before we can do so, let's aim for at the least 100 words in such a file before we would consider distributing it or offering means to distribute said file.

Difficult english sentences

English is not the most difficult language in the world, but when it comes to proper pronounciation of words, english can be surprisingly difficult.

This subsection may keep a listing of sentences that, for one reason or the other, can be somewhat difficult to read out aloud without mistake, on a first try. It is just a fun-subsection, not meant to be taken too seriously; and it is quite subjective.

Without any further ado, here comes a listing of sentences that may be difficult to pronounce properly so:

I would like to distribute something.

Obtaining all translations into german for a given english word

Since as of November 2020 the following API exists:

Dictionaries.return_array_of_translated_words_from_online_leo('cat')
Dictionaries.return_array_of_translated_words_from_online_leo('dog')
Dictionaries.return_array_of_translated_words_from_online_leo('human')

This will return an Array of german names. It does not work 100% perfectly as it is based on a regex; and using a regex to parse HTML is never a trivial way. But if you just want to get the first entry, just call .first on it, and in most cases this is the best, most likely translation available.

The regex has to find matches to entries such as the following one:

</repr><words><word>der Jazzfan</word></words>

Anyone to come up with a more accurate regex is welcome to share it. :)

This functionality was specifically necessary because I needed to use this in the ruby-gtk bindings for this project.

Dictionaries.return_unique_words_from_this_file

This method will return all words that are presently not registered in the english dictionary.

The idea here is for me to slowly add more english words into the yaml file. I won't add every english word that exists, but I will try to aim for a seizable number in the long run, such as 5000 english words - already halfway there. \o/

GUI component

The GUI component of the dictionaries gem defaults to GTK3 finally. It is not a very advanced GUI, though.

First, install the gtk3 gem:

gem install gtk3

Then install the gtk_paradise project:

gem install gtk_paradise

Now you should be able to start the GUI component if the above has worked:

dictionaries --gui

See also the help options.

dictionaries --help

You may need to install some .h files if you use a specific Linux distribution; look at the relevant -dev packages for this. Or just compile from source. :)

The current version of ruby-gtk3 in October 2021 looks like this:

Yes, this isn't very pretty; I just wanted to focus a bit on the functionality. Tons of things are missing, such as switching to other .yml files from within the GUI itself. I just wanted to showcase a demo - the convert from english-to-german functionality is working, though, so the GUI is functional, even if not super-pretty.

In February 2022 this was improved a little bit. It's still not extremely pretty, but you can see a few small improvements. In the long run I will add functionality to switch between different .yml files (thus, different dictionaries, such as english, italian and so forth) - but it is a hobby project. I won't have enough time to add all dictionaries into this project as-is. I will, however had, add the possibility to load custom .yml files and other formats in the long run, so others can adapt the project to suit their needs.

sinatra

To start the sinatra interface of the dictionaries gem, do:

dictionaries --sinatra

You can then visit it on the localhost and it may look like this:

You can then input an english word into the form, or the URL area in your browser. If this word is registered in the .yml file then the following result can be seen:

This is really just very basic - I wanted to show the functionality. You may have to adapt the code if you have a more realistic use case, so consider the images above just as examples of how this could be used in a website.

Generating a .pdf file containing the translations

Since as of July 2022 you can generate a .pdf file with all the translated words. This functionality depends on the prawn gem right now, so make sure that this gem is installed before invoking the ruby code that generates the .pdf file.

The toplevel API for creating the .pdf file is as follows:

Dictionaries.generate_pdf_file

Statistical information

On the 24.01.2023, the Statistics submodule was added towards module Dictionaries. The purpose of that submodule is to simply display some information about the project - in particular how many words are kept in each individual .yml file (the yaml file that contains all words in a given language).

For instance, on that day, the dictionaries gem contains these word-translations in total:

chinese           15 words.
danish             1 words.
dutch              1 words.
english         2489 words.
farsi              4 words.
finnish            2 words.
italian          191 words.
japanese           2 words.
norwegian          5 words.
russian            3 words.
spanish           23 words.
swedish            1 words.

Expect more to be added over the coming months and years. As can be seen english is the primary focus for this project, in particular english-to-german and german-to-english.

Right now I am adding new entries manually for the most part, but at a later point in time I may simply parse an existing dictionary and then begin to add the missing entries more systematically. Stay tuned for more information in this regard in the future.

Since as of May 2023 it is now possible to show how many words are available per language file.

Use the following commandline invocation for this:

dictionaries --stats
dictionaries --statistics # both variants work

In May 2023 the statistics were as follows:

chinese           15 words.
danish             1 words.
dutch              1 words.
english         2607 words.
farsi              4 words.
finnish            2 words.
italian          206 words.
japanese           2 words.
norwegian          5 words.
polish             1 words.
russian            3 words.
spanish           31 words.
swedish            1 words.

I may add to the above listing every some year or so, to show how the project grows - and, also, in the event that someone else may want to take over eventually.

Standalone .html example

Since as of August 2023 there is a small example file under test/, to show how the dictionary could be used on a webpage:

Note that the word-list, in JavaScript, is auto-generated from ruby, but not updated regularly. So this is more a proof of concept as-is. In the long run functionality will be added to the dictionaries gem to allow users to embed the javascript-hash into a webpage. Dictionaries for everyone! \o/

Why are there so few words registered in the dictionaries gem?

In August 2023, almost 3000 german-to-english words are registered in the dictionaries gem. This is not a whole lot of words.

The main reason for this is that I manually add new entries, so that takes time. I thus focus on my own use case.

However had, as I also want to make the dictionaries gem more useful to other folks, I may begin to programmatically convert entries, such as in the ding-dictionary, and integrate it into the dictionaries gem. But for now this has to come at a later time. This subsection was written in the event that people are confused why there are so few words registered in this project.

SpellChecker

A new class was added in November 2023. This class can be used to "spell check" against the distributed .yml file for english words (to german).

This class can be found here:

require 'dictionaries/spell_checker/spell_checker.rb'
Dictionaries::SpellChecker.new(ARGV)

It is currently unfinished. In the long run I plan to have this class support all entries in the distributed .yml file, but this will probably take a few years, considering how slow I am in regards to this project here. Nonetheless, others can then expand on this functionality, such as by subclassing class Dictionaries::SpellChecker eventually.

You can also add words that have to be ignored, into a file called IGNORE_THESE_WORDS.md - which is treated as a YAML file by class Dictionaries::SpellChecker. It has to exist in the current working directory. If it does exist, then it will be loaded and each entry there will become a word that will be ignored by class Dictionaries::Spellchecker.

Licence

Until the 17th of October 2019, this project was using the GPLv2 licence (no later clause).

However had, I believe that the GPLv2 licence is not great for a project that focuses on existing words in real languages, aka dictionaries. Thus, I have decided to change the licence into the MIT licence on that day (17.11.2019, in dd.mm.yyyy notation).

So the gem is now MIT licenced. There may be dragons! \o/

For a description of that licence, see https://opensource.org/licenses/MIT.

Contact information and mandatory 2FA (no longer) coming up in 2022 / 2023

If your creative mind has ideas and specific suggestions to make this gem more useful in general, feel free to drop me an email at any time, via:

shevy@inbox.lt

Before that email I used an email account at Google gmail, but in 2021 I decided to slowly abandon gmail, for various reasons. In order to limit the explanation here, allow me to just briefly state that I do not feel as if I want to promote any Google service anymore when the user becomes the end product (such as via data collection by upstream services, including other proxy-services). My feeling is that this is a hugely flawed business model to begin with, and I no longer wish to support this in any way, even if only indirectly so, such as by using services of companies that try to promote this flawed model.

In regards to responding to emails: please keep in mind that responding may take some time, depending on the amount of work I may have at that moment. So it is not that emails are ignored; it is more that I have not (yet) found the time to read and reply. This means there may be a delay of days, weeks and in some instances also months. There is, unfortunately, not much I can do when I need to prioritise my time investment, but I try to consider all feedback as an opportunity to improve my projects nonetheless.

In 2022 rubygems.org decided to make 2FA mandatory for every gem owner eventually:

see https://blog.rubygems.org/2022/06/13/making-packages-more-secure.html

However had, that has been reverted again, so I decided to shorten this paragraph. Mandatory 2FA may exclude users who do not have a smartphone device or other means to 'identify'. I do not feel it is a fair assumption by others to be made that non-identified people may not contribute code, which is why I reject it. Mandatory 2FA would mean an end to all my projects on rubygems.org, so let's hope it will never happen. (Keep in mind that I refer to mandatory 2FA; I have no qualms for people who use 2FA on their own, but this carrot-and-stick strategy by those who control the rubygems infrastructure is a very bad one to pursue.