Unicode::Emoji [version] [travis]

A small Ruby library which provides Unicode Emoji data and regexes.

Emoji version: 5.0

Supported Rubies: 2.4, 2.3, 2.2, 2.1

Gemfile

gem "unicode-emoji"

Usage

Properties

Allows you to access the codepoint data form Unicode's emoji-data.txt file:

require "unicode/emoji"

Unicode::Emoji.properties "โ˜" # => ["Emoji", "Emoji_Modifier_Base"]

Regex

Five Emoji regexes are included, which are compiled out of various Emoji Unicode data.

require "unicode/emoji"

string = "String which contains all kinds of emoji:

- Singleton Emoji: ๐Ÿ˜ด
- Textual singleton Emoji with Emoji variation: โ–ถ๏ธ
- Emoji with skin tone modifier: ๐Ÿ›Œ๐Ÿฝ
- Region flag: ๐Ÿ‡ต๐Ÿ‡น
- Sub-Region flag: ๐Ÿด๓ ง๓ ข๓ ณ๓ ฃ๓ ด๓ ฟ
- Keycap sequence: 2๏ธโƒฃ
- Sequence using ZWJ (zero width joiner): ๐Ÿคพ๐Ÿฝโ€โ™€๏ธ

"

string.scan(Unicode::Emoji::REGEX) # => ["๐Ÿ˜ด", "โ–ถ๏ธ", "๐Ÿ›Œ๐Ÿฝ", "๐Ÿ‡ต๐Ÿ‡น", "๐Ÿด๓ ง๓ ข๓ ณ๓ ฃ๓ ด๓ ฟ", "2๏ธโƒฃ", "๐Ÿคพ๐Ÿฝโ€โ™€๏ธ"]
Regex Description Example Matches Example Non-Matches
Unicode::Emoji::REGEX Matches (non-textual) singleton Emoji (except for singleton components, like a skin tone modifier without base Emoji) and all kind of valid Emoji sequences, but restrict ZWJ and TAG sequences to recommended sequences ๐Ÿ˜ด, โ–ถ๏ธ, ๐Ÿ›Œ๐Ÿฝ, ๐Ÿ‡ต๐Ÿ‡น, 2๏ธโƒฃ, ๐Ÿด๓ ง๓ ข๓ ณ๓ ฃ๓ ด๓ ฟ, ๐Ÿคพ๐Ÿฝโ€โ™€๏ธ ๐Ÿ˜ด๏ธŽ, โ–ถ, ๐Ÿป, ๐Ÿ‡ต๐Ÿ‡ต, ๐Ÿด๓ ง๓ ข๓ ก๓ ง๓ ข๓ ฟ, ๐Ÿค โ€๐Ÿคข
Unicode::Emoji::REGEX_VALID Matches (non-textual) singleton Emoji (except for singleton components, like a skin tone modifier without base Emoji) and all kind of valid Emoji sequences ๐Ÿ˜ด, โ–ถ๏ธ, ๐Ÿ›Œ๐Ÿฝ, ๐Ÿ‡ต๐Ÿ‡น, 2๏ธโƒฃ, ๐Ÿด๓ ง๓ ข๓ ณ๓ ฃ๓ ด๓ ฟ, ๐Ÿด๓ ง๓ ข๓ ก๓ ง๓ ข๓ ฟ, ๐Ÿคพ๐Ÿฝโ€โ™€๏ธ, ๐Ÿค โ€๐Ÿคข ๐Ÿ˜ด๏ธŽ, โ–ถ, ๐Ÿป, ๐Ÿ‡ต๐Ÿ‡ต
Unicode::Emoji::REGEX_BASIC Matches (non-textual) singleton Emoji (except for singleton components, like a skin tone modifier without base Emoji), but no sequences ๐Ÿ˜ด, โ–ถ๏ธ ๐Ÿ˜ด๏ธŽ, โ–ถ, ๐Ÿป, ๐Ÿ›Œ๐Ÿฝ, ๐Ÿ‡ต๐Ÿ‡น, ๐Ÿ‡ต๐Ÿ‡ต,2๏ธโƒฃ, ๐Ÿด๓ ง๓ ข๓ ณ๓ ฃ๓ ด๓ ฟ, ๐Ÿด๓ ง๓ ข๓ ก๓ ง๓ ข๓ ฟ, ๐Ÿคพ๐Ÿฝโ€โ™€๏ธ, ๐Ÿค โ€๐Ÿคข
Unicode::Emoji::REGEX_TEXT Matches only textual singleton Emoji (except for singleton components, like digit 1) ๐Ÿ˜ด๏ธŽ, โ–ถ ๐Ÿ˜ด, โ–ถ๏ธ, ๐Ÿป, ๐Ÿ›Œ๐Ÿฝ, ๐Ÿ‡ต๐Ÿ‡น, ๐Ÿ‡ต๐Ÿ‡ต,2๏ธโƒฃ, ๐Ÿด๓ ง๓ ข๓ ณ๓ ฃ๓ ด๓ ฟ, ๐Ÿด๓ ง๓ ข๓ ก๓ ง๓ ข๓ ฟ, ๐Ÿคพ๐Ÿฝโ€โ™€๏ธ, ๐Ÿค โ€๐Ÿคข
Unicode::Emoji::REGEX_ANY Matches any Emoji-related codepoint (but no variation selectors or tags) ๐Ÿ˜ด, โ–ถ, ๐Ÿป, ๐Ÿ›Œ, ๐Ÿฝ, ๐Ÿ‡ต, ๐Ÿ‡น, 2, ๐Ÿด, ๐Ÿคพ, โ™€, ๐Ÿค , ๐Ÿคข -

Also See

MIT