Ruby Emoji Regex πŸ’Ž

Gem Version Build Status

A pair of Ruby regular expressions for matching Unicode Emoji symbols.

Background

This is based upon the fantastic work from Mathias Bynens' emoji-regex Javascript package. emoji-regex is cleverly assembled based upon data from the Unicode Consortium.

The regular expressions provided herein are derived from that pacakge.

Installation

gem install emoji_regex

Usage

emoji_regex provides two regular expressions:

  • EmojiRegex::Regex matches emoji which present as emoji by default, and those which present as emoji when combined with U+FE0F VARIATION SELECTOR-16.

  • EmojiRegex::Text matches emoji which present as text by default (regardless of variation selector), as well as those which present as emoji by default.

Emoji vs Text Presentation

Emoji_Presentation is a property of emoji symbols, defined in Unicode Technical Report #51 which controls whether symbols are intended to be rendered as emoji by default.

Generally, for emoji which re-use Unicode code points which existed before Emoji itself was introduced to Unicode, Emoji_Presentation is false.

This means they should be displayed as monochrome text characters by default, and should be combined with U+FE0F VARIATION SELECTOR-16 to indicate emoji presentation is desired.

EmojiRegex::Regex follows this Unicode Consortium guidance, while EmojiRegex::Text matches anything that someone might possibly consider a Unicode emoji.

It's most likely that the regular expression you want is EmojiRegex::Regex! ☺️

Example

require 'emoji_regex'

text = <<TEXT
\u{231A}: ⌚ default emoji presentation character (Emoji_Presentation)
\u{2194}: ↔ default text presentation character
\u{2194}\u{FE0F}: ↔️ default text presentation character with Emoji variation selector
\u{1F469}: πŸ‘© emoji modifier base (Emoji_Modifier_Base)
\u{1F469}\u{1F3FF}: πŸ‘©πŸΏ emoji modifier base followed by a modifier
TEXT

puts 'EmojiRegex::Regex'
text.scan EmojiRegex::Regex do |emoji|
  puts "Matched sequence #{emoji} β€” code points: #{emoji.length}"
end

puts ''

puts 'EmojiRegex::Text'
text.scan EmojiRegex::Text do |emoji|
  puts "Matched sequence #{emoji} β€” code points: #{emoji.length}"
end

Console output:

EmojiRegex::Regex
Matched sequence ⌚ β€” code points: 1
Matched sequence ⌚ β€” code points: 1
Matched sequence ↔️ β€” code points: 2
Matched sequence ↔️ β€” code points: 2
Matched sequence πŸ‘© β€” code points: 1
Matched sequence πŸ‘© β€” code points: 1
Matched sequence πŸ‘©πŸΏ β€” code points: 2
Matched sequence πŸ‘©πŸΏ β€” code points: 2

EmojiRegex::Text
Matched sequence ⌚ β€” code points: 1
Matched sequence ⌚ β€” code points: 1
Matched sequence ↔ β€” code points: 1
Matched sequence ↔ β€” code points: 1
Matched sequence ↔️ β€” code points: 2
Matched sequence ↔️ β€” code points: 2
Matched sequence πŸ‘© β€” code points: 1
Matched sequence πŸ‘© β€” code points: 1
Matched sequence πŸ‘©πŸΏ β€” code points: 2
Matched sequence πŸ‘©πŸΏ β€” code points: 2

Development

Requirements

Initial setup

To install all the Ruby and Javascript dependencies, you can run:

bin/setup

To update the Ruby source files based on the emoji-regex library:

rake regenerate

Specs

A spec suite is provided, which can be run as:

rake spec

Creating a release

  1. Update the version in emoji_regex.gemspec
  2. rake release