Module: SnowflakeId::Generator

Defined in:: lib/snowflake_id/generator.rb

Defined Under Namespace

Constant Summary collapse

DEFAULT_REGEX =

/timestamp_id\('(?<seq_prefix>\w+)'/

Class Method Summary collapse

.at(timestamp, with_random: true) ⇒ Object
.define_timestamp_id ⇒ Object

Our ID will be composed of the following: 6 bytes (48 bits) of millisecond-level timestamp 2 bytes (16 bits) of sequence data.
.ensure_id_sequences_exist ⇒ Object
.ensure_id_sequences_exist_for(table_name) ⇒ Object
.to_time(id) ⇒ Object

Class Method Details

.at(timestamp, with_random: true) ⇒ `Object`

# File 'lib/snowflake_id/generator.rb', line 110

def at(timestamp, with_random: true)
  id  = timestamp.to_i * 1000
  id += rand(1000) if with_random
  id <<= 16
  id += rand(2**16) if with_random
  id
end

.define_timestamp_id ⇒ `Object`

Our ID will be composed of the following: 6 bytes (48 bits) of millisecond-level timestamp 2 bytes (16 bits) of sequence data

The ‘sequence data’ is intended to be unique within a given millisecond, yet obscure the ‘serial number’ of this row.

To do this, we hash the following data:

Table name (if provided, skipped if not)
Secret salt (should not be guessable)
Timestamp (again, millisecond-level granularity)

We then take the first two bytes of that value, and add the lowest two bytes of the table ID sequence number (‘table_name`_id_seq). This means that even if we insert two rows at the same millisecond, they will have distinct ’sequence data’ portions.

If this happens, and an attacker can see both such IDs, they can determine which of the two entries was inserted first, but not the total number of entries in the table (even mod 2**16).

The table name is included in the hash to ensure that different tables derive separate sequence bases so rows inserted in the same millisecond in different tables do not reveal the table ID sequence number for one another.

The secret salt is included in the hash to ensure that external users cannot derive the sequence base given the timestamp and table name, which would allow them to compute the table ID sequence number.

# File 'lib/snowflake_id/generator.rb', line 67

def define_timestamp_id
  return if already_defined?

  connection.execute(sanitized_timestamp_id_sql)
end

.ensure_id_sequences_exist ⇒ `Object`

# File 'lib/snowflake_id/generator.rb', line 73

def ensure_id_sequences_exist
  # Find tables using timestamp IDs.
  connection.tables.each do |table|
    ensure_id_sequences_exist_for(table)
  end
end

.ensure_id_sequences_exist_for(table_name) ⇒ `Object`

# File 'lib/snowflake_id/generator.rb', line 80

def ensure_id_sequences_exist_for(table_name)
  # We're only concerned with "id" columns.
  id_col = connection.columns(table_name).find { |col| col.name == "id" }
  return unless id_col

  # And only those that are using timestamp_id.
  data = DEFAULT_REGEX.match(id_col.default_function)
  return unless data

  seq_name = "#{data[:seq_prefix]}_id_seq"

  # If we were on Postgres 9.5+, we could do CREATE SEQUENCE IF
  # NOT EXISTS, but we can't depend on that. Instead, catch the
  # possible exception and ignore it.
  # Note that seq_name isn't a column name, but it's a
  # relation, like a column, and follows the same quoting rules
  # in Postgres.
  connection.execute("    DO $$\n      BEGIN\n        CREATE SEQUENCE \#{connection.quote_column_name(seq_name)};\n      EXCEPTION WHEN duplicate_table THEN\n        -- Do nothing, we have the sequence already.\n      END\n    $$ LANGUAGE plpgsql;\n  SQL\nrescue StandardError => e\n  Rails.logger.warn \"SnowflakeId: Could not ensure sequence for \#{table_name}: \#{e.message}\"\nend\n")

.to_time(id) ⇒ `Object`



118
119
120

# File 'lib/snowflake_id/generator.rb', line 118

def to_time(id)
  Time.at((id >> 16) / 1000).utc
end

Module: SnowflakeId::Generator

Defined Under Namespace

Constant Summary collapse

Class Method Summary collapse

Class Method Details

.at(timestamp, with_random: true) ⇒ Object

.define_timestamp_id ⇒ Object

.ensure_id_sequences_exist ⇒ Object

.ensure_id_sequences_exist_for(table_name) ⇒ Object

.to_time(id) ⇒ Object

.at(timestamp, with_random: true) ⇒ `Object`

.define_timestamp_id ⇒ `Object`

.ensure_id_sequences_exist ⇒ `Object`

.ensure_id_sequences_exist_for(table_name) ⇒ `Object`

.to_time(id) ⇒ `Object`