Class: FlexColumns::Contents::ColumnData
- Inherits:
-
Object
- Object
- FlexColumns::Contents::ColumnData
- Defined in:
- lib/flex_columns/contents/column_data.rb
Overview
ColumnData is one of the core classes in flex_columns
. An instance of ColumnData represents the data present in a single row for a single flex column; it stores that data, is used to set and retrieve that data, and can serialize and deserialize itself from and to JSON (with headers and optional compression added for binary storage).
Clients do not interact with ColumnData itself; rather, they interact with an instance of a generated subclass of FlexColumnsContentsBase, and it delegates core methods to this object.
Instance Method Summary collapse
-
#[](field_name) ⇒ Object
Returns the data for the given
field_name
. -
#[]=(field_name, new_value) ⇒ Object
Sets the data for the given
field_name
to the givennew_value
. -
#deserialized? ⇒ Boolean
Has this object been deserialized? If it’s been deserialized, then we need to do things like run validations on it, save it back to the database when someone calls #save! on the parent object, and so on.
-
#initialize(field_set, options = { }) ⇒ ColumnData
constructor
Creates a new instance.
-
#keys ⇒ Object
Returns an Array of all field names that are currently set to something.
-
#to_hash ⇒ Object
Returns a representation of this data as a Hash.
-
#to_json ⇒ Object
Returns a String with the current contents of this object as JSON.
-
#to_stored_data ⇒ Object
Returns the exact String that should be stored in the database – compressed or not, with header or not, etc.
-
#touch! ⇒ Object
Does nothing, other than making sure the JSON has been deserialized.
Constructor Details
#initialize(field_set, options = { }) ⇒ ColumnData
Creates a new instance. field_set
is the FlexColumns::Definition::FieldSet that contains the set of fields defined for this flex column; options
can contain:
- :storage_string
-
The data present in the column in the database; this can be omitted if creating an instance for a row that has no data, or for a new row.
- :data_source
-
Where did that data come from? This can be any object; it must respond to #describe_flex_column_data_source (no arguments), which should return a String that is used in thrown exceptions to let the client know what data caused the problem; it also must respond to #notification_hash_for_flex_column_data_source (no arguments), which should return a Hash that is used to generate the payload for the ActiveSupport::Notification calls this class makes. (This is, in practice, always an instance of the FlexColumnsContentsBase subclass generated for the column.)
- :unknown_fields
-
Must pass
:preserve
or:delete
. If there are keys in the serialized JSON that do not correspond to any fields that the FieldSet knows about, this determines what will happen to that data when re-serializing it to save::preserve
keeps that data, while:delete
removes it. (In neither case is that data actually accessible; you must declare a field if you want access to it.) - :length_limit
-
If present, specifies the maximum length of data that can be stored in the underlying storage mechanism (the column). When serializing data, this object will raise an exception if the serialized form is longer than this limit. This is used to avoid cases where the database might otherwise silently truncate the data being stored (I’m looking at you, MySQL) and hence corrupt stored data.
- :storage
-
This must be
:binary
,:text
, or :json. If:text
, standard, uncompressed JSON will always be stored. (It is not possible to store compressed data reliably in a text column, because the database will interpret the bytes as characters and may modify them or raise an exception if byte sequences are present that would be invalid characters in whatever encoding it’s using.) If :binary, then a very small header will be written that’s just for versioning (currentlyFC:01,
), followed by a marker indicating if it’s compressed (1,
) or not (0,
), followed by either standard, uncompressed JSON encoded in UTF-8 or the GZipped version of the same. If :json, then we assume the database has a native JSON type (like PostgreSQL with sufficiently-recent ActiveRecord and PG gem), and deal in an actual Hash, which the database processes directly. - :compress_if_over_length
-
If present, must be set to an integer. If
:storage
is:binary
and the JSON string is at least this many bytes long, then this class will compress it before returning its stored data (from #to_stored_data); if the compressed version is at most 95% (MIN_SIZE_REDUCTION_RATIO_FOR_COMPRESSION) as long as the uncompressed version, then the compressed version will be used instead. - :binary_header
-
Must be
true
orfalse
. Iffalse
, then, even if:storage
is:binary
, no header will be written to the binary column. (As a consequence, compression will also be disabled, since compression requires the header.) - :null
-
Must be
true
orfalse
. Iffalse
, assumes the underlying column in the database is defined as non-NULL (although this is not recommended), and therefore will set an empty string (“”) on the column if there’s no data in it, rather than SQLNULL
.
56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 |
# File 'lib/flex_columns/contents/column_data.rb', line 56 def initialize(field_set, = { }) .assert_valid_keys(:storage_string, :data_source, :unknown_fields, :length_limit, :storage, :compress_if_over_length, :binary_header, :null) @storage_string = [:storage_string] @field_set = field_set @data_source = [:data_source] @unknown_fields = [:unknown_fields] @length_limit = [:length_limit] @storage = [:storage] @compress_if_over_length = [:compress_if_over_length] @binary_header = [:binary_header] @null = [:null] raise ArgumentError, "Invalid JSON string: #{storage_string.inspect}" if storage_string && (! storage_string.kind_of?(String)) && (! storage_string.kind_of?(Hash)) raise ArgumentError, "Must supply a FieldSet, not: #{field_set.inspect}" unless field_set.kind_of?(FlexColumns::Definition::FieldSet) raise ArgumentError, "Must supply a data source, not: #{data_source.inspect}" unless data_source raise ArgumentError, "Invalid value for :unknown_fields: #{unknown_fields.inspect}" unless [ :preserve, :delete ].include?(unknown_fields) raise ArgumentError, "Invalid value for :length_limit: #{length_limit.inspect}" if length_limit && (! (length_limit.kind_of?(Integer) && length_limit >= 8)) raise ArgumentError, "Invalid value for :storage: #{storage.inspect}" unless [ :binary, :text, :json ].include?(storage) raise ArgumentError, "Invalid value for :compress_if_over_length: #{compress_if_over_length.inspect}" if compress_if_over_length && (! compress_if_over_length.kind_of?(Integer)) raise ArgumentError, "Invalid value for :binary_header: #{binary_header.inspect}" unless [ true, false ].include?(binary_header) raise ArgumentError, "Invalid value for :null: #{null.inspect}" unless [ true, false ].include?(null) @field_contents_by_field_name = nil @unknown_field_contents_by_key = nil end |
Instance Method Details
#[](field_name) ⇒ Object
Returns the data for the given field_name
. Raises FlexColumns::Errors::NoSuchFieldError if there is no field of the given name. Returns nil if there is such a field, but no data for it.
87 88 89 90 |
# File 'lib/flex_columns/contents/column_data.rb', line 87 def [](field_name) field_name = validate_and_deserialize_for_field(field_name) field_contents_by_field_name[field_name] end |
#[]=(field_name, new_value) ⇒ Object
Sets the data for the given field_name
to the given new_value
. Raises FlexColumns::Errors::NoSuchFieldError if there is no field of the given name. Returns new_value
.
94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 |
# File 'lib/flex_columns/contents/column_data.rb', line 94 def []=(field_name, new_value) field_name = validate_and_deserialize_for_field(field_name) # We do this for a very good reason. When encoding as JSON, Ruby's JSON library happily accepts Symbols, but # encodes them as simple Strings in the JSON. (This makes sense, because JSON doesn't support Symbols.) This # means that if you save a value in a flex column as a Symbol, and then re-read that row from the database, # you'll get back a String, not the Symbol you put in. # # Unfortunately, this is different from what you'll get if there is no intervening save/load cycle, where it'd # otherwise stay a Symbol. This difference in behavior can be the source of some really annoying bugs. While # ActiveRecord has this annoying behavior, this is a chance to clean it up in a small way -- so, if you set a # Symbol, we return a String. (And, yes, this has no bearing on Symbols stored nested inside Arrays or Hashes; # and that's OK.) new_value = new_value.to_s if new_value.kind_of?(Symbol) old_value = field_contents_by_field_name[field_name] # We deliberately delete from the hash anything that's being set to +nil+; this is so that we don't end up just # binding keys to +nil+, and returning them in #keys, etc. (Yes, this means that you can't distinguish a key # explicitly set to +nil+ from a key that's not present; this is different from Ruby's semantics for a Hash, # but not by very much, and it makes use of +flex_columns+ a whole lot simpler.) if new_value == nil field_contents_by_field_name.delete(field_name) nil else field_contents_by_field_name[field_name] = new_value end end |
#deserialized? ⇒ Boolean
Has this object been deserialized? If it’s been deserialized, then we need to do things like run validations on it, save it back to the database when someone calls #save! on the parent object, and so on.
Not at all obvious: originally, we had a method called #touched? that let you know whether the given object had been changed at all. It simply got set on #[]=
, above. The problem with this is that very frequently, flex_columns
is used to store complex data structures (because that’s one of the things that’s dramatically easier in a serialized JSON blob than in a traditional relational structure). But if you have an array stored, and you call #<< on it to append an element, then #[]=
never gets called at all – because it’s still the same object, just with different contents.
We could have worked around this by saving off a copy of each field when we deserialized, then comparing them using a deep equality (#== should work just fine) to determine if they’ve changed. However, this adds very significant overhead to each and every single use of a flex_column
object, whether or not you rely on or care about this kind of tracking – we would have to #dup every flex column field every single time we deserialized, and, if you have large objects in there, that can get extremely expensive.
Since almost every object in Ruby is mutable – even Strings – there aren’t really any easy wins here. Numbers are the only commonplace object that aren’t, and it’s not going to be a common use case that someone uses a flex_column
with fields that each simply store one single number. (Storing an array or a hash of numbers is much more common, but then you’re talking about Arrays and Hashes, which are back to being mutable.)
Another option would be to #freeze all of the fields on a flex column, thus requiring clients to reassign them with a new object if they wanted to change them at all. That, however, presents an API that most users would hate – I don’t want to say user.prefs_map = user.prefs_map.merge(:foo => bar)
; I want to just say user.prefs_map[:foo] = bar
.
Instead, once we deserialize a field, we just assume that it has changed. While this may end up causing the client to do extra work at times, it’s much higher-performance than doing the tracking every time.
(There is definitely room to add code that would make this configurable, on a per-flex-column or even per-field basis. As always, patches are welcome; as of this writing, it seems likely that it might just not be an issue big enough to worry about.)
177 178 179 |
# File 'lib/flex_columns/contents/column_data.rb', line 177 def deserialized? !! field_contents_by_field_name end |
#keys ⇒ Object
Returns an Array of all field names that are currently set to something.
124 125 126 127 |
# File 'lib/flex_columns/contents/column_data.rb', line 124 def keys deserialize_if_necessary! field_contents_by_field_name.keys end |
#to_hash ⇒ Object
Returns a representation of this data as a Hash. This should not be used in flex_columns
to manipulate data, as it does not contain a full representation of a column (in particular, unknown-field data is not represented in the returned Hash); however, it’s useful to construct a string (e.g., FlexColumnsContentsBase#inspect) to help with debugging.
133 134 135 136 |
# File 'lib/flex_columns/contents/column_data.rb', line 133 def to_hash deserialize_if_necessary! field_contents_by_field_name.dup.with_indifferent_access end |
#to_json ⇒ Object
Returns a String with the current contents of this object as JSON. (This will deserialize from JSON, if it hasn’t already happened.)
Always returns a string encoded in UTF-8, if we’re running on a Ruby >= 1.9 (that is, with encoding support).
185 186 187 188 189 190 191 192 193 |
# File 'lib/flex_columns/contents/column_data.rb', line 185 def to_json deserialize_if_necessary! json_hash = to_json_hash as_string = JSON.generate(json_hash, :allow_nan => true) as_string = as_string.encode(Encoding::UTF_8) if as_string.respond_to?(:encode) as_string end |
#to_stored_data ⇒ Object
Returns the exact String that should be stored in the database – compressed or not, with header or not, etc. Raises FlexColumns::Errors::JsonTooLongError if the string is too long to fit in the database.
(Under PostgreSQL, with appropriate ActiveRecord and PostgreSQL support,)
199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 |
# File 'lib/flex_columns/contents/column_data.rb', line 199 def to_stored_data out = nil deserialize_if_necessary! return to_json_hash if storage == :json instrument("serialize") do if storage == :json out = to_json_hash else out = to_json if out.length < 8 && out =~ /^\s*\{\s*\}\s*$/i out = @null ? nil : "" else out = to_binary_storage(out) if storage == :binary end end end actual_length = out ? out.length : 0 if length_limit && actual_length > length_limit raise FlexColumns::Errors::JsonTooLongError.new(data_source, length_limit, out) end out end |
#touch! ⇒ Object
Does nothing, other than making sure the JSON has been deserialized. This therefore has the effect both of ensuring that the stored data (if any) is valid, and also will remove any unknown keys (on save) if :unknown_fields
was set to :delete
.
141 142 143 |
# File 'lib/flex_columns/contents/column_data.rb', line 141 def touch! deserialize_if_necessary! end |