Module: JsonCsv::CsvToJson::ClassMethods

Defined in:
lib/json_csv/csv_to_json.rb

Instance Method Summary collapse

Instance Method Details

#apply_field_casting_type(value, field_casting_type) ⇒ Object



85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
# File 'lib/json_csv/csv_to_json.rb', line 85

def apply_field_casting_type(value, field_casting_type)
  unless FIELD_CASTING_TYPES.include?(field_casting_type)
    raise ArgumentError,
          "Invalid cast type #{field_casting_type}"
  end

  case field_casting_type
  when TYPE_INTEGER
    raise ArgumentError, "\"#{value}\" is not an integer" unless /^[0-9]+$/.match?(value.to_s)

    value.to_i
  when TYPE_FLOAT
    unless value.to_s =~ /^[0-9]+(\.[0-9]+)*$/ || value.to_s =~ /^\.[0-9]+$/
      raise ArgumentError,
            "\"#{value}\" is not a float"
    end

    value.to_f
  when TYPE_BOOLEAN
    case value.downcase
    when 'true'
      true
    when 'false'
      false
    else
      raise ArgumentError, "\"#{value}\" is not a boolean"
    end
  else
    value # fall back to string, which is the original form
  end
end

#csv_file_to_hierarchical_json_hash(path_to_csv, field_casting_rules = {}, strip_value_whitespace = true) ⇒ Object

Takes flat csv data and yields to a block for each row, presenting that row as un-flattened json. This method works for large CSVs and uses very little memory because it only keeps one row in memory at a time. Sample usage: csv_file_to_hierarchical_json_hash(

path_to_csv, field_casting_rules = {}, strip_value_whitespace = true

) { |row_json_hash, row_number| …your block logic here… }



27
28
29
30
31
32
33
34
35
36
37
# File 'lib/json_csv/csv_to_json.rb', line 27

def csv_file_to_hierarchical_json_hash(path_to_csv, field_casting_rules = {}, strip_value_whitespace = true)
  # Start with row 2 because this corresponds to the SECOND row of
  # 1-indexed CSV data (where headers are row 1)
  i = 2
  CSV.foreach(path_to_csv, headers: true, header_converters: lambda { |header|
    header.strip # remove leading and trailing header whitespace
  }) do |row_data_hash|
    yield csv_row_hash_to_hierarchical_json_hash(row_data_hash, field_casting_rules, strip_value_whitespace), i
    i += 1
  end
end

#csv_row_hash_to_hierarchical_json_hash(row_data_hash, field_casting_rules, strip_value_whitespace = true) ⇒ Object



39
40
41
42
43
44
45
46
47
48
49
50
51
# File 'lib/json_csv/csv_to_json.rb', line 39

def csv_row_hash_to_hierarchical_json_hash(row_data_hash, field_casting_rules, strip_value_whitespace = true)
  hierarchical_hash = {}
  row_data_hash.each do |key, value|
    next if value.nil? || value == '' # ignore nil or empty string values

    put_value_at_json_path(hierarchical_hash, key, value, field_casting_rules)
  end
  # Clean up empty array elements, which may have come about from CSV data
  # that was 1-indexed instead of 0-indexed.
  JsonCsv::Utils.recursively_remove_blank_fields!(hierarchical_hash)
  JsonCsv::Utils.recursively_strip_value_whitespace!(hierarchical_hash) if strip_value_whitespace
  hierarchical_hash
end

#json_path_to_pieces(json_path) ⇒ Object

Takes the given json_path and splits it into individual json path pieces. e.g. Takes “related_books.notes_from_reviewers” and converts it to:

“related_books”, 1, “notes_from_reviewers”, 0


120
121
122
123
124
125
126
127
128
129
130
131
132
133
# File 'lib/json_csv/csv_to_json.rb', line 120

def json_path_to_pieces(json_path)
  # split on...
  # '].' (when preceded by a number)
  # OR
  # '[' (when followed by a number)
  # OR
  # ']' (when preceded by a number)
  # OR
  # '.' (always)
  # ...and remove empty elements (which only come up when you're working with
  # a json_path like '[0]', which splits between the first bracket and the number)
  pieces = json_path.split(/(?<=\d)\]\.|\[(?=\d)|(?<=\d)\]|\./).reject { |piece| piece == '' }
  pieces.map { |piece| piece.to_i.to_s == piece ? piece.to_i : piece } # numeric pieces should be actual numbers
end

#pieces_to_json_path(pieces) ⇒ Object

Generates a string json path from the given pieces e.g. Takes [“related_books”, 1, “notes_from_reviewers”, 0] and converts it to: “related_books.notes_from_reviewers



138
139
140
141
142
143
144
145
146
147
148
149
# File 'lib/json_csv/csv_to_json.rb', line 138

def pieces_to_json_path(pieces)
  json_path = ''
  pieces.each do |piece|
    if piece.is_a?(Integer)
      json_path += "[#{piece}]"
    else
      json_path += '.' unless json_path.empty?
      json_path += piece
    end
  end
  json_path
end

#put_value_at_json_path(obj, json_path, value, field_casting_rules = {}, full_json_path_from_top = json_path) ⇒ Object

For the given obj, puts the given value at the given json_path, creating nested elements as needed. This method calls itself recursively when placing a value at a nested path, and during this sequence of calls the obj param may either be a hash or an array.



57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
# File 'lib/json_csv/csv_to_json.rb', line 57

def put_value_at_json_path(obj, json_path, value, field_casting_rules = {}, full_json_path_from_top = json_path)
  json_path_pieces = json_path_to_pieces(json_path)

  if json_path_pieces.length == 1
    # If the full_json_path_from_top matches one of the field_casting_rules,
    # then case this field to the specified cast type
    full_json_path_from_top_as_field_casting_rule_pattern =
      real_json_path_to_field_casting_rule_pattern(full_json_path_from_top)
    obj[json_path_pieces[0]] =
      if field_casting_rules.key?(full_json_path_from_top_as_field_casting_rule_pattern)
        apply_field_casting_type(value,
                                 field_casting_rules[full_json_path_from_top_as_field_casting_rule_pattern])
      else
        value
      end
  else
    obj[json_path_pieces[0]] ||= (json_path_pieces[1].is_a?(Integer) ? [] : {})
    put_value_at_json_path(obj[json_path_pieces[0]], pieces_to_json_path(json_path_pieces[1..]), value,
                           field_casting_rules, full_json_path_from_top)
  end
end

#real_json_path_to_field_casting_rule_pattern(full_json_path_from_top) ⇒ Object

Takes a real json_path like “related_books.notes_from_reviewers” and converts it to a field_casting_rule_pattern like: “related_books.notes_from_reviewers



81
82
83
# File 'lib/json_csv/csv_to_json.rb', line 81

def real_json_path_to_field_casting_rule_pattern(full_json_path_from_top)
  full_json_path_from_top.gsub(/\d+/, 'x')
end