Class: GenericSegmentTreeInternal

Inherits:
Object
  • Object
show all
Includes:
Shared::BinaryTreeArithmetic
Defined in:
lib/data_structures_rmolinari/generic_segment_tree_internal.rb

Overview

TODO:
  • provide a data-update operation like update_val_at(idx, val)

    • this is O(log n)

    • note that this may need some rework. Consider something like IndexOfMaxVal: @merge needs to know about the underlying data in that case. Hmmm. Maybe the lambda can close over the data in a way that makes it possible to change the data “from the outside”. Yes:

      a = [1,2,3]
      foo = ->() { a.max }
      foo.call # 3
      a = [1,2,4]
      foo.call # 4
      
  • Offer an optional parameter base_case_value_extractor (<– need better name) to be used in #determine_val in the case that left == tree_l && right == tree_r instead of simply returning @tree

    • Use case: cp-algorithms.com/data_structures/segment_tree.html#saving-the-entire-subarrays-in-each-vertex, such as finding the least element in a subarray l..r no smaller than a given value x. In this case we store a sorted version the entire subarray at each node and use a binary search on it.

    • the default value would simply be the identity function.

    • NOTE that in this case, we have different “combine” functions in #determine_val and #build. In #build we would combine sorted lists into a larger sorted list. In #determine_val we combine results via #min.

    • Think about the interface before doing this.

A Segment Tree, which can be used for various interval-related purposes, like efficiently finding the sum (or min or max) on a arbitrary subarray of a given array.

There is an excellent description of the data structure at cp-algorithms.com/data_structures/segment_tree.html. The Wikipedia article (en.wikipedia.org/wiki/Segment_tree) appears to describe a different data structure which is sometimes called an “interval tree.”

For more details (and some close-to-metal analysis of run time, especially for large datasets) see en.algorithmica.org/hpc/data-structures/segment-trees/. In particular, this shows how to do a bottom-up implementation, which is faster, at least for large datasets and cache-relevant compiled code.

This is a generic implementation.

We do O(n) work to build the internal data structure at initialization. Then we answer queries in O(log n) time.

Instance Method Summary collapse

Constructor Details

#initialize(combine:, single_cell_array_val:, size:, identity:) ⇒ GenericSegmentTreeInternal

Construct a concrete instance of a Segment Tree. See details at the links above for the underlying concepts here.

Parameters:

  • combine

    a lambda that takes two values and munges them into a combined value.

    • For example, if we are calculating sums over subintervals, combine.call(a, b) = a + b, while if we are doing maxima we will return max(a, b)

  • single_cell_array_val

    a lambda that takes an index i and returns the value we need to store in the #build operation for the subinterval i..i. This is often simply be the value data, but in some cases - like “index of max val” - it will be something else.

  • size

    the size of the underlying data array, used in certain internal arithmetic.

  • identity

    is the value to return when we are querying on an empty interval

    • for sums, this will be zero; for maxima, this will be -Infinity, etc



51
52
53
54
55
56
57
58
59
# File 'lib/data_structures_rmolinari/generic_segment_tree_internal.rb', line 51

def initialize(combine:, single_cell_array_val:, size:, identity:)
  @combine = combine
  @single_cell_array_val = single_cell_array_val
  @size = size
  @identity = identity

  @tree = []
  build(root, 0, @size - 1)
end

Instance Method Details

#query_on(left, right) ⇒ Object

The desired value (max, sum, etc.) on the subinterval left..right. The type of the return value depends on the concrete instance of the segment tree.

Parameters:

  • left

    the left end of the subinterval.

  • right

    the right end (inclusive) of the subinterval.



66
67
68
69
70
71
72
# File 'lib/data_structures_rmolinari/generic_segment_tree_internal.rb', line 66

def query_on(left, right)
  raise "Bad query interval #{left}..#{right}" if left.negative? || right >= @size

  return @identity if left > right # empty interval

  determine_val(root, left, right, 0, @size - 1)
end