Class: FiniteMDP::HashModel

Inherits:

Object

Object
FiniteMDP::HashModel

show all

Includes:: Model

Defined in:: lib/finite_mdp/hash_model.rb

Overview

A finite markov decision process model for which the transition probabilities and rewards are specified using nested hash tables.

The structure of the nested hash is as follows:

hash[:s]         #=> a Hash that maps actions to successor states
hash[:s][:a]     #=> a Hash from successor states to pairs (see next)
hash[:s][:a][:t] #=> an Array [probability, reward] for transition (s,a,t)

The states and actions can be arbitrary objects; see notes for Model.

The TableModel is an alternative way of storing these data.

Instance Attribute Summary collapse

#hash ⇒ Hash<state, Hash<action, Hash<state, [Float, Float]>>>

See notes for HashModel for an explanation of this structure.

Class Method Summary collapse

.from_model(model, sparse = true) ⇒ HashModel

Convert a generic model into a hash model.

Instance Method Summary collapse

#actions(state) ⇒ Array<action>

Actions that are valid for the given state; see Model#actions.
#initialize(hash) ⇒ HashModel constructor

A new instance of HashModel.
#next_states(state, action) ⇒ Array<state>

Possible successor states after taking the given action in the given state; see Model#next_states.
#reward(state, action, next_state) ⇒ Float^?

Reward for a given transition; see Model#reward.
#states ⇒ Array<state>

States in this model; see Model#states.
#transition_probability(state, action, next_state) ⇒ Float

Probability of the given transition; see Model#transition_probability.

Methods included from Model

#check_transition_probabilities_sum, #terminal_states, #transition_probability_sums

Constructor Details

#initialize(hash) ⇒ `HashModel`

Returns a new instance of HashModel.

Parameters:

hash (Hash<state, Hash<action, Hash<state, [Float, Float]>>>) —

see notes for FiniteMDP::HashModel for an explanation of this structure



23
24
25

# File 'lib/finite_mdp/hash_model.rb', line 23

def initialize(hash)
  @hash = hash
end

Instance Attribute Details

#hash ⇒ `Hash<state, Hash<action, Hash<state, [Float, Float]>>>`

Returns see notes for FiniteMDP::HashModel for an explanation of this structure.

Returns:

(Hash<state, Hash<action, Hash<state, [Float, Float]>>>) —

see notes for FiniteMDP::HashModel for an explanation of this structure



31
32
33

# File 'lib/finite_mdp/hash_model.rb', line 31

def hash
  @hash
end

Class Method Details

.from_model(model, sparse = true) ⇒ `HashModel`

Convert a generic model into a hash model.

Parameters:

model (Model)
sparse (Boolean) (defaults to: true) —

do not store entries for transitions with zero probability

Returns:

(HashModel) —

not nil

# File 'lib/finite_mdp/hash_model.rb', line 109

def self.from_model(model, sparse = true)
  hash = {}
  model.states.each do |state|
    hash[state] ||= {}
    model.actions(state).each do |action|
      hash[state][action] ||= {}
      model.next_states(state, action).each do |next_state|
        pr = model.transition_probability(state, action, next_state)
        next unless pr > 0 || !sparse
        hash[state][action][next_state] =
          [pr, model.reward(state, action, next_state)]
      end
    end
  end
  FiniteMDP::HashModel.new(hash)
end

Instance Method Details

#actions(state) ⇒ `Array<action>`

Actions that are valid for the given state; see Model#actions.

Parameters:

state (state)

Returns:

(Array<action>) —

not empty; no duplicate actions



49
50
51

# File 'lib/finite_mdp/hash_model.rb', line 49

def actions(state)
  hash[state].keys
end

#next_states(state, action) ⇒ `Array<state>`

Possible successor states after taking the given action in the given state; see Model#next_states.

Parameters:

state (state)
action (action)

Returns:

(Array<state>) —

not empty; no duplicate states



63
64
65

# File 'lib/finite_mdp/hash_model.rb', line 63

def next_states(state, action)
  hash[state][action].keys
end

#reward(state, action, next_state) ⇒ `Float`^?

Reward for a given transition; see Model#reward.

Parameters:

state (state)
action (action)
next_state (state)

Returns:

(Float, nil) —

nil if the transition is not in the hash

# File 'lib/finite_mdp/hash_model.rb', line 94

def reward(state, action, next_state)
  _probability, reward = hash[state][action][next_state]
  reward
end

#states ⇒ `Array<state>`

States in this model; see Model#states.

Returns:

(Array<state>) —

not empty; no duplicate states



38
39
40

# File 'lib/finite_mdp/hash_model.rb', line 38

def states
  hash.keys
end

#transition_probability(state, action, next_state) ⇒ `Float`

Probability of the given transition; see Model#transition_probability.

Parameters:

state (state)
action (action)
next_state (state)

Returns:

(Float) —

in [0, 1]; zero if the transition is not in the hash

# File 'lib/finite_mdp/hash_model.rb', line 78

def transition_probability(state, action, next_state)
  probability, _reward = hash[state][action][next_state]
  probability || 0
end

Class: FiniteMDP::HashModel

Overview

Instance Attribute Summary collapse

Class Method Summary collapse

Instance Method Summary collapse

Methods included from Model

Constructor Details

#initialize(hash) ⇒ HashModel

Instance Attribute Details

#hash ⇒ Hash<state, Hash<action, Hash<state, [Float, Float]>>>

Class Method Details

.from_model(model, sparse = true) ⇒ HashModel

Instance Method Details

#actions(state) ⇒ Array<action>

#next_states(state, action) ⇒ Array<state>

#reward(state, action, next_state) ⇒ Float?

#states ⇒ Array<state>

#transition_probability(state, action, next_state) ⇒ Float

#initialize(hash) ⇒ `HashModel`

#hash ⇒ `Hash<state, Hash<action, Hash<state, [Float, Float]>>>`

.from_model(model, sparse = true) ⇒ `HashModel`

#actions(state) ⇒ `Array<action>`

#next_states(state, action) ⇒ `Array<state>`

#reward(state, action, next_state) ⇒ `Float`^?

#states ⇒ `Array<state>`

#transition_probability(state, action, next_state) ⇒ `Float`