Class: FiniteMDP::HashModel
- Inherits:
-
Object
- Object
- FiniteMDP::HashModel
- Includes:
- Model
- Defined in:
- lib/finite_mdp/hash_model.rb
Overview
A finite markov decision process model for which the transition probabilities and rewards are specified using nested hash tables.
The structure of the nested hash is as follows:
hash[:s] #=> a Hash that maps actions to successor states
hash[:s][:a] #=> a Hash from successor states to pairs (see next)
hash[:s][:a][:t] #=> an Array [probability, reward] for transition (s,a,t)
The states and actions can be arbitrary objects; see notes for Model.
The TableModel is an alternative way of storing these data.
Instance Attribute Summary collapse
-
#hash ⇒ Hash<state, Hash<action, Hash<state, [Float, Float]>>>
See notes for HashModel for an explanation of this structure.
Class Method Summary collapse
-
.from_model(model, sparse = true) ⇒ HashModel
Convert a generic model into a hash model.
Instance Method Summary collapse
-
#actions(state) ⇒ Array<action>
Actions that are valid for the given state; see Model#actions.
-
#initialize(hash) ⇒ HashModel
constructor
A new instance of HashModel.
-
#next_states(state, action) ⇒ Array<state>
Possible successor states after taking the given action in the given state; see Model#next_states.
-
#reward(state, action, next_state) ⇒ Float?
Reward for a given transition; see Model#reward.
-
#states ⇒ Array<state>
States in this model; see Model#states.
-
#transition_probability(state, action, next_state) ⇒ Float
Probability of the given transition; see Model#transition_probability.
Methods included from Model
#check_transition_probabilities_sum, #terminal_states, #transition_probability_sums
Constructor Details
#initialize(hash) ⇒ HashModel
Returns a new instance of HashModel.
23 24 25 |
# File 'lib/finite_mdp/hash_model.rb', line 23 def initialize(hash) @hash = hash end |
Instance Attribute Details
#hash ⇒ Hash<state, Hash<action, Hash<state, [Float, Float]>>>
Returns see notes for FiniteMDP::HashModel for an explanation of this structure.
31 32 33 |
# File 'lib/finite_mdp/hash_model.rb', line 31 def hash @hash end |
Class Method Details
.from_model(model, sparse = true) ⇒ HashModel
Convert a generic model into a hash model.
109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 |
# File 'lib/finite_mdp/hash_model.rb', line 109 def self.from_model(model, sparse = true) hash = {} model.states.each do |state| hash[state] ||= {} model.actions(state).each do |action| hash[state][action] ||= {} model.next_states(state, action).each do |next_state| pr = model.transition_probability(state, action, next_state) next unless pr > 0 || !sparse hash[state][action][next_state] = [pr, model.reward(state, action, next_state)] end end end FiniteMDP::HashModel.new(hash) end |
Instance Method Details
#actions(state) ⇒ Array<action>
Actions that are valid for the given state; see Model#actions.
49 50 51 |
# File 'lib/finite_mdp/hash_model.rb', line 49 def actions(state) hash[state].keys end |
#next_states(state, action) ⇒ Array<state>
Possible successor states after taking the given action in the given state; see Model#next_states.
63 64 65 |
# File 'lib/finite_mdp/hash_model.rb', line 63 def next_states(state, action) hash[state][action].keys end |
#reward(state, action, next_state) ⇒ Float?
Reward for a given transition; see Model#reward.
94 95 96 97 |
# File 'lib/finite_mdp/hash_model.rb', line 94 def reward(state, action, next_state) _probability, reward = hash[state][action][next_state] reward end |
#states ⇒ Array<state>
States in this model; see Model#states.
38 39 40 |
# File 'lib/finite_mdp/hash_model.rb', line 38 def states hash.keys end |
#transition_probability(state, action, next_state) ⇒ Float
Probability of the given transition; see Model#transition_probability.
78 79 80 81 |
# File 'lib/finite_mdp/hash_model.rb', line 78 def transition_probability(state, action, next_state) probability, _reward = hash[state][action][next_state] probability || 0 end |