object Chapter4
- Source
- Chapter4.scala
- Alphabetic
- By Inheritance
- Chapter4
- AnyRef
- Any
- Hide All
- Show All
- Public
- All
Value Members
-
final
def
!=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
##(): Int
- Definition Classes
- AnyRef → Any
-
final
def
==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- val allowedIterations: Long
-
final
def
asInstanceOf[T0]: T0
- Definition Classes
- Any
-
def
clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()
- val emptyFn: StateValueFn[Position, DecayState[Double]]
- val epsilon: Double
-
final
def
eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
def
equals(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- def figureFourOne(): Unit
-
def
finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( classOf[java.lang.Throwable] )
- def fourOne(inPlace: Boolean): (StateValueFn[Position, DecayState[Double]], Long)
-
def
fourTwo(inPlace: Boolean): (StateValueFn[InvPair, DecayState[Double]], Config, Long)
The big differences from the book version are:
The big differences from the book version are:
- Currently our Poisson distribution normalizes over the allowed values, rather than just truncating the chance of a value greater than the max to zero.
- our Greedy policy randomly chooses from the entire greedy set, vs just choosing the "first" thing, like Numpy does.
The Python version also keeps an actual greedy policy, which means that the policy starts by returning 0 no matter what, by design, instead of by acting as a random policy until it knows any better.
Without that the generated values match.
TODO ALSO... currently, the sweepUntil function only supports valueIteration or updating on every single sweep. The book actually wants to do a full round of policy evaluation before doing any policy improvement.
We need to support that.
- val gamma: Double
-
final
def
getClass(): Class[_]
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
- val gridConf: Config
-
def
hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
final
def
isInstanceOf[T0]: Boolean
- Definition Classes
- Any
- def main(items: Array[String]): Unit
-
final
def
ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
final
def
notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
final
def
notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
def
runCarRental(): Unit
I'm leaving this in a nightmare state for now.
I'm leaving this in a nightmare state for now. To finish this out, we really need to:
- add support for policy evaluation and policy stability checks, alternating.
- come up with some way of actually turning a particular policy's decisions into a heat map that's not so hardcoded
- NOT have the graph library explode when I cancel a run, for Heatmap.
- def shouldStop[Obs, T](l: StateValueFn[Obs, T], r: StateValueFn[Obs, T], iterations: Long, verbose: Boolean = false)(implicit arg0: ToDouble[T]): Boolean
-
final
def
synchronized[T0](arg0: ⇒ T0): T0
- Definition Classes
- AnyRef
-
def
toString(): String
- Definition Classes
- AnyRef → Any
-
def
vfToSeqPoints(vf: StateValueFn[InvPair, DecayState[Double]]): Seq[Seq[Double]]
This currently is not great because we don't have a way of automatically binning the data and generating that graph.
This currently is not great because we don't have a way of automatically binning the data and generating that graph. This is custom.
-
final
def
wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()
edit this text on github
ScalaRL
This is the API documentation for the ScalaRL functional reinforcement learning library.
Further documentation for ScalaRL can be found at the documentation site.
Check out the ScalaRL package list for all the goods.