final def !=(arg0: Any): Boolean

Definition Classes: AnyRef → Any

final def ##(): Int

Definition Classes: AnyRef → Any

final def ==(arg0: Any): Boolean

Definition Classes: AnyRef → Any

val allowedIterations: Long

final def asInstanceOf[T0]: T0

Definition Classes: Any

def clone(): AnyRef

Attributes: protected[lang]
Definition Classes: AnyRef
Annotations: @throws( ... ) @native()

val emptyFn: StateValueFn[Position, DecayState[Double]]

val epsilon: Double

final def eq(arg0: AnyRef): Boolean

Definition Classes: AnyRef

def equals(arg0: Any): Boolean

Definition Classes: AnyRef → Any

def figureFourOne(): Unit

def finalize(): Unit

Attributes: protected[lang]
Definition Classes: AnyRef
Annotations: @throws( classOf[java.lang.Throwable] )

def fourOne(inPlace: Boolean): (StateValueFn[Position, DecayState[Double]], Long)

def fourTwo(inPlace: Boolean): (StateValueFn[InvPair, DecayState[Double]], Config, Long)

The big differences from the book version are:

Currently our Poisson distribution normalizes over the allowed values, rather than just truncating the chance of a value greater than the max to zero.
our Greedy policy randomly chooses from the entire greedy set, vs just choosing the "first" thing, like Numpy does.

The Python version also keeps an actual greedy policy, which means that the policy starts by returning 0 no matter what, by design, instead of by acting as a random policy until it knows any better.

Without that the generated values match.

TODO ALSO... currently, the sweepUntil function only supports valueIteration or updating on every single sweep. The book actually wants to do a full round of policy evaluation before doing any policy improvement.

We need to support that.

val gamma: Double

final def getClass(): Class[_]

Definition Classes: AnyRef → Any
Annotations: @native()

val gridConf: Config

def hashCode(): Int

Definition Classes: AnyRef → Any
Annotations: @native()

final def isInstanceOf[T0]: Boolean

Definition Classes: Any

def main(items: Array[String]): Unit

final def ne(arg0: AnyRef): Boolean

Definition Classes: AnyRef

final def notify(): Unit

Definition Classes: AnyRef
Annotations: @native()

final def notifyAll(): Unit

Definition Classes: AnyRef
Annotations: @native()

def runCarRental(): Unit

I'm leaving this in a nightmare state for now.

I'm leaving this in a nightmare state for now. To finish this out, we really need to:

add support for policy evaluation and policy stability checks, alternating.
come up with some way of actually turning a particular policy's decisions into a heat map that's not so hardcoded
NOT have the graph library explode when I cancel a run, for Heatmap.

def shouldStop[Obs, T](l: StateValueFn[Obs, T], r: StateValueFn[Obs, T], iterations: Long, verbose: Boolean = false)(implicit arg0: ToDouble[T]): Boolean

final def synchronized[T0](arg0: ⇒ T0): T0

Definition Classes: AnyRef

def toString(): String

Definition Classes: AnyRef → Any

def vfToSeqPoints(vf: StateValueFn[InvPair, DecayState[Double]]): Seq[Seq[Double]]

This currently is not great because we don't have a way of automatically binning the data and generating that graph.

This currently is not great because we don't have a way of automatically binning the data and generating that graph. This is custom.

final def wait(): Unit

Definition Classes: AnyRef
Annotations: @throws( ... )

final def wait(arg0: Long, arg1: Int): Unit

Definition Classes: AnyRef
Annotations: @throws( ... )

final def wait(arg0: Long): Unit

Definition Classes: AnyRef
Annotations: @throws( ... ) @native()

Packages

ScalaRL

Chapter4

object Chapter4

Value Members

Inherited from AnyRef

Inherited from Any

Ungrouped