Chapter5

object Chapter5

Source: Chapter5.scala

Linear Supertypes

AnyRef, Any

Ordering

Alphabetic
By Inheritance

Inherited

Chapter5
AnyRef
Any

Hide All
Show All

Visibility

Public
All

Type Members

type Loop[M[_], T] = (T) ⇒ M[T]

Value Members

final def !=(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def ##(): Int

Definition Classes
AnyRef → Any
final def ==(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def asInstanceOf[T0]: T0

Definition Classes
Any
def clone(): AnyRef

Attributes
protected[lang]
Definition Classes
AnyRef
Annotations
@throws( ... ) @native()
final def eq(arg0: AnyRef): Boolean

Definition Classes
AnyRef
def equals(arg0: Any): Boolean

Definition Classes
AnyRef → Any
implicit val evaluator: Numeric[Real]
def figureFiveFour(): Unit
this checks using the random policy to check the stickHigh behavior policy, and compares ordinary and weighted off-policy sampling.
def figureFiveOne(): Unit
This is the figure that explores the stickHigh strategy over a bunch of states, tracking what happens with a usable ace and with no usable ace.
def figureFiveThree(): Unit
this checks using the random policy to check the stickHigh behavior policy, and compares ordinary and weighted off-policy sampling.
def figureFiveTwo(): Unit
This uses exploring starts to capture the optimal policy.
This uses exploring starts to capture the optimal policy.
- go through a single round of the game, then
- update the policy to use the new function.
the policy gets updated on every play at the end of the trajectory walk;
def finalize(): Unit

Attributes
protected[lang]
Definition Classes
AnyRef
Annotations
@throws( classOf[java.lang.Throwable] )
final def getClass(): Class[_]

Definition Classes
AnyRef → Any
Annotations
@native()
def hashCode(): Int

Definition Classes
AnyRef → Any
Annotations
@native()
def importanceSampling(): Unit
final def isInstanceOf[T0]: Boolean

Definition Classes
Any
val limited: Generator[State[AgentView, Action, Double, Generator]]
def limitedM[M[_]](state: M[Blackjack[M]])(implicit arg0: Functor[M]): M[State[AgentView, Action, Double, M]]
Is this appreciably slower? This is going to be useful, in any case, when I'm working with the tests.
def main(items: Array[String]): Unit
final def ne(arg0: AnyRef): Boolean

Definition Classes
AnyRef
final def notify(): Unit

Definition Classes
AnyRef
Annotations
@native()
final def notifyAll(): Unit

Definition Classes
AnyRef
Annotations
@native()
def random[M[_]]: Policy[AgentView, Action, Double, Cat, M]
implicit val rng: RNG
val starter: Generator[Blackjack[Generator]]
def stickHigh[S[_]](hitBelow: Int): Policy[AgentView, Action, Double, Id, S]
Simple blackjack policy for the demos below.
def stickHighCat[S[_]](hitBelow: Int): Policy[AgentView, Action, Double, Cat, S]
final def synchronized[T0](arg0: ⇒ T0): T0

Definition Classes
AnyRef
def toString(): String

Definition Classes
AnyRef → Any
val uniformStarts: Generator[Blackjack[Generator]]
def updateFn[Obs, A, R, G, M[_]](g: M[State[Obs, A, R, M]], agg: MonoidAggregator[SARS[Obs, A, R, M], G, Option[G]], policyFn: (ActionValueFn[Obs, A, G]) ⇒ Policy[Obs, A, R, M, M])(implicit arg0: Monad[M]): Loop[M, ActionValueFn[Obs, A, G]]
Obs, A, R, M make sense here.
Obs, A, R, M make sense here. They have to line up with the state. So what is T? T is the type that you use to walk back along the trajectory.
If you have NO decay you want to supply a Double.
If you decay you need to supply a DecayState.
Then, internal to the value function, is the aggregation.
final def wait(): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long, arg1: Int): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long): Unit

Definition Classes
AnyRef
Annotations
@throws( ... ) @native()

Packages

ScalaRL

Chapter5

object Chapter5

Type Members

Value Members

Inherited from AnyRef

Inherited from Any

Ungrouped

Packages

ScalaRL

Chapter5 

object Chapter5

Type Members

Value Members

Inherited from AnyRef

Inherited from Any

Ungrouped

Chapter5