object Chapter5
- Source
- Chapter5.scala
- Alphabetic
- By Inheritance
- Chapter5
- AnyRef
- Any
- Hide All
- Show All
- Public
- All
Type Members
- type Loop[M[_], T] = (T) ⇒ M[T]
Value Members
-
final
def
!=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
##(): Int
- Definition Classes
- AnyRef → Any
-
final
def
==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
asInstanceOf[T0]: T0
- Definition Classes
- Any
-
def
clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()
-
final
def
eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
def
equals(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- implicit val evaluator: Numeric[Real]
-
def
figureFiveFour(): Unit
this checks using the random policy to check the stickHigh behavior policy, and compares ordinary and weighted off-policy sampling.
-
def
figureFiveOne(): Unit
This is the figure that explores the stickHigh strategy over a bunch of states, tracking what happens with a usable ace and with no usable ace.
-
def
figureFiveThree(): Unit
this checks using the random policy to check the stickHigh behavior policy, and compares ordinary and weighted off-policy sampling.
-
def
figureFiveTwo(): Unit
This uses exploring starts to capture the optimal policy.
This uses exploring starts to capture the optimal policy.
- go through a single round of the game, then
- update the policy to use the new function.
the policy gets updated on every play at the end of the trajectory walk;
-
def
finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( classOf[java.lang.Throwable] )
-
final
def
getClass(): Class[_]
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
def
hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
- def importanceSampling(): Unit
-
final
def
isInstanceOf[T0]: Boolean
- Definition Classes
- Any
- val limited: Generator[State[AgentView, Action, Double, Generator]]
-
def
limitedM[M[_]](state: M[Blackjack[M]])(implicit arg0: Functor[M]): M[State[AgentView, Action, Double, M]]
Is this appreciably slower? This is going to be useful, in any case, when I'm working with the tests.
- def main(items: Array[String]): Unit
-
final
def
ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
final
def
notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
final
def
notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
- def random[M[_]]: Policy[AgentView, Action, Double, Cat, M]
- implicit val rng: RNG
- val starter: Generator[Blackjack[Generator]]
-
def
stickHigh[S[_]](hitBelow: Int): Policy[AgentView, Action, Double, Id, S]
Simple blackjack policy for the demos below.
- def stickHighCat[S[_]](hitBelow: Int): Policy[AgentView, Action, Double, Cat, S]
-
final
def
synchronized[T0](arg0: ⇒ T0): T0
- Definition Classes
- AnyRef
-
def
toString(): String
- Definition Classes
- AnyRef → Any
- val uniformStarts: Generator[Blackjack[Generator]]
-
def
updateFn[Obs, A, R, G, M[_]](g: M[State[Obs, A, R, M]], agg: MonoidAggregator[SARS[Obs, A, R, M], G, Option[G]], policyFn: (ActionValueFn[Obs, A, G]) ⇒ Policy[Obs, A, R, M, M])(implicit arg0: Monad[M]): Loop[M, ActionValueFn[Obs, A, G]]
Obs, A, R, M make sense here.
Obs, A, R, M make sense here. They have to line up with the state. So what is T? T is the type that you use to walk back along the trajectory.
If you have NO decay you want to supply a Double.
If you decay you need to supply a DecayState.
Then, internal to the value function, is the aggregation.
-
final
def
wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()
edit this text on github
ScalaRL
This is the API documentation for the ScalaRL functional reinforcement learning library.
Further documentation for ScalaRL can be found at the documentation site.
Check out the ScalaRL package list for all the goods.