object Chapter2
# Introduction to Chapter 2
This chapter is about Bandits. These are markov processes that know about a single state, really. The trick here is going to be getting the stuff that plays these particular states to be more general, and work with the same machinery that rolls states forward.
What we REALLY NEED here is both the top and bottom graphs, getting it done.
The top graph is the average reward across GAMES per step.
So we really want to march them ALL forward and grab the average reward...
- Source
- Chapter2.scala
Linear Supertypes
Ordering
- Alphabetic
- By Inheritance
Inherited
- Chapter2
- AnyRef
- Any
- Hide All
- Show All
Visibility
- Public
- All
Value Members
-
final
def
!=(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
##(): Int
- Definition Classes
- AnyRef → Any
-
final
def
==(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
-
final
def
asInstanceOf[T0]: T0
- Definition Classes
- Any
- def average(s: Iterable[Double]): Double
-
def
clone(): AnyRef
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()
-
final
def
eq(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
def
equals(arg0: Any): Boolean
- Definition Classes
- AnyRef → Any
- implicit val evaluator: Numeric[Real]
-
def
finalize(): Unit
- Attributes
- protected[lang]
- Definition Classes
- AnyRef
- Annotations
- @throws( classOf[java.lang.Throwable] )
-
final
def
getClass(): Class[_]
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
def
hashCode(): Int
- Definition Classes
- AnyRef → Any
- Annotations
- @native()
-
final
def
isInstanceOf[T0]: Boolean
- Definition Classes
- Any
- def main(items: Array[String]): Unit
-
def
nArmedTestbed(nArms: Int, meanMean: Double, stdDev: Double): Generator[State[Unit, Arm, Double, Generator]]
Generates the n-armed testbed.
-
final
def
ne(arg0: AnyRef): Boolean
- Definition Classes
- AnyRef
-
def
nonStationaryTestbed(nArms: Int, mean: Double, stdDev: Double): Generator[State[Unit, Arm, Double, Generator]]
Generates a non-stationary distribution.
-
final
def
notify(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
-
final
def
notifyAll(): Unit
- Definition Classes
- AnyRef
- Annotations
- @native()
- def play(policy: Policy[Unit, Arm, Double, Cat, Generator]): List[Double]
- def playBandit[Obs, A, R](policy: Policy[Obs, A, R, Generator, Generator], stateGen: Generator[State[Obs, A, R, Generator]], nRuns: Int, timeSteps: Int)(reduce: (List[SARS[Obs, A, R, Generator]]) ⇒ R): (List[Moment[Obs, A, R, Generator]], List[R])
-
implicit
val
rng: RNG
These are needed to actually call get on anything.
-
final
def
synchronized[T0](arg0: ⇒ T0): T0
- Definition Classes
- AnyRef
-
def
toString(): String
- Definition Classes
- AnyRef → Any
-
final
def
wait(): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long, arg1: Int): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... )
-
final
def
wait(arg0: Long): Unit
- Definition Classes
- AnyRef
- Annotations
- @throws( ... ) @native()
edit this text on github
ScalaRL
This is the API documentation for the ScalaRL functional reinforcement learning library.
Further documentation for ScalaRL can be found at the documentation site.
Check out the ScalaRL package list for all the goods.