package bandit
Ordering
- Alphabetic
Visibility
- Public
- All
Type Members
-
case class
Gradient[Obs, A, R, T, S[_]](config: Config[R, T], valueFn: ActionValueFn[Obs, A, Item[T]])(implicit evidence$1: Equiv[A], evidence$2: ToDouble[R], evidence$3: ToDouble[T]) extends Policy[Obs, A, R, Cat, S] with Product with Serializable
This thing needs to track its average reward internally...
This thing needs to track its average reward internally... then, if we have the gradient baseline set, use that thing to generate the notes.
T is the "average" type.
- case class Greedy[Obs, A, R, T, S[_]](config: Config[R, T], valueFn: ActionValueFn[Obs, A, T])(implicit evidence$1: Ordering[T]) extends Policy[Obs, A, R, Cat, S] with Product with Serializable
- case class UCB[Obs, A, R, T, S[_]](config: Config[R, T], valueFn: ActionValueFn[Obs, A, Choice[T]], time: Time) extends Policy[Obs, A, R, Cat, S] with Product with Serializable
edit this text on github
ScalaRL
This is the API documentation for the ScalaRL functional reinforcement learning library.
Further documentation for ScalaRL can be found at the documentation site.
Check out the ScalaRL package list for all the goods.