Policy

Companion object Policy

trait Policy[Obs, A, R, M[_], S[_]] extends AnyRef

This is how agents actually choose what comes next. This is a stochastic policy. We have to to be able to match this up with a state that has the same monadic return type, but for now it's hardcoded.

A - Action Obs - the observation offered by this state. R - reward M - the monadic type offered by the policy. S - the monad for the state.

Self Type: Policy[Obs, A, R, M, S]
Source: Policy.scala

Linear Supertypes

AnyRef, Any

Known Subclasses

Greedy, Gradient, Greedy, UCB

Ordering

Alphabetic
By Inheritance

Inherited

Policy
AnyRef
Any

Hide All
Show All

Visibility

Public
All

Type Members

type This = Policy[Obs, A, R, M, S]

Abstract Value Members

abstract def choose(state: State[Obs, A, R, S]): M[A]

Concrete Value Members

final def !=(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def ##(): Int

Definition Classes
AnyRef → Any
final def ==(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def asInstanceOf[T0]: T0

Definition Classes
Any
def clone(): AnyRef

Attributes
protected[lang]
Definition Classes
AnyRef
Annotations
@throws( ... ) @native()
def contramapObservation[P](f: (P) ⇒ Obs)(implicit S: Functor[S]): Policy[P, A, R, M, S]
def contramapReward[T](f: (T) ⇒ R)(implicit S: Functor[S]): Policy[Obs, A, T, M, S]
final def eq(arg0: AnyRef): Boolean

Definition Classes
AnyRef
def equals(arg0: Any): Boolean

Definition Classes
AnyRef → Any
def finalize(): Unit

Attributes
protected[lang]
Definition Classes
AnyRef
Annotations
@throws( classOf[java.lang.Throwable] )
final def getClass(): Class[_]

Definition Classes
AnyRef → Any
Annotations
@native()
def hashCode(): Int

Definition Classes
AnyRef → Any
Annotations
@native()
final def isInstanceOf[T0]: Boolean

Definition Classes
Any
def learn(sars: SARS[Obs, A, R, S]): This
def mapK[N[_]](f: FunctionK[M, N]): Policy[Obs, A, R, N, S]
Just an idea to see if I can make stochastic deciders out of deterministic deciders.
Just an idea to see if I can make stochastic deciders out of deterministic deciders. We'll see how this develops.
final def ne(arg0: AnyRef): Boolean

Definition Classes
AnyRef
final def notify(): Unit

Definition Classes
AnyRef
Annotations
@native()
final def notifyAll(): Unit

Definition Classes
AnyRef
Annotations
@native()
final def synchronized[T0](arg0: ⇒ T0): T0

Definition Classes
AnyRef
def toString(): String

Definition Classes
AnyRef → Any
final def wait(): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long, arg1: Int): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long): Unit

Definition Classes
AnyRef
Annotations
@throws( ... ) @native()

Packages

ScalaRL

Policy

Companion object Policy

trait Policy[Obs, A, R, M[_], S[_]] extends AnyRef

Type Members

Abstract Value Members

Concrete Value Members

Inherited from AnyRef

Inherited from Any

Ungrouped

Packages

ScalaRL

Policy 

Companion object Policy

trait Policy[Obs, A, R, M[_], S[_]] extends AnyRef

Type Members

Abstract Value Members

Concrete Value Members

Inherited from AnyRef

Inherited from Any

Ungrouped

Policy