In Reinforcement Learning (”RL”), we deal with the following terminology:
- *States* $s \in S$ where an agent can be in. If all states are observed we also know $S$, the set of all possible states. ^045824
- *Actions* $a \in A$ that the agent can take to get from $s \mapsto s^\prime$. ^d84ec0
- *Transitions* $T(s,a,s^\prime)$ are probabilities to go from $s \mapsto s^\prime$ by taking action $a$. Can be expressed as conditional probability. ^c23f52
$ T(s,a,s^\prime)= \mathbf P(s^\prime | s,a) \qquad \sum_{s \in S; \, a \in A} T(s,a,s^\prime)=1 $
- *Rewards* $R(s_t)$ are based on the state, and the agent receives it for the desired behavior (reward could also be based on a state-action combination). ^01e643