Human drivers¶
- class routerl.environment.HumanAgent(id, start_time, origin, destination, params, initial_knowledge)[source]
Class representing human drivers, responsible for modeling their learning process and decision-making in route selection.
- Parameters:
id (int) – The id of the agent.
start_time (float) – The start time of the agent.
origin (float) – The origin of the agent.
destination (float) – The destination value of the agent.
params (dict) – The parameters for the learning model of the agent as specified in here.
initial_knowledge (float) – The initial knowledge of the agent.
- act(observation) int [source]
Returns the agent’s action (route of choice) based on the current observation from the environment.
- Parameters:
observation (list) – The observation of the agent.
- Returns:
int – The action of the agent.
- get_reward(observation: list[dict]) float [source]
This function calculated the reward of each individual agent.
- Parameters:
observation (list[dict]) – The observation of the agent.
- Returns:
float – Own travel time of the agent.
- get_state(_) None [source]
Returns the current state of the agent.
- Parameters:
_ (Any) – The current state of the agent.
- Returns:
None
- property last_reward: float
Set the last reward of the agent.
- Returns:
float – The last reward of the agent.
- learn(action, observation) None [source]
Updates the agent’s knowledge based on the action taken and the resulting observations.
- Parameters:
action (int) – The action of the agent.
observation (list[dict]) – The observation of the agent.
- Returns:
None