Human drivers¶

class routerl.environment.HumanAgent(id, start_time, origin, destination, params, initial_knowledge)[source]

Class representing human drivers, responsible for modeling their learning process and decision-making in route selection.

Parameters:

id (int) – The id of the agent.
start_time (float) – The start time of the agent.
origin (float) – The origin of the agent.
destination (float) – The destination value of the agent.
params (dict) – The parameters for the learning model of the agent as specified in here.
initial_knowledge (float) – The initial knowledge of the agent.

act(observation) → int[source]

Returns the agent’s action (route of choice) based on the current observation from the environment.

get_reward(observation: list[dict]) → float[source]

This function calculated the reward of each individual agent.

Parameters:: observation (list[dict]) – The observation of the agent.
Returns:: float – Own travel time of the agent.

get_state(_) → None[source]

Returns the current state of the agent.

property last_reward: float

Set the last reward of the agent.

learn(action, observation) → None[source]

Updates the agent’s knowledge based on the action taken and the resulting observations.

Parameters:

Returns:

None