Human drivers

class routerl.environment.HumanAgent(id, start_time, origin, destination, params, initial_knowledge)[source]

Class representing human drivers, responsible for modeling their learning process and decision-making in route selection.

Parameters:
  • id (int) – The id of the agent.

  • start_time (float) – The start time of the agent.

  • origin (float) – The origin of the agent.

  • destination (float) – The destination value of the agent.

  • params (dict) – The parameters for the learning model of the agent as specified in here.

  • initial_knowledge (float) – The initial knowledge of the agent.

act(observation) int[source]

Returns the agent’s action (route of choice) based on the current observation from the environment.

Parameters:

observation (list) – The observation of the agent.

Returns:

int – The action of the agent.

get_reward(observation: list[dict]) float[source]

This function calculated the reward of each individual agent.

Parameters:

observation (list[dict]) – The observation of the agent.

Returns:

float – Own travel time of the agent.

get_state(_) None[source]

Returns the current state of the agent.

Parameters:

_ (Any) – The current state of the agent.

Returns:

None

property last_reward: float

Set the last reward of the agent.

Returns:

float – The last reward of the agent.

learn(action, observation) None[source]

Updates the agent’s knowledge based on the action taken and the resulting observations.

Parameters:
  • action (int) – The action of the agent.

  • observation (list[dict]) – The observation of the agent.

Returns:

None