Agent endpoint

The agent exposes a REST service to the Controller, which is consumed by the controller to execute Keywords. The endpoint of this service is called agent endpoint. The agent endpoint is configured automatically at agent startup and is communicated to the controller at registration. If required, the endpoint can be tweaked as follow:

Agent port

Per default the agent lets the system find a free port automatically at startup to listen on. If required you can set the agent port explicitly using the parameter agentPort. The agent service would then run on that specific port.

For example :

# Configure the agent to listen on port 8080
agentPort: 8080

Agent host

Per default the hostname of the agent endpoint is determined automatically at agent startup. If required you can set the agent hostname explicitly using the parameter agentHost.

For example :

# Configure the agent to be called by the controller using the hostname agent-host.mynetwork.net
agentHost: agent-host.mynetwork.net

Agent URL

Per default the URL of the endpoint which is communicated to the controller is built as follow: http://<agentHost>:<agentPort>. If required (if your agent is behind a proxy for instance), you can set the URL of the agent endpoint explicitly using the parameter agentUrl:

For examples :

agentUrl: https://myAgentHost:7000
agentUrl: http://192.168.10.50:8888

SSL

Per default the agent exposes its service as plain HTTP. If required you can enable SSL for the agent service using the parameter ssl.

Before enabling SSL for the agent service you will need a valid SSL certificate for your agent endpoint. You can either use a self-signed certificate or obtain it from a certificate authority (CA).

In both cases you’ll get following files:

the private key file (.key)
the certificate file (.cert)

The agent requires the certificate in a Java KeyStore in JKS format. To generate the JKS KeyStore based on your .key and .cert files, follow the steps described here

As soon as you have your certificate in JKS format as a .jks file you can enable SSL for the agent service using following parameters:

ssl: true
keyStorePath: /path/to/cert.jks
keyStorePassword: '<password>' // The password for the key store used at key store generation
keyManagerPassword: '<password>' // The password for the specific key within the key store used at key store generation

Agent tokens

As described here in more detail, Keywords are executed on so-called agent tokens. On agent can emit multiple groups of agent tokens. The token groups are configured using the parameter tokenGroups. Each token group has its own number of tokens (capacity) and attributes. The following paragraphs explain the configuration of token groups.

Capacity

The number of agent tokens emitted by a token group is defined by the parameter capacity. This parameter defines how many tokens have to be made available to the controller for Keyword execution and thus defines the maximal number of parallel keyword executions to be allowed on the agent for a specific group. Set this value according to the resources of your agent:

For example :

# Defines 1 token group with 200 tokens
tokenGroups:
- capacity: 200
  tokenConf:
    attributes:
    properties:

Attributes

As described here, specific agent tokens can be selected for execution using so-called agent attributes. The agent attributes are defined as key-value pairs in the parameter tokenConf.attributes.

For example, we could define a pool of “Windows” agents by setting an “OS” attribute like below :

tokenGroups:
- capacity: 1
  tokenConf:
    attributes:
      OS: Windows

On that example, the OS attribute can be used to route the workload on specific machines hosting this agent, enabling the ability to choose where to run test plans (see this link)

Configuration placeholders

Placeholders can be used in AgentConf.yaml and AgentConf.json to define dynamic values that have to be set at startup outside the configuration file.

For instance, you might want to set the gridHost at startup. You could achieve this using a placeholder like this in your AgentConf.yaml:

gridHost: ${myGridHost}

and set the variable myGridHost in the startup script startAgent.bat(sh|command):

“%JAVA_PATH%java.exe” %JAVA_OPTS% -cp “..\lib*;” step.controller.ControllerServer -config=../conf/step.properties -myGridHost=http://mygridhost.net:8081

Agent cleanup configuration

There are 2 caching mechanism in place with automatic cleanup provided out of the box. While their configuration is distinct, it is important to understand that files stored in the file manager cannot be cleanup as long as they are used by the execution context cache. The default configuration is meant for optimal performance, reducing the timeToLive can decrease the execution throughput as well as increase the resource consumption on both the agent and controller (network, CPU and memory).

File Manager cleanup configuration

When the controller needs to transfer files to an agent for the execution of keywords, these file are cached locally on the agents to improve performance. This cache is automatically cleaned up, but following properties can be customized in the agent configuration to tweak this behaviour.

fileManagerConfiguration:
  # enable or disable the cleanup (true by default)
  enableCleanup: true
  # define file that are eligible for cleanup based on their last access time (default 1440 minutes).
  # When set to 0, files are cleaned up as soon as they're not used anymore without waiting for the clean up job (also see execution context cache cleanup)
  cleanupTimeToLiveMinutes: 1440
  # define the frequency of the cleanup job in minutes (default 60)
  cleanupFrequencyMinutes: 60

Execution context cache cleanup configuration

For the execution of Keywords on the Java agent, the keywords package (JAR files) must be loaded in the context of the agent process. For performance reason, these package are cached in the so-called execution context cache. This cache is automatically cleaned up, but following properties can be customized in the agent configuration to tweak this behaviour.

executionContextCacheConfiguration:
  # enable or disable the cleanup (true by default)
  enableCleanup: true
  # define file that are eligible for cleanup based on their last access time (default 1440 minutes).
  # When set to 0, files are cleaned up as soon as they're not used anymore without waiting for the clean up job (also see execution context cache cleanup)
  cleanupTimeToLiveMinutes: 1440
  # define the frequency of the cleanup job in minutes (default 60)
  cleanupFrequencyMinutes: 60

Agents kubernetes probes

If you are running Kubernetes or any other platform for managing containerized workloads and services, you could use:

the /running API endpoint as a liveness probe (the Agent process is running)
the /registered API endpoint as a readiness probe (the Agent has successfully registered to the Step GRID)

Below an example of what your Kubernetes manifest could contain:

livenessProbe:
  httpGet:
    path: /running
    port: 12345
  initialDelaySeconds: 5
  periodSeconds: 5
readinessProbe:
  httpGet:
    path: /registered
    port: 12345
  initialDelaySeconds: 10
  periodSeconds: 5

Exposing JVM metrics for prometheus endpoints

It is possible to expose the default JVM metrics using the Prometheus format by adding the below parameter to your Agent configuration file:

exposeMetrics: true

The metrics will then be exposed under the /metrics path on the same port the Agent is running.

Forked Agent Mode (Java Agent Only)

Availability: Step 29+
Scope: Java Agent only (not supported by .NET or JavaScript agents)

Overview

By default, the Java Agent executes all Keywords (automation scripts) inside a single JVM, isolating sessions with per-session classloaders for maximum performance.
However, this in-process mode can lead to:

Potential JVM leaks (threads, static objects) if Keywords don’t clean up their resources properly,
Side effects from shared singletons or global state,
Framework incompatibilities where libraries expect the JVM to terminate (e.g., deleteOnExit or shutdown hooks).

Forked Agent Mode provides process-level isolation: each Keyword session runs in its own JVM (a forked agent) that starts on demand and shuts down at the end of the session.

Benefits:

Resources are fully reclaimed on JVM exit (leak containment),
Frameworks depending on JVM shutdown behave correctly,
Sessions are isolated from each other (no shared static state).

Trade-off: higher startup overhead per session compared to the default, classloader-based mode.

Configuration

Enabling the Forked Agent Mode

To enable the Forked Agent Mode, update your AgentConf.yaml add the following to your AgentConf.yaml:

agentForkerConfiguration:
  enabled: true

When enabled, the agent spawns a dedicated JVM for each Keyword session and terminates it after completion. The default configuration should work by default and suit the common needs. For advanced configuration of the forked agent mode, a comprehensive list of the configuration parameters is provided in the following section.

Configuration reference

All settings live under agentForkerConfiguration.

Parameter	Type	Default	Description
enabled	boolean	`false`	Enables the Forked Agent Mode.
javaPath	string	(inherits main agent Java)	Path to the Java executable used to start the forked JVM. If unset, the main agent’s Java is used.
vmArgs	string	(none)	Extra JVM arguments for the forked JVM (e.g. `-Xmx512m -Dfile.encoding=UTF-8`).
agentConf	string	(none)	Path to a custom configuration file for the forked agent. This file allows overriding default configuration values.
logbackConf	string	(none)	Path to a custom Logback configuration file to be used by the forked agent. Useful for adjusting logging behavior of the forked agent
startTimeoutMs	long	`20000`	Timeout in milliseconds to wait for the forked agent to fully start. If the agent doesn’t start within this time, startup is considered to have failed.
shutdownTimeoutMs	long	`10000`	Timeout in milliseconds to allow for a graceful shutdown of the forked agent. If the shutdown is not completed within this time, the agent is forcibly terminated.
callTimeoutOffsetMs	int	`1000`	When delegating calls to the forked agent, this offset in milliseconds is subtracted from the overall call timeout to ensure the forked agent has sufficient time to process and handle the timeout internally.
gridPort	int	`0`	The port number on which the embedded grid service will be started inside the main agent. Forked agents will join this grid. A value of 0 indicates that a random available port will be used.
agentPortRangeStart	int	`0`	The starting port of the range used by the forked agent to expose its services. This is the first port in the inclusive range the agent may bind to. By default, a random available port will be used.
agentPortRangeEnd	int	`0`	The ending port of the range used by the forked agent to expose its services. This is the last port in the inclusive range the agent may bind to. By default, a random available port will be used.
tempDirectoryDeletionRetryWait	long	`100`	When a forked agent is shut down, its temporary directory is deleted. If immediate deletion fails, the system retries up to five times. This value defines the wait time in milliseconds between each retry attempt.
workingDirectory	string	`"work/forked-agents"`	Working directory where the forked agent creates its execution directories.

Example

agentForkerConfiguration:
  enabled: true
  javaPath: /usr/lib/jvm/java-17/bin/java
  vmArgs: "-Xmx512m -Dfile.encoding=UTF-8"
  agentConf: /opt/step/conf/agent-forked.yaml
  logbackConf: /opt/step/conf/logback-forked.xml
  startTimeoutMs: 20000
  shutdownTimeoutMs: 10000
  callTimeoutOffsetMs: 1000
  gridPort: 0
  agentPortRangeStart: 50000
  agentPortRangeEnd: 50100
  tempDirectoryDeletionRetryWait: 100
  workingDirectory: work/forked-agents

Advanced configuration

Exposing agent services

In some advanced scenarios, users may need to control the agent from within a Keyword. To accommodate this, the Java agent can optionally expose internal agent control services to Keywords.

Supported operations

Currently, the only supported operation is the graceful shutdown of the agent. This allows controlled termination of the agent process from within a Keyword. Additional control operations may be introduced in future versions.

Note: This functionality is currently only available in the Java agent. The .NET and JavaScript agents do not support this capability at this time.

Enabling Agent Control Services

By default, the agent control services are not exposed to Keywords for safety and encapsulation reasons. To enable this feature, set the following property in the agent configuration file:

exposeAgentControlServices: true

Usage Example (Java + step-api 1.4+)

Once enabled, the agent control service can be accessed within a Java Keyword using the TokenSession:

tokenSession.get(AgentControlServices.class).shutdownAgent();

This will gracefully shut down the running agent instance. Version requirement: This feature requires step-api version 1.4 or newer.

Agent endpoint

Agent port

Agent host

Agent URL

SSL

Agent tokens

Capacity

Attributes

Configuration placeholders

Agent cleanup configuration

File Manager cleanup configuration

Execution context cache cleanup configuration

Agents kubernetes probes

Exposing JVM metrics for prometheus endpoints

Forked Agent Mode (Java Agent Only)

Overview

Configuration

Enabling the Forked Agent Mode

Configuration reference

Example

Advanced configuration

Exposing agent services

Supported operations

Enabling Agent Control Services

Usage Example (Java + step-api 1.4+)

See Also