Agents are responsible for executing automation (automated test, load test, etc) through Keywords. Their resource requirements can vary significantly depending on the type of automation, the automation tools used, and the nature of the application under test. This guide provides practical recommendations to help you size agents effectively for reliable, stable, and performant execution.

For the general platform requirements and sizing recommendations, refer to the Requirements page.

Why Agent Sizing Matters

Properly sizing your agents is critical for ensuring predictable and efficient automation execution. Insufficiently provisioned agents may lead to:

Slow executions and inaccurate performance measurements
Unstable executions (e.g., timeouts, element resolution failures)
Agent crashes due to system overload

Right-sizing your agents helps ensure that your automation run smoothly and predictably.

Key Factors Influencing Agent Sizing

Type of Automation

The nature of the automation significantly influences resource consumption.

API Automation:
API automation typically involves executing API clients (e.g., HTTP calls) and are considered lightweight. They demand minimal CPU and memory.
UI Automation (Web, Mobile, Desktop Clients):
UI automation is resource-intensive because it requires launching and managing full-fledged clients (e.g., browsers, mobile device emulators, fat clients). The agent’s sizing is directly influenced by the client’s runtime behavior.

Recommendation: Use minimal sizing for pure API tests; allocate more generous resources for UI tests, especially if dealing with complex or heavy client applications.

Application Under Test (AUT)

The characteristics of the AUT, particularly its client, impact how much CPU and memory an agent consumes during execution.

Web Applications:
Heavy JavaScript-based SPAs (Single Page Applications) require significantly more CPU than simple static websites.
Mobile/Desktop Clients:
These often demand additional system-level resources (GPU, memory, disk I/O) depending on their type, complexity and rendering needs.

Note: Step’s UI automation sizing recommendations assume a typical web application. Heavier clients may require custom adjustments.

Automation Tooling

Even with the same AUT and automation type, the automation tool or library used can affect agent sizing.

Web Automation:
For instance, Cypress and Selenium and the way they are integrated in Step lead to different runtime behaviors, resulting in varied CPU/memory footprints.
API Automation:
For instance, using Step’s native HTTP Keywords (which execute within the agent process) is less demanding than invoking third-party tools like k6 or Postman CLI, which require spawning external processes.

Tip: Favor native integrations when possible to keep agent footprint smaller and easier to manage.

Parallelism

Number of Agent Tokens

Step agents support concurrent execution of Keywords, driven by the number of configured tokens. Each agent may define one or more token pools, each with n tokens.

A pool with 5 tokens can run up to 5 Keywords concurrently.
This concurrency directly affects the required resources: more tokens = higher CPU/RAM needs.

Rule of Thumb: Multiply the resource needs for one token by the number of tokens to estimate the required capacity.

Internal Parallelism in Keywords

Some automation tools (e.g., k6, JMeter) support internal parallelism, meaning they can execute multiple virtual users (VUs) or threads concurrently within a single Keyword.

For example, if an agent has 3 tokens, and each Keyword triggers a scenario configured with 10 VUs, the effective concurrency becomes: 3 tokens × 10 VUs = 30 virtual users

This combined parallelism (token-level × tool-level) significantly increases the load on the agent. However, the actual resource impact depends heavily on how the tool implements internal parallelism.

Important: When sizing agents, you must account for both token-based and tool-based parallelism. In general, tool-level parallelization is more efficient, as it reduces the overhead of managing multiple tool instances.

Example

In a load testing scenario using Grafana k6, each instance of the k6 CLI introduces overhead.

Running 10 tokens × 10 VUs = 10 separate k6 processes
Running 1 token × 100 VUs = 1 k6 process

Although the total number of VUs is the same, the first scenario generates more system load due to the overhead of spawning and running 10 separate k6 processes.

This distinction is crucial when designing efficient load tests and is explicitly reflected in the recommended sizings provided below.

Sizing guide

In order to properly size your agents it is critical that you understand the key factors mentioned in the previous section.

To proceed with sizing, we recommend the following approach:

Step 1: Define the values for the 4 key factors

Before selecting a recommendation, identify the characteristics of your use case. The following table will help you collect the necessary information:

Key factor	Your Value	Example Values
Type of automation		UI or API
Client type		Standard web application, Heavy web application, Mobile, Desktop
Automation tool		Selenium, Playwright, Cypress, k6, JMeter, etc
Parallelism		1 to n

Step 2: Use the Reference Recommendations

Once you’ve identified your parameters, use the table below to find the closest sizing recommendation. These are based on the official Requirements and represent validated defaults for the most common scenarios.

Type of automation	Client type	Automation tool	Parallelism	Memory	CPU	Note	Corresponding flavour
UI	Standard web application	Selenium, Playwright	1	1800Mi	1750m		ui-standard
UI	Standard web application	Cypress	1	1800Mi	1750m		ui-standard
API	N/A	Grafana k6	100	1800Mi	1750m	Tested with: 1 token, 100 VUs 20 tokens, 5 VUs / token A maximum of 20 tokens and thus 20 parallel k6 processes can be reached with this setup	ui-standard
API	N/A	Native HTTP Keywords	100	1800Mi	1750m		api-standard

Disclaimer: These values serve as a starting point. Depending on your environment and automation specifics, further tuning may be needed.

Step 3: Adapt the Sizing to Your Context

If your parameters differ from the examples above, adjust accordingly:

Heavier Clients (e.g., fat desktop apps or mobile emulators): Increase memory and CPU beyond the default UI recommendation.
Higher Parallelism: Multiply the CPU and memory values proportionally. E.g., if the baseline is for 1 token, multiply by 5 for 5 tokens.

Step 4: Validate the Sizing in Practice

To ensure your sizing is adequate and resilient, run a sequence of baseline and stress tests:

Baseline Test (oversized agent): Run your scenario without parallelism for ~100 iterations to establish a stable reference.
Baseline Test (sized agent): Re-run the same scenario on your actual planned agent sizing.
Parallelism Test (target setup): Run your scenario with the target parallelism on the planned agent sizing (tokens + VUs).

Goal: All tests should complete successfully, with consistent performance and without crashes or throttling.

How to Monitor Agent Resource Usage

Once your agents are running in a production environment, continuous monitoring is essential to ensure they remain healthy, performant, and stable over time.

What to Monitor

Focus on the following key system metrics for each agent:

CPU Usage:
Track average and peak CPU usage over time. Sustained high usage may indicate under-provisioning or excessive parallelism.
CPU Throttling (Kubernetes environments):
Monitor for CPU throttling to detect when the agent is being limited due to exceeding its CPU quota. This can lead to degraded performance or timeouts.
Memory Usage:
Observe memory consumption trends. Sudden spikes or consistently high usage may result in OOM (Out-of-Memory) terminations or instability during test execution.

Tip: Use Kubernetes-native tools (e.g., Prometheus, Grafana, Lens) or your cloud provider’s monitoring suite to visualize and track these metrics.

Configure Alerts

Set up alerts to proactively catch issues before they impact test runs. Suggested thresholds:

Metric	Suggested Alert Threshold
CPU usage	> 80% sustained usage over 5–10 min
CPU throttling	Any throttling consistently observed
Memory usage	> 80–90% of provisioned limit over time

Note: Thresholds should be tuned based on your workloads and empirical performance data during validation and early production runs.

Agent Sizing Guide

Why Agent Sizing Matters

Key Factors Influencing Agent Sizing

Type of Automation

Application Under Test (AUT)

Automation Tooling

Parallelism

Number of Agent Tokens

Internal Parallelism in Keywords

Example

Sizing guide

Step 1: Define the values for the 4 key factors

Step 2: Use the Reference Recommendations

Step 3: Adapt the Sizing to Your Context

Step 4: Validate the Sizing in Practice

How to Monitor Agent Resource Usage

What to Monitor

Configure Alerts

See Also