Tools | Step Documentation

Reference for all tools exposed by the Step MCP Server. For setup and installation, see Getting Started.

All tools accept two optional shared parameters: profile (which Step instance and credentials to use) and project (which project to act on). The discovery tools below are the exception: they don’t require a project. When omitted, parameters fall back to the configured defaults (see Profile and project resolution). Every tool’s response includes a resolvedContext block reporting the profile, baseUrl, and project actually used.

Discovery

Read-only tools that let an agent discover valid profile and project names before acting — without reading the configuration file or exposing its secrets.

`step_list_profiles`

Lists the configured profiles with safe, non-secret metadata: name, base URL, default project, whether a credential is configured, its source (user or workspace file), and which profile is the default. API tokens are never returned. Useful to resolve a human reference (“the build server”) to a profile name before acting. Takes no parameters.

`step_list_projects`

Lists the projects available for a profile on Step Enterprise, so an agent can pick a valid project before calling a project-scoped tool. On the open-source edition it reports that project listing is not supported.

Key parameters:

profile (optional; resolved as usual)

Authoring

Helps an agent assemble a deployable Step Automation Package. The MCP server’s role is deliberately narrow: it provides the building blocks, while the agent produces the test content itself.

`step_initialize_project`

Scaffolds a local Step Automation Package from scratch. The output is shaped by two parameters:

keywordSource: prebuilt-library, custom-code, or imported-asset
useCase: functional-testing, load-testing, monitoring, or rpa

Every combination is valid. Returns a filesystemReceipt listing all created files. Use dryRun=true to preview without writing to disk. Chain with step_validate_plan after scaffolding.

`step_validate_plan`

Validates a Step Automation Package at the requested depth. Available values for depth: parse, references, resolve, dry-run — currently only parse is implemented. Use depth='parse' (fastest) to check that automation-package.yaml exists, is valid YAML, and conforms to its declared schemaVersion.

Typically chained directly after step_initialize_project by passing its filesystemReceipt.root as packagePath.

Execution

Runs a Step Automation Package through the Step CLI centrally on a Step instance. The Step CLI must be installed and on your PATH.

Note: Central execution is required because local execution is not supported for Node.js keyword packages.

Typical use cases:

Run an existing automation package and feed the results straight into Analysis
Re-run a package to confirm a fix or reproduce a failure

`step_execute_automation_package`

Executes a Step Automation Package through the Step CLI (step ap execute --async) and returns immediately with the execution ID(s). Follow up with step_search_executions or step_fetch_execution_overview to track results.

Key parameters:

packagePath (required, absolute path to the package folder or a built .tar.gz / .zip archive)
local (default false; run on the local agent instead of centrally on the Step server)
plans (optional; specific plan names, otherwise all plans run)
force (default false; pass --force to bypass CLI/server version-mismatch errors)

Analysis

Read-only tools that let an agent navigate and diagnose Step execution results in a token-efficient way, from a high-level health overview down to individual failures, attachments, and performance metrics. The same tools work across functional tests, load tests, synthetic monitoring, and custom workflows.

Typical use cases:

Diagnose why an execution failed and summarise the root cause
Triage a large run by navigating only the failing paths
Inspect logs, stack traces, and attachments for a specific failure
Investigate performance regressions or resource exhaustion from execution metrics
Find past executions by status, time window, user, or parameters

A typical troubleshooting flow:

Find the execution with step_search_executions.
Assess overall health with step_fetch_execution_overview.
Page through individual failures with step_fetch_execution_report_nodes.
Pull raw evidence (logs, stack traces) with step_download_attachment_content.
Investigate performance with step_query_performance_metrics.

`step_search_executions`

Searches for executions in Step. All filters are optional and combined with AND logic; results are sorted by start time (descending). Useful as an entry point when you don’t already have an execution ID.

Key parameters:

description
status (RUNNING / PASSED / FAILED / TECHNICAL_ERROR)
user
startTimeFrom / startTimeTo (epoch ms)
executionParameters
offset, limit (max 200, default 25)

`step_fetch_execution_overview`

Assesses the overall health of an execution. Use this first.

Returns a token-optimized view of the Aggregated Tree, showing only the failing paths by default. Set filterStatus to ALL to include passing ones.

With includePreview enabled (default), it embeds a small sample of failed nodes and inlines small attachments to avoid extra round trips.

Key parameters:

executionId (required)
filterStatus (FAILED / ALL, default FAILED)
includePreview (default true)

`step_fetch_execution_report_nodes`

Pages through more instances of a specific failure by querying the raw execution tree. The artifactHash is a hash identifying a specific failing node type in the Aggregated Tree — returned by step_fetch_execution_overview.

Key parameters:

executionId (required)
artifactHash (required)
limit (default 5)
offset (default 0)

`step_download_attachment_content`

Downloads the contents of an attachment identified on a ReportNode. Small text files are returned inline. Larger text files and binary files (Playwright trace bundles, HAR files) are saved to disk and the file path is returned.

Key parameters:

attachmentId (required)
attachmentType (required)

`step_query_performance_metrics`

Pulls time-series metric aggregates or bucketed histogram data across the execution timeline. Essential for analysing performance regressions, load tests, and resource exhaustion. Start and end times default to the execution window.

Key parameters:

executionId (required)
metricType (default response-time; also histogram, gauge, counter)
numberOfBuckets (max 200, default 100; use 1 for a single aggregate across the whole execution)
percentiles (default [80, 90, 99])
startTime / endTime (epoch ms, optional)

Prompts

The server registers one MCP prompt that pre-configures an agent for root-cause analysis.

`troubleshoot_step_execution`

Directs the agent to perform an optimization-first root cause analysis on a Step execution run. It injects a system directive that enforces the recommended workflow:

Call step_fetch_execution_overview (with includePreview=true) first.
Escalate to step_fetch_execution_report_nodes or step_query_performance_metrics only when the preview indicates a larger systemic or performance issue.
Never brute-force raw nodes when an aggregation map is available.

Argument:

executionId (required)

Discovery

step_list_profiles

step_list_projects

Authoring

step_initialize_project

step_validate_plan

Execution

step_execute_automation_package

Analysis

step_search_executions

step_fetch_execution_overview

step_fetch_execution_report_nodes

step_download_attachment_content

step_query_performance_metrics

Prompts

troubleshoot_step_execution

See Also

`step_list_profiles`

`step_list_projects`

`step_initialize_project`

`step_validate_plan`

`step_execute_automation_package`

`step_search_executions`

`step_fetch_execution_overview`

`step_fetch_execution_report_nodes`

`step_download_attachment_content`

`step_query_performance_metrics`

`troubleshoot_step_execution`