Tools
Reference for all tools exposed by the Step MCP Server. For setup and installation, see Getting Started.
All tools accept two optional shared parameters: profile (which Step instance and credentials to use) and project (which project to act on). The discovery tools below are the exception: they don’t require a project. When omitted, parameters fall back to the configured defaults (see Profile and project resolution). Every tool’s response includes a resolvedContext block reporting the profile, baseUrl, and project actually used.
Discovery
Read-only tools that let an agent discover valid profile and project names before acting — without reading the configuration file or exposing its secrets.
step_list_profiles
Lists the configured profiles with safe, non-secret metadata: name, base URL, default project, whether a credential is configured, its source (user or workspace file), and which profile is the default. API tokens are never returned. Useful to resolve a human reference (“the build server”) to a profile name before acting. Takes no parameters.
step_list_projects
Lists the projects available for a profile on Step Enterprise, so an agent can pick a valid project before calling a project-scoped tool. On the open-source edition it reports that project listing is not supported.
Key parameters:
profile(optional; resolved as usual)
Authoring
Helps an agent assemble a deployable Step Automation Package. The MCP server’s role is deliberately narrow: it provides the building blocks, while the agent produces the test content itself.
step_initialize_project
Scaffolds a local Step Automation Package from scratch. The output is shaped by two parameters:
keywordSource:prebuilt-library,custom-code, orimported-assetuseCase:functional-testing,load-testing,monitoring, orrpa
Every combination is valid. Returns a filesystemReceipt listing all created files. Use dryRun=true to preview without writing to disk. Chain with step_validate_plan after scaffolding.
step_validate_plan
Validates a Step Automation Package at the requested depth. Available values for depth: parse, references, resolve, dry-run — currently only parse is implemented. Use depth='parse' (fastest) to check that automation-package.yaml exists, is valid YAML, and conforms to its declared schemaVersion.
Typically chained directly after step_initialize_project by passing its filesystemReceipt.root as packagePath.
Execution
Runs a Step Automation Package through the Step CLI centrally on a Step instance. The Step CLI must be installed and on your PATH.
Note: Central execution is required because local execution is not supported for Node.js keyword packages.
Typical use cases:
- Run an existing automation package and feed the results straight into Analysis
- Re-run a package to confirm a fix or reproduce a failure
step_execute_automation_package
Executes a Step Automation Package through the Step CLI (step ap execute --async) and returns immediately with the execution ID(s). Follow up with step_search_executions or step_fetch_execution_overview to track results.
Key parameters:
packagePath(required, absolute path to the package folder or a built.tar.gz/.ziparchive)local(defaultfalse; run on the local agent instead of centrally on the Step server)plans(optional; specific plan names, otherwise all plans run)force(defaultfalse; pass--forceto bypass CLI/server version-mismatch errors)
Analysis
Read-only tools that let an agent navigate and diagnose Step execution results in a token-efficient way, from a high-level health overview down to individual failures, attachments, and performance metrics. The same tools work across functional tests, load tests, synthetic monitoring, and custom workflows.
Typical use cases:
- Diagnose why an execution failed and summarise the root cause
- Triage a large run by navigating only the failing paths
- Inspect logs, stack traces, and attachments for a specific failure
- Investigate performance regressions or resource exhaustion from execution metrics
- Find past executions by status, time window, user, or parameters
A typical troubleshooting flow:
- Find the execution with
step_search_executions. - Assess overall health with
step_fetch_execution_overview. - Page through individual failures with
step_fetch_execution_report_nodes. - Pull raw evidence (logs, stack traces) with
step_download_attachment_content. - Investigate performance with
step_query_performance_metrics.
step_search_executions
Searches for executions in Step. All filters are optional and combined with AND logic; results are sorted by start time (descending). Useful as an entry point when you don’t already have an execution ID.
Key parameters:
descriptionstatus(RUNNING/PASSED/FAILED/TECHNICAL_ERROR)userstartTimeFrom/startTimeTo(epoch ms)executionParametersoffset,limit(max 200, default 25)
step_fetch_execution_overview
Assesses the overall health of an execution. Use this first.
Returns a token-optimized view of the Aggregated Tree, showing only the failing paths by default. Set filterStatus to ALL to include passing ones.
With includePreview enabled (default), it embeds a small sample of failed nodes and inlines small attachments to avoid extra round trips.
Key parameters:
executionId(required)filterStatus(FAILED/ALL, defaultFAILED)includePreview(defaulttrue)
step_fetch_execution_report_nodes
Pages through more instances of a specific failure by querying the raw execution tree. The artifactHash is a hash identifying a specific failing node type in the Aggregated Tree — returned by step_fetch_execution_overview.
Key parameters:
executionId(required)artifactHash(required)limit(default 5)offset(default 0)
step_download_attachment_content
Downloads the contents of an attachment identified on a ReportNode. Small text files are returned inline. Larger text files and binary files (Playwright trace bundles, HAR files) are saved to disk and the file path is returned.
Key parameters:
attachmentId(required)attachmentType(required)
step_query_performance_metrics
Pulls time-series metric aggregates or bucketed histogram data across the execution timeline. Essential for analysing performance regressions, load tests, and resource exhaustion. Start and end times default to the execution window.
Key parameters:
executionId(required)metricType(defaultresponse-time; alsohistogram,gauge,counter)numberOfBuckets(max 200, default 100; use1for a single aggregate across the whole execution)percentiles(default[80, 90, 99])startTime/endTime(epoch ms, optional)
Prompts
The server registers one MCP prompt that pre-configures an agent for root-cause analysis.
troubleshoot_step_execution
Directs the agent to perform an optimization-first root cause analysis on a Step execution run. It injects a system directive that enforces the recommended workflow:
- Call
step_fetch_execution_overview(withincludePreview=true) first. - Escalate to
step_fetch_execution_report_nodesorstep_query_performance_metricsonly when the preview indicates a larger systemic or performance issue. - Never brute-force raw nodes when an aggregation map is available.
Argument:
executionId(required)