Can XCSteward help with "Coding agents running iOS simulator tests on one Mac"?

This is a strong fit when the failure is operational: simulator readiness, destination resolution, CoreSimulator responsiveness, cleanup, timeouts, or local concurrency. XCSteward may help by making those phases bounded, serialized, and easier to inspect. It will not fix broken tests, code signing, missing runtimes, or vendor image bugs.

What should I check first?

Check whether xcrun simctl commands return promptly, whether xcodebuild can resolve a concrete simulator destination, whether the device is truly ready rather than merely Booted, and whether concurrent agents, scripts, or manual runs are touching the same simulator subsystem.

Coding agents running iOS simulator tests on one Mac

Coding agents and scripts run xcodebuild/Simulator tests concurrently on one Mac, and runs start colliding, wedging, and failing unpredictably.

Strong fit Likely class: concurrent simulator contention / shared-state collisions · Updated 2026-06-09

coding agents running ios simulator tests
multiple xcodebuild runs same mac conflict
ai agents ios tests simulator collisions
parallel simulator tests wedge mac

Symptom

You have one or more coding agents (plus your own terminals, scripts, and maybe a local CI-like job) running iOS/macOS tests on the same Mac. Individually each run might be fine. Together they start colliding: runs hang, simulators wedge, and failures stop being reproducible.

What it usually looks like

Two agents kick off xcodebuild test for different apps at the same time, and one or both stall.
Devices end up in odd states — booted by one run, erased by another, shut down mid-test.
simctl or the whole subsystem wedges right when activity peaks. See CoreSimulatorService deadlock.
Result bundles, logs, or DerivedData from different runs overwrite each other.
A run builds successfully but fails before XCTest attaches, and the agent cannot tell from raw output whether it is an app regression or simulator / runner setup trouble.
Failures correlate with how busy the Mac is, not with any one test — the hallmark of a contention problem rather than a code problem.

Why it happens / likely failure classes

This is the situation XCSteward was born from. The simulator subsystem is largely shared, per-user, and not designed for several uncoordinated drivers at once:

One CoreSimulatorService for the whole user session. Every agent, script, and Simulator window funnels through it. Uncoordinated concurrent operations contend for its locks and can deadlock it.
Devices are shared global state. One run booting, erasing, or shutting down a device can pull it out from under another run that assumed it was stable.
Implicit destinations collide. Two runs using generic/platform=iOS Simulator or the same device by name can land on the same device. See xcodebuild hangs resolving the destination.
Shared artifact paths. DerivedData, result bundles, and temp dirs collide when runs are not isolated.
No backpressure. Nothing stops a fifth run from starting when the host is already saturated.

Concurrency is an amplifier, not the sole cause. Each of these failures can happen in a single run — but multiple agents on one Mac make them frequent and hard to reason about.

Quick checks

# How many simulator-related processes are running right now?
pgrep -lf 'xcodebuild|simctl|Simulator'

# Are multiple runs targeting the same device?
xcrun simctl list devices | grep -i booted

# Is the subsystem still responsive under load?
time xcrun simctl list devices available

If failures cluster when several of these are active at once, treat it as contention, not flaky tests.

Manual mitigations

Serialize simulator work. Run one xcodebuild/simulator job at a time — e.g. a shell lock so agents queue instead of colliding:

# crude global mutex around simulator runs
(
  flock -w 1800 9 || { echo "another run holds the lock"; exit 1; }
  xcodebuild test -scheme YourScheme \
    -destination "platform=iOS Simulator,id=<UDID>"
) 9>/tmp/xcsim.lock

Give each run its own device by UDID, never a generic/by-name destination.

Isolate artifacts per run:

xcodebuild test -scheme YourScheme \
  -destination "platform=iOS Simulator,id=<UDID>" \
  -derivedDataPath "/tmp/dd-$RUN_ID" \
  -resultBundlePath "/tmp/result-$RUN_ID.xcresult"

Cap concurrency so the host is not saturated, and clean up devices and processes between runs.

These work, but they turn into a pile of glue you have to maintain — which is the gap XCSteward aims to fill.

When XCSteward may help

This is the central use case XCSteward is designed for:

A single controlled execution lane with a scheduler/queue, so multiple agents and scripts submit runs that execute in a coordinated order instead of fighting over the subsystem.
Guardrails around unsafe concurrent activity — preventing two runs from grabbing the same device or hammering CoreSimulator at once.
Readiness checks, timeouts, and deterministic cleanup so one bad run does not poison the next.
Isolated artifacts per run so logs and result bundles never collide.
Human-visible monitoring for long runs: plain submit --wait prints the queued job id, job directory, watch/follow commands, and compact updates; status <job-id> --watch and logs <job-id> --follow let a human keep observing an existing job.
A JSON contract for agents, including projects --json, profile show <name> --json, profile init --detect --json, status <job-id> --watch --json as newline-delimited JobSummary objects, explain <job-id> --json, phase-aware --progress, --metadata, --label, and repeatable --env KEY=VALUE for per-run environment injection. XCSteward records env override keys, not sensitive values.
Explicit pre-XCTest classification: runner_bootstrap_failure means runner or environment setup failed before XCTest attached, so agents can distinguish CoreSimulator, destination, launch-session, artifact, or runner setup trouble from real test failures and branch on result_class.
Timeout-before-attach detail: pre_xctest_timeout means the test command hit its timeout before XCSteward observed XCTest attach/test execution evidence, so agents do not mistake a bootstrap problem for a timed-out test case.
Pending-log handling so logs <job-id> can report that the combined log is not ready yet during queued/bootstrap setup and point back to status <job-id> --watch.

If your pain shows up specifically when agents share a Mac, this is the class of failure XCSteward most wants to be tested against.

When XCSteward probably will not help

It does not make your tests themselves parallel-safe — shared app state, fixtures, or backend data that collide across runs are your responsibility.
It is not a distributed test grid; it coordinates work on one Mac, it does not add machines.
It will not fix genuinely broken or flaky tests that fail on their own — see the failure-mode library for where it does and does not fit.

Common questions

What does "Coding agents running iOS simulator tests on one Mac" usually mean?: It usually points to concurrent simulator contention / shared-state collisions. Coding agents and scripts run xcodebuild/Simulator tests concurrently on one Mac, and runs start colliding, wedging, and failing unpredictably. Start by checking simulator readiness, destination selection, CoreSimulator/simctl responsiveness, and whether another xcodebuild, simctl, or Simulator process is already active before treating it as a test-code failure.
Can XCSteward help with "Coding agents running iOS simulator tests on one Mac"?: This is a strong fit when the failure is operational: simulator readiness, destination resolution, CoreSimulator responsiveness, cleanup, timeouts, or local concurrency. XCSteward may help by making those phases bounded, serialized, and easier to inspect. It will not fix broken tests, code signing, missing runtimes, or vendor image bugs.
What should I check first?: Check whether xcrun simctl commands return promptly, whether xcodebuild can resolve a concrete simulator destination, whether the device is truly ready rather than merely Booted, and whether concurrent agents, scripts, or manual runs are touching the same simulator subsystem.