You have one or more coding agents (plus your own terminals, scripts, and maybe
a local CI-like job) running iOS/macOS tests on the same Mac. Individually each
run might be fine. Together they start colliding: runs hang, simulators wedge,
and failures stop being reproducible.
What it usually looks like
Two agents kick off xcodebuild test for different apps at the same time,
and one or both stall.
Devices end up in odd states — booted by one run, erased by another, shut down
mid-test.
Result bundles, logs, or DerivedData from different runs overwrite each
other.
A run builds successfully but fails before XCTest attaches, and the agent
cannot tell from raw output whether it is an app regression or simulator /
runner setup trouble.
Failures correlate with how busy the Mac is, not with any one test — the
hallmark of a contention problem rather than a code problem.
Why it happens / likely failure classes
This is the situation XCSteward was born from. The simulator subsystem is
largely shared, per-user, and not designed for several uncoordinated drivers
at once:
One CoreSimulatorService for the whole user session. Every agent, script,
and Simulator window funnels through it. Uncoordinated concurrent operations
contend for its locks and can deadlock it.
Devices are shared global state. One run booting, erasing, or shutting down
a device can pull it out from under another run that assumed it was stable.
Implicit destinations collide. Two runs using generic/platform=iOS Simulator or the same device by name can land on the same device. See
xcodebuild hangs resolving the destination.
Shared artifact paths.DerivedData, result bundles, and temp dirs collide
when runs are not isolated.
No backpressure. Nothing stops a fifth run from starting when the host is
already saturated.
Concurrency is an amplifier, not the sole cause. Each of these failures can
happen in a single run — but multiple agents on one Mac make them frequent and
hard to reason about.
Quick checks
# How many simulator-related processes are running right now?pgrep -lf 'xcodebuild|simctl|Simulator'# Are multiple runs targeting the same device?xcrun simctl list devices | grep -i booted# Is the subsystem still responsive under load?time xcrun simctl list devices available
If failures cluster when several of these are active at once, treat it as
contention, not flaky tests.
Manual mitigations
Serialize simulator work. Run one xcodebuild/simulator job at a time —
e.g. a shell lock so agents queue instead of colliding:
# crude global mutex around simulator runs( flock -w 1800 9 || { echo "another run holds the lock"; exit 1; } xcodebuild test -scheme YourScheme \ -destination "platform=iOS Simulator,id=<UDID>") 9>/tmp/xcsim.lock
Give each run its own device by UDID, never a generic/by-name destination.
Cap concurrency so the host is not saturated, and clean up devices and
processes between runs.
These work, but they turn into a pile of glue you have to maintain — which is the
gap XCSteward aims to fill.
When XCSteward may help
This is the central use case XCSteward is designed for:
A single controlled execution lane with a scheduler/queue, so multiple
agents and scripts submit runs that execute in a coordinated order instead of
fighting over the subsystem.
Guardrails around unsafe concurrent activity — preventing two runs from
grabbing the same device or hammering CoreSimulator at once.
Readiness checks, timeouts, and deterministic cleanup so one bad run does
not poison the next.
Isolated artifacts per run so logs and result bundles never collide.
Human-visible monitoring for long runs: plain submit --wait prints the
queued job id, job directory, watch/follow commands, and compact updates;
status <job-id> --watch and logs <job-id> --follow let a human keep
observing an existing job.
A JSON contract for agents, including projects --json,
profile show <name> --json, profile init --detect --json,
status <job-id> --watch --json as newline-delimited JobSummary objects,
explain <job-id> --json, phase-aware --progress, --metadata,
--label, and repeatable --env KEY=VALUE for per-run environment
injection. XCSteward records env override keys, not sensitive values.
Explicit pre-XCTest classification: runner_bootstrap_failure means
runner or environment setup failed before XCTest attached, so agents can
distinguish CoreSimulator, destination, launch-session, artifact, or runner
setup trouble from real test failures and branch on result_class.
Timeout-before-attach detail: pre_xctest_timeout means the test command
hit its timeout before XCSteward observed XCTest attach/test execution
evidence, so agents do not mistake a bootstrap problem for a timed-out test
case.
Pending-log handling so logs <job-id> can report that the combined log is
not ready yet during queued/bootstrap setup and point back to
status <job-id> --watch.
If your pain shows up specifically when agents share a Mac, this is the class of
failure XCSteward most wants to be tested against.
When XCSteward probably will not help
It does not make your tests themselves parallel-safe — shared app state,
fixtures, or backend data that collide across runs are your responsibility.
It is not a distributed test grid; it coordinates work on one Mac, it does
not add machines.
It will not fix genuinely broken or flaky tests that fail on their own —
see the failure-mode library for where it does and does not fit.
Common questions
What does "Coding agents running iOS simulator tests on one Mac" usually mean?
It usually points to concurrent simulator contention / shared-state collisions. Coding agents and scripts run xcodebuild/Simulator tests concurrently on one Mac, and runs start colliding, wedging, and failing unpredictably. Start by checking simulator readiness, destination selection, CoreSimulator/simctl responsiveness, and whether another xcodebuild, simctl, or Simulator process is already active before treating it as a test-code failure.
Can XCSteward help with "Coding agents running iOS simulator tests on one Mac"?
This is a strong fit when the failure is operational: simulator readiness, destination resolution, CoreSimulator responsiveness, cleanup, timeouts, or local concurrency. XCSteward may help by making those phases bounded, serialized, and easier to inspect. It will not fix broken tests, code signing, missing runtimes, or vendor image bugs.
What should I check first?
Check whether xcrun simctl commands return promptly, whether xcodebuild can resolve a concrete simulator destination, whether the device is truly ready rather than merely Booted, and whether concurrent agents, scripts, or manual runs are touching the same simulator subsystem.
Related failure modes
AI coding agent hits xcodebuild timeouts — A coding agent running iOS tests keeps hitting xcodebuild timeouts or hangs — frequently because its runs collide with other simulator activity.