Two skills that turn Claude Code from an interactive tool into an autonomous worker. Validated, secured, ready for overnight runs.
This is what it looks like when you use Claude Code for larger tasks. Four problems everyone encounters who has tried it.
40 clicks per hour on "Approve". After 20 minutes, you stop reading what you're approving.
After /compact, Claude forgets the plan, the stack, the completed steps. Without correction, it loops.
rm -rf ~/ has documented cases of deleting entire home directories. --dangerously-skip-permissions alone is not a safety concept.
No overnight runs. No parallel work. Leave the terminal, Claude stops.
Nightshift for planned project work. 24x7 for a continuous task queue. Both generate a complete, ready-to-run setup.
One task. One project. One validated plan. Claude works all night and commits the result in the morning.
Endless runner. Drop tasks as folders. Claude processes them sequentially and delivers results.
Each skill generates a complete setup. Install, copy into your project, start.
Nightshift detects the task type and fills the runbook with the matching phases, risk checks, and autonomy zones.
| Property | Nightshift | 24x7 |
|---|---|---|
| Mode | Single run, then exit | Endless loop |
| Task type | Planned project work | Continuous task queue |
| Workspace | Directly in the project repo | Isolated workspace |
| Runbook | Validated, genre-based | Freeform via task.md |
| Compact recovery | Hook re-injects plan | Not needed (fresh session) |
| Output | Git commit in project | Files in outbox/output/ |
| Parallel projects | Start multiple nightshifts | One queue, sequential |
| Idle behavior | — | Configurable |
| Loop detection | Stall warning after 3 checks without progress | Not needed (fresh session per task) |
| Cross-run memory | decisions.md in project root | decisions.md in workspace root |
--dangerously-skip-permissions alone is not a safety concept. Both skills build four lines of defense.
Blocks rm -rf, sudo, mkfs, chmod 777, curl|bash, eval, fork bombs
Kernel-level isolation. Claude can only write to the project folder and /tmp.
git checkout . reverts everything. Always commit before starting.
Heartbeat monitor detects when Claude stalls. macOS notification or custom alerting.
Both skills combine four mechanisms. None of them work well alone. Together, they make autonomous runs reliable.
A markdown file with checkboxes that Claude reads before each step. When context compression happens (and it will, on any run longer than 20 minutes), Claude loses its internal memory of the plan. The runbook survives because it's a file on disk, not in Claude's context window. After every compression, a hook tells Claude: "Read runbook.md again. Find the next unchecked item. Continue there."
The runbook also contains autonomy zones that tell Claude what it may do freely, what to log, and what is forbidden. Plus an error budget defining how many test failures are acceptable before Claude should stop.
Claude Code hooks are scripts that fire on specific events. They run even with --dangerously-skip-permissions. A PreToolUse hook returning exit code 2 blocks the tool call unconditionally — Claude cannot override it.
A sandbox-exec profile restricts Claude's filesystem access at the kernel level. Claude can only write to the project directory and /tmp. Even if Claude tries rm -rf ~/, the kernel blocks it — the hook doesn't even need to catch it. This is the real safety net, not the hook.
Important: The sandbox is not active by default. You must explicitly start with sandbox-exec -f sandbox.sb ./run.sh. Without it, Claude has full access to everything your user account can reach. On Linux, use Docker or a dedicated user account instead.
A separate script that checks the heartbeat log. If Claude hasn't written a heartbeat in N minutes (default: 10), it raises an alarm. On macOS, this is a system notification. You can extend it with your own alerting (Slack webhook, email, etc.).
The watchdog catches crashes (no heartbeat). But what about loops? Claude retries the same failing test 50 times — the heartbeat looks healthy, but no progress is made. The Stop hook now tracks completed steps between checks. If 3 consecutive checks show no new checkmarks in the runbook, a STALL WARNING is injected into Claude's context.
The warning tells Claude to check its error budget and skip the step if allowed. This catches the most expensive failure mode in autonomous runs: burning hours of API credits on a loop that a human would have interrupted after 5 minutes.
Autonomous runs are ephemeral — Claude starts fresh each time. But architecture decisions made in one run should inform the next. If Tuesday's run chose PostgreSQL over SQLite, Wednesday's run should know that.
Both skills now maintain a decisions.md file. Claude reads it at the start of every run and appends relevant decisions at the end. Format: date, decision, reasoning. Append-only, never overwrite. It's not learning — it's structured remembering.
Nightshift: decisions.md in the project root, persists across overnight runs. 24x7: decisions.md in the workspace root, shared across all tasks.
Everything you need to do, in order. Takes about 5 minutes to set up, then Claude runs for hours.
Download nightshift.skill and unzip it into Claude's skill directory: unzip nightshift.skill -d ~/.claude/skills/. This makes the skill available in Claude Code and Claude.ai. You only do this once.
Open Claude and say something like: "Set up a nightshift run for /my/project — refactor auth module to use JWT tokens". Claude detects the genre (refactoring), asks you to confirm, generates a runbook with concrete steps, validates it against 15 checks, and produces all setup files.
Claude generates a setup package. Copy the files into your project: runbook.md, .claude/settings.json, the shell scripts, and the sandbox profile. Append CLAUDE-nightshift.md to your existing CLAUDE.md.
This is your safety net. Run git add -A && git commit -m "Checkpoint before Nightshift". If anything goes wrong during the run, git checkout . brings you back to exactly this state. Never skip this step.
For maximum safety: sandbox-exec -f nightshift-sandbox.sb ./nightshift-run.sh. For background operation (terminal can be closed): ./nightshift-run-bg.sh. The PID lock prevents accidental double starts. A cost warning is displayed — this run uses API credits.
Look at the git log: git log --oneline -5. Check what changed: git diff HEAD~1. Open runbook.md to see which steps were completed [x]. If something went wrong, git checkout . undoes everything.
Set it up once, then drop tasks as folders whenever you need something done.
Install the skill: unzip 24x7.skill -d ~/.claude/skills/. Then tell Claude: "Set up a 24x7 runner at /my/workspace with idle behavior cleanup". Claude generates the workspace structure with runner, watchdog, hooks, and sandbox.
Copy the generated files to your workspace, make scripts executable, and start: ./runner-bg.sh. The runner now polls the inbox every 30 seconds. Start the watchdog in a second terminal: ./watchdog.sh
Create a folder in inbox/ with a task.md describing the assignment and optionally a materials/ folder with input files. The runner detects it automatically, moves it to working/, spawns a fresh Claude session, and routes the result to outbox/ or failed/.
Check outbox/your-task/output/ for the deliverables. Read outbox/your-task/log.md for what Claude did. Failed tasks land in failed/ with error information in their log.md.
A single Claude session running for hours suffers from "context rot" — Claude becomes increasingly unreliable as context accumulates and gets compressed. The 24x7 runner avoids this entirely: the bash loop is the daemon (runs forever), Claude is the worker (fresh session per task, exits cleanly). Every task gets Claude at full quality.
Autonomous Claude Code is powerful but not magic. Here's what you need to know before you start.
Both skills run Claude Code in headless mode. Every tool call, every file read, every response consumes API credits. A Nightshift run might cost $5–50 depending on complexity. A 24x7 runner generates continuous costs. Monitor your usage at console.anthropic.com. For 24x7, set idle to "sleep" if cost is a concern — this prevents Claude from burning credits when no tasks are waiting.
The generated sandbox profile is just a file. You must explicitly start with sandbox-exec -f sandbox.sb ./run.sh. Without it, Claude has full access to your entire user account. The PreToolUse hook catches obvious destructive commands via pattern matching, but it's not a real security boundary — the sandbox is. On Linux, use Docker or a dedicated user account instead of sandbox-exec.
Vague steps produce vague results. "Implement auth" can mean anything — Claude will guess, and in headless mode, nobody corrects the guess. Good steps include file paths, function names, or shell commands. The 15-point validation catches the worst offenders, but you should review the runbook before starting. If a runbook needs more than 20 steps, split it into multiple runs.
Without a clean git state, you have no rollback. The runner warns you about uncommitted changes (non-blocking), but it's your responsibility to commit first. git checkout . is your emergency brake — it only works if you committed before the run.
sandbox-exec is a macOS-only feature (deprecated but functional). On Linux, run Claude inside a Docker container with mounted volumes, or create a dedicated user account with limited permissions. The hooks and watchdog work on any platform — only the kernel-level sandbox is macOS-specific.
Nightshift handles compression with two mechanisms: the SessionStart hook re-injects the plan after /compact, and the Stop hook repeats constraints every 5 steps. This works well for 10–20 step runbooks. For very long tasks (30+ steps), quality degrades even with these mechanisms. Split into multiple runs instead. The 24x7 skill avoids this entirely with fresh sessions per task.
The claude command must be installed and authenticated. These skills generate setups that use claude -p (headless mode).
The runner scripts are bash. The build script is Python. Hooks use jq to parse JSON. All three must be in your PATH.
For sandbox-exec kernel-level isolation. Works on Linux without the sandbox — use Docker or a dedicated user instead.
Nightshift commits results to your repo. Without git, there's no rollback. For 24x7, any directory works.