Riding the Sandworm
Hunting Shai-Hulud / Miasma - adversary actions & a field guide to finding droppers, workers and forged commits
Actor: TeamPCP (~UNC6780) · two strains, evolving · npm + GitHub · static-only methodology · TLP:AMBER
1. What you are hunting
Shai-Hulud (a.k.a. Miasma / “The Second Coming”), actor TeamPCP (~UNC6780), is a self-propagating supply-chain worm. It executes on a developer machine, harvests every credential it can reach, and re-uses those credentials to (1) publish poisoned package versions, (2) inject an obfuscated dropper into source repositories, and (3) stage stolen secrets in throw-away “dead-drop” repos. It exists in two strains that are actively evolving — treat it as a family, not a fixed IOC set:
“classic” Shai-Hulud (worker) — a throw-away repo running an obfuscated script via a GitHub Actions workflow. Loud: persona author “THE ASSET”, real timestamps, no identity forgery.
Miasma (injection) — an oversized .github/setup.js committed into an existing repo using a stolen maintainer token. Stealthy: the commit impersonates a real maintainer and is back-dated years to hide in history.
The single most useful fact for a hunter: the commit author / e-mail / date are attacker-controlled (the “author-pusher gap”). NEVER sequence or scope by commit date. Anchor on the server-side push time and on what cannot be forged a GPG signature, a file’s size, a workflow’s contents.
2. The adversary’s actions, in detail
2.1 The loop
Infection closes a loop on every newly-compromised machine: EXECUTE (auto-run on install or editor/CI trigger) → HARVEST (env vars, ~/.npmrc, cloud metadata 169.254.169.254, git/npm tokens) → PUBLISH / INJECT / STAGE. Auto-run is wired through npm pre/postinstall hooks, package.json scripts, agent/editor triggers (.claude/.gemini SessionStart, .cursor rules, .vscode folderOpen) and GitHub Actions workflows.
2.2 Two dropper strains
The size gap matters operationally: a detector tuned to “~5 MB” misses the ~685 KB worker entirely.
2.3 The worker delivery (classic)
A throw-away repo is created and an obfuscated script (any name/extension JS or Python) is committed to root or a dotdir, plus a workflow that runs it (static read never run):
# .github/workflows/run.yml
on: workflow_dispatch
jobs: { run: { runs-on: ubuntu-latest, permissions: { issues: write },
steps:
- uses: actions/checkout@v6
- uses: oven-sh/setup-bun@v2
- env: { GITHUB_TOKEN2: ${{ secrets.PATS }} }
run: bun run index.js } }Signature: oven-sh/setup-bun + bun run <script> + secrets.PATS / GITHUB_TOKEN2. The script is the dropper, regardless of name.
2.4 The injection delivery + commit-identity forgery (Miasma)
With a stolen maintainer token the worm commits the ~5 MB .github/setup.js into an existing repo and forges the commit to hide it:
Author spoofing: author/e-mail set to a real maintainer of that repo (sometimes a different person than the pusher).
Back-dating rule: the malicious commit clones the current HEAD commit’s exact timestamp (0-day gap from its parent). When HEAD is a PR-merge it replays the real “Merge pull request #N” text and the PR’s exact merge time; otherwise a plausible dev message + [skip ci]. Stale repos → inherited dates are years old.
Unsigned: the one un-forgeable field: forged commits are verified=false while the real maintainer signs.
Measured across 1,311 injection commits (anchored on real push time): 93% unsigned, 58% author-spoofed, 75% back-dated (median ~2.8 yr, max ~12.6 yr), 54% [skip ci], 0% future-dated. The malicious commit clones an existing old commit’s timestamp, so the forged date is always in the past.
2.5 Dead-drop exfiltration
Harvested secrets are written to disposable repos on the victim’s own account (markers: “Sha1-Hulud: The Second Coming”, “A Mini Shai-Hulud has Appeared”). Exfil-only in the historical firehose the dead-drop owners made zero cross-account actions. ~hundreds of drops per victim (≈48× loop inflation); dedupe by owner for a real victim count.
3. Strain evolution & campaign timeline
The forgery is a Miasma adaptation, not present in the classic worker. The worker uses fresh throw-away repos with no history to blend into, so it forges nothing. Miasma injects into existing repos with real maintainers and history, so it added impersonation + multi-year back-dating to disappear into that history.
Trajectory: loud, disposable workers → stealthy, history-blending injection. The tradecraft is evolving; expect later waves to carry more sophisticated evasion. Hunt by behaviour, not by today’s IOCs.
Figure 1 · Strain & infection timeline the two waves of dead-drops, the Miasma injection window, and the two documented evolution points (identity forgery; marker mutation).
4. Hunting playbook
4.1 Two rules before anything else
Never execute a sample. Everything below is static: tree metadata (paths + sizes), commit metadata, workflow YAML. The ingested dropper blob is never fetched or run, so the hunt cannot infect the hunter.
Anchor on the un-forgeable. Server-side push time (Events/GH Archive), GPG verification, file size, workflow contents. Ignore author.date for sequencing.
4.2 Source matrix - pick the right reach
4.3 Step-by-step hunt
Seed. Pull OSV MAL-2026-* (poisoned packages) and the community dead-drop marker lists.
Dedupe dead-drops by owner. Hundreds of drops collapse to a handful of victim accounts that is your compromised-user roster.
Enumerate each victim’s repos and size-scan them. This finds BOTH strains and recovers repos the name/marker rosters missed.
Confirm the strain. Miasma = oversized .github/setup.js (≥3 MB). Worker = root/dotdir script ≥200 KB + a workflow with the secrets.PATS / GITHUB_TOKEN2 signature.
For injections, audit the forgery. Pull the repo’s pushes from GH Archive at the real push hours; for each commit compute push_time − author.date, check verified and [skip ci].
Pivot to blast radius. Map each push-actor to their employer/orgs (profile + memberships); org-owned repos they can write to are the next-hop risk.
4.4 The detection heuristics (copy these)
DROPPER (both strains, name+size-agnostic) — from the repo tree:
flag a blob where ext in {.js .cjs .mjs .ts .py .sh}
AND size >= 200_000 # worker ~685KB ; Miasma ~5MB
AND ( path has no ‘/’ # repo ROOT
OR first_segment.startswith(’.’) ) # any DOTDIR
AND not path matches dist/|public/|libs/|node_modules/
|.yarn/|.pnp|.next/|.nuxt/|.output/ # legit big files
WORKER workflow — from .github/workflows/*.yml (static):
contains ‘secrets.PATS’ OR ‘GITHUB_TOKEN2’ # credential exfil (unambiguous)
often with oven-sh/setup-bun + ‘bun run <script>’
FORGERY (Miasma) — per commit, via Git Data API + GH Archive push time:
backdated = (push_time - author.date) > 7 days # often years in the past
spoofed = commit author.login != real pusher
unsigned = verification.verified == false AND maintainer normally signs
ci_evasion = ‘[skip ci]’ in messageLocation does double duty: legit oversized files (Monaco, Vite/Angular bundles, Storybook, Nuxt .output) live under dist/ / public/ / libs/ never root or a dotdir so the root-or-dotdir rule catches both droppers and excludes those false positives.
4.5 Pitfalls that produce wrong verdicts
Name-scoping. The dropper is not always setup.js match on size+location, JS and Python.
Date-windowed commit search. --since filters skip back-dated commits by construction; fetch by SHA.
“Recent commits” views. Back-dating buries the dropper below the fold; use the tree, not git log.
Remediated ≠ never-infected. A reverted HEAD hides the injection; check surviving commit objects by SHA.
The 300-event API cap. Active maintainers are undercounted; use GH Archive for completeness.
5. Observed scope & the companion data
Red Hat messaging maintainers compromised (accounts REDACTED) Miasma injections into rh-messaging/* and jboss-container-images/* (HEADs remediated).
SAP:" a compromised build-bot account (REDACTED) is a public SAP org member; dead-drop victims skew to SAP CAP/UI5/Fiori (poisoned @cap-js/*).
Worker fleets under several compromised accounts (REDACTED), using testing-* / experiments* throwaway repos; crypto/Web3 repos carry live droppers.
GitHub-wide discovery (marker + worker-credential + Dune-naming search) added ~164 compromised accounts and >1,200 repos beyond the original roster a FLOOR, since code search under-indexes the small/ephemeral repos this campaign favours.




