Playbook9 min read

Signed-malware supply-chain response — the CISO playbook

Short definition

Operational playbook for the first 72 hours after a package you trusted — and that carried a valid signature — is reported compromised. Decision gates, credential rotation order, evidence list, regulator notification triggers.

Why this matters now

Signed packages have been the trust anchor of the modern software supply chain for five years; in 2025-2026 that anchor cracked. Shai-Hulud, Mini Shai-Hulud, and the May 2026 Tanstack incident all shipped malicious code under valid Sigstore signatures. The defensive question is no longer "is the signature valid" but "what do I do in the next 72 hours when the signature was valid and the code was not".

Key points

▸Signature validity is no longer sufficient evidence of package safety — a valid signature only proves provenance, not intent.
▸The clock starts at the first credible third-party report, not at internal confirmation. Most organisations lose 6-12 hours to "let us verify" delay.
▸Credential rotation order matters: OIDC tokens issued during the compromise window > GitHub PATs in workspaces that ran installs > long-lived cloud keys in CI > deploy keys.
▸CI runner workspaces are evidence — do not delete them until provenance is captured. Most teams blow this in the first hour.
▸If the malware exfiltrated customer data, NIS2 Title 13 + GDPR Art. 33 fire concurrently. The same evidence base feeds both.
▸Behavioural traffic detection catches signed-but-malicious egress regardless of signature status. Signature checks are necessary but no longer sufficient.

When this playbook fires

Use this playbook when any of the following becomes true:

A package you consume (npm, PyPI, RubyGems, Maven, Crates, Go module, container image, GitHub Action) is reported compromised in a GitHub Security Advisory and the affected version range overlaps a version you installed in the last 30 days.
A vendor advisory or named-incident report (e.g. the May 2026 Tanstack incident GHSA-g7cv-rxg3-hmpx) names a package or maintainer account you depend on.
A signature on a package you trust is reissued from an unexpected identity or build pipeline, even before exploitation is reported — Sigstore Rekor transparency-log anomalies count.
A CI/CD secret you own appears in a public exposure dump, paste site, or stealer log, and the path of exposure points back to a package install step.

Do NOT use this playbook for: classical CVEs in your own first-party code (use your standard vulnerability-management workflow), credential-stuffing or account-takeover incidents without a supply-chain vector, or zero-day exploits against signed binaries that did not pass through your build system. Those have different containment trees.

The clock — start condition and gates

The countdown starts at first credible third-party report, not at internal verification. A "credible third-party report" is: a GitHub Security Advisory, a CVE assigned by a CNA, a vendor security bulletin, or an incident-tracking post from a recognised researcher with reproducible IoCs.

The gates:

Hour 0 → Hour 1: triage, freeze, scope. You must know which projects pulled the package and which CI runs executed an install during the compromise window.
Hour 1 → Hour 4: credential containment. Every secret that touched a compromised runner is presumed exfiltrated. Rotate, do not "monitor".
Hour 4 → Hour 24: eradication and rebuild. Pin to last-known-good, replace credentials in production, rebuild containers from a clean base.
Hour 24 → Hour 72: regulator notification (if data was exfiltrated), customer notification (if a downstream artefact was shipped), upstream coordination (advisory, PR to fix).

The deadlines are not negotiable. Waiting for "alignment" with the affected vendor before starting your own rotation is the most common single failure mode in this scenario.

Hour 0–1 — triage, freeze, scope

Goal: stop the bleeding and know the blast radius.

Triage actions (parallelisable, assign owners):

Confirm the report against the canonical source (CISA KEV catalog, GitHub Security Advisories, the advisory ID from the vendor). Do not act on a single tweet without a corroborating advisory.
Identify the affected package + version range + ecosystem.
Determine the compromise window (when the malicious version went live, when it was pulled).

Freeze actions:

Block the affected version range in your private package mirror (Artifactory, Nexus, GitHub Packages) — version-pin to last-known-good in a registry-side allowlist.
Pause autoinstall in CI: disable any pipeline that runs npm install, pip install, bundle install, or equivalent against the public registry until the mirror block is in place.
Halt deploys to production for services that consumed the affected package in the last 30 days.

Scope actions:

Grep your lockfiles (package-lock.json, pnpm-lock.yaml, poetry.lock, Gemfile.lock, go.sum) across every repository for the affected package name and the compromised version.
Pull CI run logs for the compromise window and identify which runners executed install against the compromised version.
For container images: identify which images were built from a base that pulled the package, and which environments are running them. SBOMs make this a single query; without SBOMs this is a multi-hour manual reconciliation.

Hour 1–4 — credential containment (rotation order)

Every secret that was present in the memory space of a CI runner during a compromised install is presumed exfiltrated. Treat presumption as fact.

Rotation order (priority high → low):

OIDC tokens issued to compromised runners during the install. These are short-lived but were almost certainly exchanged for longer-lived cloud credentials inside the malicious code. Revoke the cloud session tokens issued during the compromise window on every cloud provider you use (npm provenance attestations document the OIDC issuer relationship).
GitHub Personal Access Tokens that were exposed in workspace env vars during the install — both organisation-scoped and user-scoped. Audit recent token use via GitHub audit log for the window.
Long-lived cloud keys (AWS access keys, GCP service account keys, Azure SP secrets) that were in CI secrets when the runner ran the install. Even if you use OIDC primarily, any fallback long-lived key is presumed leaked.
Deploy keys and SSH agent forwarding that touched any workspace running the install.
Package registry tokens (npm publish tokens, PyPI API tokens, GitHub Packages tokens, container registry push credentials). Worm-style malware re-uses these to publish itself further — rotation here prevents you from becoming the next downstream vector.
Third-party SaaS API keys in CI (Datadog, Sentry, Slack, Snyk). These are not directly weaponisable in the build but appear on stealer markets.

Document the rotation in a single timeline log with timestamps and approver per item. This log is evidence for both upstream advisories and downstream customer notifications.

Hour 4–24 — eradication, rebuild, hunt

Goal: get back to a clean state, then look for what the malware did before you contained it.

Eradication:

Pin every project to the last-known-good version of the affected package. If no safe version exists in the same major, fork-and-fix or remove the dependency.
Force npm ci / pip install --no-cache / docker build --no-cache on every affected pipeline — cached compromised artefacts are the most common reinfection vector at this stage.
Rebuild container images from clean bases. Do not patch in place; the malware may have modified node_modules or site-packages in the layered image.
Revoke and reissue any artefact that was signed or attested using a compromised credential.

Hunt (this is where most teams stop too early):

For each rotated credential, query upstream provider logs for use of that credential between the install timestamp and the rotation timestamp. Anything outside your normal pattern is presumed adversary activity.
Hunt egress from CI runners during the compromise window: never-before-seen domains, raw IP egress, DNS-over-HTTPS to unknown resolvers, bursts of HTTPS to consumer paste services. This is the signature of typical exfiltration payloads.
Hunt in production for indicators of executed payloads: cron jobs, systemd timers, scheduled tasks, registry runs, unusual outbound from production hosts that ran the affected version.
Cross-check your SBOM diff against the SLSA v1.2 build provenance for the affected releases — provenance mismatches reveal which release artefacts were rebuilt under the compromised pipeline.

Hour 24–72 — disclosure and regulator notification

Goal: close the loop with upstream, downstream, and regulators.

Upstream coordination:

If your team identified the issue independently, submit a coordinated GitHub Security Advisory via the affected repository's "Security" tab. Embargo until upstream confirms.
If a fix is available, contribute it via PR to the affected project. If maintainers are unresponsive, request advisory disclosure through GitHub or MITRE.

Downstream notification:

If you shipped a downstream artefact (a customer-facing build, a library you publish, a container image others pull) that contained the compromised dependency, notify your customers. Include: the compromise window, the affected versions of your artefact, the recommended remediation (downgrade or upgrade), and any IoCs to hunt in their own environment.

Regulator notification (EU operators):

NIS2 essential/important entities: this is a significant incident under NIS2 Title 13 if it affects the continuity of an essential or important service, or if customer data was exfiltrated. Early warning at hour 24, full notification at hour 72.
DORA significant financial entities: parallel notification under DORA Art. 19, with 4h early warning, 72h intermediate report.
GDPR: if personal data was exfiltrated from CI logs, build artefacts, or downstream production, Art. 33 notification to the supervisory authority within 72 hours of awareness.
Sector regulators (energy, healthcare, finance, public administration): check the specific sectoral notification table — some have shorter deadlines than NIS2.

Use the same evidence base for all notifications. Build it once during the hunt phase, export per regime.

Evidence checklist — what to preserve and present

Across the response, plan to have ready (each item signed at write time, timestamped, with chain-of-custody — manual after-the-fact assembly under regulator deadline is where teams fail):

Triage timeline log: timestamps of first report, first internal confirmation, freeze, rotation completion, eradication completion.
Lockfile snapshots: pre-incident and post-remediation, for every affected repository. Diff demonstrates which versions were pinned.
CI run logs for the compromise window across all affected pipelines, with original timestamps preserved.
SBOM diff: pre-incident vs post-remediation, for every shipped artefact.
Credential rotation log: each rotated credential, original creation time, rotation time, approver, scope of access.
Egress telemetry from CI runners and production hosts that ran the affected version, covering the compromise window plus 7 days before and 7 days after.
Provenance attestations: SLSA build attestations for affected releases, demonstrating which were rebuilt under the compromised pipeline (referenced via SLSA v1.2 specification).
Upstream coordination record: GHSA submissions, PR links, vendor email chain.
Customer notification log: who was notified, when, via which channel, with what content.

The operational lesson from peer organisations: pre-build this evidence chain with a continuous-evidence platform that signs each artefact at write time. Behavioural egress telemetry against the CI runners — captured by an on-prem appliance running AI Traffic Analysis with four parallel inference heads on the wire — produces the exfiltration timeline as a byproduct of normal operation, rather than as a forensic reconstruction under deadline. The signature check told you the package was authentic; the traffic told you what it did.

Common failure modes

1. "Let us verify before we act" delay. Teams routinely lose 6-12 hours to internal verification before starting credential rotation. The advisory was already verified by the issuer; your verification window is for blast-radius scoping, not for deciding whether to act. Start the rotation in parallel with the verification.

2. Deleting CI workspaces too early. Workspaces from compromised runs are forensic evidence. Snapshot them before any cleanup. Most organisations destroy this evidence in the first hour because automation aggressively reclaims runner storage.

3. Skipping the OIDC token revocation. Modern CI uses short-lived OIDC tokens that "expire on their own". They do — but cloud credentials issued via STS assume-role using those OIDC tokens are valid for hours and can be re-used. Revoke STS sessions issued during the window on every cloud, not just the OIDC issuer.

4. Trusting SBOM coverage you do not have. Most SBOM tooling covers direct dependencies well, transitive dependencies poorly, container-base packages worse, and Build-time-only tools not at all. Audit your SBOM coverage before the incident — relying on incomplete SBOMs during scoping leaves blind spots that the malware exploited specifically.

5. Notifying customers without IoCs. A customer notification that says "we used a compromised package but we are not sure what it did" generates more support load than it deflects. Always include IoCs the customer can hunt — domains, IPs, file hashes, process names, registry keys — so they can answer the question without coming back.

6. Treating it as a one-off. Worm-style supply-chain malware (Shai-Hulud family) is designed to re-publish itself through the credentials it steals. The May 2026 Tanstack incident reused stolen tokens from prior compromises. Assume your rotated credentials may have been used to publish to other registries before rotation — audit your own publish history during the compromise window.

Cross-regime notes

A signed-malware supply-chain incident typically triggers multiple notification regimes in parallel. Common combinations:

EU SaaS provider: NIS2 Title 13 (if essential/important) + GDPR Art. 33 (if personal data) + DORA Art. 19 (if financial entity) + customer contractual notification (always). Same evidence base, four exports.
US federal contractor: CISA reporting (CIRCIA when in force) + sector-specific (FedRAMP, DFARS 252.204-7012) + state-level data-breach laws where personal data crossed jurisdictions.
UK regulated: NIS Regulations 2018 (if essential/important) + UK GDPR + sector-specific (FCA, ICO).
Multinational: assume that the highest-bar regime in your customer base is the bar. The same playbook satisfies all.

The operational reference for the EU-side cadence is the NIS2 Title 13 incident-timeline playbook. For the methodology angle on threat-led testing that exercises supply-chain scenarios proactively, see TLPT (Threat-Led Penetration Testing) — supply-chain compromise is a standard TIBER-EU scenario for finance entities. The defensive position the Zero Hunt engineering team takes — and the position NIST SP 800-204D codifies for federal supply chains — is that signature verification is necessary but not sufficient: behavioural traffic monitoring and continuous provenance attestation must run alongside.

Goes deeper

Want this against your environment?

Book a 30-minute scoping call — we will map this directly to your current compliance scope and threat profile.

Request a demo Browse use cases

← Back to Home