Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions src/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -153,6 +153,7 @@
- [User Namespace](linux-hardening/privilege-escalation/container-security/protections/namespaces/user-namespace.md)
- [UTS Namespace](linux-hardening/privilege-escalation/container-security/protections/namespaces/uts-namespace.md)
- [Escaping from Jails](linux-hardening/privilege-escalation/escaping-from-limited-bash.md)
- [Copy Fail Af Alg Splice Page Cache Overwrite Cve 2026 31431](linux-hardening/privilege-escalation/linux-kernel-exploitation/copy-fail-af_alg-splice-page-cache-overwrite-cve-2026-31431.md)
- [Posix Cpu Timers Toctou Cve 2025 38352](linux-hardening/privilege-escalation/linux-kernel-exploitation/posix-cpu-timers-toctou-cve-2025-38352.md)
- [euid, ruid, suid](linux-hardening/privilege-escalation/euid-ruid-suid.md)
- [Interesting Groups - Linux Privesc](linux-hardening/privilege-escalation/interesting-groups-linux-pe/README.md)
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,128 @@
# Copy Fail: AF_ALG + splice page-cache overwrite (CVE-2026-31431)

{{#include ../../../banners/hacktricks-training.md}}

This page documents **Copy Fail**: a Linux kernel local privilege escalation where **`AF_ALG` + `splice()`** turns **readable file page-cache pages** into part of a **writable AEAD destination scatterlist**, and `authencesn` then performs a **deterministic 4-byte write past the contractual output boundary**.

- Affected component: `crypto/algif_aead.c` in-place decrypt path + `crypto/authencesn.c`
- Primitive: controlled **4-byte page-cache write** into any file readable by the attacker
- Reachability: unprivileged local user, `AF_ALG` available, `algif_aead` loaded
- Impact: immediate system-wide corruption of the page-cache copy used by `read()`, `mmap()`, and `execve()`

This is closer to **Dirty Pipe / Dirty COW style page-cache abuse** than to a classic memory-corruption race:

- no race window
- no repeated retries
- no on-disk file modification
- same exploit flow across many distros because the primitive is structural, not offset-dependent

## Core idea

`splice()` moves data between a file, a pipe, and another FD **by reference**. If a readable file is spliced into a pipe and then into an `AF_ALG` AEAD socket, the crypto input scatterlist can reference the **same page-cache pages** backing that file.

For AEAD decrypt, `algif_aead` historically optimized the request into an **in-place** layout:

- **AAD** and **ciphertext** were copied into the user RX buffer
- the final **authentication tag** was **not copied**
- instead, tag scatterlist entries were appended to the destination with `sg_chain()`
- `req->src = req->dst`, so those appended tag pages became part of a **writable destination chain**

If the tag pages come from spliced file data, the writable destination chain now includes **page-cache pages of a read-only file**.

## The bug in `authencesn`

`authencesn` is an AEAD wrapper used for IPsec Extended Sequence Numbers (ESN). During decrypt it uses the destination scatterlist as scratch space and writes **4 bytes past the legitimate decrypt output**:

```c
scatterwalk_map_and_copy(tmp, dst, 0, 8, 0);
scatterwalk_map_and_copy(tmp, dst, 4, 4, 1);
scatterwalk_map_and_copy(tmp + 1, dst, assoclen + cryptlen, 4, 1);
```

The last write stores **`seqno_lo`** (attacker-controlled AAD bytes `4..7`) at `dst[assoclen + cryptlen]`, which is **after the tag** and therefore **outside the contract for AEAD decrypt output**.

When `algif_aead` has chained page-cache-backed tag pages into `dst`, that write crosses out of the RX buffer and lands in the victim file's page cache.

## Why this becomes a useful primitive

The attacker controls:

- **Which file**: any file readable by the attacker
- **Which offset**: via splice offset/length and AEAD `assoclen`
- **Which value**: the 4 bytes written come from attacker-controlled AAD bytes `4..7`

Even if authentication fails and `recvmsg()` returns an error, the **page-cache overwrite persists** because the scratch write already happened.

The corrupted page is **not marked dirty for writeback**, so:

- the on-disk file remains unchanged
- checksum comparisons on disk miss the attack
- all later `read()`, `mmap()`, and `execve()` users consume the modified in-memory page

## Typical LPE path

The public write-up targets a **setuid-root binary** such as `/usr/bin/su`:

1. Open `AF_ALG` and bind to `authencesn(hmac(sha256),cbc(aes))`
2. Send AAD where bytes `4..7` contain the 4-byte chunk to write
3. `splice()` target file data into the AEAD input so the final tag region references the target file's page-cache pages
4. Trigger `recv()` / `recvmsg()` to force decrypt
5. Repeat until the page-cache copy of the setuid binary is patched
6. Execute the binary so the kernel loads the modified cached image and runs attacker code as root

Conceptual PoC skeleton:

```python
a = socket.socket(38, 5, 0) # AF_ALG, SOCK_SEQPACKET
a.bind(("aead", "authencesn(hmac(sha256),cbc(aes))"))
# set key, accept request socket
u.sendmsg([b"A"*4 + payload_chunk], [cmsg_headers], MSG_MORE)
os.splice(target_fd, pipe_wr, offset)
os.splice(pipe_rd, alg_fd, offset)
u.recv(...) # triggers decrypt -> page-cache write
```

## How the bug became exploitable

- **2011**: `authencesn` introduced for IPsec ESN handling (`a5079d084f8b`)
- **2015**: `authencesn` converted to the new AEAD interface and kept the out-of-contract scratch write (`104880a6b470`)
- **2017**: `algif_aead` switched decrypt to an in-place design and chained tag pages into the destination (`72548b093ee3`)

That 2017 change is what turned an internal scratch write into a **page-cache write primitive** reachable from unprivileged userspace.

## Fix and mitigations

Mainline fixed this by reverting `algif_aead` back to **out-of-place** operation (`a664bf3d603d`), so page-cache pages can remain in the source scatterlist but no longer become part of the writable destination chain.

Useful mitigations:

- patch to a kernel carrying `a664bf3d603d` or a distro backport
- block `AF_ALG` socket creation with seccomp for untrusted workloads
- disable `algif_aead` if you need an immediate stopgap

Example emergency mitigation:

```bash
echo "install algif_aead /bin/false" > /etc/modprobe.d/disable-algif-aead.conf
rmmod algif_aead 2>/dev/null || true
```

For containerized environments, `AF_ALG` should be treated as a **kernel attack surface**. Even without this specific CVE, it is a good candidate for seccomp denial in CI runners, sandboxes, and multi-tenant containers.

## Detection / review notes

- A page-cache-only patch means the suspicious effect may be visible only in memory, not on disk.
- Look for unusual `AF_ALG` use on systems that do not intentionally expose kernel crypto sockets to workloads.
- When auditing zero-copy kernel interfaces, treat any path that combines **`splice()`-backed page references** with **scatterlists reused as destinations** as high risk.
- A useful reviewer rule is: if an algorithm writes beyond its documented output length, any caller that chains foreign pages into `dst` may turn it into a write primitive.

## References

- [Xint write-up: Copy Fail: 732 Bytes to Root on Every Major Linux Distributions](https://xint.io/blog/copy-fail-linux-distributions)
- [Copy Fail advisory / mitigation page](https://copy.fail/)
- [Linux fix: `crypto: algif_aead - Revert to operating out-of-place` (`a664bf3d603d`)](https://github.com/torvalds/linux/commit/a664bf3d603dc3bdcf9ae47cc21e0daec706d7a5)
- [Linux commit: `crypto: algif_aead - copy AAD from src to dst` (`72548b093ee3`)](https://github.com/torvalds/linux/commit/72548b093ee3)
- [Linux commit: `crypto: authencesn - Convert to new AEAD interface` (`104880a6b470`)](https://github.com/torvalds/linux/commit/104880a6b470)
- [Linux commit: `crypto: authencesn - Add algorithm to handle IPsec extended sequence numbers` (`a5079d084f8b`)](https://github.com/torvalds/linux/commit/a5079d084f8b)

{{#include ../../../banners/hacktricks-training.md}}