[Bug] regression: increased resident memory usage with mmap since master-691-563137a

### Git commit

563137a5926ac9455420240c2a7d0f3f15eb9bd0 (`master-691-563137a`)

### Operating System & Version

Debian 13

### GGML backends

Vulkan, HIP

### Command-line arguments used

sd-cli --backend ROCm0 -p 'sunset' --diffusion-model z_image_turbo-Q8_0.gguf --llm Qwen3-4B-UD-Q4_K_XL.gguf --vae ae_bf16.safetensors --cfg-scale 1 --steps 9 -W 1024 -H 1024  --fa --mmap --offload-to-cpu

### Steps to reproduce

Since `master-691-563137a`, offloading with mmap is using a lot of resident memory, instead of the expected shared memory. As measured by `top` during inference:

| | RES | SHR |
|:---|:---|:---|
| master-690 | 7,7g | <1.0G |
| master-690 --mmap | 10.0G |  9.4G |
| master-691 | 7,6g | <1.0G |
| master-691 --mmap | 16.1G | <1.0G |

(note that it's expected that RES shows up larger with mmap, since it counts the memory-mapped areas too, and those are never released from memory; but it shouldn't result in a plain increase of resident memory usage)

The log keeps showing mmap being used for the models:

```
[INFO ] stable-diffusion.cpp:491  - Version: Z-Image 
[DEBUG] model_loader.cpp:813  - using mmap for I/O
[INFO ] model_loader.cpp:819  - using mmap for 'z_image_turbo-Q8_0.gguf'
[DEBUG] model_loader.cpp:813  - using mmap for I/O
[INFO ] model_loader.cpp:819  - using mmap for 'Qwen3-4B-UD-Q4_K_XL.gguf'
[DEBUG] model_loader.cpp:813  - using mmap for I/O
[INFO ] model_loader.cpp:819  - using mmap for 'ae_bf16.safetensors'
```

Later releases show different values, but similar behavior: increased RES usage with low shared memory.

### What you expected to happen

shared memory being used, to reduce RAM pressure

### What actually happened

increased RAM usage

### Logs / error messages / stack trace

_No response_

### Additional context / environment details

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] regression: increased resident memory usage with mmap since master-691-563137a #1675

Git commit

Operating System & Version

GGML backends

Command-line arguments used

Steps to reproduce

What you expected to happen

What actually happened

Logs / error messages / stack trace

Additional context / environment details

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

	RES	SHR
master-690	7,7g	<1.0G
master-690 --mmap	10.0G	9.4G
master-691	7,6g	<1.0G
master-691 --mmap	16.1G	<1.0G

[Bug] regression: increased resident memory usage with mmap since master-691-563137a #1675

Description

Git commit

Operating System & Version

GGML backends

Command-line arguments used

Steps to reproduce

What you expected to happen

What actually happened

Logs / error messages / stack trace

Additional context / environment details

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions