No speed gain on 20gb vram  with 27b q4

Using a 7900xt with 20gb vram, having tried for HIP and Vulkan backends, I cannot get noticeable speed increases with the default recommended settings in the guide:

"$BEE_SERVER" --host 0.0.0.0 --port $PORT \
 -m $MODEL -md $DRAFT \
 --jinja --chat-template-kwargs '{"enable_thinking":true}' \
 -ngld all -ngl all -np 1 --reasoning on --cache-ram 0 \
 --spec-type dflash --spec-dflash-cross-ctx 512 \
 --kv-unified -b 2048 -ub 256 \
 --spec-draft-n-max 3 \
 --log-timestamps --log-prefix --log-colors off \
 --no-mmap --mlock --no-host \ 
 --temp 0.6 --top-k 20 --min-p 0.0 \
 -ctk turbo3 -ctv turbo3 \
 -fa on --metrics -c 64000

 MODEL=Qwen3.6-27B-Q4_K_M.gguf                                                                  
 DRAFT=dflash-draft-3.6-q4_k_m.gguf

I have context set at 64k because this is the minimum that hermes requires for usage.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

No speed gain on 20gb vram with 27b q4 #22

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

No speed gain on 20gb vram with 27b q4 #22

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions