Skip to content

Emulated Netlink sockets cause infinite busy loops under event-loop#112

Open
doanbaotrung wants to merge 1 commit into
sysprog21:mainfrom
open-sources-port:fix/emulated_netlink_sockets
Open

Emulated Netlink sockets cause infinite busy loops under event-loop#112
doanbaotrung wants to merge 1 commit into
sysprog21:mainfrom
open-sources-port:fix/emulated_netlink_sockets

Conversation

@doanbaotrung

@doanbaotrung doanbaotrung commented Jun 25, 2026

Copy link
Copy Markdown
Collaborator

When guest applications (such as avahi-daemon) run event loops that
wait for netlink messages using ppoll() or select(), the emulated
descriptor immediately returns POLLIN because the write end is
closed. The application's subsequent recvmsg or read call returns 0
bytes.

Because no sender credentials can be resolved from a 0-byte read,
the application ignores the read and immediately queries ppoll()
again. This causes the guest application to spin in an infinite
busy-loop consuming 100% CPU.

To address this, implement a non-blocking self-pipe signaling
mechanism inside src/syscall/netlink.c:

  1. Store and use pipe_rd in netlink_state_t directly to avoid
    lockless global state reads in netlink_clear_readable.
  2. Honor MSG_DONTWAIT and O_NONBLOCK flags and implement proper
    blocking semantics by polling the pipe outside nl_lock.
  3. Signal readability only on empty-to-nonempty transition to
    prevent pipe buffer drift.
  4. Handle zero-length reads/recvs before empty-buffer check.

Close #105


Summary by cubic

Fixes infinite busy loops in emulated Netlink sockets by signaling readiness only when data exists and enforcing correct blocking. Stops 100% CPU spin in ppoll/select loops (e.g., avahi-daemon) and honors MSG_DONTWAIT/O_NONBLOCK.

  • Bug Fixes
    • Add a per-socket non-blocking self-pipe; store rd/wr in state (avoid lockless reads), set both O_NONBLOCK, keep write end open, and clean up on close.
    • Signal only on empty→non-empty; drain the pipe when buffers empty; handle 0-length read/recv/recvmsg first to avoid spurious POLLIN.
    • Enforce blocking correctly: return -EAGAIN for non-blocking; otherwise poll the pipe outside nl_lock; pass flags from sys_recvfrom to netlink_recv.

Written for commit 035e92a. Summary will update on new commits.

Review in cubic

cubic-dev-ai[bot]

This comment was marked as resolved.

@doanbaotrung

Copy link
Copy Markdown
Collaborator Author

Fix #105

@jserv jserv requested a review from Max042004 June 28, 2026 21:58
Max042004

This comment was marked as off-topic.

@jserv jserv left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Netlink busy-loop fix review. The core approach is sound: keeping the pipe write end open and signaling readiness only when a response is buffered fixes the spurious peer-closed POLLIN that spun avahi-daemon. A few correctness and robustness points below.

Append Close #105 at the end of git commit messages.

Comment thread src/syscall/netlink.c Outdated
Comment thread src/syscall/netlink.c Outdated
Comment thread src/syscall/netlink.c
@@ -605,6 +631,9 @@ int64_t netlink_sendmsg(int guest_fd, guest_t *g, uint64_t msg_gva, int flags)
}

int ret = nl_process_request(ns, req, rlen);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

netlink_signal_readable() writes a token on every successful send. Since each request overwrites ns->buf (buf_pos reset to 0), repeated sends without an intervening recv keep pushing bytes into the pipe until it fills and the writes silently drop on EAGAIN. Harmless today, but it makes the pipe level drift from ns->buf state and can mask a real missed-wakeup later. Signal only on the empty-to-nonempty transition: capture bool was_empty = ns->buf_pos >= ns->buf_len before nl_process_request() and write only when it goes non-empty. Same at line 667 (netlink_send).

Comment thread src/syscall/netlink.c
if (pipe(pipefd) < 0)
return -LINUX_EMFILE;

if (fd_set_nonblock(pipefd[0]) < 0 || fd_set_nonblock(pipefd[1]) < 0) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fd_set_nonblock() failure returns -LINUX_EMFILE, which is misleading -- this is an fcntl failure, not fd exhaustion. Preserve the real errno (return -linux_errno() from the failing call) after closing both pipe fds.

Comment thread src/syscall/netlink.c Outdated
@doanbaotrung doanbaotrung changed the title Emulated Netlink sockets cause infinite busy loops under event-loop Draft: Emulated Netlink sockets cause infinite busy loops under event-loop Jul 2, 2026
@doanbaotrung doanbaotrung force-pushed the fix/emulated_netlink_sockets branch 3 times, most recently from bcf3579 to d5fcee1 Compare July 2, 2026 14:52
@doanbaotrung

Copy link
Copy Markdown
Collaborator Author

Address netlink review feedback on busy-loop fix

  1. Store and use pipe_rd in netlink_state_t directly to avoid
    lockless global state reads in netlink_clear_readable.
  2. Honor MSG_DONTWAIT and O_NONBLOCK flags and implement proper
    blocking semantics by polling the pipe outside nl_lock.
  3. Signal readability only on empty-to-nonempty transition to
    prevent pipe buffer drift.
  4. Handle zero-length reads/recvs before empty-buffer check.

@doanbaotrung doanbaotrung changed the title Draft: Emulated Netlink sockets cause infinite busy loops under event-loop Emulated Netlink sockets cause infinite busy loops under event-loop Jul 2, 2026
@doanbaotrung doanbaotrung requested a review from Max042004 July 2, 2026 14:56

@jserv jserv left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Squash commits.

Comment thread src/syscall/netlink.c
Comment on lines +715 to +717
while (ns->buf_pos >= ns->buf_len) {
bool nonblock = (flags & LINUX_MSG_DONTWAIT) ||
(fd_table[guest_fd].linux_flags & LINUX_O_NONBLOCK);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add proper comments.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

Comment thread src/syscall/netlink.c Outdated
Comment thread src/syscall/netlink.c Outdated
@doanbaotrung doanbaotrung force-pushed the fix/emulated_netlink_sockets branch 2 times, most recently from c6fedec to 6f282ae Compare July 2, 2026 15:12

@jserv jserv left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Be aware of preferable code style.

Comment thread src/syscall/netlink.c Outdated
Comment on lines +644 to +646
if (was_empty && ns->buf_pos < ns->buf_len) {
netlink_signal_readable(ns);
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use brackets only when necessary.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Comment thread src/syscall/netlink.c Outdated
Comment on lines +681 to +683
if (was_empty && ns->buf_pos < ns->buf_len) {
netlink_signal_readable(ns);
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use brackets only when necessary.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Comment thread src/syscall/netlink.c Outdated
Comment on lines +735 to +737
if (errno == EINTR) {
return -LINUX_EINTR;
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use brackets only when necessary.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Comment thread src/syscall/netlink.c Outdated
Comment on lines +773 to +775
if (ns->buf_pos >= ns->buf_len) {
netlink_clear_readable(ns);
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use brackets only when necessary.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Comment thread src/syscall/netlink.c Outdated
Comment on lines +985 to +987
if (errno == EINTR) {
return -LINUX_EINTR;
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use brackets only when necessary.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Comment thread src/syscall/netlink.c Outdated
Comment on lines +1009 to +1011
if (ns->buf_pos >= ns->buf_len) {
netlink_clear_readable(ns);
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use brackets only when necessary.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

When guest applications (such as avahi-daemon) run event loops that
wait for netlink messages using ppoll() or select(), the emulated
descriptor immediately returns POLLIN because the write end is
closed. The application's subsequent recvmsg or read call returns 0
bytes.

Because no sender credentials can be resolved from a 0-byte read,
the application ignores the read and immediately queries ppoll()
again. This causes the guest application to spin in an infinite
busy-loop consuming 100% CPU.

To address this, implement a non-blocking self-pipe signaling
mechanism inside src/syscall/netlink.c:

1. Store and use pipe_rd in netlink_state_t directly to avoid
   lockless global state reads in netlink_clear_readable.
2. Honor MSG_DONTWAIT and O_NONBLOCK flags and implement proper
   blocking semantics by polling the pipe outside nl_lock.
3. Signal readability only on empty-to-nonempty transition to
   prevent pipe buffer drift.
4. Handle zero-length reads/recvs before empty-buffer check.

Close sysprog21#105
@doanbaotrung doanbaotrung force-pushed the fix/emulated_netlink_sockets branch from 6f282ae to 035e92a Compare July 2, 2026 15:31
@doanbaotrung

Copy link
Copy Markdown
Collaborator Author

Noted. I'll aware of using brackets only when necessary.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Emulated Netlink sockets cause infinite busy loops under event-loop pollers (e.g., avahi-daemon)

3 participants