Skip to content

refactor wasmtime_wasi_http::handler#13404

Open
dicej wants to merge 4 commits into
bytecodealliance:mainfrom
dicej:wasmtime-wasi-http-handler-refactor
Open

refactor wasmtime_wasi_http::handler#13404
dicej wants to merge 4 commits into
bytecodealliance:mainfrom
dicej:wasmtime-wasi-http-handler-refactor

Conversation

@dicej
Copy link
Copy Markdown
Contributor

@dicej dicej commented May 18, 2026

The goal here is to make this module more flexible and easier to reuse.

The heart of the changes is an overhaul of the HandlerState trait. Previously, it mostly consisted of functions returning Duration or usize representing various timeouts and reuse limits. The drawback to that API is that it prevented the embedder from dynamically adjusting timeouts and limits based on service load and resource usage. The new API uses pollable traits to provide such dynamism, allowing the embedder to e.g. expire and reclaim idle instances early if a pooling allocator limit has been reached.

Other high-level changes:

  • ProxyHandler::spawn has been replaced with ProxyHandler::handle, which takes a http::Request and returns a Result<http::Response, wasmtime::Error>, saving the embedder from needing to manage task spawning, type conversions, channels, etc.

  • ProxyHandler::handle may return a wasmtime::Error which may be downcast to an InstantiationError, which itself may contain a PoolConcurrencyLimitError representing a pooling allocator limit being reached. In this case, the embedder may wish to expire and reclaim instances more aggressively and arrange to retry the handle call(s) as instances are reclaimed.

  • Implementing HandlerState now requires providing an InstanceExpiration implementation as well. An instance of this type will be created for each worker and polled as part of the worker future to determine when the worker and its associated instance should be expired.

  • HandlerState::drop is called when the worker exits (normally or with a failure), giving the embedder the chance to interrogate the store used by that worker (for e.g. telemetry) prior to dropping it.

  • I've added a new p2::types::WasiHttpCtxView::new_response_outparam_from_callback function as a more flexible alternative to the existing new_response_outparam for cases where you don't have a tokio::sync::oneshot::Sender handy.

@dicej dicej requested a review from alexcrichton May 18, 2026 18:13
@dicej dicej requested review from a team as code owners May 18, 2026 18:13
The goal here is to make this module more flexible and easier to reuse.

The heart of the changes is an overhaul of the `HandlerState` trait.
Previously, it mostly consisted of functions returning `Duration` or `usize`
representing various timeouts and reuse limits.  The drawback to that API is
that it prevented the embedder from dynamically adjusting timeouts and limits
based on service load and resource usage.  The new API uses pollable traits to
provide such dynamism, allowing the embedder to e.g. expire and reclaim idle
instances early if a pooling allocator limit has been reached.

Other high-level changes:

- `ProxyHandler::spawn` has been replaced with `ProxyHandler::handle`, which
  takes a `http::Request` and returns a `Result<http::Response,
  wasmtime::Error>`, saving the embedder from needing to manage task spawning,
  type conversions, channels, etc.

- `ProxyHandler::handle` may return a `wasmtime::Error` which may be downcast to
  an `InstantiationError`, which itself may contain a
  `PoolConcurrencyLimitError` representing a pooling allocator limit being
  reached.  In this case, the embedder may wish to expire and reclaim instances
  more aggressively and arrange to retry the `handle` call(s) as instances are
  reclaimed.

- Implementing `HandlerState` now requires providing an `InstanceExpiration`
  implementation as well.  An instance of this type will be created for each
  worker and polled as part of the worker future to determine when the worker
  and its associated instance should be expired.

- `HandlerState::drop` is called when the worker exits (normally or with a
  failure), giving the embedder the chance to interrogate the store used by that
  worker (for e.g. telemetry) prior to dropping it.

- I've added a new
  `p2::types::WasiHttpCtxView::new_response_outparam_from_callback` function as
  a more flexible alternative to the existing `new_response_outparam` for cases
  where you don't have a `tokio::sync::oneshot::Sender` handy.
@dicej dicej force-pushed the wasmtime-wasi-http-handler-refactor branch from 1ac6fea to 6887760 Compare May 18, 2026 18:24
Comment thread tests/all/cli_tests.rs Outdated
this is half a print to stderr
\n\
after empty
start a print 1234
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There needs to be some way to determine which of the running instances (though, ideally, also the request where possible, though that now requires guest support...) a print line came from - is this just not done yet in the current design?

Copy link
Copy Markdown
Contributor Author

@dicej dicej May 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the possibility of concurrent execution in WASIp3, printing the request ID can't be done in general; i.e. we can't unambiguously determine which request generated which stdio activity.

Previously, I had written "best effort" code to plumb the request ID through in the case where instance reuse is disabled. I removed it in this PR because it seemed like more trouble than it was worth given that the vision going forward is to reuse instances by default. I can restore it, though, if desired.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we only need to distinguish instance IDs and not request IDs, then yes, we can do that without ambiguity.

Copy link
Copy Markdown
Contributor

@pchickey pchickey May 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, this host code is going to still execute p2 guests where instances are not reused, so we should retain the way to tag those instances in the print, otherwise it degrades the experience for existing users of p2.

I agree we need to do plumbing elsewhere to get a nice user experience in all cases with p3, but with backpressure on instance reuse there is still the possibility of many instances in p3 as well, so we will still need a way to disambiguate even if each instance keeps their own internal counter of request. So, even in the p3 world, theres still a need for this in order to get output the user can make sense of. (And it shows theres probably a need for some mechanism to get a unique id for a wasi-http request without depending on it being injected into a header, but thats a totally different problem for now...)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, so if I (re-)add the ability to tag stdio output on a per-instance basis (without trying to do it on a per-request basis), will that be sufficient, at least for the time being?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just pushed an update; let me know what you think.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Thats a good spot for this to be presently, and I will pursue the spec changes that I think are required to fix the rest separately.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we can really do meaningfully better than this: there is no way to reliably tie a line printed to stdout/err back to a connection within an instance if multiple are in flight concurrently, since tasks from any of them might be processed freely and with arbitrary scheduling, or even without a Component Model task being visible at all. E.g. if for two incoming requests timer-based actions are used, they might be coalesced into a single external async task representing the closest deadline.

This allows associated functions like `on_request_start` to maintain per-worker
state.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants