Nhse o40 orkv.i141 silvermachine by martinsumner · Pull Request #3 · OpenRiak/riak_api

martinsumner · 2026-04-24T11:35:59Z

Switch HTTP API to use SilverMachine not WebMachine/Mochiweb.

Intentions are:

improve performance of common requests, especially with significant volumes of index entries and/or user object metadata;
simplify the callbacks required within the application;
only implement the basics of HTTP, the full extent of rules (i.e. equivalent webmachine decision_core) may be implemented in the callbacks, but aren't applied by default;
allow for streaming on both inbound request bodies as well as outbound response bodies, to ready support for file upload/downloads;
simplify the task of providing SSL handshake information to the application

WIP. A framework of modules and functions for SliverMachine.

Apply only to new files, or heavily altered files so that broader change history is maintained.

Also add callback module and unit test the sending of responses both streamed and whole.

Plus some further testing/formatting

Retain mohijson2 and mochinum to ease transition

Where URI starts "/" will add a leading <<>> to the split path - which is confusing and easy to forget about.

Address initial feedback

Also provides functions for converting Last Modified Date in KV GET.

128 aligns with mochiweb - otherwise some tests with riak_test that use large bursts of connections may fail

Without requiring knowledge when implementing routes of the routes priority relative to others.

ThomasArts · 2026-05-07T11:52:33Z

+            {ok, PeerIP, Cert} = riak_api_web_socket:get_peer(Socket),
+            loop(Socket, <<>>, PeerIP, Cert, Port);
+        {error, timeout} ->
+            init(Server, Listener, Port);


No max number of attempts?

I think not. We start a listener, and a pool of acceptors - however, there is no telling how long it might be before there is a connection. If we reached a max number of loops, we still want to maintain the pool, so even if we exit, it would only be to start another acceptor and put it in the loop.

This was copied from Elli:

https://github.com/elli-lib/elli/blob/main/src/elli_http.erl#L57-L81

However, is it actually the case that we don't need the timeout. i.e. we should use gen_tcp:accept/1 (with an infinite timeout) and then not have to loop on timeout - as the purpose of the loop isn Elli (reading the comments) is to support a code upgrade we will probably not require?

f34nk · 2026-05-12T11:57:30Z

+        State#socket_state{
+            pool_size = State#socket_state.pool_size - 1,
+            acceptor_pool =
+                sets:del_element(Pid, State#socket_state.acceptor_pool)


If I don't misunderstand something, this is a potential bug.
Please correct me if I'm wrong :)

While a new acceptor is started on accept when:

PS < State#socket_state.max_pool_size

EXIT decrements pool_size and removes the pid.

When PS >= max_pool_size, every acceptor can be busy serving connections, so no process is blocked in accept.

When those workers exit, the listener still never get a new blocking acceptor unless something else spawns one.

From my understanding, the behaviour will be that some connections get served as normal, but new connection attempts will timeout (?) while the listener is still alive.

I will try to come up with a test that reproduces this.

I think you're right. Once the max_pool_size has been exceeded, and all acceptors are busy handling connections - I don't think there is anything to start new acceptors.

I think the model for acceptor pools requires a rethink.

It might be easier to follow if we track awaiting_acceptor and busy_acceptor. When an awaiting_acceptor gets a connection, it is moved to busy_acceptor, and if size(awaiting_acceptor) + size(busy_acceptor) < max_pool_size a new acceptor is created.

If an acceptor exit is detected, tit needs to be removed from the pool (presumably the busy_acceptor), but maybe check awaiting_acceptor if it is not in the busy list (perhaps it can exit before accepting a connection). If the size(awaiting_acceptor) < pool_size AND size(awaiting_acceptor) + size(busy_acceptor) < max_pool_size - then create a replacement acceptor.

See 27d39b8

This contains a change to riak_api_web_socket acceptor pool management, and new connection tests in tests/riak_api_web_ets_store

Nice!

To follow up on my comment - it took a while to wrap my head around this :)
This test reproduces the issue with the "single" acceptor_pool. My adhoc solution was to restart an acceptor on EXIT (it works to demonstrate the issue).

I very much like your approach: separating acceptor_pool and connected_pool makes this much clearer to follow.

The only thing I am suspicious about is, that the accepted check happens BEFORE setting the new state.

For example, this edge case: AP = 0, CP = 3 and max_pool_size = 3.
The guard sees 0 + 3 < 3 -> which does not spawn a new acceptor (as it should ??). Only AFTER a CP process exits, a new acceptor gets triggered.

I am not sure if this is what we want.

Maybe better: compute the new state and then decide what to do.

Something like this:

AP1 = sets:del_element(ExP, AP), CP1 = sets:del_element(ExP, CP), NeedIdle = sets:size(AP1) < State#socket_state.pool_size, BelowMax = sets:size(AP1) + sets:size(CP1) < State#socket_state.max_pool_size.

Then start a replacement when NeedIdle andalso BelowMax is true.

Wdyt?

I've done a different version of your proposed modification: 18a1f78

I think it is much clearer to update the sets and then have the case statement based on sets:size rather than assumed size. Hopefully the above reads OK (I set it out this way to reduce some code duplication).

I've added the commit with your test into the test suite. Thanks

Co-authored-by: Thomas Arts <thomas.arts@quviq.com>

No intention to support trailer fields. Cannot merge with headers, as headers already processed by callback. Actually looking at trailer fields provides further risk (how many should be supported, should they be included in content size etc).

hmmr

Overall, designing this riak component as a pared-down http server, with no attempt to present it as a general purpose http server outside of riak, is clearly the right approach. A handful of non-critical issues; otherwise, ready to approve.

Handle different error scenarios

Add a regression test for the empty acceptor pool case after max pool use and worker exits. Register the new module in the erlfmt file list.

martinsumner added 30 commits March 26, 2026 15:45

Initial Commit

af08451

WIP. A framework of modules and functions for SliverMachine.

Initial test of get_body

fd5c2b4

Fix formatting of new files

256151c

Apply only to new files, or heavily altered files so that broader change history is maintained.

Add basic chunking support (receive body)

ea3034b

Add further tests - plus date caching

b815a5c

Add behaviour

3c11b94

Also add callback module and unit test the sending of responses both streamed and whole.

Formatting

b471a74

Accepting chunked requests in slices

57d1867

Plus some further testing/formatting

Formatting

df18274

Add initial end-to-end test

f9469ae

Revert clock bakc to ETS, and add negative tests

2812d4b

Add monitoring of exits on acceptors

81d1b4e

Pass the split path as well as path

f4a27d3

Extend tests, add KV store test

be07044

Add file upload/download test

7ccac06

Formatting

52afe86

Chunked puts

ab1a5b2

Check keepalive in HTTP 1.0

65a4040

Remove webmachine/mochiweb

9333085

Retain mohijson2 and mochinum to ease transition

Remove mochijson2/mochinum - included in rhc

a011b1d

Remove leading empty bin if it occurs

3a1714f

Where URI starts "/" will add a leading <<>> to the split path - which is confusing and easy to forget about.

Use nomatch, and context to be last not first

01ddd3f

Address initial feedback

Optimise clock management

81beed1

Also provides functions for converting Last Modified Date in KV GET.

Tidy-up initial configuration

c553fe2

Fix return tuple from security function

697631e

Update riak_api_web_security.erl

ee81ccc

Be explicit peer is IP

adf5b51

Missed peer()

fa6a194

Correct security type for peer

94ec8f7

Align params type with uri_string:dissect_query

9d66650

martinsumner added 5 commits April 24, 2026 15:40

Format fix

917f6cd

Make type definition more consistent

ffa7553

And export type

a417d2e

Increase default backlog

99b9bab

128 aligns with mochiweb - otherwise some tests with riak_test that use large bursts of connections may fail

Basic doc added, with doc-inspired tidying

1ce27c5

martinsumner marked this pull request as ready for review May 1, 2026 11:14

martinsumner added 2 commits May 5, 2026 12:56

Allow overlapping routes with different methods

8696bd2

Without requiring knowledge when implementing routes of the routes priority relative to others.

Minor doc fix

cf7c0da

ThomasArts reviewed May 7, 2026

View reviewed changes

f34nk reviewed May 8, 2026

View reviewed changes

Comment thread README.md Outdated

ThomasArts reviewed May 11, 2026

View reviewed changes

f34nk reviewed May 12, 2026

View reviewed changes

martinsumner and others added 4 commits May 12, 2026 15:44

Apply suggestions from code review - comment typos

126431e

Co-authored-by: Thomas Arts <thomas.arts@quviq.com>

Check additional percent encoding scenarios

f40e261

Formatting issue

a96f6c4

hmmr requested changes May 14, 2026

View reviewed changes

martinsumner added 9 commits May 14, 2026 15:37

Changes following initial review

27d39b8

Extend timeout on connection test

b9e1271

Typo missed from review

86fd824

Make path normalized or split/decoded

ddb99d3

Handle different error scenarios

Further updates following review

db87d3b

Ordering of directives

1c87497

Limit chunk size to < 4GB

16e8423

Change definition of timeout

b6e3f56

Change PB listener to use new spec - match http

c6d9b60

hmmr approved these changes May 18, 2026

View reviewed changes

Comment thread src/riak_api_web.erl Outdated

f34nk and others added 3 commits May 18, 2026 14:18

Add acceptor pool drain EUnit test

0ff0c13

Add a regression test for the empty acceptor pool case after max pool use and worker exits. Register the new module in the erlfmt file list.

Formatting

0968cbc

simplify/clarify acceptor pool management on accepted

18a1f78

Conversation

martinsumner commented Apr 24, 2026

Uh oh!

Uh oh!

ThomasArts May 7, 2026

Choose a reason for hiding this comment

Uh oh!

martinsumner May 12, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

f34nk May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

martinsumner May 12, 2026

Choose a reason for hiding this comment

Uh oh!

martinsumner May 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

martinsumner May 14, 2026

Choose a reason for hiding this comment

Uh oh!

f34nk May 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

f34nk May 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

martinsumner May 18, 2026

Choose a reason for hiding this comment

Uh oh!

hmmr left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

f34nk May 12, 2026 •

edited

Loading

martinsumner May 12, 2026 •

edited

Loading

f34nk May 17, 2026 •

edited

Loading

f34nk May 17, 2026 •

edited

Loading