Replace post meta sync storage with dedicated sync_updates table#11068
Replace post meta sync storage with dedicated sync_updates table#11068josephfusco wants to merge 24 commits intoWordPress:trunkfrom
sync_updates table#11068Conversation
|
The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the Core Committers: Use this line as a base for the props when committing in SVN: To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook. |
Test using WordPress PlaygroundThe changes in this pull request can previewed and tested using a WordPress Playground instance. WordPress Playground is an experimental project that creates a full WordPress instance entirely within the browser. Some things to be aware of
For more details about these limitations and more, check out the Limitations page in the WordPress Playground documentation. |
sync_updates tablesync_updates table
peterwilsoncc
left a comment
There was a problem hiding this comment.
First pass note inline.
This approach had occurred to me and I think it's probably quite a good idea.
But, as always, there is a but.
I know @m has been reluctant in the past to add new tables to the database schema. I guess (and Matt will be able to correct me) that's to avoid ending up with a database scheme of dozens and dozens of tables.
As not all sites run the database upgrade routine on a regular basis (I think wordpress.org is often behind) as it can be quite a burden on a site, if this approach is to be taken it would need to include a check for the table's existence in a couple of locations:
- The
option_{$option}hook - Before displaying the options field on the writing page
Some form of filtering would need to be available for sites that are using custom data stores.
Sorry to be that person. I'm not trying to be negative, just share knowledge I've learnt over the last few years.
|
@peterwilsoncc Thanks for the thorough review — please don't apologize, this is exactly the kind of institutional knowledge that's invaluable.
Completely agree. I believe I've audited every code path that reads
When the setting is saved, the flow through options.php → update_option() writes to the wp_options table. No code is hooked onto that option change that would interact with the sync_updates table, so toggling the setting is safe regardless of table state.
The checkbox in The code path that touches the table is the REST route registration in
The Let me know if I've missed anything or if you think additional safeguards would be worthwhile! |
0104993 to
c13f1cd
Compare
Yeah, this has been an issue, but I think it's primarily/only really an issue for database tables that are used for frontend features. Given that the proposed
I haven't touched custom tables for a long time, so I'm not aware of the state of the art here. But I do know that WooCommerce made the switch to using custom tables instead of postmeta for the sake of scalabilitiy, simplicity, and reliability. Doing a quick search in WPDirectory and I see that Jetpack, Yoast, Elementor, WPForms, WordFence, and many others use custom tables as well. |
This comment was marked as off-topic.
This comment was marked as off-topic.
355e7f2 to
7f57c63
Compare
This isn't quite correct, on multi site installs site (ie, non super admins) don't get prompted to upgrade the database tables and globally upgrading tables by super admins can be prevented using a filter: add_filter( 'map_meta_cap', function( $caps, $cap ) {
if ( 'upgrade_network' === $cap ) {
$caps[] = 'do_not_allow';
}
return $caps;
}, 10, 4 );To reproduce:
Keep in mind that the sub site admin can enable site RTC on the writing page, even though they can't update the tables.
Not really relevant for the core schema. |
Seems relevant to me, as plugins may opt to introduce new tables if the core schema doesn't provide an efficient way to store the desired data. In the same way, if a new feature for core can't be represented efficiently in the core schema, then so too should a new table be considered. |
|
@westonruter Well, no, because my point wasn't about whether the tables would be more performant (I said it's probably a good idea in my comment), my point was that Matt has a reluctance to add more tables to the database schema. Whether that reluctance still remains is worth asking but if it does then the plugin actions aren't relevant to core. |
Co-authored-by: Peter Wilson <519727+peterwilsoncc@users.noreply.github.com>
Adds a db_version check before registering the collaboration REST routes so sites that haven't run the upgrade routine yet don't hit a fatal error from the missing sync_updates table.
Schedule a daily cron job to delete sync update rows older than 1 day, preventing unbounded table growth from abandoned collaborative editing sessions. The expiration is filterable via the wp_sync_updates_expiration hook.
Store the room identifier string directly instead of its md5 hash. Room strings like "postType/post:42" are short, already validated by the REST API schema, and storing them verbatim improves debuggability.
Hardcode WEEK_IN_SECONDS for cron cleanup, matching the auto-draft cleanup precedent in core. Remove the wp_sync_updates_expiration filter to keep the API surface minimal for v1.
The room string is already constrained by REST schema regex validation, but esc_html() future-proofs against the regex loosening or the message being reused in an HTML context.
Cover gaps in sync_updates test coverage: - Cron deletes rows older than 7 days - Cron preserves rows within the 7-day window - Cron boundary behavior at exactly 7 days - Cron selective deletion with mixed old and recent rows - Cleanup hook registered in default-filters.php - Sync routes not registered when db_version is below 61698
Co-authored-by: Weston Ruter <westonruter@gmail.com>
Co-authored-by: Weston Ruter <westonruter@gmail.com>
Extract the db_version >= 61698 and option check into a named function, following the precedent set by wp_check_term_meta_support_prefilter() in taxonomy.php.
Use the new helper at all remaining call sites: the script injection in collaboration.php, the cron cleanup guard, the REST route registration in rest-api.php, and the cron scheduling in admin.php. The cron handler now also checks the option, which is correct: if collaboration is disabled, cleanup should be skipped too.
Add Playwright tests covering presence awareness, real-time sync, and undo/redo for the collaboration feature, along with shared test fixtures and utilities.
Extract repeated patterns across collaboration test files into CollaborationUtils methods: createCollaborativePost, insertBlockViaEvaluate, assertEditorHasContent, and assertAllEditorsHaveContent. Replace hardcoded timeout values with the exported SYNC_TIMEOUT constant. Fix docblock for get_updates_after_cursor in the WP_Sync_Storage interface.
efaeac8 to
255cde9
Compare
|
|
||
| global $wpdb; | ||
|
|
||
| $wpdb->query( |
There was a problem hiding this comment.
Given that this cleanup routine is running via cron, I wonder if it should be more resilient to large amounts of data removal - imagine in a scenario of 1,000 updates being removed in one query, we should perhaps batch process this in reasonable amounts or grab the correct data from a select query first.
There was a problem hiding this comment.
I think this would definitely be a concern in the case of deleting 1,000 posts or postmeta, given that actions are expected to run for each deletion. But a SQL DELETE like this should be very fast. For prior art, the delete_expired_transients() function does a similar DELETE query and it also lacks any LIMIT or batching. And this is even with option_value not being indexed. To ensure the deletion of wp_sync_updates is as fast or faster, an index could be added to created_at: https://github.com/WordPress/wordpress-develop/pull/11068/changes#r2879822934
There was a problem hiding this comment.
Can we assume updates that are a week+ old are safe to delete? I asked Gemini this question and it said:
The wp_delete_old_sync_updates function unconditionally deletes updates older than 7 days.
- Impact: If a user does not sync for > 7 days, or if the latest "compaction" (snapshot) update is > 7 days old, the server will delete the history they rely on. When the client reconnects, the server will return "no updates" (or a partial list starting after the gap). The client will assume it is up-to-date or apply updates on top of an inconsistent state, leading to document corruption.
- Context: The server does not signal "History Lost" or force a resync. The removal of history is safer than the old infinite-growth model, but without a protocol to detect "too old" cursors (e.g., checking if requested_cursor < min_table_id), this causes silent data corruption for offline users.
There was a problem hiding this comment.
@pkevan @westonruter Agreed - a single DELETE on the indexed created_at column should be fast even at volume, same as delete_expired_transients() handles it today without batching.
@mindctrl The Gemini analysis assumes a client could hold a stale cursor across the 7-day window, but the cursor is in-memory and resets to 0 on every editor load - a returning user always starts fresh.
The behavior here was modeled after wp_delete_auto_drafts(): daily cron, hardcoded 7-day threshold, non-filterable. Auto-drafts are orphaned posts from abandoned editing sessions; sync updates are orphaned rows from abandoned collaboration sessions. Compaction keeps row volume low during active use, so the cleanup is a safety net for abandoned sessions, not a routine data path.
| public function set_awareness_state( string $room, array $awareness ): bool { | ||
| // Awareness is high-frequency, short-lived data (cursor positions, selections) | ||
| // that doesn't need cursor-based history. Transients avoid row churn in the table. | ||
| return set_transient( $this->get_awareness_transient_key( $room ), $awareness, MINUTE_IN_SECONDS ); |
There was a problem hiding this comment.
will using transients that expire in 1 minute just be somewhat wasteful resource-wise. Could this just not be implemented in the custom table with an appropriate cleanup routine?
There was a problem hiding this comment.
Good question. Awareness is last-write-wins - only the latest state per room matters, no history. Transients are appropriate because they self-expire via the 1-minute TTL (handles presence disappearing when a user leaves) and on sites with an external object cache, set_transient() routes through wp_cache_set() so these never hit the database.
Moving it into the custom table would add more overhead - an INSERT per poll per user per room during active editing, plus another cleanup routine for stale awareness rows. The transient approach avoids that churn and keeps the table focused on ordered sync updates.
| public function get_updates_after_cursor( string $room, int $cursor ): array { | ||
| global $wpdb; | ||
|
|
||
| // Snapshot the current max ID for this room to define a stable upper bound. |
There was a problem hiding this comment.
would be useful to see some analysis of how these new queries perform with large data sets, to see if any optimizations are possible or needed.
|
|
||
| global $wpdb; | ||
|
|
||
| $wpdb->query( |
There was a problem hiding this comment.
Can we assume updates that are a week+ old are safe to delete? I asked Gemini this question and it said:
The wp_delete_old_sync_updates function unconditionally deletes updates older than 7 days.
- Impact: If a user does not sync for > 7 days, or if the latest "compaction" (snapshot) update is > 7 days old, the server will delete the history they rely on. When the client reconnects, the server will return "no updates" (or a partial list starting after the gap). The client will assume it is up-to-date or apply updates on top of an inconsistent state, leading to document corruption.
- Context: The server does not signal "History Lost" or force a resync. The removal of history is safer than the old infinite-growth model, but without a protocol to detect "too old" cursors (e.g., checking if requested_cursor < min_table_id), this causes silent data corruption for offline users.
Co-authored-by: Weston Ruter <westonruter@gmail.com>
Replace the opaque JSON blob with dedicated columns so the sync_updates table is legible and queryable without JSON parsing. Removes the json_encode/json_decode round-trip on every write and read. No migration needed — the table has not shipped in a stable release. Props mindctrl.
…hfusco/wordpress-develop into feature/sync-updates-table
|
@westonruter @josephfusco As mentioned in Slack and on the ticket, Matt has approved an additional table for RTC. As this is a first design and wasn't discussed architecturally, I would like to slow right down and consider the architecture on the ticket prior to proceeding here. As you'll have seen, I've posted in slack but will comment on the ticket once I've thought about things further. Prior to the discussion, I think it's best if this PR is put on hold so we can focus on any other RTC issues in the meantime. I think there's a few things in the JavaScript client and the polling class that need work in the mean time. |
Co-authored-by: Weston Ruter <westonruter@gmail.com>
…ble. Reverts the column split from c920b97 and removes the round-trip test from ceccbb4. The sync_updates table is an append-only event log and should remain protocol-agnostic — the storage layer stores and orders; the sync server interprets. This reverts commit c920b97. This reverts commit ceccbb4.
The real-time collaboration sync layer currently stores messages as post meta, which works but creates side effects at scale. This moves it to a dedicated
wp_sync_updatestable purpose-built for the workload.The beta1 implementation stores sync messages as post meta on a private
wp_sync_storagepost type. Post meta is designed for static key-value data, not high-frequency transient message passing. This mismatch causes:Cache thrashing — Every sync write triggers
wp_cache_set_posts_last_changed(), invalidating site-wide post query caches unrelated to collaboration.Compaction race condition — The "delete all, re-add some" pattern in
remove_updates_before_cursor()loses messages under concurrent writes. The window betweendelete_post_meta()and theadd_post_meta()loop is unprotected.Cursor race condition — Timestamp-based cursors (
microtime() * 1000) miss updates when two writes land within the same millisecond.A purpose-built table with auto-increment IDs eliminates all three at the root: no post meta hooks fire, compaction is a single atomic
DELETE, and auto-increment IDs guarantee unique ordering. TheWP_Sync_Storageinterface andWP_HTTP_Polling_Sync_Serverare unchanged.Also adds a
wp_sync_storagefilter so hosts can substitute alternative backends (Redis, WebSocket) without patching core, and includes a beta1 upgrade path that cleans up orphanedwp_sync_storageposts.Credits
meta_idcursor approach, awareness → transients, race condition testing (wordpress-develop #11067)meta_valueindexing limitationsTrac ticket: https://core.trac.wordpress.org/ticket/64696
Use of AI Tools
Co-authored with Claude Code (Opus 4.6), used to synthesize discussion across related tickets and PRs into a single implementation. All code was reviewed and tested before submission.
This Pull Request is for code review only. Please keep all other discussion in the Trac ticket. Do not merge this Pull Request. See GitHub Pull Requests for Code Review in the Core Handbook for more details.