Skip to content

fix(super-editor): emit OOXML property children in canonical schema order#3702

Open
dbrogan-OSI wants to merge 1 commit into
superdoc-dev:mainfrom
dbrogan-OSI:fix/exporter-ooxml-property-child-order
Open

fix(super-editor): emit OOXML property children in canonical schema order#3702
dbrogan-OSI wants to merge 1 commit into
superdoc-dev:mainfrom
dbrogan-OSI:fix/exporter-ooxml-property-child-order

Conversation

@dbrogan-OSI

Copy link
Copy Markdown

Problem

OOXML property containers β€” w:rPr, w:pPr, w:numPr, w:trPr, w:tcPr, w:tblPr, w:tblPrEx, w:sectPr β€” are defined by ECMA-376 as xsd:sequence complex types (CT_RPr, CT_PPr, CT_NumPr, …). Their child elements must appear in a fixed schema order.

On export, SuperDoc emits these children in JavaScript insertion order:

  • decodeProperties() (v3/handlers/utils.js) iterates Object.keys(attrs), i.e. ProseMirror attribute insertion order.
  • generateRunProps() (exporter.js) emits run marks in mark-array order.

Neither matches the schema sequence, so children are emitted in arbitrary order: w:numId before w:ilvl on every list item, w:color before w:rFonts, w:trPr w:hidden before w:trHeight, w:tblPr w:tblLook before w:tblCellMar, etc.

Why this is easy to miss

Desktop Word silently repairs out-of-order children on open, so the document looks fine there. Word for the web rejects the package and opens the document read-only as "corrupt." We hit this repeatedly against real exports β€” it is the cause of "document opens read-only in Word for the web."

Before / after (a list item)

Before (rejected by Word online):

<w:numPr>
  <w:numId w:val="3"/>
  <w:ilvl w:val="0"/>
</w:numPr>

After (canonical CT_NumPr order):

<w:numPr>
  <w:ilvl w:val="0"/>
  <w:numId w:val="3"/>
</w:numPr>

Fix

Add a shared canonical child-order table keyed by container local name (v3/handlers/ooxml-property-order.js) and stable-sort the emitted children at the two emit sites:

  • createNestedPropertiesTranslator.decode in utils.js β€” covers w:rPr, w:pPr, w:numPr, w:trPr, w:tcPr, w:tblPr.
  • generateRunProps in exporter.js β€” the run-mark w:rPr path that bypasses the v3 translator.

Unknown/extension elements (w14:*, mc:AlternateContent) sort after all known children with their relative order preserved (stable). Containers without a canonical sequence are returned unchanged.

This generalizes the existing CT_TC_MAR_CHILD_ORDER pattern (tcMar/tblCellMar) into one reusable table rather than per-container wrappers.

w:sectPr and w:tblPrEx sequences are included in the table for completeness; their emission currently runs through separate v2 paths, so wiring those is a ready follow-up.

Tests

  • New ooxml-property-order.test.js: utility unit tests for all wired containers + integration through real w:numPr, w:trPr, and w:rPr translators (13 tests).
  • All existing converter/exporter suites pass.

Test plan

  • pnpm --filter super-editor test for the changed converter/exporter tests
  • Open an exported list/table doc in Word for the web and confirm it is no longer read-only / flagged corrupt

Made with Cursor

…rder

OOXML property containers (w:rPr, w:pPr, w:numPr, w:trPr, w:tcPr, w:tblPr,
w:tblPrEx, w:sectPr) are defined by ECMA-376 as xsd:sequence complex types
(CT_RPr, CT_PPr, ...) whose child elements MUST appear in a fixed order.

On export, SuperDoc emitted these children in JavaScript insertion order:
decodeProperties() iterates Object.keys(attrs) (ProseMirror attr order) and
generateRunProps() emits run marks in mark-array order. Neither matches the
schema sequence, so children came out arbitrarily (e.g. numId before ilvl on
every list item, color before rFonts, trPr hidden before trHeight, tblPr
tblLook before tblCellMar).

Desktop Word silently repairs out-of-order children on open and hid the bug,
but Word for the web rejects the package and opens the document read-only as
"corrupt" β€” reproduced repeatedly against real exports.

Fix: add a shared canonical child-order table keyed by container local name
(ooxml-property-order.js) and stable-sort the emitted children at the two emit
sites β€” createNestedPropertiesTranslator.decode (covers rPr/pPr/numPr/trPr/
tcPr/tblPr) and exporter.js generateRunProps (the run-mark rPr path). Unknown
or extension elements (w14:*, mc:AlternateContent) sort after known children
with their relative order preserved. This generalizes the existing
CT_TC_MAR_CHILD_ORDER pattern (tcMar/tblCellMar) into one reusable table.

The fix is independent of translator array order, so the non-canonical
ordering of the propertyTranslators arrays themselves no longer matters.

Co-authored-by: Cursor <cursoragent@cursor.com>

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

πŸ’‘ Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3415911db0

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with πŸ‘.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +24 to +25
rPr: [
'rStyle',

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve paragraph rPr track-change order

When w:rPr is the paragraph-mark run properties under w:pPr, the same rPr translator can emit trackInsert/trackDelete as w:ins/w:del; those children belong before the normal run-property sequence for CT_ParaRPr. Because this new order table omits them, the sorter treats them as unknown and moves them after known props, so a valid tracked paragraph mark such as <w:rPr><w:ins/><w:b/></w:rPr> round-trips as <w:b/><w:ins/>, which violates the fixed OOXML sequence and can still make Word reject the document.

Useful? React with πŸ‘Β / πŸ‘Ž.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant