CIS 610: Assignment 4 Analysis

I implemented a generic version of sort, meaning that it will sort any ASCII strings, not only those with uppercase characters. I wanted to pit my sort against UNIX sort(1). As a test case, I took /usr/share/dict/words, which contains 479625 strings. As it is already pretty much sorted, I created another file which contains the same strings but in reverse order:

perl -e 'while (<>) { push @words, $_ } print reverse @words;' < /usr/share/dict/words > reverse-words.txt

Going Versus System sort(1):

[dmitri@localhost ass4]$ time ./mysort -f /usr/share/dict/words >/dev/null

real    0m9.096s
user    0m8.437s
sys     0m0.656s
[dmitri@localhost ass4]$ time ./mysort -f /tmp/reverse-words.txt >/dev/null

real    0m9.175s
user    0m8.557s
sys     0m0.592s
[dmitri@localhost ass4]$ time /bin/sort /usr/share/dict/words >/dev/null

real    0m11.333s
user    0m11.125s
sys     0m0.180s
[dmitri@localhost ass4]$ time /bin/sort /tmp/reverse-words.txt >/dev/null

real    0m11.850s
user    0m11.685s
sys     0m0.160s
[dmitri@localhost ass4]$
        

We can see that my sort ran 25% faster. Now, for the bad news. I wondered how quicksort is going to stack up against our efficient radix sort. Fortunately, I did not have to implement it myself, as there is already qsort(3) in the standard library. The -q flag in my program causes it to use quick sort instead:

Going Versus Quicksort:

[dmitri@localhost ass4]$ time ./mysort -q -f /usr/share/dict/words >/dev/null

real    0m3.838s
user    0m3.280s
sys     0m0.560s
[dmitri@localhost ass4]$ time ./mysort -q -f /tmp/reverse-words.txt >/dev/null

real    0m4.137s
user    0m3.564s
sys     0m0.568s
[dmitri@localhost ass4]$
        

We can see that quicksort, while sensitive to whether the input is sorted or not, still beats my radix sort by at least a factor of 2. Most words in that file are pretty short. I believe radix sort should be used for sets containing much longer strings, and quicksort used otherwise.