Best way to write compare-functions

The workhorse of many (JavaScript) programs is sort(). When you want to sort objects (or numbers, actually) you need to supply a compare-function. Those are nice functions because they are very testable and reusable, but sorting is also a bit expensive (perhaps the most expensive thing your program does) so you want them fast.

For the rest of this article I will assume we are sorting som Order objects based status, date and time (all strings).

The naive way to write this is:

function compareOrders1(a,b) {
if ( a.status < b.status ) return -1;
if ( a.status > b.status ) return 1;
if ( a.date < b.date ) return -1;
if ( a.date > b.date ) return 1;
if ( a.time < b.time ) return -1;
if ( a.time > b.time ) return 1;
return 0;
}

There are somethings about this that is just not appealing: too verbose, risk of a typo, and not too easy to read.

Another option follows:

function cmpStrings(a,b) {
if ( a < b ) return -1;
if ( a > b ) return 1;
return 0;
}

function compareOrders2(a,b) {
return cmpStrings(a.status,b.status)
|| cmpStrings(a.date ,b.date )
|| cmpStrings(a.time ,b.time );
}

Note that the first function (cmpStrings) is highly reusable, so this is shorter code. However, there is still som repetition, so I tried:

function cmpProps(a,b,p) {
return cmpStrings(a[p], b[p]);
}

function compareOrders3(a,b) {
return cmpProps(a,b,'status')
|| cmpProps(a,b,'date')
|| cmpProps(a,b,'time');
}

There is something nice about not repeating status, date and time, but there is something not so appealing about quoting them as strings. If you want to go more functional you can do:

function compareOrders4(a,b) {
function c(p) {
return cmpStrings(a[p],b[p]);
}
return c('status') || c('date') || c('time');
}

To my taste, that is a bit too functional and obscure. Finally, since it comes to mind and some people may suggest it, you can concatenate strings, like:

function compareOrders5(a,b) {
return cmpStrings(
a.status + a.date + a.time,
b.status + b.date + b.time
);
}

Note that in case fields “overlap” and/or have different length, this could give unexpected results.

Benchmarks

I tried the five different compare-functions on two different machines and got this kind of results (i5 N=100000, ARM N=25000), with slightly different parameters.

In these tests I used few unique values of status and date to often hit the entire compare function.

(ms)   i5    i5    ARM
#1 293 354 507
#2 314 351 594
#3 447 506 1240
#4 509 541 1448
#5 866 958 2492

This is quite easy to understand. #2 does exactly what #1 does and the function overhead is eliminated by the JIT. #3 is trickier for the JIT since a string is used to read a property. That is true also for #4, which also requires a function to be generated. #5 puts two strings on the stack needlessly when often only the first two strings are needed to compare anyway.

Conclusion & Recommendation

My conclusion is that #3 may be the best choice, despite it is slightly slower. I find #2 clearly preferable to #1, and I think #4 and #5 should be avoided.

Leave a Comment


NOTE - You can use these HTML tags and attributes:
<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

Time limit is exhausted. Please reload CAPTCHA.

This site uses Akismet to reduce spam. Learn how your comment data is processed.