Tag Archives: JavaScript

All JavaScript objects are not equally fast

One thing I like with JavaScript and NodeJS is to have JSON in the entire stack. I store JSON on disk, process JSON data server side, send JSON over HTTP, process JSON data client side, and the web GUI can easily present JSON (I work with Angular).

As a result of this, all objects are not created the same. Lets say I keep track of Entries, I have an Entry-constructor that initiates new objects with all fields (no more no less). At the same time I receive Entry-objects as JSON-data over the network.

A strategy is needed:

  1. Have mix of raw JSON-Entries and Objects that are instanceof Entry
  2. Create real Entry-objects from all JSON-data
  3. Only work with raw JSON-Entries

Note that if you don’t go with (2) you can’t use prototype, expect objects to have functions or use instanceof to identify objects.

Another perhaps not obvious aspect is that performance is not the same. When you create a JavaScript object using new the runtime actually creates a class with fast to access properties. Such object properties are faster than

  • an empty object {} with properties set afterwards
  • an object created with JSON.parse()

I wrote a program to test this. The simplified explanation is that I obtained an array of objects that I then sorted/calculated a few (6) times. For a particular computer and problem size I got these results:

TIME   PARAMETER   DESCRIPTION
3.3s       R       Produce random objects using "new"
4.4s       L       Load objects from json-file using JSON.parse()
3.0s       L2      json-file, JSON.parse(), send raw objects to constructor
3.2s       L3      load objects using require() from a js-file

I will be honests and say that the implementation of the compare-function sent to sort() matters. Some compare functions suffered more or less from different object origins. Some compare functions are more JIT-optimised and faster the second run. However, the consistent finding is that raw JSON-objects are about 50% slower than objects created with new and a constructor function.

What is not presented above is the cost of parsing and creating objects.

My conclusion from this is that unless you have very strict performance requirements you can use the raw JSON-objects you get over the network.

Below is the source code (for Node.js). Apart from the parameters R, L, L2 and L3 there is also a S(tore) parameter. It creates the json- and js-files used by the Load options. So typically run the program with the S option first, and then the other options. A typicall run looks like this:

$ node ./obj-perf.js S
Random: 492ms
Store: 1122ms

$ node ./obj-perf.js R
Random: 486ms
DISTS=110463, 110621, 110511, 110523, 110591, 110515 : 3350ms
DISTS=110463, 110621, 110511, 110523, 110591, 110515 : 3361ms
DISTS=110463, 110621, 110511, 110523, 110591, 110515 : 3346ms

$ node ./obj-perf.js L
Load: 376ms
DISTS=110463, 110621, 110511, 110523, 110591, 110515 : 4382ms
DISTS=110463, 110621, 110511, 110523, 110591, 110515 : 4408ms
DISTS=110463, 110621, 110511, 110523, 110591, 110515 : 4453ms

$ node ./obj-perf.js L2
Load: 654ms
DISTS=110463, 110621, 110511, 110523, 110591, 110515 : 3018ms
DISTS=110463, 110621, 110511, 110523, 110591, 110515 : 2974ms
DISTS=110463, 110621, 110511, 110523, 110591, 110515 : 2890ms

$ node ./obj-perf.js L3
Load: 1957ms
DISTS=110463, 110621, 110511, 110523, 110591, 110515 : 3436ms
DISTS=110463, 110621, 110511, 110523, 110591, 110515 : 3264ms
DISTS=110463, 110621, 110511, 110523, 110591, 110515 : 3199ms

The colums with numbers (110511) are checksums calculated between the sorts. They should be equal, otherwise they dont matter.

const nodeFs = require('fs');

function Random(seed) {
  this._seed = seed % 2147483647;
  if (this._seed <= 0) this._seed += 2147483646;
}

Random.prototype.next = function () {
  return this._seed = this._seed * 16807 % 2147483647;
};

function Timer() {
  this.time = Date.now();
}

Timer.prototype.split = function() {
  var now = Date.now();
  var ret = now - this.time;
  this.time = now;
  return ret;
};

function Point() {
  this.a = -1;
  this.b = -1;
  this.c = -1;
  this.d = -1;
  this.e = -1;
  this.f = -1;
  this.x =  0;
}

function pointInit(point, rand) {
  var p;
  for ( p in point ) {
    point[p] = rand.next() % 100000;
  }
}

function pointLoad(json) {
  var p;
  var point = new Point();
  for ( p in point ) {
    point[p] = json[p];
  }
  return point;
}

function pointCmp(a,b) {
  return pointCmpX[a.x](a,b,a.x);
}

function pointCmpA(a,b) {
  if ( a.a !== b.a ) return a.a - b.a;
  return pointCmpB(a,b);
}

function pointCmpB(a,b) {
  if ( a.b !== b.b ) return a.b - b.b;
  return pointCmpC(a,b);
}

function pointCmpC(a,b) {
  if ( a.c !== b.c ) return a.c - b.c;
  return pointCmpD(a,b);
}

function pointCmpD(a,b) {
  if ( a.d !== b.d ) return a.d - b.d;
  return pointCmpE(a,b);
}

function pointCmpE(a,b) {
  if ( a.e !== b.e ) return a.e - b.e;
  return pointCmpF(a,b);
}

function pointCmpF(a,b) {
  if ( a.f !== b.f ) return a.f - b.f;
  return pointCmpA(a,b);
}

var pointCmpX = [pointCmpA,pointCmpB,pointCmpC,pointCmpD,pointCmpE,pointCmpF];

function pointDist(a,b) {
  return Math.min(
    (a.a-b.a)*(a.a-b.a),
    (a.b-b.b)*(a.b-b.b),
    (a.c-b.c)*(a.c-b.c),
    (a.d-b.d)*(a.d-b.d),
    (a.e-b.e)*(a.e-b.e),
    (a.f-b.f)*(a.f-b.f)
  );
}

function getRandom(N) {
  var i;
  var points = new Array(N);
  var rand   = new Random(14);

  for ( i=0 ; i<N ; i++ ) {
    points[i] = new Point();
    n = pointInit(points[i], rand);
  }
  return points;
}

function test(points) {
  var i,j;
  var dist;
  var dists = [];

  for ( i=0 ; i<6 ; i++ ) {
    dist = 0;
    for ( j=0 ; j<points.length ; j++ ) {
      points[j].x = i;
    }
    points.sort(pointCmp);
    for ( j=1 ; j<points.length ; j++ ) {
      dist += pointDist(points[j-1],points[j]);
    }
    dists.push(dist);
  }
  return 'DISTS=' + dists.join(', ');
}

function main_store(N) {
  var timer = new Timer();
  points = getRandom(N);
  console.log('Random: ' + timer.split() + 'ms');
  nodeFs.writeFileSync('./points.json', JSON.stringify(points));
  nodeFs.writeFileSync('./points.js', 'exports.points=' +
                                      JSON.stringify(points) + ';');
  console.log('Store: ' + timer.split() + 'ms');
}

function main_test(points, timer) {
  var i, r;
  for ( i=0 ; i<3 ; i++ ) {
    r = test(points);
    console.log(r + ' : ' + timer.split() + 'ms');
  }
}

function main_random(N) {
  var timer = new Timer();
  var points = getRandom(N);
  console.log('Random: ' + timer.split() + 'ms');
  main_test(points, timer);
}

function main_load() {
  var timer = new Timer();
  var points = JSON.parse(nodeFs.readFileSync('./points.json'));
  console.log('Load: ' + timer.split() + 'ms');
  main_test(points, timer);
}

function main_load2() {
  var timer = new Timer();
  var points = JSON.parse(nodeFs.readFileSync('./points.json')).map(pointLoad);
  console.log('Load: ' + timer.split() + 'ms');
  main_test(points, timer);
}

function main_load3() {
  var timer = new Timer();
  var points = require('./points.js').points;
  console.log('Load: ' + timer.split() + 'ms');
  main_test(points, timer);
}

function main() {
  var N = 300000;
  switch ( process.argv[2] ) {
  case 'R':
    main_random(N);
    break;
  case 'S':
    main_store(N);
    break;
  case 'L':
    main_load();
    break;
  case 'L2':
    main_load2();
    break;
  case 'L3':
    main_load3();
    break;
  default:
    console.log('Unknown mode=' + process.argv[2]);
    break;
  }
}

main();

JavaScript: await async

With Node.js version 8 there is finally a truly attractive alternative to good old callbacks.

I was never a fan of promises, and implementing await-async as a library is not pretty. Now when await and async are keywords in JavaScript things change.

The below program demonstrates a simple async function doing IO: ascertainDir. It creates a directory, but if it already exists no error is thrown (if there is already a file with the same name, no error is thrown, and that is a bug but it will do for the purpose of this article).

There are four modes of the program: CALLBACK, PROMISE, AWAIT-LIB and AWAIT-NATIVE. Creating a folder (x) should work. Creating a folder in a nonexisting folder (x/x/x) should fail. Below is the output of the program and as you see the end result is the same for the different asyncronous strategies.

$ node ./await-async.js CB a
Done: a
$ node ./await-async.js CB a/a/a
Done: Error: ENOENT: no such file or directory, mkdir 'a/a/a'

$ node ./await-async.js PROMISE b
Done: b
$ node ./await-async.js PROMISE b/b/b
Done: Error: ENOENT: no such file or directory, mkdir 'b/b/b'

$ node ./await-async.js AWAIT-LIB c
Done: c
$ node ./await-async.js AWAIT-LIB c/c/c
Done: Error: ENOENT: no such file or directory, mkdir 'c/c/c'

$ node ./await-async.js AWAIT-NATIVE d
Done: d
$ node ./await-async.js AWAIT-NATIVE d/d/d
Done: Error: ENOENT: no such file or directory, mkdir 'd/d/d'

The program itself follows:

     1	var nodefs = require('fs')
     2	var async = require('asyncawait/async')
     3	var await = require('asyncawait/await')
     4	
     5	
     6	function ascertainDirCallback(path, callback) {
     7	  if ( 'string' === typeof path ) {
     8	    nodefs.mkdir(path, function(err) {
     9	      if (!err) callback(null, path)
    10	      else if ('EEXIST' === err.code) callback(null, path)
    11	      else callback(err, null)
    12	    })
    13	  } else {
    14	    callback('mkdir: invalid path argument')
    15	  }
    16	};
    17	
    18	
    19	function ascertainDirPromise(path) {
    20	  return new Promise(function(fullfill,reject) {
    21	    if ( 'string' === typeof path ) {
    22	      nodefs.mkdir(path, function(err) {
    23	        if (!err) fullfill(path)
    24	        else if ('EEXIST' === err.code) fullfill(path)
    25	        else reject(err)
    26	      })
    27	    } else {
    28	      reject('mkdir: invalid path argument')
    29	    }
    30	  });
    31	}
    32	
    33	
    34	function main() {
    35	  var method = 0
    36	  var dir    = 0
    37	  var res    = null
    38	
    39	  function usage() {
    40	    console.log('await-async.js CB/PROMISE/AWAIT-LIB/AWAIT-NATIVE directory')
    41	    process.exit(1)
    42	  }
    43	
    44	  switch ( process.argv[2] ) {
    45	  case 'CB':
    46	  case 'PROMISE':
    47	  case 'AWAIT-LIB':
    48	  case 'AWAIT-NATIVE':
    49	    method = process.argv[2]
    50	    break
    51	  default:
    52	    usage();
    53	  }
    54	
    55	  dir = process.argv[3]
    56	
    57	  if ( process.argv[4] ) usage()
    58	
    59	  switch ( method ) {
    60	  case 'CB':
    61	    ascertainDirCallback(dir, function(err, path) {
    62	      console.log('Done: ' + (err ? err : path))
    63	    })
    64	    break
    65	  case 'PROMISE':
    66	    res = ascertainDirPromise(dir)
    67	    res.then(function(path) {
    68	      console.log('Done: ' + path)
    69	    },function(err) {
    70	      console.log('Done: ' + err)
    71	    });
    72	    break
    73	  case 'AWAIT-LIB':
    74	    (async(function() {
    75	      try {
    76	        res = await(ascertainDirPromise(dir))
    77	        console.log('Done: ' + res)
    78	      } catch(e) {
    79	        console.log('Done: ' + e)
    80	      }
    81	    })());
    82	    break
    83	  case 'AWAIT-NATIVE':
    84	    (async function() {
    85	      try {
    86	        res = await ascertainDirPromise(dir)
    87	        console.log('Done: ' + res)
    88	      } catch(e) {
    89	        console.log('Done: ' + e)
    90	      }
    91	    })();
    92	    break
    93	  }
    94	}
    95	
    96	main()

Please note:

  1. The anonymous function on line 74 would not be needed if main() itself was async()
  2. The anonymous function on line 84 would not be needed if main() itself was async
  3. A function that returns a Promise() (line 19) works as a async function without the async keyword.

Callback
Callback is the old simple method of dealing with asyncrounous things in JavaScript. A major complaint has been “callback hell”: if you call several functions in sequence it can get rather messy. I can agree with that, BUT I think each asyncrounous call deserves its own error handling anyway (and with proper error handling other options tend to be equally tedious).

Promise
I dont think using a promise (66-71) is very nice. It is of course a matter of habit. One thing is that not all requests in the success-path are actually success in real life, or not all errors are errors (like in ascertainDir). Very commonly you make a http-request which itself is good, but the data you receive is not good so you want to proceed with error handling. This means that the fulfill case needs to execute the same code as the reject case, for some “successful” replies. Promises can be chained, but it typically results in ignoring proper error handling.

awaitasync library
I think the syntax of the asyncawait library is rather horrible, but it works as a proof-of-concept for the real thing.

async await native keywords
With the async/await keywords in JavaScript, suddenly asyncrounous code can be handled just like in Java or C#. Since it is familiar it is appealing! No doubt it is clean and practical. I would hesitate to mix it with Callbacks or Promises, and would rather wait until I can do a complete rewrite.

Common sources of bugs in JavaScript are people trying to return from within (callback/promises) functions, people not realising the rest of the code continues to run after the asyncrous call, or things related to variable scopes. I guess in most cases the await/async makes these things cleaner and easier, but I would expect problems where it causes unexpected effects when not properly used.

Finally, if you start using async/await keywords there is no polyfill or fallback for older browser (maybe Babel can do that for you). As usual, IE seems to lag behind, and you can forget about Node v6 (or earlier). Depending on your situation, this could be a show stopper or no issue at all.

Watch something?
For more details, I can recommend this video on 5 architectures of asynchronous JavaScript.

Lodash Performance Sucks!

To continue my Functional Programming Sucks series of posts I will have a closer look at reduce().

I complained with Lodash (and Underscore) for different reasons. One complaint was performance, but I just read the code and presumed it was going to be slow without measuring. Then I complained with the performance of Functional Programming in general.

I thought it would be interesting to “improve” the Functional code with Lodash functions, and to my surprise (I admit I was both wrong and surprised) I found Lodash made it faster! After reading a little more about it I discovered this is a well known fact.

So, here are four different implementations of a function that checks if the elements (numbers) in an array are ordered (cnt is incremented if the array is sorted, such was the original problem).

// Standard reduce()
    this.test = function(p) {
        if ( false !== p.reduce(function(acc,val) {
            if ( false === acc || val < acc ) return false;
            return val;
        }, -1)) cnt++;
    };

// Lodash reduce(), and some other Lodash waste
    this.test = function(p) {
        if ( false !== LO.reduce(p,function(acc,val) {
            if ( false === acc || val < acc ) return false;
    //      if ( !LO.isNumber(acc) || val < acc ) return false;
            return val;
        }, -1)) cnt++;
    };

// My own 4 minute to implement simpleReduce(), see below
    this.test = function(p) {
        if ( false !== simpleReduce(p,function(acc,val) {
            if ( false === acc || val < acc ) return false;
            return val;
        }, -1)) cnt++;
    };

// A simple imperative version
    this.test = function(p) {
        var i;
        for ( i=1 ; i < p.length ; i++ ) {
            if ( p[i] < p[i-1] ) return;
        }
        cnt++;
    };

// my own implementation reduce()
    function simpleReduce(array, func, initval) {
         var i;
         var v = initval;
         for ( i=0 ; i<array.length ; i++ ) {
             v = func(v, array[i]);
         }
         return v;
    }

The interesting thing here is that the standard library reduce() is the slowest.
However, my simpleReduce is faster than Lodash reduce().

(seconds) reduce()
Std Lib Lodash Simple Imperative
Raspberry Pi v1 (ARMv6 @ 700) 21 13 9.3 4.8
MacBook Air (Core i5 @ 1400) 0.46 0.23 0.19 0.16

Conclusion
The conclusion is that from a performance perspective Functional Programming sucks. Lodash sucks too, but a little bit less so than the standard library (however, if you decorate all your code with isEmpty, isString, isNumber and that crap it will get worse).

That said, the generic nature of Lodash comes at a cost. The most simpleReduce() imaginable outperforms Lodash. As I see it, this leaves Lodash in a pretty bad (or small) place:

  • Compared to the standard library it is an extra dependency with limited performance benefits
  • The generic nature of Lodash comes at both a performance cost and it allows for sloppy coding
  • A hand written reduce() outperforms Lodash and is a good excercise for anyone to write. I expect this is quite true also for other functions like take or takeRight.
  • For best performance, avoid Functional Programming (and in this case the imperative version is arguably more readable than the FP reduce() versions)

Whats up with the Standard Library???
JavaScript is a scripted language (interpreted with a JIT compiler) that has a standard library written in C++. How can anything written in JavaScript execute faster than anything in the standard library that does the same thing?

First, kudos to the JIT designers! Amazing job! Perhaps the standard library people can learn from you?

I can imagine the standard library functions are doing some tests or validations that are somehow required by the standard, and that a faster and less strict version of reduce() would possibly break existing code (although this sounds far fetched).

I can (almost not) imagine that there is a cost of going from JS to Native and back to JS: that function calls to native code comes with overhead. Like going from user space to kernel space. It sounds strange.

I have read that there are optimizations techniques applied to Lodash (like lazy evaluation), but I certainly didn’t do anything like that in my simpleReduce().

For Node.js optimizing the standard library truly would make sense. In the standard library native code of a single-threaded server application every cycle counts.

UPDATE: I tried replacing parts of the above code: 1) the lambda function that is passed to reduce(), 2) the imperative version, with native code. That is, I wrote C++ code for V8 and used it instead of JavaScript code. In both cases this was slower! Obviously there is some overhead in going between native and JavaScript JIT, and for rather small functions this overhead makes C++ “slower” than JavaScript. My idea was to write a C++ reduce() function but I think the two functions I wrote are enough to show what is happening here. Conclusion: don’t write small native C++ functions for performance, and for maximum performance it can be worth to rewrite the standard library in JavaScript (although this is insane to do)!

All FP-sucks related articles
Functional Programming Sucks)
Underscore.js sucks! Lodash sucks!
Functional Programming Sucks! (it is slow)
Lodash Performance Sucks! (this one)

Functional Programming Sucks! (it is slow)

Update 2017-07-17: Below i present numbers showing that functional code is slower than imperative code. It seems this has changed with newer versions of Node.js: functional code has not turned faster but imperative code has become slower. You can read a little more about it in the comments. I will look more into this. Keep in mind that the below findings may be more accurate for Node.js v4-6 than for v8.

Functional programming is very popular with contemporary JavaScript programmers. As I have written before, Functional programming sucks and functional libraries for JavaScript also suck.

In this post I will explain more why Functional Programming sucks. I will start with the conclusion. Read on as long as you want more details.

Functional Programming practices are bad for performance
It is very popular to feed lamda-functions to map(), reduce(), filter() and others. If you do this carelessly the performance loss is significant.

It is also popular to work with immutable data. That is, you avoid functions that change (mutate) current state (side effects) and instead you produce a new state (a pure function). This puts a lot of pressure on the garbage collector and it can destroy performance.

The Benchmark Problem
Sometimes I entertain myself solving problems on Hackerrank.com. I particularly like the mathematical challenges in the Project Euler section (the Project Euler is also an independent organisation – HackerRank uses the challenges in Project Euler to create programming challenges).

This article refers to Project Euler 32. I will not go into details, but the solution is basically:

  1. Generate all permutations of the numbers 1,2,3,4,5,6,7,8,9 (there are 9! of them)
  2. For each permutation, check if it is “good” (very few are)
  3. Print the sum of the good instances

The first two steps give good benchmark problems. I have made different implementations of (1) and (2) and then compared the results.

Benchmark Results
I have three different permutation generators (all recursive functions):

  1. Pure function, immutable data (it may not be strictly pure)
  2. Function that mutates its own internal state, but not its input
  3. Function that mutates shared data (no allocation/garbace collection)

I also have three different test functions:

  1. Tests the orginal Project Euler problem
  2. Simplified test using reduce() and lamda function
  3. Simplified test implemented a standard loop

I benchmarked on two different systems using Node.js version 6. I have written elsewhere that Node.js performance on Raspberry Pi sucks.

(seconds) Project Euler Test Simplified Test
Test Function: Functional Imperative
Permutation Gen: Pure Semi Shared Shared Shared Pure
Raspberry Pi v1 (ARMv6 @ 700) 69 23 7.4 21 3.7 62
MacBook Air (Core i5 @ 1400) 0.77 0.29 0.13 0.40 0.11 0.74

Comparing columns 1-2-3 shows the performance of different generators (for Project Euler test)
Comparing columns 4-5 shows the performance of two different test functions (using fast generator)
Comparing columns 5-6 shows the performance of two different generators (for fast simple test)

This shows that the benefit of using shared/mutable data (not running the garbage collector) instead of immutable data is 5x performance on the Intel CPU and even more on the ARM. Also, the cost of using reduce() with a lamda function is more than 3x overall performance on the Intel CPU, and even more on the ARM.

For both the test function and permutation generation, making any of them functional-slow significantly slows down the entire program.

The conclusion of this is that unless you are quite sure your code will never be performance critical you should avoid functional programming practices. It is a lot easier to write imperative code than to later scale out your architecture when your code does not perform.

However, the pure immutable implementation of the permutation generator is arguably much simpler than the iterative (faster) counterpart. When you look at the code you may decide that the performance penalty is acceptable to you. When it comes to the reduce() with a lamda function, I think the imperative version is easier to read (and much faster).

Please notice that if your code consists of nice testable, replaceble parts without side effects you can optimize later on. The functional principles are more valuable at a higher level. If you define your functions in a way that they behave like nice FP functions it does not matter if they are implemented using imperative principles (for performance).

Generating Permutations
I used the following simple method for generating permutations. I start with two arrays and I send them to my permute-function:

  head = [];
  tail = [1,2,3,4];

  permute(head,tail);

My permute-function checks if tail is empty, and then: test/evalute head.
Otherwise it generates 4 (one for each element in tail) new sets of head and tail:

  permute( [1] , [2,3,4] )
  permute( [2] , [1,3,4] )
  permute( [3] , [1,2,4] )
  permute( [4] , [1,2,3] )

The difference in implementation is:

  • Pure version generates all the above 8 arrays as new arrays using standard array functions
  • Semi pure version generates its own 2 arrays (head and tail) and then uses a standard loop to change the values of the arrays between the (recursive) calls to permute.
  • Shared version simply creates a single head-array and 9 tail-arrays (one for each recursion step) up front. It then reuses these arrays throughout the 9! iterations. (It is not global variables, they are hidden and private to the permutation generator)

The simplified test
The simplified test checks if the array is sorted: [1,2,3,4]. Of all permutations, there is always exactly one that is sorted. It is a simple test to implement (especially with a loop).

// These functions are part of a "test-class" starting like:
function testorder1() {
    var cnt = 0;

// Functional test
    this.test = function(p) {
        if ( false !== p.reduce(function(acc,val) {
            if ( false === acc || val < acc ) return false;
            return val;
        }, -1)) cnt++;
    };

// Iterative test (much faster)
    this.test = function(p) {
        var i;
        for ( i=1 ; i<p.length ; i++ ) {
            if ( p[i] < p[i-1] ) return;
        }
        cnt++;
    };

I tried to optimise the functional reduce() version by breaking out a named function. That did not help. I also tried to let the function always return the same type (now it returns false OR a number) but that also made no difference at all.

All the code
For those who want to run this themselves or compare the permutation functions here is the entire program.

As mentioned above, the slowest (immutable data) permutation function is a lot smaller and easier to understand then the fastest (shared data) implementation.


'use strict';

// UTILITIES

function arrayToNum(p, s, e) {
    var r = 0;
    var m = 1;
    var i;
    for ( i=e-1 ; s<=i ; i-- ) {
        r += m * p[i];
        m *= 10;
    }
    return r;
}

function arrayWithZeros(n) {
    var i;
    var a = new Array(n);
    for ( i=0 ; i<a.length ; i++ ) a[i] = 0;
    return a;
}


// PERMUTATION ENGINES

function permutations0(n, callback) {
}

// IMMUTABLE (SLOWEST)

function permutations1(n, callback) {
    var i;
    var numbers = [];
    for ( i=1 ; i<=n ; i++ ) numbers.push(i);
    permute1([],numbers,callback);
}

function permute1(head, tail, callback) {
    if ( 0 === tail.length ) {
        callback(head);
        return;
    }

    tail.forEach(function(t, i, a) {
        permute1( [t].concat(head),
                  a.slice(0,i).concat(a.slice(i+1)),
                  callback);

    });
}

// MUTATES ITS OWN DATA, BUT NOT ITS ARGUMENTS

function permutations2(n, callback) {
    var i;
    var numbers = [];
    for ( i=1 ; i<=n ; i++ ) numbers.push(i);
    permute2([],numbers,callback);
}

function permute2(head, tail, callback) {
    if ( 0 === tail.length ) {
        callback(head);
        return;
    }
    var h2 = [tail[0]].concat(head);
    var t2 = tail.slice(1);
    var i  = 0;
    var tmp;
    
    while (true) {
        permute2(h2, t2, callback);
        if ( i === t2.length ) return;
        tmp   = h2[0];
        h2[0] = t2[i];
        t2[i] = tmp;
        i++;
    }
}

// MUTATES ALL DATA (INTERNALLY) (FASTEST)

function permutations3(n, callback) {
    var i;
    var head  = arrayWithZeros(n);
    var tails = new Array(n+1);

    for ( i=1 ; i<=n ; i++ ) {
        tails[i] = arrayWithZeros(i);
    }

    for ( i=1 ; i<=n ; i++ ) {
        tails[n][i-1] = i;
    }

    function permute3(x) {
        var j;
        var tail_this;
        var tail_next;
        var tmp;
        if ( 0 === x ) {
            callback(head);
            return;
        }
        tail_this = tails[x];
        tail_next = tails[x-1];

        for ( j=1 ; j<x ; j++ ) {
            tail_next[j-1] = tail_this[j];
        }

        j=0;
        while ( true ) {
            head[x-1] = tail_this[j];
            permute3(x-1);
             
            j++;
            if ( j === x ) return;

            tmp            = head[x-1];
            head[x-1]      = tail_next[j-1];
            tail_next[j-1] = tmp;
        }
    }

    permute3(n);
}

// TEST FUNCTIONS

function testprint() {
    this.test = function(p) {
        console.log(JSON.stringify(p));
    };

    this.done = function() {
        return 'Done';
    };
}

// CHECKS IF PERMUTATION IS ORDERED - FUNCTIONAL (SLOWEST)

function testorder1() {
    var cnt = 0;

    this.test = function(p) {
        if ( false !== p.reduce(function(acc,val) {
            if ( false === acc || val < acc ) return false;
            return val;
        }, -1)) cnt++;
    };

    this.done = function() {
        return cnt;
    };
}

// CHECKS IF PERMUTATION IS ORDERED - IMPERATIVE (FASTEST)

function testorder2() {
    var cnt = 0;

    this.test = function(p) {
        var i;
        for ( i=1 ; i<p.length ; i++ ) {
            if ( p[i] < p[i-1] ) return;
        }
        cnt++;
    };

    this.done = function() {
        return cnt;
    };
}

// TEST FUNCTION FOR PROJECT EULER 32

function testeuler() {
    var sums = {};

    this.test = function(p) {
        var w1, w2, w;
        var m1, m2, mx;
        w =  Math.floor(p.length/2);
        w1 = 1;
        w2 = p.length - w - w1;
    
        while ( w1 <= w2 ) {
            m1 = arrayToNum(p,     0, w1      );
            m2 = arrayToNum(p,    w1, w1+w2   );
            mx = arrayToNum(p, w1+w2, p.length);
        
            if ( m1 < m2 && m1 * m2 === mx ) {
                sums['' + mx] = true;
            }
        
            w1++;
            w2--;
        }
    };

    this.done = function() {
        var i;
        var r = 0;
        for ( i in sums ) {
            r += +i;
        }
        return r;
    };
}

// MAIN PROGRAM BELOW

function processData(input, parg, targ) {
    var r;

    var test = null;
    var perm = null;

    switch ( parg ) {
    case '0':
        perm = permutations0;
        break;
    case '1':
        perm = permutations1;
        break;
    case '2':
        perm = permutations2;
        break;
    case '3':
        perm = permutations3;
        break;
    }

    switch ( targ ) {
    case 'E':
        test = new testeuler;
        break;
    case 'O1':
        test = new testorder1;
        break;
    case 'O2':
        test = new testorder2;
        break;
    case 'P':
        test = new testprint();
        break;
    }


    r = perm(+input, test.test);
    console.log(test.done());
} 

function main() {
    var input = '';
    var parg = '1';
    var targ = 'E';
    var i;

    for ( i=2 ; i<process.argv.length ; i++ ) {
        switch ( process.argv[i] ) {
        case '0':
        case '1':
        case '2':
        case '3':
            parg = process.argv[i];
            break;
        case 'E':
        case 'O1':
        case 'O2':
        case 'P':
            targ = process.argv[i];
            break;
        }
    }
    

    process.stdin.resume();
    process.stdin.setEncoding('ascii');
    process.stdin.on('data', function (s) {
        input += s;
    });

    process.stdin.on('end', function () {
       processData(input, parg, targ);
    });
}

main();

This is how I run the code (use a lower value than 9 to have fewer than 9! permutations)

### Project Euler Test: 3 different permutation generators ###
$ echo 9 | time node projecteuler32.js 3 E
45228
8.95user ...
b$ echo 9 | time node projecteuler32.js 2 E
45228
25.03user ...
$ echo 9 | time node projecteuler32.js 1 E
45228
70.34user ...

### Simple check-order test, two different versions. Fastest permutations.
b$ echo 9 | time node projecteuler32.js 3 O1
1
23.71user ...
$ echo 9 | time node projecteuler32.js 3 O2
1
4.72user ...

(the timings here may not exactly match the above figures)

All FP-sucks related articles
Functional Programming Sucks)
Underscore.js sucks! Lodash sucks!
Functional Programming Sucks! (it is slow) (this one)
Lodash Performance Sucks!

Underscore.js sucks! Lodash sucks!

In a world of functional programming hype there are two very popular JavaScript frameworks: underscore.js and Lodash. Dont use them! It is a horrible idea. They suck just like functional programming sucks!

They make claims like:

  • Lodash: A modern JavaScript utility library delivering […] performance
  • Underscore: JavaScript library that provides a whole mess of useful functional programming helpers.

The road to hell is sided by good intentions. This is how it goes.

1. Sloppy data types
There are many good things about JavaScript. The sloppy dynamic typing is perhaps not one of them. The following are for example true:

  • ’27’ == 27
  • undefined == null
  • 0 == ”
  • ‘object’ === typeof null

Now that I consider myself an experienced programmer I find it quite convenient to not need to be explicit about data types. But I dont mix and change types! If a variable is a number from the beginning it keeps being a number. I carefully pick types: Objects, Arrays, Number, String and stick to that type (for a variable or property). Occationally – mostly for return variables – I break the rule.

Lodash and Underscore is about allowing yourself to be sloppy:

  • Dont know if its an object or an array: use map, foreach, filter, reduce and many more
  • Dont know if it is empty (or what being empty even means): use isEmpty
  • Dont know if it is String Object or a String primitive or something else: use isString

If you dont know what it is is you already have a much bigger problem than how to do something with it.
If you mix String Objects and String primitives AND other things, and you want to know if it is any kind of string you are doing something wrong.

So Step 1 with Lodash and Underscore is that you

  1. Add a depenceny
  2. Allow sloppy and inconsistent typing
  3. No one can now presume anything about your types anymore
  4. Your code is now impossible to maintain or extend without lodash or underscore

2. Performance!
My experience after many years in software development is that when an application is not well received by the users it is very often because of (bad) performance. And bad performance causes weird, hard to reproduce, bugs and instability as well.

An important type of optimization that the JIT can do relies on the runtime generating classes with strict types for your objects (it guesses and learns the types as the program runs). If you allow a property to assume values of different types you are likely to destroy this optimization.

Lets look at the little function isEmpty.

/** Underscore **/
_.isEmpty = function(obj) {
    if (obj == null) return true;
    if (isArrayLike(obj) && (_.isArray(obj) || _.isString(obj) || _.isArguments(obj))) return obj.length === 0;
    return _.keys(obj).length === 0;
};

/** Lodash **/
function isEmpty(value) {
    if (isArrayLike(value) &&
        (isArray(value) || isString(value) ||
         isFunction(value.splice) || isArguments(value))) {
        return !value.length;
    }
    return !nativeKeys(value).length;
}

If you KNOW the datatype you can just do:

/** String **/
if ( 0 === s.length )

/** String that may be null **/
if ( null === s || 0 === s.length )

/** Array **/
if ( 0 === a.length )

/** Object **/
function objectIsEmpty(o) {
    for ( x in o ) return false;
    return true;
}

(you may want to check o.hasOwnProperty(x) depending on what you actually want – but if you dont know what you want using Lodash or Underscore will produce equally unexpected results as my code)

The worst thing with the Underscore and Loadash implementations are the last lines:

    return _.keys(obj).length === 0;
    return !nativeKeys(value).length;

Unless the JIT compiler and runtime is very smart those two will produce an entire array of strings on the heap – just to check if the array length is 0! Even though this in most practical cases will have an acceptable overhead it can get very expensive.

This IS the beauty of FP generic programming. Focus on WHAT, not HOW, and any little innocent check (like isEmpty) can turn horribly expensive.

However, to be honest, I took some plain functional code and replaced plain JavaScript with Lodash functions (forEach, reduce, isNumber, isEmpty and a few more) and the code actually got faster! Still much slower than imperative code, but slightly faster than not using Lodash.

3. Complexity and Optimization
Now that you have added an extra dependency, made your data objects sloppy, made your application harder for the JIT to optimize, perhaps your application is not as quick as you need it to be. If you are writing a front end you are probably quite fine. But if you are coding a Node.js backend, performance matters a lot more and waste is more unacceptable. If you are really unlucky these sloppy types give you hard to find bugs and your backend is not completely stable and reliable.

What do you do? Common practices in the business could be things like:

  • Minification/uglification
  • Scale out service architecture
  • Failover technology
  • Spend time optimizing code and modules

This is how little sloppyness, laziness and convenience when making initial decisions about your architecture later can cause you huge problems and costs.

Of course, just including lodash and using isEmpty() in a few places is not going to do much (if any) harm.
But finding lodash or underscore generally preferable to not using them is one kind of low quality thinking that causes software to be bad.

Conclusion
Be explicit, careful, consistent and smart about the data types you use.
Be restrictive with libraries and frameworks that introduce overhead and hide relevant details.

Use the standard library. However, you can find that for example the Array functions of Lodash outperform the standard library equivalents. That may not be true in the future (and I really wonder how it can happen at all).

All FP-sucks related articles
Functional Programming Sucks)
Underscore.js sucks! Lodash sucks! (this one)
Functional Programming Sucks! (it is slow)
Lodash Performance Sucks!

Functional Programming Sucks*

(*like everything else that is mindlessly applied the wrong way to solve the wrong problems)

Functional Programming Rocks!
You can do amazing things with Haskell. The historical importance of Lisp is huge. Math, computer science, algorithms and engineering can come together beautifully with functional programming. Writing small reusable, testable functions with no side effects is a great thing to do in any language – but more than anywhere else those virtues are emphasized in functional programming. I love first class functions!

But…

The unfortunate JavaScript Hype
As JavaScript made map, filter and reduce part of the standard those have become the foremost frontier of functional programming. Being quite serious, many people argue that if you replace your for-loops with map, filter, reduce and forEach you have achieved something significant when it comes to readability. What do you think?

// non-functional smelly loop
redFruits = [];
for ( f in fruits ) {
  if ( 'red' === fruits[f].color ) redFruits.push(fruits[f]);
}

// fantastic functional alternative
greenFruits = fruits.filter(function(f) {
  return 'green' === f.color;
});

As a bonus with functional, you can use lamda-functions… that is functions, with no name, that can not be reused. You can even use arrow-functions if you think this is too easy a concept and you think the code gets more readable by omitting the confusing word function.

While there are a lot of balanced articles about using functional ideas in JavaScript, for a lot of people eliminating for-loops at any cost seems to be functional enough.

The argument here is that your code should be declarative, it should express intention rather than instructions. In the case above with the fruit-filter it does make some sense. But if that argument should make any sense, it requires that the programmer does not abuse map, filter, forEach or reduce to eliminate a for-loop just for the sake of it. I have seen things like:

// functional
applePrice = Object.keys(fruits).filter(function(f) {
  return 'apple' === f.name;
})[0].price;

// when this would work
for ( f in fruits ) {
  if ( 'apple' === fruits[f].name ) {
    applePrice = fruits[f].price;
    break;
  }
}

I have several objections with the functional version. The Object.keys thing is hardly pleasant to the eye. It is false marketing to use filter: the intention is find or search, not filter. So you still need to read the details (just as with the for-loop), you are just first fooled into thinking its a filter (and chaining is very popular, so then you have no function names and no variable names). But perhaps the worst problem is the lack of error handling. Functional code is not meant to have side effects, and error handling is exactly that. If ‘apple’ is not found you get an ugly Error. You can of course try/catch, or make temporary array variable apples and check that its length is one, but people who prefer functional style usually dont do it.

I understand your objection: people can write crappy code with any language an paradigm and just becuase I have seen bad applications of filter does not mean there is anything wrong with filter or FP. Of course. The bad thing is recommending people to use it like a silver bullet. The bad thing is that good FP is actually difficult and junior programmers will get it wrong trying to be fashionable.

Here is another example to show I am not inventing this rant.

Functional is preferable-Hype
Another favorite example of this functional hype is from rosettacode. Look at this amazing collection of implementations of a simple algorithm: the Luhn Algorithm. I suggest:

  1. Read the introduction so you get some idea
  2. Look at the C-implementation: imperative and simple. Testable and no side effects.
  3. Look at the functional implementations: C++11, Common Lisp, Haskell, JavaScript (ES5.1 version), PicoLisp, Python, Rust, Scheme

Look at Scala: there are two versions, a Functional Style (recommended) and an Imperative style. The people att IOCCC would be jealous with this shit!

I can only come to one conclusion: the emperor is naked.

I mean, if you code in PicoLisp, for fun or for a good reason, and you have to implement the Luhn algorith, please do so! But to say “recommended” about the functional Scala code or to think that the C++11 code is in anyway more reasonable than the original C-code… it makes no sense. No real systems programmers will choose Rust over C based on this. And Python – a language known for its clarity… such a sad example.

Trendy fashionable “functional programmers” suck
So, the problem here is not primarily with Function Programming itself or the code that guru coders write with Functional Programming. The problem is that people with far too little theoretical background, training and experience follow the hype and the result is ugly.

Functional Programming Sucks (at)
There are some things that Functional Programming is not very good at: State and Side Effects. This is by design. From a FP perspective (and a general perspective as well) State and Side Effects are nasty and problematic: they should be avoided and when they cant be avoided they need special attention (like Monads).

The problem here is that JavaScript is mostly a programming language for the web. A simple (modern, single page) web application:

  1. Loads data from a server
  2. Presents data to the user
  3. Lets the user update the data
  4. Sends the data back to the server

This is (almost) all about state and side effects! You may have some functions for validation, filtering, sorting and calculations on the client, and those benefit from Functional Programming ideas. But your core enterprise is about state, side effects and error handling! Functional Programming is so unsuitable for this that a simple web page just can’t be written functionally, unless you start breaking rules. And this is of course what programmers end up doing because in the end of the day they have a real application to build for real users.

The difficult thing is to break the right rules in the right way at the right place for the right reason. This is architecture – mixing concepts, and designing how your application lives, its breaths and heartbeats. Architecture is difficult, and it is not made easier relying on unsuitable silver bullets.

Functional Reactive Programming
There is one thing that is even more hyped and even more theoretic than Functional Programming, and that is Functional Reactive Programming.

But it makes sense (as I understand it). It is not about taking Functional Programming one step further but about putting it in context. You have data that changes (signals, events, behaviours, whatever). That data can be pipelined with high quality through FP logic to produce the content of your GUI or the requests to the server.

Keep the functional parts functional, and let other parts of your application just deal with I/O, GUI and state. Dividing your code into separate modules with clear responsibilites and clear interactions have always been a good idea.

Performance
My experience is that when a real world application is not well received by the users a lot of the time its because performance sucks. When performance is bad usability is also bad, and stability gets bad (especially on the server side).

Functional programming is typically bad for performance.

  • anonymous functions can often not be JIT compiled and will never be optimized
  • recursion is nice, but rarely faster than a loop
  • the chaining concept creates arrays and arrays and arrays just for throwing away, which is fun for the garbage collector, but it is not fast
  • the immutable object concept is about generating new RO copies of object instead of changing objects, which is expensive and wasteful

Perhaps worse, since functional programming and proper use of and map(), filter(), reduce() and friends is hard (easy things get difficult) not so experienced programmers end up writing implementations with unnecessary computational complexity ( O(n) turns into O(n^2) ). It is not funny – you cant afford that for most anything that goes to production.

Agile, refactoring and generic code
It is HARD to design code properly from the beginning! Good objects, classes, functions, modules, packages, namings, dependency trees and architecture dont come for free! Agile and Refactoring are about accepting that the design will not be optimal the first time, thus we should not bother too much about it, but rather fix the code when we have learnt more about the problem and our code starts getting (too) ugly.

A strong argument for FP is that it is highly generic – which is true. But until a programmer has spent much time with her domain and problem she will not know what things can and should be made generic. Making things too generic is called overengineering, and it is perhaps the worst sickness in our industry.

I usually:

  • start with one source file rather than many
  • allow myself some copy-paste until I see what code really gets repeated
  • make my code as specific as possible, unless I see an obvious generalisation or the actual need for generalisation emerges
  • dont worry too much about global variables in the beginning (after a while there will be a natural place for them or for what the represent)
  • allow quite long functions until I see what parts of them actually do something meaningful on its own
  • code quite defensively with lots of error handling (it usually pays of quite quickly)

This works for getting quick practical results. Later, during refactoring, when the code base has grown, and when I have learnt more about the domain, I can break out pieces of critical code that creates nice generic functions. But thinking FP first – no way!

All FP-sucks related articles
Functional Programming Sucks (this one)
Underscore.js sucks! Lodash sucks!
Functional Programming Sucks! (it is slow)
Lodash Performance Sucks!

Sort strings without case sensitivity

In JavaScript, I wanted to sort arrays of strings without caring about case. It was more complicated than I first thought.

The background is that I present lists like this in a GUI:

  • AMD
  • Apple
  • Gigabyte
  • IBM
  • Intel
  • Microsoft
  • MSI
  • Nokia
  • Samsung
  • Sony

I want AMD and MSI (spelled in all caps) to be sorted without respect to case. Standard sort() would put MSI before Microsoft.

Obviously I am not the first one wanting to do this and I found an article on stackoverflow. It suggests the following solution:

Use toLowerCase()
You can make your own string compare function that uses toLowerCase and send it as an argument to sort():

function cmpCaseless(a,b) {
    a = a.toLowerCase();
    b = b.toLowerCase();
    if ( a < b ) return -1;
    if ( a > b ) return  1;
    return 0;
}

myStringArray.sort(cmpCaseless);

This has a number of problems. The article above mentions that it is not stable. That is probably true in some cases but I was of course worried about performance: making two String objects for each compare should make the garbage collector quite busy, not to mention the waste of copying and lowercasing potentially quite long stings when usually the first character is enought. When I started experimenting I found another more critical flaw though: in Swedish we have three extra characters in the alphabet; Å,Ä,Ö, in that order. The above cmpCaseless orders Ä,Å,Ö, which sounds like a little problem, but it is simply unacceptable.

Use localeCompare
There is a more competent (or so I thought, read on) way to compare strings in JavaScript: the localeCompare function. This one simply treats A,Å,Ä and O,Ö as the same character, which is far more unacceptable than the toLowerCase problem.

However, it also has a “locales” option (a second optional argument). If I set it to ‘sv’ I get the sort order that I want, but performance is horrible. And I still have to use toLowerCase as well as localeCompare:

function localeCompare(a,b) {
    return a.toLowerCase().localeCompare(b.toLowerCase());
}

function localeCompare_sv(a,b) {
    return a.toLowerCase().localeCompare(b.toLowerCase(), 'sv');
}

localeCompare() has an extra options argument with a “sensitivity” parameter, but it is no good for our purpuses.

Rolling my own
Of course, I ended up building my own function to do caseless string compare. The strategy is to compare one character at a time, not making any new String objects, and fallback to localeCompare if both characters are above the 127 ASCII characters:

function custom(a,b) {
    var i, al, bl, l;
    var ac, bc;
    al = a.length;
    bl = b.length;
    l = al < bl ? al : bl;
        
    for ( i=0 ; i<l ; i++ ) {
        ac = a.codePointAt(i);  // or charCodeAt() for better compability
        bc = b.codePointAt(i);
        if ( 64 < ac && ac < 91 ) ac += 32;
        if ( 64 < bc && bc < 91 ) bc += 32;
        if ( ac !== bc ) { 
            if ( 127 < ac && 127 < bc ) {
                ac = a.substr(i,1).toLowerCase();
                bc = b.substr(i,1).toLowerCase();
                if ( ac !== bc ) return ac.localeCompare(bc);
            } else {
                return ac-bc;
            }
        }
    }
    return al-bl;
}

One fascinating thing is that here I can use localeCompare() without 'sv'.

Test for yourself
I built a simple webpage where you can test everything yourself.

Conclusion
Defining a string sort order is not trivial, when you dont just have ASCII characters. If you look at the ascii table you see that non alphabetic characters are spread out:

  • SPACE, #, 1-9, and many more come before both A-Z and a-z
  • Underscore: _, and a few other characters come after A-Z but before a-z
  • Pipe: | and a few other characters come after A-Z and a-z

When it comes to characters behind ASCII 127, it just gets more complicated: how do you sort european language latin letters, greek letters and arrows and other symbols?

For this reason, I think it makes sense to define your own sorting function and clearly define the behaviour for the characters you know that you care about. If it really matters in your application.

My function above is significantly faster than the options.

Disclaimer
These results can probably be inconsistent over different web browsers.

Playing with smart.js and V7

I have been playing quite much with Node.js lately, and I have put some effort into trying to build it for typical OpenWRT hardware. It turns out that Node.js/V8 is not, and will never be, suitable for hardware without an FPU and at least 128MB RAM.

My curiousity led me to smart.js which is based on the V7 javascript engine. Among the positive details were posix compability, ECMA script 5.1 support, HTTP support (not very different from Node.js, just much less of it), and supposed to be the fastest non-compiled JavaScript engine.

smart.js
smart.js seems to be the IoT-platform, while v7 is just the JavaScript engine. smart.js had some peculiar properties:

  • The source zip I downloaded from github failed to compiled becuase of a missing .git-directory. Using git clone solved this.
  • The binary always reads and executes smart.js from the same folder. If you give more command line arguments, it executes those after smart.js.
  • I found no way to pass command line arguments.

It comes with a little set of IoT-example files, and I found this a little confusing.

v7
I found out it was possible to use v7 standalone. It is a very nice v7.c file that is simply compiled:

$ gcc -O3 -DV7_EXE -o v7 v7.c -lm

This is very promising – a lot easier than building node.

I found that neither console.log or process.argv exist and found ways to work around this. My plan was to build a little benchmark suite and test different things. I did not get so far: the below program just creates 100 random objects (TestBit) 10 times and stores in arrays.

if ( 'undefined' === typeof process ) {
  process = {
    exit : exit,
    argv : [ 'v7', 'v7bench.c' ]  // faking command line arguments
  };
}

if ( 'undefined' === typeof console ) {
  console = {
    log : print
  }
}

var TestBit = function() {
  this.n  = Math.random();
  this.s  = '' + this.n;
  this.s1 = this.s.substr(2,1);
  this.s2 = this.s.substr(3,2);
  this.s3 = this.s.substr(5,3);
  this.b  = this.s > 0.5;
};

var generateTestBits = function(n) {
  var i;
  var r = new Array(n);
  for ( i=0 ; i<n ; i++ ) {
    r[i] = new TestBit();
  }
  return r;
}

var Timer = function() {
  this.start = Date.now();
  this.last  = this.start;

  this.split = function() {
    var r;
    var n = Date.now();
    r = n - this.last;
    this.last = n;
    return r;
  }
}

var timer = new Timer();

var i = 0;
var tests = [];
var tmp;

console.log('Start');

while(i<10) {
  tests.push(generateTestBits(100));
  i++;
  console.log('' + ( i*100) + ':' + timer.split());
}

This simple program unfortunately proved that performance is much worse than I could expect. Below are timings for generating the 1000 objects, in steps of 100 at a time. Benchmarks on a RPi2 900MHz ARMv7 CPU.

There is a flag (-vo) described as “object arena size” that I decided to play with (values 1, 10, 100 gave very similar results as 1000)

       default    -vo 1000    -vo 5000    -vo 10000
 100      0.2s       0.12s        6.8s          28s                         
 200      0.4s       0.16s        8.6s
 300      1.3s       0.23s       12  s
 400      7.4s       0.24s        2.6s
 500     13  s       0.37s       15  s
 600     35  s       0.93s       32  s
 700     47  s       2.9 s       48  s
 800     79  s       4.6 s       45  s
 900    160  s       5.9 s       36  s
1000    160  s       8.0 s       37  s

Well, generating 100 random objects on a RPi2 using a simple interpreter could take 0.2 seconds (the top left value): quite reasonable I guess. As the number of generated objects grows the performance is completey ruined. With default parameter, a full 1.6 seconds is spent to generate one single “TestBit”. Memory usage of the v7 process is insignificant.

I dont know what the “object arena size” setting does, but it obviously changes something. The 400-value for the vo=5000 series is actually reproducable.

Even for an IoT-platform I find this performance (and the lack of predictability) unacceptable. V7 officially aims to be the fastest interpreted JavaScript engine out there: it must be cabable of handling what is actually small arrays (10×100 elements) of small objects.

JavaScript: switch options

Is the nicest solution also the fastest?

Here is a little thing I ran into that I found interesting enough to test it. In JavaScript, you get a parameter (from a user, perhaps a web service), and depending on the parameter value you will call a particular function.

The first solution that comes to my mind is a switch:

function test_switch(code) {
  switch ( code ) {
  case 'Alfa':
    call_alfa();
    break;
  ...
  case 'Mike':
    call_mike();
    break;
  }
  call_default();
}

That is good if you know all the labels when you write the code. A more compact solution that allows you to dynamically add functions is to let the functions just be properties of an object:

x1 = {
  Alfa:call_alfa,
  Bravo:call_bravo,
  Charlie:call_charlie,
...
  Mike:call_mike
};

function test_prop(code) {
  var f = x1[code];
  if ( f ) f();
  else call_default();
}

And as a variant – not really making sense in this simple example but anyway – you could loop over the properties (functions) until you find the right one:

function test_prop_loop(code) {
  var p;
  for ( p in x1 ) {
    if ( p === code ) {
      x1[p]();
      return;
    }
  }
  call_default();
}

And, since we are into loops, this construction does not make so much sense in this simple example, but anyway:

x2 = [
  { code:'Alfa'     ,func:call_alfa    },
  { code:'Bravo'    ,func:call_bravo   },
  { code:'Charlie'  ,func:call_charlie },
...
  { code:'Mike'     ,func:call_mike    }
];

function test_array_loop(code) {
  var i, o;
  for ( i=0 ; i<x2.length ; i++ ) {
    o = x2[i];
    if ( o.code === code ) {
      o.func();
      return;
    }
  }
  call_default();
}

Alfa, Bravo…, Mike and default
I created exactly 13 options, and labeled them Alfa, Bravo, … Mike. And all the test functions accept invalid code and falls back to a default function.

The loops should clearly be worse for more options. However it is not obvious what the cost is for more options in the switch case.

I will make three test runs: 5 options (Alfa to Echo), 13 options (Alfa to Mike) and 14 options (Alfa to November) where the last one ends up in default. For each run, each of the 5/13/14 options will be equally frequent.

Benchmark Results
I am benchmarking using Node.js 0.12.2 on a Raspberry Pi 1. The startup time for Nodejs is 2.35 seconds, and I have reduced that from all benchmark times. I also ran the benchmarks on a MacBook Air with nodejs 0.10.35. All benchmarks were repeated three times and the median has been used. Iteration count: 1000000.

(ms)       ======== RPi ========     ==== MacBook Air ====
              5      13      14         5      13      14
============================================================
switch     1650    1890    1930        21      28      30
prop       2240    2330    2890        22      23      37
proploop   2740    3300    3490        31      37      38
loop       2740    4740    4750        23      34      36

Conclusions
Well, most notable (and again), the RPi ARMv6 is not fast running Node.js!

Using the simple property construction seems to make sense from a performance perspective, although the good old switch also fast. The loops have no advantages. Also, the penalty for the default case is quite heavy for the simple property case; if you know the “code” is valid the property scales very nicely.

It is however a little interesting that on the ARM the loop over properties is better than the loop over integers. On the x64 it is the other way around.

Variants of Simple Property Case
The following are essentially equally fast:

function test_prop(code) {
  var f = x1[code];   
  if ( f ) f();       
  else call_x();                        
}   

function test_prop(code) {
  var f = x1[code];   
  if ( 'function' === typeof f ) f();
  else call_x();                        
}   

function test_prop(code) {
  x1[code]();                          
}   

So, it does not cost much to have a safety test and a default case (just in case), but it is expensive to use it. This one, however:

function test_prop(code) {
  try {
    x1[code]();
  } catch(e) {
    call_x();
  }
}

comes at a cost of 5ms on the MacBook, when the catch is never used. If the catch is used (1 out of 14) the run takes a full second instead of 37ms!

Faking a good goto in JavaScript

There are cases where gotos are good (most possible uses of gotos are not good). I needed to write JavaScript functions (for running in NodeJS) where I wanted to call the callback function just once in the end (to make things as clear as possible). In C that would be (this is a simplified example):

void withgoto(int x, void(*callback)(int) ) {
  int r;

  if ( (r = test1(x)) )
    goto done;

  if ( (r = test2(x)) )
    goto done;
 
  if ( (r = test3(x)) )
    goto done;
 
  r = 0;
done:
  (*callback)(r);
}

I think that looks nice! I mean the way goto controls the flow, not the syntax for function pointers.

JavaScript: multiple callbacks
The most obvious way to me to write this in JavaScript was:

var with4callbacks = function(x, callback) {
  var r

  if ( r = test1(x) ) {
    callback(r)
    return
  }

  if ( r = test2(x) ) {
    callback(r)
    return
  }

  if ( r = test3(x) ) {
    callback(r)
    return
  }
  r = 0
  callback(r)
}

This works perfectly, of course. But it is not nice with callback in several places. It is annoying (bloated) to always write return after callback. And in other cases it can be a little unclear if callback is called zero times, or more than one time… which is basically catastrophic. What options are there?

JavaScript: abusing exceptions
My first idea was to abuse the throw/catch-construction:

var withexceptions = function(x, callback) {
  var r

  try {
    if ( r = test1(x) )
      throw null

    if ( r = test2(x) )
      throw null

    if ( r = test3(x) )
      throw null

    r = 0
  } catch(e) {
  }
  callback(r)
}

This works just perfectly. In a more real world case you would probably put some code in the catch block. Is it good style? Maybe not.

JavaScript: an internal function
With an internal (is it called so?) function, a return does the job:

var withinternalfunc = function(x, callback) {
  var r
  var f

  f = function() {
    if ( r = test1(x) )
      return

    if ( r = test2(x) )
      return

    if ( r = test3(x) )
      return

    r = 0
  }
  f()
  callback(r)
}

Well, this looks like JavaScript, but it is not super clear.

JavaScript: an external function
You can also do with an external function (risking that you need to pass plenty of parameters to it, but in my simple example that is not an issue):

var externalfunc = function(x) {
  var r
  if ( r = test1(x) )
    return r

  if ( r = test2(x) )
    return r

  if ( r = test3(x) )
    return r

  return 0
}

var withexternalfunc = function(x, callback) {
  callback(externalfunc(x))
}

Do you think the readability is improved compared to the goto code? I don’t think so.

JavaScript: Break out of Block
Finally (and I got help coming up with this one), it is possible to do:

var withbreakblock = function(x, callback) {
  var r
  var f

myblock:
  {
    if ( r = test1(x) )
      break myblock

    if ( r = test2(x) )
      break myblock

    if ( r = test3(x) )
      break myblock

    r = 0
  }
  callback(r)
}

Well, that is at close to the goto construction I come with JavaScript. Pretty nice!

JavaScript: Multiple if(done)
Using a done-variable and multiple if statements is also possible:

var with3ifs = function(x, callback) {
  var r
  var done = false

  if ( r = test1(x) )
    done = true

  if ( !done ) {
    if ( r = test2(x) )
      done = true
  }

  if ( !done ) {
    if ( r = test3(x) )
      done = true
  }

  if ( !done ) {
    r = 0
  } 
  callback(r)
}

Hardly pretty, I think. The longer the code gets (the more sequential ifs there is), the higher the penalty for the ifs will be.

Performance
Which one I choose may depend on performance, if the difference is big. They should all be fast, but:

  • It is quite unclear what the cost of throwing (an exception) is
  • The internal function, is it recompiled and what is the cost?

I measured performance as (millions of) calls to the function per second. The test functions are rather cheap, and x is an integer in this case.

I did three test runs:

  1. The fall through case (r=0) is relatively rare (~8%)
  2. The fall through case is very common (~92%)
  3. The fall through case is extremely common (>99.99%)

In real applications fallthrough rate may be the most common case, with no error input data found. The benchmark environment is:

  • Mac Book Air Core i5@1.4GHz
  • C Compiler: Apple LLVM version 6.1.0 (clang-602.0.49) (based on LLVM 3.6.0svn)
  • C Flags: -O2
  • Node version: v0.10.35 (installed from pkgsrc.org, x86_64 version)

Performance was rather consistent over several runs (for 1000 000 calls each):

Fallthrough Rate           ~8%      ~92      >99.99%      
---------------------------------------------------------
     C: withgoto           66.7     76.9     83.3  Mops
NodeJS: with4callbacks     14.9     14.7     16.4  Mops
NodeJS: with exceptions     3.67     8.77    10.3  Mops
NodeJS: withinternalfunc    8.33     8.54     9.09 Mops
NodeJS: withexternalfunc   14.5     14.9     15.6  Mops
NodeJS: withbreakblock     14.9     15.4     17.5  Mops
NodeJS: with3ifs           15.2     15.6     16.9  Mops

The C code was row-by-row translated into the JavaScript code. The performance difference is between C/Clang and NodeJS, not thanks to the goto construction itself of course.

On Recursion
In JavaScript it is quite natural to do recursion when you deal with callbacks. So I decided to run the same benchmarks using recursion instead of a loop. Each recursion step involves three called ( function()->callback()->next()-> ). With this setup the maximum recursion depth was about 3×5300 (perhaps close to 16535?). That may sound much, but not enough to produce any benchmarks. Do I need to mention that C delivered 1000 000 recursive calls at exactly the same performance as the loop?

Conclusion
For real code 3.7 millions exceptions per second sounds pretty far fetched. Unless you are in a tight loop (which you probably are not, when you deal with callbacks), all solutions will perform well. However, the break out of a block is clearly the most elegant way and also the most efficient, second only to the real goto of course. I suspect the generally higher performance in the (very) high fallthrough case is because branch prediction gets more successful.

Any better ideas?