Tag Archives: JavaScript

Arrow functions in JavaScript: A strategy

Arrow functions have been a part of JavaScript since ES6. They are typically supported where you run JavaScript, except in Internet Explorer. To be clear, arrow functions are:

(a,b) => a+b

instead of

function(a,b) { return a+b }

I like to make things simple, and

  1. my code sometimes run on Internet Explorer
  2. arrow functions offers shorter and simplified syntax in some cases, but fundamentally you can write the same code with function
  3. I like to not have a build step (babel, webpack and friends) for a language that really does and should not need one

so, until now I have simply avoided them (and kind of banned them, along with other ES6 features) in code and software I am responsible for.

However

  1. arrow functions (as part of ES6) are here to stay
  2. they offer some advantages
  3. Internet Explorer will go away.

so, it makes sense to have a strategy for when to use arrow functions.

What I find on the Internet
The Internet is full of sources telling you how you can use arrow functions, how to write them, what are the pros, cons and pitfalls, and what they cannot do.

  • The key difference is how arrow functions work with this.
  • The syntax is shorter especially for single argument (needs no parenthesis), single statement (needs no return), functions.
  • Arrow functions don’t work well with Object oriented things (as constructors and prototype function)

In short, there are some cases where you can’t use arrow functions, some cases where they offer some real advantages, but in most cases it makes little real difference.

Arrow functions allow you to chain sort().filter().map() in very compact ways. With simple single statement arrow functions it is quite nice. But if the arrow functions become multiple lines I think it is poor programming.

What I don’t really find on the Internet
I don’t really find good advice on when to use arrow functions and when not to use arrow functions. I mean, when I program, I make decisions all the time:

  • Should I break this code out into a function?
  • Should this be an object (prototype style) or just data?
  • Should I break this code into its own module?
  • Should I write tests for this?
  • Should I allow a simple, slower algorithm, or should I add effort and complexity to write my code faster?
  • What should be the scope of these variables?
  • Should this be a parameter or can it be hard coded?
  • Can I make good use of map/reduce/every and friends, or is it better I just use a loop?
  • Naming everything…
  • …and so on…

Using, or not using, an arrow function is also a choice. How do I make that choice to ensure my code is good? I don’t really find very clear guidelines or style guides on this.

Lambda functions in other languages
Other languages have lambda functions. Those are special case anonymous functions. The thing I find peculiar about the use of arrow functions in JavaScript is that they are often used instead of function, when a standard function – not a lambda – would have been the obvious choice in other languages.

Intention
For practical purposes most often function and () => {} are interchangeable. And I guess you can write any JavaScript program using only arrow functions.

When you write code, it mostly does not matter what you use.
When you read code, it comes down to understanding the intention of the writer.

So I think good use of arrow functions is a way that makes the intention of the code as clear as possible. I want clear and consistent guidelines.

Using arrow functions in well defined cases shows more intention and contributes to more clear code than never using them.

I tend to read arrow functions as being a strong marker for functional programming. I find it confusing and when arrow functions are used in code that breaks other good core principles of functional programming.

The strongest cases
The strongest cases for arrow functions I can see:

Minimal syntax (no () or {} required), and never worth breaking such function out.

names = stuffs.map(stuff => stuff.name);

Callback: the arguments (error, data) are already given by openFile and the callback function cannot have a meaningful this. Also, for most practical purposes, the callback needs to use closure to access data in the parent scope, so it can not be a named function declared elsewhere.

openFile('myFile', (error, data) => {
  ... implementation
});

When it makes little difference
For a regular function it makes no difference:

const swapNames = (a,b) => {
  let tmp = a.name;
  a.name = b.name;
  b.name = tmp;
}

The function alternative would be:

function swapNames(a,b) {

and is actually shorter. However, I can appreciate with arrows that it is completely clear from the beginning that a binding of this can never happen, that it can not be used as a constructor and that there can be no hidden arguments (accessed via arguments).

Confused with comparison
There are cases when arrow functions can be confused with comparison.

// The intent is not clear
var x = a => 1 ? 2 : 3;
// Did the author mean this
var x = function (a) { return 1 ? 2 : 3 };
// Or this
var x = a <= 1 ? 2 : 3;

Obfuscate with higher order functions
Higher order functions (map, reduce, filter, sort) are nice and can improve your code. But, carelessly used they can be confusing and obfuscating.

These are not the fault of () => {} in itself. But it is a consequence of making higher order functions with arrow functions too popular.

I have seen for example (things like):

myArray.map(x => x.print())

map() should not have a side effect. It is outright obfuscating to feed a function that has a side effect into map(). And side effects have nothing to do with functional programming in the first place.

I have also seen reduce() and filter() being used when every(), some() or find() would have been the right choice. It is obfuscating, it is expensive, and it produces more code than necessary.

The use of arrow functions with higher order functions is only appropriate when the correct higher order function is used.

The abusive cases
Anonymous functions that are non-trivial and could clearly be named and reused (and testable) is clearly bad code:

myStuff.sort((a,b) => {
  if ( a.name < b.name ) return -1;
  if ( a.name > b.name ) return  1;
  if ( a.id   < b.id   ) return -1;
  if ( a.id   > b.id   ) return  1;
  return 0;
});

especially when the code is duplicated or the parent function is large.

An arrow-friendly policy
Admittedly, after doing my research I feel happier with arrow functions than I thought I would.

I suggest (as long as your runtime supports it) to use arrow functions as the default function. The reason for this is that they do less. I think the standard behavior of arguments, this and of OOP-concepts (prototype and constructors) should be optional and require explicit use (of function).

Just as one-line if-statements and if-statements without {} should be used carefully (I tend to abuse it myself) I think the same applies to arrow functions.

I think this is excellent:

names = stuffs.map(stuff => stuff.name);

but apart from those common simple cases I think think the full syntax should be used for clarity:

const compareItems (a,b) => {
  if ( a.name < b.name ) return -1;
  if ( a.name > b.name ) return  1;
  if ( a.id   < b.id   ) return -1;
  if ( a.id   > b.id   ) return  1;
  return 0;
};

(dont try to be clever by omitting (), {}, or return).

The use of function should be reserved for

  • constructors
  • prototype functions
  • functions that need the standard behavior of this
  • functions that do things with arguments
  • source files where function is used exclusively since before

Basic good functional programming practices should be especially respected when using arrow functions:

  • Dont duplicate code: break out anonymous functions to named functions when appropriate
  • Dont write long functions: break out anonymous functions to named functions when appropriate
  • Avoid side effects and global variables
  • Use the correct higher order function for the job

Also, obviously, take advantage of OOP and function when appropriate!

Callback functions
I think anonymous callback functions should generally be kept short.

const doStuff = () => {
  readFile('myFile', (error, data) => {
    if ( error )
      console.log('readFile failed: ' + e);
    else
      doStuffWithData(data);
  });
};

const doStuffWithData = (data) => {
  ...
};

Performance
In principle, I see no reason why arrow functions should not be at least as fast as regular function. In practice, the current state of JavaScript engines could be disappointing - I don't know.

However, a named static function is typically faster than an anonymous inline function. The JIT typically can optimize a function the more it is run so named and reusable functions are preferred.

I have made no benchmarks on arrow functions.

Feedback
I will start using arrow functions when I write new code and I feel enthusiastic about it. I will probably come across things I have not thought about. Do you have any thoughts on this? Let me know!

Vue.js: loading template html files

Update 2018-05-27: A few months have passed since I wrote this post. I have used my solution/library for several real applications and it has worked very well. So everything looks exactly as it did when I posted v0.1 and that is a good thing. There are obviously improvement opportunites and probaby limitations/bugs. But for my purposes I have not encountered any problems to fix. And nobody has notified me of needed fixes.

You may want to code your Vue.js application in such way that your html templates are in separate html files, but you still do not want a build/compile step. Well, the people writing Vue dont want you do do this, but it can easily be done.

All you need is to download this single js file and include it in your Vue.js web page. All instructions and documentation required are found in the js file.

VueWithHtmlLoader-library
I wrote a little library that simply does what is required in a rather simple way. I will not hold you back and I will show you by example immediately:

  • A Rock-paper-scissors Vue-app, all in 1 file: link
  • A Rock-paper-scissors Vue-app, modularised with separate html/js files: link
  • Source of VueWithHtmlLoader library: link

These are the code changes needed to use VueWithHtmlLoader:

 * 1) After including "vue.js", and
 *    before including your component javascript files,
 *    include "vuewithhtmlloader.js"
 *
 * 2) In your component javascript files
 *    replace: Vue.component(
 *       with: VueWithHtmlLoader.component(
 *
 *    replace: template: '...'
 *       with: templateurl: 'component-template.html' (replace with your url)
 *
 * 3) The call to "new Vue()" needs to be delayed, like:
 *    replace: var myVue = new Vue(...);
 *       with: var myVue;          
 *             function initVue() {
 *               myVue = new Vue(...);
 *             }
 *             VueWithHtmlLoader.done(initVue);

My intention is that the very simple Rock-paper-scissors-app shall work as an example.

Disclaimer: the library is just written and tested only with this application. The application is written primarily to demonstrate the library. The focus has been clarity and simplicity. Please feel free to suggest improvements to the library or the application, but keep in mind that it was never my intention to follow all best practices. The purpose of the library is to break a Vue best practice.

What the library does:

  1. It creates a global object: VueWithHtmlLoader
  2. It provides a function: VueWithHtmlLoader.component() that you shall use instead of Vue.component() (there may be unsupported/untested cases)
  3. When using VueWithHtmlLoader.component(), you can provide templateurl:’mytemplate.html’ instead of template:’whatever Vue normally supports’
  4. The Vue()-constructor must be called after all templateurls have been downloaded. To facilitate this, place the code that calls new Vue() inside a function, and pass that function to VueWithHtmlLoader.done()
  5. The library will now load all templateurls. When an html template is successfully downloaded over the network Vue.component() is called normally.
  6. When all components are initiated, new Vue() is called via the provided function

Apart from this, you can and should use the global Vue object normally for all other purposes. There may be more things that you want to happen after new Vue() has been called.

The library has no dependencies (it uses XMLHttpRequest directly).

Background
Obviously there are people (like me) with an AngularJS (that is v1) background who are used to ng-include and like it. We see Vue as a better, smaller AngularJS for the future, but we want to keep our templates in separate files without a build step.

I also expect many developers with various backgrounds to try out Vue.js. They may also benefit from a simple way to keep templates in separate files without worrying about a build tool.

As I see it, there are different sizes of applications (and sizes of team and support around them).

  1. Small single-file applications: I think it is great that Vue supports simple single-file applications (with x-template if you want), implemented like my game above. This has a niche!
  2. Applications that clearly require modularization, but optimizing loading times is not an issue, and you want to use the the simplest tools available (keep html/js separate to allow standard editor support and not require a build step). AngularJS (v1) did this nicely. I intend Vue to do it nicely too with this library.
  3. Applications built by people or organizations that already use Webpack and such tools, or applications that are so demanding that such tools are required.

I fully respect and understand the Vue project does not want to support case 2 out of the box and that they prefer to keep the Vue framework small (and as fast as possible).

But i sense some kind of arrogance with articles like 7 Ways To Define A Component Template in Vue.js. I mean 1,2 are only useful for very small components. 3 is only useful for minimal applications that dont require modularization. 4 has very narrow use cases. 5 is insane for normal development (however, I can see cases where you want to output/generate it). And 6,7 requires a build step.

8. Put the damn HTML in an HTML-file and include it? Nowhere to be seen.

The official objection to 8 is obviously performance. I understand that pre-compiling your html instead of serving html that the client will compile is faster. But compared to everything else this overhead may be negligable. And that is what performance is all about, focusing on what is critical and keeping everything else simple. My experience is that loading data to my applications take much more time than loading the application itself.

The Illusion of Simplicity
AngularJS (v1) gave the illusion of simplicity. You just wrote JavaScript-files and (almost) HTML-files, the browser loaded everything and it just worked. I know this is just an illusion and a lot happens behind the scenes. But my experience is that this illusion works well, and it does not leak too much. Vue.js is so much simpler than AngularJS in so many ways. I think my library can keep my illusion alive.

Other options
There is thread on Stackoverflow about this and there are obviously other solutions. If you want to write .vue-files and load them there is already a library for that. For my solution I was inspired by the simple jquery example, but: 1) it is nice to not have a jquery dependency, 2) it is nice to keep the async stuff in one place, 3) the delayed call of new Vue() seems forgotten.

Feedback, limitations, bugs…
If you have suggestions for improvements or fixes of my library, please let me know! I am happy to make it better and I intend to use it for real applications.

I think this library suits some but not all (or even most) Vue.js applications. Lets not expect it to serve very complex requirements or applications that would actually benefit more of a Webpack treatment.

TODO and DONE

  • A minified version – I have not really decided on ambition/obfuscation level
  • Perhaps change loglevel if minified Vue is used? or not.
  • I had some problems with comments in html-files, but I failed to reproduce them. I think <!– comments –> should definitely be supported.

JavaScript: Sets, Objects and Arrays

JavaScript has a new (well well) fancy Set datastructure (that does not come with functions for union, intersection and the likes, but whatever). A little while ago I tested Binary Search (also not in the standard library) and I was quite impressed with the performance.

When I code JavaScript I often hesitate about using an Array or an Object. And I have not started using Set much.

I decided to make some tests. Lets say we have pseudo-random natural numbers (like 10000 of them). We then want to check if a number is among the 10000 numbers or not (if it is a member of the set). A JavaScript Set does exactly that. A JavaScript Object just requires you to do: set[314] = true and you are basically done (it gets converted to a string, though). For an Array you just push(314), sort the array, and then use binary search to see if the value is there.

Obviously, if you often add or remove value, (re)sorting the Array will be annoying and costly. But quite often this is not the case.

The test
My test consists of generating N=10000 random unique numbers (with distance 1 or 2 between them). I then insert them (in a kind of pseudo-random order) into an Array (and sorts it), into an Object, and into a Set. I measure this time as an initiation time (for each data structure).

I repeat. So now I have 2xArrays, 2xObjects, 2xSets.

This way I can test both iterating and searching with all combinations of data structures (and check that the results are the same and thus correct).

Output of a single run: 100 iterations, N=10000, on a Linux Intel i5 and Node.js 8.9.1 looks like this:

                         ====== Search Structure ======
(ms)                        Array     Object      Set
     Initiate                1338        192      282
===== Iterate =====    
        Array                 800         39       93
       Object                 853        122      170
          Set                1147         82      131

By comparing columns you can compare the cost of searching (and initiating the structure before searching it). By comparing rows you can compare the cost of iterating over the different data structures (for example, iterating over Set while searching Array took 1147ms).

These results are quite consistent on this machine.

Findings
Some findings are very clear (I guess they are quite consistent across systems):

  • Putting values in an Array, to sort it, and the search it, is much slower and makes little sense compared to using an Object (or a Set)
  • Iterating an Array is a bit faster than iterating an Object or Set, so if you are never going to search an Array is faster
  • The newer and more specialized Set offers little advantage to good old Objects

What is more unclear is why iterating over Objects is faster when searching Arrays, but iterating over Sets if faster when searching Objects or Sets. What I find is:

  • Sets seem to perform comparably to Objects on Raspberry Pi, ARMv7.
  • Sets seem to underperform more on Mac OS X

Obviusly, all this is very unclear and can vary depending on CPU-cache, Node-version, OS and other factors.

Smaller and Larger sets
These findings hold quite well for smaller N=100 and larger N=1000000. The Array, despite being O(n log n), does not get much more worse for N=1000000 than it already was for N=10000.

Conclusions and Recommendation
I think the conservative choice is to use Arrays when order is important or you know you will not look for a member based on its unique id. If members have unique IDs and are not ordered, use Object. I see no reason to use Set, especially if you target browsers (support in IE is still limited in early 2018).

The Code
Here follows the source code. Output is not quite as pretty as the table above.

var lodash = require('lodash');

function randomarray(size) {
  var a = new Array(size);
  var x = 0;
  var i, r;
  var j = 0;
  var prime = 3;

  if ( 50   < size ) prime = 31;
  if ( 500  < size ) prime = 313;
  if ( 5000 < size ) prime = 3109;

  for ( i=0 ; i<size ; i++ ) {
    r = 1 + Math.floor(2 * Math.random());
    x += r;
    a[j] = '' + x;
    j += prime;
    if ( size <= j ) j-=size;
  }
  return a;
}

var times = {
  arr : {
    make : 0,
    arr  : 0,
    obj  : 0,
    set  : 0
  },
  obj : {
    make : 0,
    arr  : 0,
    obj  : 0,
    set  : 0
  },
  set : {
    make : 0,
    arr  : 0,
    obj  : 0,
    set  : 0
  }
}

function make_array(a) {
  times.arr.make -= Date.now();
  var i;
  var r = new Array(a.length);
  for ( i=a.length-1 ; 0<=i ; i-- ) {
    r[i] = a[i];
  }
  r.sort();
  times.arr.make += Date.now();
  return r;
}

function make_object(a) {
  times.obj.make -= Date.now();
  var i;
  var r = {};
  for ( i=a.length-1 ; 0<=i ; i-- ) {
    r[a[i]] = true;
  }
  times.obj.make += Date.now();
  return r;
}

function make_set(a) {
  times.set.make -= Date.now();
  var i;
  var r = new Set();
  for ( i=a.length-1 ; 0<=i ; i-- ) {
    r.add(a[i]);
  }
  times.set.make += Date.now();
  return r;
}

function make_triplet(n) {
  var r = randomarray(n);
  return {
    arr : make_array(r),
    obj : make_object(r),
    set : make_set(r)
  };
}

function match_triplets(t1,t2) {
  var i;
  var m = [];
  m.push(match_array_array(t1.arr , t2.arr));
  m.push(match_array_object(t1.arr , t2.obj));
  m.push(match_array_set(t1.arr , t2.set));
  m.push(match_object_array(t1.obj , t2.arr));
  m.push(match_object_object(t1.obj , t2.obj));
  m.push(match_object_set(t1.obj , t2.set));
  m.push(match_set_array(t1.set , t2.arr));
  m.push(match_set_object(t1.set , t2.obj));
  m.push(match_set_set(t1.set , t2.set));
  for ( i=1 ; i<m.length ; i++ ) {
    if ( m[0] !== m[i] ) {
      console.log('m[0]=' + m[0] + ' != m[' + i + ']=' + m[i]);
    }
  }
}

function match_array_array(a1,a2) {
  times.arr.arr -= Date.now();
  var r = 0;
  var i, v;
  for ( i=a1.length-1 ; 0<=i ; i-- ) {
    v = a1[i];
    if ( v === a2[lodash.sortedIndex(a2,v)] ) r++;
  }
  times.arr.arr += Date.now();
  return r;
}

function match_array_object(a1,o2) {
  times.arr.obj -= Date.now();
  var r = 0;
  var i;
  for ( i=a1.length-1 ; 0<=i ; i-- ) {
    if ( o2[a1[i]] ) r++;
  }
  times.arr.obj += Date.now();
  return r;
}

function match_array_set(a1,s2) {
  times.arr.set -= Date.now();
  var r = 0;
  var i;
  for ( i=a1.length-1 ; 0<=i ; i-- ) {
    if ( s2.has(a1[i]) ) r++;
  }
  times.arr.set += Date.now();
  return r;
}

function match_object_array(o1,a2) {
  times.obj.arr -= Date.now();
  var r = 0;
  var v;
  for ( v in o1 ) {
    if ( v === a2[lodash.sortedIndex(a2,v)] ) r++;
  }
  times.obj.arr += Date.now();
  return r;
}

function match_object_object(o1,o2) {
  times.obj.obj -= Date.now();
  var r = 0;
  var v;
  for ( v in o1 ) {
    if ( o2[v] ) r++;
  }
  times.obj.obj += Date.now();
  return r;
}

function match_object_set(o1,s2) {
  times.obj.set -= Date.now();
  var r = 0;
  var v;
  for ( v in o1 ) {
    if ( s2.has(v) ) r++;
  }
  times.obj.set += Date.now();
  return r;
}

function match_set_array(s1,a2) {
  times.set.arr -= Date.now();
  var r = 0;
  var v;
  var iter = s1[Symbol.iterator]();
  while ( ( v = iter.next().value ) ) {
    if ( v === a2[lodash.sortedIndex(a2,v)] ) r++;
  }
  times.set.arr += Date.now();
  return r;
}

function match_set_object(s1,o2) {
  times.set.obj -= Date.now();
  var r = 0;
  var v;
  var iter = s1[Symbol.iterator]();
  while ( ( v = iter.next().value ) ) {
    if ( o2[v] ) r++;
  }
  times.set.obj += Date.now();
  return r;
}

function match_set_set(s1,s2) {
  times.set.set -= Date.now();
  var r = 0;
  var v;
  var iter = s1[Symbol.iterator]();
  while ( ( v = iter.next().value ) ) {
    if ( s2.has(v) ) r++;
  }
  times.set.set += Date.now();
  return r;
}

function main() {
  var i;
  var t1;
  var t2;

  for ( i=0 ; i<100 ; i++ ) {
    t1 = make_triplet(10000);
    t2 = make_triplet(10000);
    match_triplets(t1,t2);
    match_triplets(t2,t1);
  }

  console.log('TIME=' + JSON.stringify(times,null,4));
}

main();

When to (not) use Web Workers?

Web Workers is a mature, simple, standardised, compatible technology for allowing multithreaded JavaScript-applications in the web browser.

I am not going to write about how to use Web Worker (check the excellent MDN article). I am going to write a little about when and why to (not) use Web Worker.

First, Web Workers are about performance. And performance is typically not the best thing to think about first when you code something.

Second, when you have performance problems and you throw more cores at the problem your best speedup is x2, x4 or xN. In 2018 it is quite common with 4 cores and that means in the optimal case you can make your program 4 times faster by using Web Workers. Unfortunately, if it was not fast enough from the beginning chances are a 4x speedup is not going to help much. And the cost of 4x speedup is 4 times more heat is produced, the battery will drain faster, and perhaps other applications will be suffering. A more efficient algorithm can often produce 10-100 times speedup without making the maintainability of the program suffer too much (and there are very many ways to make a non-optimised program faster).

Let us say we have a web application. The user clicks “Show report”, the GUI locks/blocks for 10s and then the report displays. The user might accept that the GUI locks, if just for 1-2 seconds. Or the user might accept that the report takes 10s to compute, if it shows up little by little and the program does not appear hung. The way we could deal with this in JavaScript (which is single thread and asyncronous) is to break the 10s report calculation into small pieces (say 100 pieces each taking 100ms) and after calculating each piece calling window.setTimeout which allows the UI to update (among other things) before calculating another piece of the report. Perhaps a more common and practical approach is to divide the 10s job into logical parts: fetch data, make calculations, make report, but this would not much improve the locked GUI situation since some (or all) parts still take significant (blocking) time.

If we could send the entire 10s job to a Web Worker our program GUI would be completely responsive while the report is generated. Now the key limitation of a web worker (which is also what allows it to be simple and safe):

Data is copied to the Worker before it starts, and copied from the Worker when it has completed (rather than being passed by reference).

This means that if you already have a lot of data, it might be quite expensive to copy that data to the web worker, and it might actually be cheaper to just do the job where the data already is. In the same way, since there is some overhead in calling the Web Worker, you can’t send too many too small pieces of work to it, because you will occupy yourself with sending and receiving messages rather than just doing the job right away.

This leaves us with obvious candidates for web workers (you can use Google):

  • Expensive searches (like chess moves or travelling salesman solutions)
  • Encryption (but chances are you should not do it in JavaScript in the first place, for security reasons)
  • Spell and grammar checker (I don’t know much about this).
  • Background network jobs

This is not too useful in most cases. What would be useful would be to send packages of work (arrays), like streams in a functional programming way: map(), reduce(), sort(), filter().

I decided to write some Web Worker tests based on sort(). Since I can not (easily, and there are probably good reasons) write JavaScript in WordPress I wrote a separate page with the application. Check it out now:

So, for 5 seconds I try to do the following job as many times I can, while I keep track of how much the GUI is suffering:

  1. create an array of 10001 random numbers: O(n)
  2. sort it: O(n log n)
  3. get the median (array[5000]): O(1)

The expensive part is step 2, the sort (well, I actually have not measured 1 vs 2). If the ratio of amount of work done per byte being sent is high enough then it can be worth it to send the job to a Web Worker.

If you run the tests yourself I think you shall see that the first Web Worker tests that outsource all of 1-2-3 are quite ok. But this basically means giving the web worker no data at all and when it has done a significant amount of job, receiving just a few numbers. This is more Web Worker friendly than Chess where at least the board would need to be sent.

If you then run the tests that outsource just sort() you see significantly lower throughput. How suitable sort()? Well, sorting 10k ~ 2^13 elements should require each element to be compared (accessed) about 13 times. And there is no data sent that is not needed by the Web Worker. Just as a counter example: if you send an order to get back the sum of the lines most of the order data is ignored by the Web Worker, and it just needs to access each line value once; much much less suitable than sort().

Findings from tests
I find that sort(), being O(n log n), on an array of numbers is far too fast to be outsourced to a Web Worker. You need to find a much more “dense” problem to benefit of a Web Worker.

Islands of data
If you can design your application in such way that one Web Worker maintains its own full state and just shares small selected parts occationally, that could work. The good thing is that this would also be clean encapsulation of data and separation of responsibilites. The bad thing is that you probably need to design with the Web Worker in mind quite early, and this kind of premature optimization is often a bad idea.

This could be letting a Web Worker do all your I/O. But if most data that you receive is needed in your application, and most data you send comes straight from your application, the benefit is very questionable. An if most data you receive is not needed in your application, perhaps you should not receive so much data in the first place. Even if you process your incoming data quite much: validating, integrating with current state, precalculating I would not expect it to come very close to the computational intensity of my sort().

Conclusions
Unfortunately, the simplicity and safety of Web Worker is unfortunately also its biggest limitation. The primary reason for using a Web Worker should be performance and even for artificial problems it is hard to get any benefit.

Note to self: never try-catch more than necessary!

A wrote a function, and then a unittest, and the unit test was good.
Then I called the function from my real project, and it failed!

I isolated the problem and thought I had found a bug in V8 (except after many years as a programmer I have I learnt it is never the compilers fault).

This was my output:

$ node bug.js 
Test good
main: err=Not JSON

This is my simplified (faulty) code:

function callSomething(callback) {
  var rawdata = '{ "a":"1" }';
  var jsondata; 

  try {
    jsondata = JSON.parse(rawdata);
    callback(null,jsondata);
  } catch (e) {
    callback('Not JSON', null);
  }
}

function test() {
  callSomething(function(err,data) {
    if ( err ) console.log('Test bad: ' + err);
    console.log('Test good');
  });
}

function main() {
  var result = {
    outdata : {}
  };

  callSomething(function(err,data) {
    if ( err ) {
      console.log('main: err=' + err);
    } else {
      result.outata.json = data;
      console.log('main: json=' + JSON.stringify(result.outdata.json));
    }
  });
}

test();
main();

How can the test not fail when main fails?

Well, here is the correct output

$ node nodebug.js 
Test good
main: json={"a":"1"}

of the correct code main function:

function main() {
  var result = {
    outdata : {}
  };

  callSomething(function(err,data) {
    if ( err ) {
      console.log('main: err=' + err);
    } else {
//    result.outata.json = data;
      result.outdata.json = data;
      console.log('main: json=' + JSON.stringify(result.outdata.json));
    }
  });
}

The misnamed property caused an Error which was (unintentionally) caught, causing the anonymous callback function to be called once more, this time with err set, but to the wrong error.

It would have been better to write:

function callSomething(callback) {
  var rawdata = '{ "a":"1" }';
  var jsondata; 

  try {
    jsondata = JSON.parse(rawdata);
  } catch (e) {
    callback('Not JSON', null);
    return;
  }
  callback(null,jsondata);
}

and the misnamed propery error would have crashed the program in the right place.

Conclusion
Don’t ever try more things than necessary. And if you need to try several lines, consider making separate try for each.

All JavaScript objects are not equally fast

One thing I like with JavaScript and NodeJS is to have JSON in the entire stack. I store JSON on disk, process JSON data server side, send JSON over HTTP, process JSON data client side, and the web GUI can easily present JSON (I work with Angular).

As a result of this, all objects are not created the same. Lets say I keep track of Entries, I have an Entry-constructor that initiates new objects with all fields (no more no less). At the same time I receive Entry-objects as JSON-data over the network.

A strategy is needed:

  1. Have mix of raw JSON-Entries and Objects that are instanceof Entry
  2. Create real Entry-objects from all JSON-data
  3. Only work with raw JSON-Entries

Note that if you don’t go with (2) you can’t use prototype, expect objects to have functions or use instanceof to identify objects.

Another perhaps not obvious aspect is that performance is not the same. When you create a JavaScript object using new the runtime actually creates a class with fast to access properties. Such object properties are faster than

  • an empty object {} with properties set afterwards
  • an object created with JSON.parse()

I wrote a program to test this. The simplified explanation is that I obtained an array of objects that I then sorted/calculated a few (6) times. For a particular computer and problem size I got these results:

TIME   PARAMETER   DESCRIPTION
3.3s       R       Produce random objects using "new"
4.4s       L       Load objects from json-file using JSON.parse()
3.0s       L2      json-file, JSON.parse(), send raw objects to constructor
3.2s       L3      load objects using require() from a js-file

I will be honests and say that the implementation of the compare-function sent to sort() matters. Some compare functions suffered more or less from different object origins. Some compare functions are more JIT-optimised and faster the second run. However, the consistent finding is that raw JSON-objects are about 50% slower than objects created with new and a constructor function.

What is not presented above is the cost of parsing and creating objects.

My conclusion from this is that unless you have very strict performance requirements you can use the raw JSON-objects you get over the network.

Below is the source code (for Node.js). Apart from the parameters R, L, L2 and L3 there is also a S(tore) parameter. It creates the json- and js-files used by the Load options. So typically run the program with the S option first, and then the other options. A typicall run looks like this:

$ node ./obj-perf.js S
Random: 492ms
Store: 1122ms

$ node ./obj-perf.js R
Random: 486ms
DISTS=110463, 110621, 110511, 110523, 110591, 110515 : 3350ms
DISTS=110463, 110621, 110511, 110523, 110591, 110515 : 3361ms
DISTS=110463, 110621, 110511, 110523, 110591, 110515 : 3346ms

$ node ./obj-perf.js L
Load: 376ms
DISTS=110463, 110621, 110511, 110523, 110591, 110515 : 4382ms
DISTS=110463, 110621, 110511, 110523, 110591, 110515 : 4408ms
DISTS=110463, 110621, 110511, 110523, 110591, 110515 : 4453ms

$ node ./obj-perf.js L2
Load: 654ms
DISTS=110463, 110621, 110511, 110523, 110591, 110515 : 3018ms
DISTS=110463, 110621, 110511, 110523, 110591, 110515 : 2974ms
DISTS=110463, 110621, 110511, 110523, 110591, 110515 : 2890ms

$ node ./obj-perf.js L3
Load: 1957ms
DISTS=110463, 110621, 110511, 110523, 110591, 110515 : 3436ms
DISTS=110463, 110621, 110511, 110523, 110591, 110515 : 3264ms
DISTS=110463, 110621, 110511, 110523, 110591, 110515 : 3199ms

The colums with numbers (110511) are checksums calculated between the sorts. They should be equal, otherwise they dont matter.

const nodeFs = require('fs');

function Random(seed) {
  this._seed = seed % 2147483647;
  if (this._seed <= 0) this._seed += 2147483646;
}

Random.prototype.next = function () {
  return this._seed = this._seed * 16807 % 2147483647;
};

function Timer() {
  this.time = Date.now();
}

Timer.prototype.split = function() {
  var now = Date.now();
  var ret = now - this.time;
  this.time = now;
  return ret;
};

function Point() {
  this.a = -1;
  this.b = -1;
  this.c = -1;
  this.d = -1;
  this.e = -1;
  this.f = -1;
  this.x =  0;
}

function pointInit(point, rand) {
  var p;
  for ( p in point ) {
    point[p] = rand.next() % 100000;
  }
}

function pointLoad(json) {
  var p;
  var point = new Point();
  for ( p in point ) {
    point[p] = json[p];
  }
  return point;
}

function pointCmp(a,b) {
  return pointCmpX[a.x](a,b,a.x);
}

function pointCmpA(a,b) {
  if ( a.a !== b.a ) return a.a - b.a;
  return pointCmpB(a,b);
}

function pointCmpB(a,b) {
  if ( a.b !== b.b ) return a.b - b.b;
  return pointCmpC(a,b);
}

function pointCmpC(a,b) {
  if ( a.c !== b.c ) return a.c - b.c;
  return pointCmpD(a,b);
}

function pointCmpD(a,b) {
  if ( a.d !== b.d ) return a.d - b.d;
  return pointCmpE(a,b);
}

function pointCmpE(a,b) {
  if ( a.e !== b.e ) return a.e - b.e;
  return pointCmpF(a,b);
}

function pointCmpF(a,b) {
  if ( a.f !== b.f ) return a.f - b.f;
  return pointCmpA(a,b);
}

var pointCmpX = [pointCmpA,pointCmpB,pointCmpC,pointCmpD,pointCmpE,pointCmpF];

function pointDist(a,b) {
  return Math.min(
    (a.a-b.a)*(a.a-b.a),
    (a.b-b.b)*(a.b-b.b),
    (a.c-b.c)*(a.c-b.c),
    (a.d-b.d)*(a.d-b.d),
    (a.e-b.e)*(a.e-b.e),
    (a.f-b.f)*(a.f-b.f)
  );
}

function getRandom(N) {
  var i;
  var points = new Array(N);
  var rand   = new Random(14);

  for ( i=0 ; i<N ; i++ ) {
    points[i] = new Point();
    n = pointInit(points[i], rand);
  }
  return points;
}

function test(points) {
  var i,j;
  var dist;
  var dists = [];

  for ( i=0 ; i<6 ; i++ ) {
    dist = 0;
    for ( j=0 ; j<points.length ; j++ ) {
      points[j].x = i;
    }
    points.sort(pointCmp);
    for ( j=1 ; j<points.length ; j++ ) {
      dist += pointDist(points[j-1],points[j]);
    }
    dists.push(dist);
  }
  return 'DISTS=' + dists.join(', ');
}

function main_store(N) {
  var timer = new Timer();
  points = getRandom(N);
  console.log('Random: ' + timer.split() + 'ms');
  nodeFs.writeFileSync('./points.json', JSON.stringify(points));
  nodeFs.writeFileSync('./points.js', 'exports.points=' +
                                      JSON.stringify(points) + ';');
  console.log('Store: ' + timer.split() + 'ms');
}

function main_test(points, timer) {
  var i, r;
  for ( i=0 ; i<3 ; i++ ) {
    r = test(points);
    console.log(r + ' : ' + timer.split() + 'ms');
  }
}

function main_random(N) {
  var timer = new Timer();
  var points = getRandom(N);
  console.log('Random: ' + timer.split() + 'ms');
  main_test(points, timer);
}

function main_load() {
  var timer = new Timer();
  var points = JSON.parse(nodeFs.readFileSync('./points.json'));
  console.log('Load: ' + timer.split() + 'ms');
  main_test(points, timer);
}

function main_load2() {
  var timer = new Timer();
  var points = JSON.parse(nodeFs.readFileSync('./points.json')).map(pointLoad);
  console.log('Load: ' + timer.split() + 'ms');
  main_test(points, timer);
}

function main_load3() {
  var timer = new Timer();
  var points = require('./points.js').points;
  console.log('Load: ' + timer.split() + 'ms');
  main_test(points, timer);
}

function main() {
  var N = 300000;
  switch ( process.argv[2] ) {
  case 'R':
    main_random(N);
    break;
  case 'S':
    main_store(N);
    break;
  case 'L':
    main_load();
    break;
  case 'L2':
    main_load2();
    break;
  case 'L3':
    main_load3();
    break;
  default:
    console.log('Unknown mode=' + process.argv[2]);
    break;
  }
}

main();

JavaScript: await async

With Node.js version 8 there is finally a truly attractive alternative to good old callbacks.

I was never a fan of promises, and implementing await-async as a library is not pretty. Now when await and async are keywords in JavaScript things change.

The below program demonstrates a simple async function doing IO: ascertainDir. It creates a directory, but if it already exists no error is thrown (if there is already a file with the same name, no error is thrown, and that is a bug but it will do for the purpose of this article).

There are four modes of the program: CALLBACK, PROMISE, AWAIT-LIB and AWAIT-NATIVE. Creating a folder (x) should work. Creating a folder in a nonexisting folder (x/x/x) should fail. Below is the output of the program and as you see the end result is the same for the different asyncronous strategies.

$ node ./await-async.js CB a
Done: a
$ node ./await-async.js CB a/a/a
Done: Error: ENOENT: no such file or directory, mkdir 'a/a/a'

$ node ./await-async.js PROMISE b
Done: b
$ node ./await-async.js PROMISE b/b/b
Done: Error: ENOENT: no such file or directory, mkdir 'b/b/b'

$ node ./await-async.js AWAIT-LIB c
Done: c
$ node ./await-async.js AWAIT-LIB c/c/c
Done: Error: ENOENT: no such file or directory, mkdir 'c/c/c'

$ node ./await-async.js AWAIT-NATIVE d
Done: d
$ node ./await-async.js AWAIT-NATIVE d/d/d
Done: Error: ENOENT: no such file or directory, mkdir 'd/d/d'

The program itself follows:

     1	var nodefs = require('fs')
     2	var async = require('asyncawait/async')
     3	var await = require('asyncawait/await')
     4	
     5	
     6	function ascertainDirCallback(path, callback) {
     7	  if ( 'string' === typeof path ) {
     8	    nodefs.mkdir(path, function(err) {
     9	      if (!err) callback(null, path)
    10	      else if ('EEXIST' === err.code) callback(null, path)
    11	      else callback(err, null)
    12	    })
    13	  } else {
    14	    callback('mkdir: invalid path argument')
    15	  }
    16	};
    17	
    18	
    19	function ascertainDirPromise(path) {
    20	  return new Promise(function(fullfill,reject) {
    21	    if ( 'string' === typeof path ) {
    22	      nodefs.mkdir(path, function(err) {
    23	        if (!err) fullfill(path)
    24	        else if ('EEXIST' === err.code) fullfill(path)
    25	        else reject(err)
    26	      })
    27	    } else {
    28	      reject('mkdir: invalid path argument')
    29	    }
    30	  });
    31	}
    32	
    33	
    34	function main() {
    35	  var method = 0
    36	  var dir    = 0
    37	  var res    = null
    38	
    39	  function usage() {
    40	    console.log('await-async.js CB/PROMISE/AWAIT-LIB/AWAIT-NATIVE directory')
    41	    process.exit(1)
    42	  }
    43	
    44	  switch ( process.argv[2] ) {
    45	  case 'CB':
    46	  case 'PROMISE':
    47	  case 'AWAIT-LIB':
    48	  case 'AWAIT-NATIVE':
    49	    method = process.argv[2]
    50	    break
    51	  default:
    52	    usage();
    53	  }
    54	
    55	  dir = process.argv[3]
    56	
    57	  if ( process.argv[4] ) usage()
    58	
    59	  switch ( method ) {
    60	  case 'CB':
    61	    ascertainDirCallback(dir, function(err, path) {
    62	      console.log('Done: ' + (err ? err : path))
    63	    })
    64	    break
    65	  case 'PROMISE':
    66	    res = ascertainDirPromise(dir)
    67	    res.then(function(path) {
    68	      console.log('Done: ' + path)
    69	    },function(err) {
    70	      console.log('Done: ' + err)
    71	    });
    72	    break
    73	  case 'AWAIT-LIB':
    74	    (async(function() {
    75	      try {
    76	        res = await(ascertainDirPromise(dir))
    77	        console.log('Done: ' + res)
    78	      } catch(e) {
    79	        console.log('Done: ' + e)
    80	      }
    81	    })());
    82	    break
    83	  case 'AWAIT-NATIVE':
    84	    (async function() {
    85	      try {
    86	        res = await ascertainDirPromise(dir)
    87	        console.log('Done: ' + res)
    88	      } catch(e) {
    89	        console.log('Done: ' + e)
    90	      }
    91	    })();
    92	    break
    93	  }
    94	}
    95	
    96	main()

Please note:

  1. The anonymous function on line 74 would not be needed if main() itself was async()
  2. The anonymous function on line 84 would not be needed if main() itself was async
  3. A function that returns a Promise() (line 19) works as a async function without the async keyword.

Callback
Callback is the old simple method of dealing with asyncrounous things in JavaScript. A major complaint has been “callback hell”: if you call several functions in sequence it can get rather messy. I can agree with that, BUT I think each asyncrounous call deserves its own error handling anyway (and with proper error handling other options tend to be equally tedious).

Promise
I dont think using a promise (66-71) is very nice. It is of course a matter of habit. One thing is that not all requests in the success-path are actually success in real life, or not all errors are errors (like in ascertainDir). Very commonly you make a http-request which itself is good, but the data you receive is not good so you want to proceed with error handling. This means that the fulfill case needs to execute the same code as the reject case, for some “successful” replies. Promises can be chained, but it typically results in ignoring proper error handling.

awaitasync library
I think the syntax of the asyncawait library is rather horrible, but it works as a proof-of-concept for the real thing.

async await native keywords
With the async/await keywords in JavaScript, suddenly asyncrounous code can be handled just like in Java or C#. Since it is familiar it is appealing! No doubt it is clean and practical. I would hesitate to mix it with Callbacks or Promises, and would rather wait until I can do a complete rewrite.

Common sources of bugs in JavaScript are people trying to return from within (callback/promises) functions, people not realising the rest of the code continues to run after the asyncrous call, or things related to variable scopes. I guess in most cases the await/async makes these things cleaner and easier, but I would expect problems where it causes unexpected effects when not properly used.

Finally, if you start using async/await keywords there is no polyfill or fallback for older browser (maybe Babel can do that for you). As usual, IE seems to lag behind, and you can forget about Node v6 (or earlier). Depending on your situation, this could be a show stopper or no issue at all.

Watch something?
For more details, I can recommend this video on 5 architectures of asynchronous JavaScript.

Lodash Performance Sucks!

To continue my Functional Programming Sucks series of posts I will have a closer look at reduce().

I complained with Lodash (and Underscore) for different reasons. One complaint was performance, but I just read the code and presumed it was going to be slow without measuring. Then I complained with the performance of Functional Programming in general.

I thought it would be interesting to “improve” the Functional code with Lodash functions, and to my surprise (I admit I was both wrong and surprised) I found Lodash made it faster! After reading a little more about it I discovered this is a well known fact.

So, here are four different implementations of a function that checks if the elements (numbers) in an array are ordered (cnt is incremented if the array is sorted, such was the original problem).

// Standard reduce()
    this.test = function(p) {
        if ( false !== p.reduce(function(acc,val) {
            if ( false === acc || val < acc ) return false;
            return val;
        }, -1)) cnt++;
    };

// Lodash reduce(), and some other Lodash waste
    this.test = function(p) {
        if ( false !== LO.reduce(p,function(acc,val) {
            if ( false === acc || val < acc ) return false;
    //      if ( !LO.isNumber(acc) || val < acc ) return false;
            return val;
        }, -1)) cnt++;
    };

// My own 4 minute to implement simpleReduce(), see below
    this.test = function(p) {
        if ( false !== simpleReduce(p,function(acc,val) {
            if ( false === acc || val < acc ) return false;
            return val;
        }, -1)) cnt++;
    };

// A simple imperative version
    this.test = function(p) {
        var i;
        for ( i=1 ; i < p.length ; i++ ) {
            if ( p[i] < p[i-1] ) return;
        }
        cnt++;
    };

// my own implementation reduce()
    function simpleReduce(array, func, initval) {
         var i;
         var v = initval;
         for ( i=0 ; i<array.length ; i++ ) {
             v = func(v, array[i]);
         }
         return v;
    }

The interesting thing here is that the standard library reduce() is the slowest.
However, my simpleReduce is faster than Lodash reduce().

(seconds) reduce()
Std Lib Lodash Simple Imperative
Raspberry Pi v1 (ARMv6 @ 700) 21 13 9.3 4.8
MacBook Air (Core i5 @ 1400) 0.46 0.23 0.19 0.16

Conclusion
The conclusion is that from a performance perspective Functional Programming sucks. Lodash sucks too, but a little bit less so than the standard library (however, if you decorate all your code with isEmpty, isString, isNumber and that crap it will get worse).

That said, the generic nature of Lodash comes at a cost. The most simpleReduce() imaginable outperforms Lodash. As I see it, this leaves Lodash in a pretty bad (or small) place:

  • Compared to the standard library it is an extra dependency with limited performance benefits
  • The generic nature of Lodash comes at both a performance cost and it allows for sloppy coding
  • A hand written reduce() outperforms Lodash and is a good excercise for anyone to write. I expect this is quite true also for other functions like take or takeRight.
  • For best performance, avoid Functional Programming (and in this case the imperative version is arguably more readable than the FP reduce() versions)

Whats up with the Standard Library???
JavaScript is a scripted language (interpreted with a JIT compiler) that has a standard library written in C++. How can anything written in JavaScript execute faster than anything in the standard library that does the same thing?

First, kudos to the JIT designers! Amazing job! Perhaps the standard library people can learn from you?

I can imagine the standard library functions are doing some tests or validations that are somehow required by the standard, and that a faster and less strict version of reduce() would possibly break existing code (although this sounds far fetched).

I can (almost not) imagine that there is a cost of going from JS to Native and back to JS: that function calls to native code comes with overhead. Like going from user space to kernel space. It sounds strange.

I have read that there are optimizations techniques applied to Lodash (like lazy evaluation), but I certainly didn’t do anything like that in my simpleReduce().

For Node.js optimizing the standard library truly would make sense. In the standard library native code of a single-threaded server application every cycle counts.

UPDATE: I tried replacing parts of the above code: 1) the lambda function that is passed to reduce(), 2) the imperative version, with native code. That is, I wrote C++ code for V8 and used it instead of JavaScript code. In both cases this was slower! Obviously there is some overhead in going between native and JavaScript JIT, and for rather small functions this overhead makes C++ “slower” than JavaScript. My idea was to write a C++ reduce() function but I think the two functions I wrote are enough to show what is happening here. Conclusion: don’t write small native C++ functions for performance, and for maximum performance it can be worth to rewrite the standard library in JavaScript (although this is insane to do)!

All FP-sucks related articles
Functional Programming Sucks)
Underscore.js sucks! Lodash sucks!
Functional Programming Sucks! (it is slow)
Lodash Performance Sucks! (this one)

Functional Programming Sucks! (it is slow)

Update 2017-12-05: I added a new test in the end that came from real code.
It is both true that functional code is slower and that Node.js v8 is tightening the gap.

Update 2017-07-17: Below i present numbers showing that functional code is slower than imperative code. It seems this has changed with newer versions of Node.js: functional code has not turned faster but imperative code has become slower. You can read a little more about it in the comments. I will look more into this. Keep in mind that the below findings may be more accurate for Node.js v4-6 than for v8.

Functional programming is very popular with contemporary JavaScript programmers. As I have written before, Functional programming sucks and functional libraries for JavaScript also suck.

In this post I will explain more why Functional Programming sucks. I will start with the conclusion. Read on as long as you want more details.

Functional Programming practices are bad for performance
It is very popular to feed lamda-functions to map(), reduce(), filter() and others. If you do this carelessly the performance loss is significant.

It is also popular to work with immutable data. That is, you avoid functions that change (mutate) current state (side effects) and instead you produce a new state (a pure function). This puts a lot of pressure on the garbage collector and it can destroy performance.

The Benchmark Problem
Sometimes I entertain myself solving problems on Hackerrank.com. I particularly like the mathematical challenges in the Project Euler section (the Project Euler is also an independent organisation – HackerRank uses the challenges in Project Euler to create programming challenges).

This article refers to Project Euler 32. I will not go into details, but the solution is basically:

  1. Generate all permutations of the numbers 1,2,3,4,5,6,7,8,9 (there are 9! of them)
  2. For each permutation, check if it is “good” (very few are)
  3. Print the sum of the good instances

The first two steps give good benchmark problems. I have made different implementations of (1) and (2) and then compared the results.

Benchmark Results
I have three different permutation generators (all recursive functions):

  1. Pure function, immutable data (it may not be strictly pure)
  2. Function that mutates its own internal state, but not its input
  3. Function that mutates shared data (no allocation/garbace collection)

I also have three different test functions:

  1. Tests the orginal Project Euler problem
  2. Simplified test using reduce() and lamda function
  3. Simplified test implemented a standard loop

I benchmarked on two different systems using Node.js version 6. I have written elsewhere that Node.js performance on Raspberry Pi sucks.

(seconds) Project Euler Test Simplified Test
Test Function: Functional Imperative
Permutation Gen: Pure Semi Shared Shared Shared Pure
Raspberry Pi v1 (ARMv6 @ 700) 69 23 7.4 21 3.7 62
MacBook Air (Core i5 @ 1400) 0.77 0.29 0.13 0.40 0.11 0.74

Comparing columns 1-2-3 shows the performance of different generators (for Project Euler test)
Comparing columns 4-5 shows the performance of two different test functions (using fast generator)
Comparing columns 5-6 shows the performance of two different generators (for fast simple test)

This shows that the benefit of using shared/mutable data (not running the garbage collector) instead of immutable data is 5x performance on the Intel CPU and even more on the ARM. Also, the cost of using reduce() with a lamda function is more than 3x overall performance on the Intel CPU, and even more on the ARM.

For both the test function and permutation generation, making any of them functional-slow significantly slows down the entire program.

The conclusion of this is that unless you are quite sure your code will never be performance critical you should avoid functional programming practices. It is a lot easier to write imperative code than to later scale out your architecture when your code does not perform.

However, the pure immutable implementation of the permutation generator is arguably much simpler than the iterative (faster) counterpart. When you look at the code you may decide that the performance penalty is acceptable to you. When it comes to the reduce() with a lamda function, I think the imperative version is easier to read (and much faster).

Please notice that if your code consists of nice testable, replaceble parts without side effects you can optimize later on. The functional principles are more valuable at a higher level. If you define your functions in a way that they behave like nice FP functions it does not matter if they are implemented using imperative principles (for performance).

Generating Permutations
I used the following simple method for generating permutations. I start with two arrays and I send them to my permute-function:

  head = [];
  tail = [1,2,3,4];

  permute(head,tail);

My permute-function checks if tail is empty, and then: test/evalute head.
Otherwise it generates 4 (one for each element in tail) new sets of head and tail:

  permute( [1] , [2,3,4] )
  permute( [2] , [1,3,4] )
  permute( [3] , [1,2,4] )
  permute( [4] , [1,2,3] )

The difference in implementation is:

  • Pure version generates all the above 8 arrays as new arrays using standard array functions
  • Semi pure version generates its own 2 arrays (head and tail) and then uses a standard loop to change the values of the arrays between the (recursive) calls to permute.
  • Shared version simply creates a single head-array and 9 tail-arrays (one for each recursion step) up front. It then reuses these arrays throughout the 9! iterations. (It is not global variables, they are hidden and private to the permutation generator)

The simplified test
The simplified test checks if the array is sorted: [1,2,3,4]. Of all permutations, there is always exactly one that is sorted. It is a simple test to implement (especially with a loop).

// These functions are part of a "test-class" starting like:
function testorder1() {
    var cnt = 0;

// Functional test
    this.test = function(p) {
        if ( false !== p.reduce(function(acc,val) {
            if ( false === acc || val < acc ) return false;
            return val;
        }, -1)) cnt++;
    };

// Iterative test (much faster)
    this.test = function(p) {
        var i;
        for ( i=1 ; i<p.length ; i++ ) {
            if ( p[i] < p[i-1] ) return;
        }
        cnt++;
    };

I tried to optimise the functional reduce() version by breaking out a named function. That did not help. I also tried to let the function always return the same type (now it returns false OR a number) but that also made no difference at all.

All the code
For those who want to run this themselves or compare the permutation functions here is the entire program.

As mentioned above, the slowest (immutable data) permutation function is a lot smaller and easier to understand then the fastest (shared data) implementation.


'use strict';

// UTILITIES

function arrayToNum(p, s, e) {
    var r = 0;
    var m = 1;
    var i;
    for ( i=e-1 ; s<=i ; i-- ) {
        r += m * p[i];
        m *= 10;
    }
    return r;
}

function arrayWithZeros(n) {
    var i;
    var a = new Array(n);
    for ( i=0 ; i<a.length ; i++ ) a[i] = 0;
    return a;
}


// PERMUTATION ENGINES

function permutations0(n, callback) {
}

// IMMUTABLE (SLOWEST)

function permutations1(n, callback) {
    var i;
    var numbers = [];
    for ( i=1 ; i<=n ; i++ ) numbers.push(i);
    permute1([],numbers,callback);
}

function permute1(head, tail, callback) {
    if ( 0 === tail.length ) {
        callback(head);
        return;
    }

    tail.forEach(function(t, i, a) {
        permute1( [t].concat(head),
                  a.slice(0,i).concat(a.slice(i+1)),
                  callback);

    });
}

// MUTATES ITS OWN DATA, BUT NOT ITS ARGUMENTS

function permutations2(n, callback) {
    var i;
    var numbers = [];
    for ( i=1 ; i<=n ; i++ ) numbers.push(i);
    permute2([],numbers,callback);
}

function permute2(head, tail, callback) {
    if ( 0 === tail.length ) {
        callback(head);
        return;
    }
    var h2 = [tail[0]].concat(head);
    var t2 = tail.slice(1);
    var i  = 0;
    var tmp;
    
    while (true) {
        permute2(h2, t2, callback);
        if ( i === t2.length ) return;
        tmp   = h2[0];
        h2[0] = t2[i];
        t2[i] = tmp;
        i++;
    }
}

// MUTATES ALL DATA (INTERNALLY) (FASTEST)

function permutations3(n, callback) {
    var i;
    var head  = arrayWithZeros(n);
    var tails = new Array(n+1);

    for ( i=1 ; i<=n ; i++ ) {
        tails[i] = arrayWithZeros(i);
    }

    for ( i=1 ; i<=n ; i++ ) {
        tails[n][i-1] = i;
    }

    function permute3(x) {
        var j;
        var tail_this;
        var tail_next;
        var tmp;
        if ( 0 === x ) {
            callback(head);
            return;
        }
        tail_this = tails[x];
        tail_next = tails[x-1];

        for ( j=1 ; j<x ; j++ ) {
            tail_next[j-1] = tail_this[j];
        }

        j=0;
        while ( true ) {
            head[x-1] = tail_this[j];
            permute3(x-1);
             
            j++;
            if ( j === x ) return;

            tmp            = head[x-1];
            head[x-1]      = tail_next[j-1];
            tail_next[j-1] = tmp;
        }
    }

    permute3(n);
}

// TEST FUNCTIONS

function testprint() {
    this.test = function(p) {
        console.log(JSON.stringify(p));
    };

    this.done = function() {
        return 'Done';
    };
}

// CHECKS IF PERMUTATION IS ORDERED - FUNCTIONAL (SLOWEST)

function testorder1() {
    var cnt = 0;

    this.test = function(p) {
        if ( false !== p.reduce(function(acc,val) {
            if ( false === acc || val < acc ) return false;
            return val;
        }, -1)) cnt++;
    };

    this.done = function() {
        return cnt;
    };
}

// CHECKS IF PERMUTATION IS ORDERED - IMPERATIVE (FASTEST)

function testorder2() {
    var cnt = 0;

    this.test = function(p) {
        var i;
        for ( i=1 ; i<p.length ; i++ ) {
            if ( p[i] < p[i-1] ) return;
        }
        cnt++;
    };

    this.done = function() {
        return cnt;
    };
}

// TEST FUNCTION FOR PROJECT EULER 32

function testeuler() {
    var sums = {};

    this.test = function(p) {
        var w1, w2, w;
        var m1, m2, mx;
        w =  Math.floor(p.length/2);
        w1 = 1;
        w2 = p.length - w - w1;
    
        while ( w1 <= w2 ) {
            m1 = arrayToNum(p,     0, w1      );
            m2 = arrayToNum(p,    w1, w1+w2   );
            mx = arrayToNum(p, w1+w2, p.length);
        
            if ( m1 < m2 && m1 * m2 === mx ) {
                sums['' + mx] = true;
            }
        
            w1++;
            w2--;
        }
    };

    this.done = function() {
        var i;
        var r = 0;
        for ( i in sums ) {
            r += +i;
        }
        return r;
    };
}

// MAIN PROGRAM BELOW

function processData(input, parg, targ) {
    var r;

    var test = null;
    var perm = null;

    switch ( parg ) {
    case '0':
        perm = permutations0;
        break;
    case '1':
        perm = permutations1;
        break;
    case '2':
        perm = permutations2;
        break;
    case '3':
        perm = permutations3;
        break;
    }

    switch ( targ ) {
    case 'E':
        test = new testeuler;
        break;
    case 'O1':
        test = new testorder1;
        break;
    case 'O2':
        test = new testorder2;
        break;
    case 'P':
        test = new testprint();
        break;
    }


    r = perm(+input, test.test);
    console.log(test.done());
} 

function main() {
    var input = '';
    var parg = '1';
    var targ = 'E';
    var i;

    for ( i=2 ; i<process.argv.length ; i++ ) {
        switch ( process.argv[i] ) {
        case '0':
        case '1':
        case '2':
        case '3':
            parg = process.argv[i];
            break;
        case 'E':
        case 'O1':
        case 'O2':
        case 'P':
            targ = process.argv[i];
            break;
        }
    }
    

    process.stdin.resume();
    process.stdin.setEncoding('ascii');
    process.stdin.on('data', function (s) {
        input += s;
    });

    process.stdin.on('end', function () {
       processData(input, parg, targ);
    });
}

main();

This is how I run the code (use a lower value than 9 to have fewer than 9! permutations)

### Project Euler Test: 3 different permutation generators ###
$ echo 9 | time node projecteuler32.js 3 E
45228
8.95user ...
b$ echo 9 | time node projecteuler32.js 2 E
45228
25.03user ...
$ echo 9 | time node projecteuler32.js 1 E
45228
70.34user ...

### Simple check-order test, two different versions. Fastest permutations.
b$ echo 9 | time node projecteuler32.js 3 O1
1
23.71user ...
$ echo 9 | time node projecteuler32.js 3 O2
1
4.72user ...

(the timings here may not exactly match the above figures)

Update 2017-12-05
Admittedly, I sometimes find map(), filter() handy and I try to use them when it makes code more clear. I came to a situation where I want to split a list in two lists (one list with valid objects and one with invalid). This is a simple if/else with a push() in each. Or it would be two calls to filter(). Then it turned out that I wanted to split the valid objects into two lists: good and ugly. The slightly simplified code is:

function goodBadUgly_1(list) {
  var i, c;
  var ret = {
    good : [],
    bad  : [],
    ugly : []
  }
  for ( i=0 ; i<list.length ; i++ ) {
    c = list[i];
    if ( !validateItem(c) )
      ret.bad.push(c);
    else if ( uglyItem(c) )
      ret.ugly.push(c);
    else
      ret.good.push(c);
  }
  return ret;
}

function goodBadUgly_2(list) {
  return {
    good : list.filter(function(c) {
                         return validateItem(c) && !uglyItem(c);
                      }),
    bad  : list.filter(function(c) {
                         return !validateItem(c);
                      }),
    ugly : list.filter(function(c) {
                         return  validateItem(c) && uglyItem(c);
                      })
  };
}

On my not too powerful x64 CPU, and a list of about 1000 items the non-FP version took 6ms and the FP version took 16ms (second run, to allow the JIT to do its job). This was with Node 8.9.1. For Node 6.11.3 the FP version was slower but the non-FP version was almost same speed (quite consistent with my comment in the beginning from 2017-07-17).

You may think that of course the FP code is slower: it calls validateItem twice (always) and uglyItem twice for all valid items. Yes, that is true, and that is also my point! When you do FP you avoid (storing intermediate results in) variables. This results in extra work being done a lot of the time. How would YOU implement this in FP style?

This is 10 ms: does it matter? Well, first it is only 1000 objects.

If you do this in a Web GUI when a user clicks a button, the user will wait 10ms longer for everything to be updated. 10ms is not a lot. But if this multiplies (because you have a longer list) or adds up (because you are doing other things in a slower-than-necessary way) the UX will suffer.

If you do this server side, 10ms is a lot. In Node.js you have just 1 thread. So this overhead is 1% of all available performance each second. If you get 10 requests per second 10% CPU is wasted only because you prefer FP style.

This is one of those cases when FP has the same computational complexity, but its kind of a constant factor slower. Sometimes it can be even worse.

All FP-sucks related articles
Functional Programming Sucks)
Underscore.js sucks! Lodash sucks!
Functional Programming Sucks! (it is slow) (this one)
Lodash Performance Sucks!

Underscore.js sucks! Lodash sucks!

In a world of functional programming hype there are two very popular JavaScript frameworks: underscore.js and Lodash. Dont use them! It is a horrible idea. They suck just like functional programming sucks!

They make claims like:

  • Lodash: A modern JavaScript utility library delivering […] performance
  • Underscore: JavaScript library that provides a whole mess of useful functional programming helpers.

The road to hell is sided by good intentions. This is how it goes.

1. Sloppy data types
There are many good things about JavaScript. The sloppy dynamic typing is perhaps not one of them. The following are for example true:

  • ’27’ == 27
  • undefined == null
  • 0 == ”
  • ‘object’ === typeof null

Now that I consider myself an experienced programmer I find it quite convenient to not need to be explicit about data types. But I dont mix and change types! If a variable is a number from the beginning it keeps being a number. I carefully pick types: Objects, Arrays, Number, String and stick to that type (for a variable or property). Occationally – mostly for return variables – I break the rule.

Lodash and Underscore is about allowing yourself to be sloppy:

  • Dont know if its an object or an array: use map, foreach, filter, reduce and many more
  • Dont know if it is empty (or what being empty even means): use isEmpty
  • Dont know if it is String Object or a String primitive or something else: use isString

If you dont know what it is is you already have a much bigger problem than how to do something with it.
If you mix String Objects and String primitives AND other things, and you want to know if it is any kind of string you are doing something wrong.

So Step 1 with Lodash and Underscore is that you

  1. Add a depenceny
  2. Allow sloppy and inconsistent typing
  3. No one can now presume anything about your types anymore
  4. Your code is now impossible to maintain or extend without lodash or underscore

2. Performance!
My experience after many years in software development is that when an application is not well received by the users it is very often because of (bad) performance. And bad performance causes weird, hard to reproduce, bugs and instability as well.

An important type of optimization that the JIT can do relies on the runtime generating classes with strict types for your objects (it guesses and learns the types as the program runs). If you allow a property to assume values of different types you are likely to destroy this optimization.

Lets look at the little function isEmpty.

/** Underscore **/
_.isEmpty = function(obj) {
    if (obj == null) return true;
    if (isArrayLike(obj) && (_.isArray(obj) || _.isString(obj) || _.isArguments(obj))) return obj.length === 0;
    return _.keys(obj).length === 0;
};

/** Lodash **/
function isEmpty(value) {
    if (isArrayLike(value) &&
        (isArray(value) || isString(value) ||
         isFunction(value.splice) || isArguments(value))) {
        return !value.length;
    }
    return !nativeKeys(value).length;
}

If you KNOW the datatype you can just do:

/** String **/
if ( 0 === s.length )

/** String that may be null **/
if ( null === s || 0 === s.length )

/** Array **/
if ( 0 === a.length )

/** Object **/
function objectIsEmpty(o) {
    for ( x in o ) return false;
    return true;
}

(you may want to check o.hasOwnProperty(x) depending on what you actually want – but if you dont know what you want using Lodash or Underscore will produce equally unexpected results as my code)

The worst thing with the Underscore and Loadash implementations are the last lines:

    return _.keys(obj).length === 0;
    return !nativeKeys(value).length;

Unless the JIT compiler and runtime is very smart those two will produce an entire array of strings on the heap – just to check if the array length is 0! Even though this in most practical cases will have an acceptable overhead it can get very expensive.

This IS the beauty of FP generic programming. Focus on WHAT, not HOW, and any little innocent check (like isEmpty) can turn horribly expensive.

However, to be honest, I took some plain functional code and replaced plain JavaScript with Lodash functions (forEach, reduce, isNumber, isEmpty and a few more) and the code actually got faster! Still much slower than imperative code, but slightly faster than not using Lodash.

3. Complexity and Optimization
Now that you have added an extra dependency, made your data objects sloppy, made your application harder for the JIT to optimize, perhaps your application is not as quick as you need it to be. If you are writing a front end you are probably quite fine. But if you are coding a Node.js backend, performance matters a lot more and waste is more unacceptable. If you are really unlucky these sloppy types give you hard to find bugs and your backend is not completely stable and reliable.

What do you do? Common practices in the business could be things like:

  • Minification/uglification
  • Scale out service architecture
  • Failover technology
  • Spend time optimizing code and modules

This is how little sloppyness, laziness and convenience when making initial decisions about your architecture later can cause you huge problems and costs.

Of course, just including lodash and using isEmpty() in a few places is not going to do much (if any) harm.
But finding lodash or underscore generally preferable to not using them is one kind of low quality thinking that causes software to be bad.

Conclusion
Be explicit, careful, consistent and smart about the data types you use.
Be restrictive with libraries and frameworks that introduce overhead and hide relevant details.

Use the standard library. However, you can find that for example the Array functions of Lodash outperform the standard library equivalents. That may not be true in the future (and I really wonder how it can happen at all).

All FP-sucks related articles
Functional Programming Sucks)
Underscore.js sucks! Lodash sucks! (this one)
Functional Programming Sucks! (it is slow)
Lodash Performance Sucks!