Category Archives: Programming

Developer lost in Windows

Admittedly, I am not a Microsoft fan, but Windows is a quite fine operating system nowadays. This article is not about complaining with Windows.

This article is also not about native Windows development for Windows. That is, if you use Windows, Visual Studio and C# (or other languages native to Windows development) to produce software targeting Windows, this article is not for you.

I and other programmers use Linux, OS X or possibly BSD to develop software meant to be OS independent. Our core tools are perhaps bash and a terminal emulator. We use tools like grep, head, tail, curl, iconv, sed, bc, emacs, vim, ssh, nc, git or svn on a daily basis (without them we are in a foreign land not understanding the customs). Depending on programming language we use compilers and interpreters like gcc, python, perl, php, nodejs and sbcl. In Linux, OS X or BSD this all come very naturally but sometimes we find ourself using a Windows computer:

  1. We just happen to have a second Windows computer that we want to use
  2. Company policy requires us to use Windows
  3. We want to be capable of working in Windows
  4. We want our project to work fine in Windows
  5. Perhaps you are a Windows developer/user who need/want to do unix-like development without getting a separate computer.

This article is for you!

The good news is that basically everything you wish to do can be done with free software (also on Windows). What is also good is that you have plenty of choices of good stable software to choose from.

The bad news is that finding just what is right for you and making it feel as simple and smooth as you are used to can be time consuming, difficult and frustrating.

Embracing Windows
To be productive in Windows you should (at least to some degree) embrace Windows. One approach is to do as the Windows developers. This would mean using cmd.exe (the old DOS-shell) or Powershell, and learn/embrace what comes with it. You would perhaps install only node/php/python native for Windows and git. I encourage you to try this, but I will not write more about it.

This might be the best solution if you develop in JavaScript or Java AND basically all tools you use are part of a JavaScript/Java ecosystem. Perhaps git is the only thing you need apart from what you install with npm.

Avoiding Windows
One approach is to just use Windows to access (and possibly run a virtual) Linux (system). It can work better or worse depending on your situation. A full screen VM might be the best solution if Linux GUI tools are central to your development, or if you are anyway very used to working with VMs. SSH to a remote system might be the best solution if mobility is not a problem. However, some development gets more complicated when different things are on different IP addresses, and security becomes more relevant when everything is not on 127.0.0.1. I will not write more about this.

Make Windows Unix-like
So, our gool is to feel reasonably at home in Windows by installing what is missing. It is always tricky to divide things in clear categories, but I would say you will want a stack of four layers:

  1. A Terminal Emulator: Windows already comes with one, but chances are you will not find it good enough. I have very modest demands but I expect copy-paste to work nicely and I expect multiple tabs. This works perfectly in any default Linux desktop and Mac OS X, not so in Windows.
  2. Editor: Unless you prefer to use vim or emacs directly in the terminal you want to install a decent or familiar text editor (Nodepad or Wordpad are not suitable for serious programming).
  3. The standard UNIX/GNU tools: Windows does not come with bash, head, tail, grep, sed, vim, bc and most other tools you take for granted. The old (DOS) equivalents are simply inferior. The new (Powershell) equivalents are… well… lets just say it is a steep learning curve and your bash scripts will not run.
  4. Interpreter/Compiler: Windows does not come with gcc, python, perl, nodejs or php. However, they are most often available as a separate download for Windows. Such a native Windows version may be slightly different from the Unix-version you are used too (command line options, especially related to paths may be different).

The short version is that you can install things like Cygwin or Windows Subsystem for Linux and you might just be fine with it! Or not. If you are bored of reading, try it out now.

To make a more informed choice you need to consider what types of binaries you want to run (and perhaps produce).

Native Windows Binaries
A Native Windows binary can (with some luck) be copied from a Windows computer to another and run. To interact with Windows (and you) it uses the APIs of Windows (typically Win32). It probably expects paths to be formed like in Windows (c:\tmp rather than /tmp).

Native Linux Binaries
A Native Linux binary was typically built on a Linux system to run on that Linux system (like Debian). Until recently it would not run on Windows, however Microsoft put a massive effort into Windows Subsystem for Linux, which allows you to run Linux programs (including bash) directly in Windows 10 (only). This is not perfect though. A Linux filesystem is quite different from a Windows filesystem so access between the two filesystems is limited and may require some thinking. This is perhaps the best approach if you are targetting Linux (like development for Docker) but it is obviously a bad approach if you want your program to work on a Windows server or other Windows computers.

Hybrid / Cygwin
There is an old (as in proven and reliable) project called Cygwin. It is basically a DLL-file that translates all (most) Linux system calls into Native Windows calls. This means that the unmodified source code of (most) programs written for Unix can be compiled with Cygwin, for Cygwin, and run on a Windows computer that has the Cygwin DLL installed. There are some drawbacks. First, performance suffers (from nothing to quite much depending on what you do). Second, for more advanced software, especially with GUI or heavy on network (like Apache), the hybrid solution can feel like the worst of two worlds. Third, access to the entire filesystem is smooth, but when it comes to access rights it sometimes does not work perfectly (files created by Cygwin get weird, even broken, permissions).

Now, to complicate things, there is a project called MSYS2 that maintains a fork of Cygwin, very similar to Cygwin. Cygwin or MSYS2 can be included/embedded in other projects (such as cmder). If you install multiple unix-compability-suites on your system it can get confusing.

Choosing binary type
At first glance Windows Subsystem for Linux, or Cygwin, seem very attractive. But lets assume that we do web development in Python. If you go with Windows Subsystem for Linux you will need to run a webserver (apache, lighttpd) inside that subsystem. To me, configuring, starting and stopping services inside this subsystem is not attractive. What could possibly go wrong? Well, a lot of things. With Cygwin you can probably make Windows IIS invoke Cygwin Python (if you really dont care about performance), because running Cygwin Apache sounds creepy (it can be done though). If, on the other hand you install Python built for Windows you get the real thing. All Windows/Python documentation and forum information suddenly applies to you. But then you end up with a Cygwin shell where everything is just like in Linux, except Python is not (except Cygwin will come with Python too, so you can end up with two versions of Python, with different features).

I would hesitate to run apache inside Cygwin and even more so inside Windows Subsystem for Linux. But I always also hesitate to do anything with IIS. Perhaps the best thing is to install Apache and Python for Windows (not depending on Cygwin) and just find the tools you need to edit your files?

The same reasoning can apply to PHP, nodejs or whatever you do.

Most configurations can probably be made to work. But you just want your Windows computer as simple and standard as possible, and you want your usual Linux (or OS X or BSD) stuff just to work the way they usually do. This is really not a matter of right or wrong, it is more a matter of taste (what kind of worthless errors do you want to solve just because you are a foreigner in Windows doing stuff not the way it was really meant to be done in Windows?).

Maintainability and Management
One thing is to make it work. Another thing is that it works week after week, month after month. Also another thing is to keep your software up to date and being able to add new tools along the way. You will find that some of the software I mention in this post does not come with the Windows installer/uninstaller that you are used to.

There is something called Chocolatey for Windows, which is a package manager dealing with installing, upgrading and uninstalling software on Windows in a uniform way. I don’t know much about it, and I will not write more about it.

While unix programs typically have a .file with configuration (there are a few places to look though) Windows programs typically use the registry. When it comes to unix software adapted for Windows you never really know: registry, config file, both or… depends on? And if a config file, where is it?

While unix programs can usually be installed in the home directory of an unpriviliged user, or in /opt, programs in Windows often require administrative priviliges to spread their files a little bit everywhere.

The more stuff you try and throw out, the more garbage you have left on your computer, which could possibly interfere with a newer version of something you install in the future. Keep this in mind. One day you will install some exotic software that does not work as expected, and you dont know if some old garbage on your computer caused it.

Git
Git is a very popular version control system. Originally designed for Linux it is today (?) the officially preferred system also for native Windows development. Git is so popular (or demanding?) that you can get it bundled for Windows together with some of your favourite tools. This might be just enough for you!

  • posh-git : git for powershell
  • git for Windows : based on Git for Windows SDK
  • cmder : based on MSYS2
  • …more?

cmder
I have tried cmder and I dont like it. It is an ugly install of just unpacking a huge zip file. MSYS2 itself is hidden inside the cmder-folder, so I dont feel comfortable managing it on my own. There seems to be no upgrade strategy (except throwing it all away and downloading the latest version). Git is run from one shell (a traditional Windows shell with a lambda prompt) but a msys2/bash (identical to Cygwin) shell is started separately. I dont want to change console to run git: I run git all the time. But it might be perfect for you (many people like cmder).

Cygwin
Cygwin is nice becuase it comes with an installer (setup.exe) that is also a package manager. It has a lot of packages, and it is capable of installing things like apache as windows services. My experience is that I am too lazy downloading the latest setup.exe, and I am too lazy running setup.exe regularly. Sometimes you end up with old versions and upgrade problems.

My disappointment with Cygwin is that it comes with its own (compared to standard Windows) terminal Mintty that still does not have tabs. I also do nodejs development and nodejs is not a Cygwin package, so I need to use standard Windows node. This sucks a little because node in Windows behaves slightly different from node in Linux/OS X (particularly when it comes to where packages go) so the Cygwin experience is a bit broken when you start using Windows node and (perhaps particularly) npm.

Also, I like bash scripts and they tend to run significantly slower in Cygwin than in Linux (process forking is extremely cheap in Linux and rather expensive in Windows, so with the Cygwin overhead it can get rather slow for heavy scripts).

As I now try to update and configure my Cygwin environment for my Node.js project I find:

  • I use Cygwin wget to download setup.exe (so I get it where I want it, rather than Downloads) to update Cygwin. When I double click it (to run it) permissions are wrong and I cant execute it. It is an easy fix, but compared to OS X / Linux this is awkward.
  • I run node from Cygwin. I get no prompt (>). It turns out node.exe does not recognize Cygwin/bash as a terminal and I need to run node.exe -i.
  • Symlinks keep being a mystery. There are some kind of symlinks in Windows now, Cygwin seems to try to use them, but the result is not consistent.

For a Terminal with tabs, check Fatty below.

MinGW & MSYS
While the idea of Cygwin is to provide a Posix compliant environment on Windows the MinGW/MSYS project was about porting unix tools (perhaps particularly a gcc-based C/C++ build environment) to run natively on Windows. According to the Wikipedia page of MinGW this is pretty much abandoned.

Gow: GNU on Windows
Gnu on Windows is a lightweight alternative to Cygwin. It appears to not have been updated since early 2014 (when 0.8.0) came out (and the Windows Subsystem for Linux seems to be one reason it is less relevant). I will not write more about it.

MSYS2
MSYS2 is the successor to MSYS, and (surprise) it is based on Cygwin. I tried it quickly and I find that:

  • It seems safe to install side by side with Cygwin
  • Using the MSYS2 is very similar to Cygwin
  • Instead of the Cygwin GUI package installer, MSYS uses pacman from Arch (if you much prefer that, go with MSYS2)
  • MSYS2 has some emphasis on MinGW32 and MinGW64. As I understand it this is about being able to use MSYS2 to build native Windows software from C/C++ code (if you do this in Cygwin, you end up with a Cygwin dll dependency)

So, for my purposes MSYS2 seems to be quite equivalent to Cygwin. Expect the same annoyances as I mentioned for Cygwin above.

Windows Subsystem for Linux
If you try to run bash from a Windows 10 command line you will probably get something like:

-- Beta feature --
This will install Ubuntu on Windows, distributed by Canonical
and licensed under its terms available here:
https://aka.ms/uowterms
In order to use this feature you must have Developer Mode enabled.
Press any key to continue...

Note that this can be quite confusing if you have installed some other bash.exe on your system. If you unexpectedly get the above message, check your PATH and make sure you invoke the right bash executable.

Installation is very easy (activate Developer mode and run bash), after giving username+password you are actually good to go! If you are used to Debian/Ubuntu you will feel surpisingly at home.

I find my Windows files in /mnt/c (not too surprising).
I find my Linux home files in c:\Users\zo0ok\AppData\Local\lxss\home\zo0ok.
(copying files there from Windows did not make them appear in Linux though)

So, if you want to edit files using a Windows GUI editor, they need to be in Windows-land, and that is obviously not the optimal environment for you project.

In general it works very well though. My node services had no problems listening to localhost:8080 and accepting incoming http requests from a Windows web browser.

If you are not happy with Ubuntu or you want more control of your Linux environment you will need to do further research. Ideally, Windows Subsystem for Linux has most of the advantages of a virtual machine, but none of the drawbacks. However, depending on what you really do and need, it can turn out to have most of the drawbacks and few of the advantages instead.

Fatty
The Mintty terminal that comes with Cygwin is ok, but it does not support tabs. There are different alternatives, and a simple one is Fatty (it is really Mintty with tabs). Installing Fatty requires doing a git clone and compiling it yourself. If you are brave you can download fatty-1.6.exe from me.

The web page for fatty tells you how to make a desktop shortcut but it did not work for me. What works for me is to set Shortcut target: “C:\cygwin64\bin\fatty.exe -“. Simple as that. I think I will be quite fine with fatty, actually.

Making Fatty run Windows Subsystem for Linux was trickier (as in no success) though.

ConEmu
ConEmu seems to be the ultra powerful flexible console. After 5 minutes I have still not found out how to change the font size.

ConsoleZ
ConsoleZ is good. Under Edit->Settings>Tabs you can add your own shell types.

Cygwin:                       Shell = C:\cygwin64\bin\bash.exe --login
Windows Subsystem for Linux:  Shell = bash.exe

Apart from that, ConzoleZ is reasonably easy to configure and it stays out of your way (I hide toolbar, status bar, search bar).

Editors
I am fine with vim in the console. However, there are many fine editors for Windows:

  • Visual Studio Code (at no cost)
  • Atom
  • Notepad++

Conclusions
I have had Windows as a 2nd/3rd platform for many years and I can see that the game has changed a bit. Microsoft has started supporting Ubuntu on Windows and at the same time the native options (like MSYS) are fading away. There are reasons to think general development is getting more and more based on Unix.

I would say:

  • Posh-git : Powershell is your thing, and you don’t care about unix tools
  • Git for Windows : You want Git, but you don’t care much about other unix tools
  • Cygwin : You want plenty of choices of unix tools in Windows
  • MSYS2 : You like pacman (Arch) or you want to build native Windows C/C++ binaries using free software
  • Windows Subsystem for Linux : You have a Windows 10 computer but you want to keep your Linux development separate from Windows

If you use Cygwin and just want tabs, get Fatty. Otherwise ConsoleZ is good. Chocolatey is more a Windows power user tool than something you need to provide unix capabilities.

In the past I have mostly been using Cygwin (with mixed feelings). Lately, when I have heard about the options (cmder, poshgit, Git for Windows, MSYS2) I have got a feeling that it is rather hard to configure an optimal environment. Now however I have come to realize that the differences are not very big. Most options are hybrids based on Cygwin and/or with Cygwin embedded (perhaps in the name of MSYS2). For Windows developers not used to Unix it is good with things like Git for Windows that just come with the basic Unix tools with no need to think about it. For developers with a Unix background it makes more sense to run Cygwin and MSYS2 (or Windows Subsystem for Linux). The days of unix tools built natively for Windows are over, and it is probably a good thing.

What you need to think about is your compiler, interpreter and/or web server.

Peculiar Compiler Optimizations

My teacher in a High Performance Computing class once told me not to confuse the compiler. This was in the late 90s, and SGI C and Fortran compilers were supposed to replace entire blocks of code with highly optimised implementations. As long as the compiler understood your intentions.

I have never discovered this, but yesterday perhaps! Read on.

I have been playing around with LISP, solving Project Euler challenges on Hackerrank. For problems 44 and 45 I decided to do Binary Search (which afterwards turned out not to be so smart, but that is another story) and took the implementation from Rosetta Code (the iterative one).

Binary search is about finding an element in a sorted array by starting in the middle, and jumping left or right, cutting the remaning array in half each time.

In my case I decided just a part of the array was worth searching so instead of searching the entire array [0..length] I wanted to search up to the Nth element [0..N]. Searching fewer elements should be faster, so I improved the binary search function to take an additional argument: hi. For SBCL (Steel Bank Common Lisp), this surprisingly had horrible effect on performance.

Benchmark Results
The results for different algorithms, different machines and different LISP implementations follow. The RPi V1 runs Arch Linux, Clisp comes with Arch, SBCL is downloaded from SBCL webpage. The RPi V2 runs Raspbian and SBCL that comes with the distribution. The Celeron runs Ubuntu that comes with SBCL. The MacBook Air runs OS X and SBCL is downloaded separately.

 (times in seconds)             Array Size Standard Optimized Recursive    C -O2
================================================================================
RPi V1  ARMv6 700MHz  Clisp 2.49      5000    640       633       720
                      SBCL  1.3.12    5000     15.6      27        34       0.95
RPi V2  ARMv7 900MHz  SBCL  1.2.4     5000      6.2      16        17       0.31
                                     20000    110       293       300       5.8
NUC Celeron J3455     Clisp 2.49     20000    765       762       720   
                      SBCL  1.3.3    20000      8.3      16.7      18.0     1.0
MacBook Air i5        SBCL  1.2.11   20000      4.0      11.5      12.3     0.75

A very slight “optimization” turns out to have very negative impact on performance for the quite fast (compiled) SBCL. I can’t imagine any other explanation than SBCL replaces the standard binary search with optimized code rather than executing my program. For Clisp the optimization actually works quite as would be expected and the recursive code is actually the fastest. On the Celeron, Clisp and SBCL behaves completely opposite.

Comparing to C
The other week I had the feeling (SBCL) LISP was fast and decided to compare LISP to C. This time I had the feeling that LISP was rather slow so I ported my test program to C (basically line by line). Well, I found that SBCL is actually pretty fast (especially on x86/x64), and C was faster only thanks to -O2 on some systems. -O2 actually made the C-program more than 5 times faster: perhaps also the C-compiler replace the entire binary search?

The Test Program
The code follows. The only difference between Standard and Optimized is the single line that is commented out (with ; in LISP) selecting which binary search to run (the length of the function name does not explain the performance difference).

The program creates an array of length N and populates it with values by the formula n(n+1)/2. This takes little time. It then checks for values 10,20,30… if the values are found in the array (using binary search). In this program the entire array is always searched, not taking advantage of the extra parameter (although the optimized version does not need to find the length of the array every time called).

(defun binary-search (value array)                       ; Standard 2 lines
    (let ((low 0) (high (1- (length array))))            ;
        (do () ((< high low) nil)
            (let ((middle (floor (+ low high) 2)))
                (cond ((> (aref array middle) value)
                       (setf high (1- middle)))
                      ((< (aref array middle) value)
                       (setf low (1+ middle)))
                      (t (return middle)))))))

(defun binary-search-optimized (value array hi)          ; Optimized 2 lines
    (let ((low 0) (high hi))                             ;
        (do () ((< high low) nil)
            (let ((middle (floor (+ low high) 2)))
                (cond ((> (aref array middle) value)
                       (setf high (1- middle)))
                      ((< (aref array middle) value)
                       (setf low (1+ middle)))
                      (t (return middle)))))))

(defun binary-search-r (value
                        array
                        &optional (low 0)
                        (high (1- (length array))))
  (if (< high low)
      nil
      (let ((middle (floor (+ low high) 2)))
        (cond ((> (aref array middle) value)
               (binary-search-r value array low (1- middle)))
              ((< (aref array middle) value)
               (binary-search-r value array (1+ middle) high))
              (t middle)))))

(defun formula (n)
    (/ (* n (+ n 1)) 2))

(defun init-array (n)
    (let ((arr (make-array n)))
        (loop for i from 0 to (1- n) do
            (setf (aref arr i) (formula (1- i))))
        arr))

(defun solve (arr n max)
    (let ((ret 0))
        (loop for i from 10 to max by 10 do
            (if (binary-search i arr)                     ; Toggle code used
;           (if (binary-search-r i arr)                   ;
;           (if (binary-search-optimized i arr n)         ;
                (incf ret)
                Nil))
        ret))
            
(defun main ()
    (let ((n (read)))
        (let ((arr (init-array n)))
            (format T "~D~%" (solve arr (1- n) (aref arr (1- n)))))))

(main)

Since I am a very novice LISP programmer I appreciate any feedback. The code above does not solve Project Euler 44 or 45, it is much simplified to test binary search. Initially I wrote code that relied on recursion rather than loops but I exceeded the stack size and ended up with loops (according to what I read, loops rather than recursion is the preferred style of Common Lisp).

Conclusion
Well... optimization is hard, and dont make any assumptions. As I have found many times before, what makes code faster on some platforms can make it slower on others. When it comes to optimizing SBCL and compiled LISP much experience is required, and dont forget to measure!

Playing with LISP and LISP vs C

Lisp is fun! Well, since I first knew about Lisp I was fascinated, but I have found it hard to learn Lisp and to play with it in a meaningful way. A few years ago I wrote about it here and here. As usual, the first steps of learning something new can be the hardest.

Occationally I use Hackerrank to find programming challanges to solve for fun. I particularly like the Project Euler competition. I find it particularly good for trying out new languages: you get a “meaningful” challenge, a simple environment prepared in your web browser, and automated test cases. So, this time I didn’t waste my time trying to find the right Lisp implementation for me, I just started hacking on Project Euler 38 and 39 on Hackerrank.

Problem 38 was quite simple, but 39 was more interesting. When I had solved it, I found my implementation was not at all fast enough, so I started experimenting locally (the Hackerrank environment is not optimal for tweaking, optimization and debugging).

Choosing a (Common) Lisp implementation
There are quite many Common Lisp implementations out there. The one Hackerrank uses is SBCL. That is clearly the Common Lisp implementation I would recommend (based on my little experience) if it is available for your platform.

I installed SBCL with apt-get in Ubuntu. I also downloaded binaries directly for my Mac OS X computer and my Raspberry Pi (v1) running Arch linux. Installation is a bit non-standard, but you can actually run it without installing (just execute run-sbcl.sh in downloaded folder).

I also tried clisp and ecl, none of these could deal with the memory usage (stack size) of my program. For clisp I found no way to manipulate stack sizes at all. For ecl I made some progress but I could not make it run my program.

SBCL is a Lisp compiler, and it produces fast and efficient code. I later compared it to C.

Project Euler 39
Project Euler 39 is basically about finding integer solutions to Pythagoras theorem. For a given, large, perimeter, how many right triangles are there? For example:

300000^2 + 400000^2 = 500000^2

This triangle has a perimeter of 300000+400000+500000=1200000. What other values for a and b so that

a + b = 700000
a^2 + b^2 = 500000^2

are there? The Hackerrank challenge requires you to work with perimeters up to 5000000. If you implement a solution, a few things to immediately note:

  • The squares wont fit in a 32bit integer. They will fit with no loss of precision in the 53 bits of a 64 bit double and they will also fit in a 64 bit integer. (This matters not for Common Lisp)
  • If you want to do recursion (and of course you want when you code Lisp) it will be millions of recursion steps, which will be a challenge to the stack size. (This also turned out not to matter for SBCL)

The Lisp implementation
It turned out that the SBCL compiler optimized the recursion is such a way that the memory usage was quite low. SBCL successfully runs my program on RPi/Arch, Intel/Ubuntu and Intel/OSX with quite reasonable memory usage.

Since this is about learing Lisp I wanted a 100% functional programming implementation. Only pure functions. A lot of my code is about generating, modifying and testing triangles. A triangle (a b c) can obviously be represented as a Lisp list (a b c) and this was my first implementation. Then if you want to read a, b or c from a list abc, or create the list from a, b and c, you can do:

  a: (car abc)
  b: (car (cdr abc))
  c: (car (cdr (cdr abc)))

abc: (list a b c)

I found this cumbersome. It became a lot of list-to-variables and variables-to-list overhead (I didnt care so much about performance, more about my code readability). I learnt that Lisp functions can return multiple values using value and that you can bind them with multiple-value-bind and use them as arguments to a function using multiple-value-call. This felt functional and pure enough, and it made my code 25% faster than the car/cdr pattern above.:

; a (stupid) function returning a triangle as three values
(defun get-345-triangle ()
  (values 3 4 5))

; a function calculating the perimeter of triangle (from a function)
(defun triangle-perimeter-1 (tri-func)
  (multiple-value-bind (a b c) (funcall tri-func)
    (+ a b c)))

; and in this case you dont need to bind, you can use + directly
(defun triangle-perimeter-2 (tri-func)
  (multiple-value-call #'+ (funcall tri-func)))

; now this works
(triangle-perimeter-1 #'get-345-triangle)
(triangle-perimeter-2 #'get-345-triangle)

Since I am a very inexperienced Lisp programmer I appreciate suggestions for improvement.

Performance of Lisp
My final Hackerrank submission of Lisp code executes in about 4.5 seconds on my Intel i5/Ubuntu. It takes about the same time on the Hackerrank web page, which is fast enough to pass all tests. On the Raspberry Pi v1 (ARMv6 @700 MHz) it takes more than 700 seconds. My intuition told me that 4.5 seconds was very good. This made me ask two questions. How would Lisp compare to C? And why is the ARM more than 100 times slower, how would that compare in C?

The C implementation
My ambition was to rewrite Lisp to C line by line. So my C-program has exactly the same functions which take almost exactly the same arguments. All calculations are identical and performed in exactly the same order. The C-program relies entirely on recursion instead of loops (just like the Lisp program). However…

Functions in C can not return multiple variables. While Lisp had values I decided to use a reference to a struct in C:

(defun get-a-triangle()
  (values x y z))

void get_a_triangle(struct triangle *t) {
  t->a = x;
  t->b = y;
  t->c = z;
}

If the C-triangle struct is a local variable on the callers stack the difference is quite small (from a practical point of view, from a theoretic strict functional programming perspective its a different story).

Numbers in Lisp have arbitrary precision integers and floats make no difference. So, when porting to C, I had to pick numeric types. For most purposes, int32_t was good enough. But, for the purpose of calculating Pythagoras theorem higher precision was needed (as I wrote above, the 53 bits of double, or 64 bits of int64_t are good). So I ended up with 5 versions of the C-program (to compare performance):

  1. All 64-bit integers
  2. 32-bit integers, 64-bit for “triangles”
  3. 32-bit integers, double for “triangles”
  4. 32-bit integers, 64-bit only for pythagoras calc
  5. 32-bit integers, double only for pythagoras calc

(In cases 2,3 the struct triangle has int64_t/doubles properties, and all manipulations and calculations on triangles use these datatypes. In cases 4,5 everything is int32_t, except the internals of a single function, which casts to higher precision before doing its calculations.)

The C-program requires a significant stack size. The stack size can be obtain and changed like (numbers in kb, all values given with ulimit -a):

$ ulimit -s
8192

$ ulimit -s 100000

For my program, a stack size much higher than 8192 is needed (see below). It seems impossible to get large stack than 64Mb in Mac OS X, so my C program could never run there.

Benchmark findings
All C-programs are compiled with gcc -O2.

 CPU            MHZ      SBCL        64     32/64  32/double   32(64)  32(double)
==================================================================================
Time (s)
 i5-4250U 1300-2600       4.5      1.52      1.52      1.60      1,54      1.58
 ARMv6          700      ~715        85        83        45        42        39
 ARMv7          900       357        23        21        13        12        10

Max Res (MB)
 i5-4250U                  41       103       103       103       103       103
 ARMv6                     50       220       210        79       110        76
 ARMv7                     57       180       160        87        97        62

This is not too easy to interpret! The ironic thing is that the fastest thing on the x64-cpu (64-bit integers everywhere) is the slowest on the ARMv6. However, the fastest option on the ARMv6 (32-bit everywhere, and when absolutely needed, use double) is almost the worst on the i5 CPU.

When it comes to the 64-bit i5, it basically does not matter what datatypes you use.

When it comes to the ARMv6, the most important thing is to not store the triangles as int64_t. The strange thing here is the stack sizes. Why does it double (compared to x64) when triangles are stored as int64_t? And the doubles, why do they reduce stack size so much (where are all these doubles actually stored)?

The time command gives max resident memory usage. If I set ulimit -s 128 the first two programs fail (with Segmentation fault 11), and the last three ones succeed, on the ARMv6.

I have found before that the performance of the ARMv6 suffers because of its slow memory and small cache. It is quite possible that the poor performance of the ARMv6 compared to the i5 is related to its slow memory, and the recursion (and stack memory) heavy algorithm.

Finally, SBCL in x64 has very good performance even compared to C (however, an iterative C-implementation, fitting completely in cache, would probably be faster). Note that I am a novice Lisp programmer and this is a math heavy program where the generic number type of Lisp will come at a cost. On the ARMv6, Lisp performance suffers much more.

Windows stack size limit
For Windows, stack size limit is set in the binary, not in the shell. With Cygwin/GCC use the flag -Wl,–stack,1000000 for one million bytes. Note that these are options passed on to the linker.

Future investigations
And I am curious about how much faster a minimal-memory-footprint loop-based C-program would perform.

The source code
Since this code solves a problem in Hackerrank I hesitate to publish it. If you want it for any other reason than just running it on Hackerrank let me know.

All JavaScript objects are not equally fast

One thing I like with JavaScript and NodeJS is to have JSON in the entire stack. I store JSON on disk, process JSON data server side, send JSON over HTTP, process JSON data client side, and the web GUI can easily present JSON (I work with Angular).

As a result of this, all objects are not created the same. Lets say I keep track of Entries, I have an Entry-constructor that initiates new objects with all fields (no more no less). At the same time I receive Entry-objects as JSON-data over the network.

A strategy is needed:

  1. Have mix of raw JSON-Entries and Objects that are instanceof Entry
  2. Create real Entry-objects from all JSON-data
  3. Only work with raw JSON-Entries

Note that if you don’t go with (2) you can’t use prototype, expect objects to have functions or use instanceof to identify objects.

Another perhaps not obvious aspect is that performance is not the same. When you create a JavaScript object using new the runtime actually creates a class with fast to access properties. Such object properties are faster than

  • an empty object {} with properties set afterwards
  • an object created with JSON.parse()

I wrote a program to test this. The simplified explanation is that I obtained an array of objects that I then sorted/calculated a few (6) times. For a particular computer and problem size I got these results:

TIME   PARAMETER   DESCRIPTION
3.3s       R       Produce random objects using "new"
4.4s       L       Load objects from json-file using JSON.parse()
3.0s       L2      json-file, JSON.parse(), send raw objects to constructor
3.2s       L3      load objects using require() from a js-file

I will be honests and say that the implementation of the compare-function sent to sort() matters. Some compare functions suffered more or less from different object origins. Some compare functions are more JIT-optimised and faster the second run. However, the consistent finding is that raw JSON-objects are about 50% slower than objects created with new and a constructor function.

What is not presented above is the cost of parsing and creating objects.

My conclusion from this is that unless you have very strict performance requirements you can use the raw JSON-objects you get over the network.

Below is the source code (for Node.js). Apart from the parameters R, L, L2 and L3 there is also a S(tore) parameter. It creates the json- and js-files used by the Load options. So typically run the program with the S option first, and then the other options. A typicall run looks like this:

$ node ./obj-perf.js S
Random: 492ms
Store: 1122ms

$ node ./obj-perf.js R
Random: 486ms
DISTS=110463, 110621, 110511, 110523, 110591, 110515 : 3350ms
DISTS=110463, 110621, 110511, 110523, 110591, 110515 : 3361ms
DISTS=110463, 110621, 110511, 110523, 110591, 110515 : 3346ms

$ node ./obj-perf.js L
Load: 376ms
DISTS=110463, 110621, 110511, 110523, 110591, 110515 : 4382ms
DISTS=110463, 110621, 110511, 110523, 110591, 110515 : 4408ms
DISTS=110463, 110621, 110511, 110523, 110591, 110515 : 4453ms

$ node ./obj-perf.js L2
Load: 654ms
DISTS=110463, 110621, 110511, 110523, 110591, 110515 : 3018ms
DISTS=110463, 110621, 110511, 110523, 110591, 110515 : 2974ms
DISTS=110463, 110621, 110511, 110523, 110591, 110515 : 2890ms

$ node ./obj-perf.js L3
Load: 1957ms
DISTS=110463, 110621, 110511, 110523, 110591, 110515 : 3436ms
DISTS=110463, 110621, 110511, 110523, 110591, 110515 : 3264ms
DISTS=110463, 110621, 110511, 110523, 110591, 110515 : 3199ms

The colums with numbers (110511) are checksums calculated between the sorts. They should be equal, otherwise they dont matter.

const nodeFs = require('fs');

function Random(seed) {
  this._seed = seed % 2147483647;
  if (this._seed <= 0) this._seed += 2147483646;
}

Random.prototype.next = function () {
  return this._seed = this._seed * 16807 % 2147483647;
};

function Timer() {
  this.time = Date.now();
}

Timer.prototype.split = function() {
  var now = Date.now();
  var ret = now - this.time;
  this.time = now;
  return ret;
};

function Point() {
  this.a = -1;
  this.b = -1;
  this.c = -1;
  this.d = -1;
  this.e = -1;
  this.f = -1;
  this.x =  0;
}

function pointInit(point, rand) {
  var p;
  for ( p in point ) {
    point[p] = rand.next() % 100000;
  }
}

function pointLoad(json) {
  var p;
  var point = new Point();
  for ( p in point ) {
    point[p] = json[p];
  }
  return point;
}

function pointCmp(a,b) {
  return pointCmpX[a.x](a,b,a.x);
}

function pointCmpA(a,b) {
  if ( a.a !== b.a ) return a.a - b.a;
  return pointCmpB(a,b);
}

function pointCmpB(a,b) {
  if ( a.b !== b.b ) return a.b - b.b;
  return pointCmpC(a,b);
}

function pointCmpC(a,b) {
  if ( a.c !== b.c ) return a.c - b.c;
  return pointCmpD(a,b);
}

function pointCmpD(a,b) {
  if ( a.d !== b.d ) return a.d - b.d;
  return pointCmpE(a,b);
}

function pointCmpE(a,b) {
  if ( a.e !== b.e ) return a.e - b.e;
  return pointCmpF(a,b);
}

function pointCmpF(a,b) {
  if ( a.f !== b.f ) return a.f - b.f;
  return pointCmpA(a,b);
}

var pointCmpX = [pointCmpA,pointCmpB,pointCmpC,pointCmpD,pointCmpE,pointCmpF];

function pointDist(a,b) {
  return Math.min(
    (a.a-b.a)*(a.a-b.a),
    (a.b-b.b)*(a.b-b.b),
    (a.c-b.c)*(a.c-b.c),
    (a.d-b.d)*(a.d-b.d),
    (a.e-b.e)*(a.e-b.e),
    (a.f-b.f)*(a.f-b.f)
  );
}

function getRandom(N) {
  var i;
  var points = new Array(N);
  var rand   = new Random(14);

  for ( i=0 ; i<N ; i++ ) {
    points[i] = new Point();
    n = pointInit(points[i], rand);
  }
  return points;
}

function test(points) {
  var i,j;
  var dist;
  var dists = [];

  for ( i=0 ; i<6 ; i++ ) {
    dist = 0;
    for ( j=0 ; j<points.length ; j++ ) {
      points[j].x = i;
    }
    points.sort(pointCmp);
    for ( j=1 ; j<points.length ; j++ ) {
      dist += pointDist(points[j-1],points[j]);
    }
    dists.push(dist);
  }
  return 'DISTS=' + dists.join(', ');
}

function main_store(N) {
  var timer = new Timer();
  points = getRandom(N);
  console.log('Random: ' + timer.split() + 'ms');
  nodeFs.writeFileSync('./points.json', JSON.stringify(points));
  nodeFs.writeFileSync('./points.js', 'exports.points=' +
                                      JSON.stringify(points) + ';');
  console.log('Store: ' + timer.split() + 'ms');
}

function main_test(points, timer) {
  var i, r;
  for ( i=0 ; i<3 ; i++ ) {
    r = test(points);
    console.log(r + ' : ' + timer.split() + 'ms');
  }
}

function main_random(N) {
  var timer = new Timer();
  var points = getRandom(N);
  console.log('Random: ' + timer.split() + 'ms');
  main_test(points, timer);
}

function main_load() {
  var timer = new Timer();
  var points = JSON.parse(nodeFs.readFileSync('./points.json'));
  console.log('Load: ' + timer.split() + 'ms');
  main_test(points, timer);
}

function main_load2() {
  var timer = new Timer();
  var points = JSON.parse(nodeFs.readFileSync('./points.json')).map(pointLoad);
  console.log('Load: ' + timer.split() + 'ms');
  main_test(points, timer);
}

function main_load3() {
  var timer = new Timer();
  var points = require('./points.js').points;
  console.log('Load: ' + timer.split() + 'ms');
  main_test(points, timer);
}

function main() {
  var N = 300000;
  switch ( process.argv[2] ) {
  case 'R':
    main_random(N);
    break;
  case 'S':
    main_store(N);
    break;
  case 'L':
    main_load();
    break;
  case 'L2':
    main_load2();
    break;
  case 'L3':
    main_load3();
    break;
  default:
    console.log('Unknown mode=' + process.argv[2]);
    break;
  }
}

main();

JavaScript: await async

With Node.js version 8 there is finally a truly attractive alternative to good old callbacks.

I was never a fan of promises, and implementing await-async as a library is not pretty. Now when await and async are keywords in JavaScript things change.

The below program demonstrates a simple async function doing IO: ascertainDir. It creates a directory, but if it already exists no error is thrown (if there is already a file with the same name, no error is thrown, and that is a bug but it will do for the purpose of this article).

There are four modes of the program: CALLBACK, PROMISE, AWAIT-LIB and AWAIT-NATIVE. Creating a folder (x) should work. Creating a folder in a nonexisting folder (x/x/x) should fail. Below is the output of the program and as you see the end result is the same for the different asyncronous strategies.

$ node ./await-async.js CB a
Done: a
$ node ./await-async.js CB a/a/a
Done: Error: ENOENT: no such file or directory, mkdir 'a/a/a'

$ node ./await-async.js PROMISE b
Done: b
$ node ./await-async.js PROMISE b/b/b
Done: Error: ENOENT: no such file or directory, mkdir 'b/b/b'

$ node ./await-async.js AWAIT-LIB c
Done: c
$ node ./await-async.js AWAIT-LIB c/c/c
Done: Error: ENOENT: no such file or directory, mkdir 'c/c/c'

$ node ./await-async.js AWAIT-NATIVE d
Done: d
$ node ./await-async.js AWAIT-NATIVE d/d/d
Done: Error: ENOENT: no such file or directory, mkdir 'd/d/d'

The program itself follows:

     1	var nodefs = require('fs')
     2	var async = require('asyncawait/async')
     3	var await = require('asyncawait/await')
     4	
     5	
     6	function ascertainDirCallback(path, callback) {
     7	  if ( 'string' === typeof path ) {
     8	    nodefs.mkdir(path, function(err) {
     9	      if (!err) callback(null, path)
    10	      else if ('EEXIST' === err.code) callback(null, path)
    11	      else callback(err, null)
    12	    })
    13	  } else {
    14	    callback('mkdir: invalid path argument')
    15	  }
    16	};
    17	
    18	
    19	function ascertainDirPromise(path) {
    20	  return new Promise(function(fullfill,reject) {
    21	    if ( 'string' === typeof path ) {
    22	      nodefs.mkdir(path, function(err) {
    23	        if (!err) fullfill(path)
    24	        else if ('EEXIST' === err.code) fullfill(path)
    25	        else reject(err)
    26	      })
    27	    } else {
    28	      reject('mkdir: invalid path argument')
    29	    }
    30	  });
    31	}
    32	
    33	
    34	function main() {
    35	  var method = 0
    36	  var dir    = 0
    37	  var res    = null
    38	
    39	  function usage() {
    40	    console.log('await-async.js CB/PROMISE/AWAIT-LIB/AWAIT-NATIVE directory')
    41	    process.exit(1)
    42	  }
    43	
    44	  switch ( process.argv[2] ) {
    45	  case 'CB':
    46	  case 'PROMISE':
    47	  case 'AWAIT-LIB':
    48	  case 'AWAIT-NATIVE':
    49	    method = process.argv[2]
    50	    break
    51	  default:
    52	    usage();
    53	  }
    54	
    55	  dir = process.argv[3]
    56	
    57	  if ( process.argv[4] ) usage()
    58	
    59	  switch ( method ) {
    60	  case 'CB':
    61	    ascertainDirCallback(dir, function(err, path) {
    62	      console.log('Done: ' + (err ? err : path))
    63	    })
    64	    break
    65	  case 'PROMISE':
    66	    res = ascertainDirPromise(dir)
    67	    res.then(function(path) {
    68	      console.log('Done: ' + path)
    69	    },function(err) {
    70	      console.log('Done: ' + err)
    71	    });
    72	    break
    73	  case 'AWAIT-LIB':
    74	    (async(function() {
    75	      try {
    76	        res = await(ascertainDirPromise(dir))
    77	        console.log('Done: ' + res)
    78	      } catch(e) {
    79	        console.log('Done: ' + e)
    80	      }
    81	    })());
    82	    break
    83	  case 'AWAIT-NATIVE':
    84	    (async function() {
    85	      try {
    86	        res = await ascertainDirPromise(dir)
    87	        console.log('Done: ' + res)
    88	      } catch(e) {
    89	        console.log('Done: ' + e)
    90	      }
    91	    })();
    92	    break
    93	  }
    94	}
    95	
    96	main()

Please note:

  1. The anonymous function on line 74 would not be needed if main() itself was async()
  2. The anonymous function on line 84 would not be needed if main() itself was async
  3. A function that returns a Promise() (line 19) works as a async function without the async keyword.

Callback
Callback is the old simple method of dealing with asyncrounous things in JavaScript. A major complaint has been “callback hell”: if you call several functions in sequence it can get rather messy. I can agree with that, BUT I think each asyncrounous call deserves its own error handling anyway (and with proper error handling other options tend to be equally tedious).

Promise
I dont think using a promise (66-71) is very nice. It is of course a matter of habit. One thing is that not all requests in the success-path are actually success in real life, or not all errors are errors (like in ascertainDir). Very commonly you make a http-request which itself is good, but the data you receive is not good so you want to proceed with error handling. This means that the fulfill case needs to execute the same code as the reject case, for some “successful” replies. Promises can be chained, but it typically results in ignoring proper error handling.

awaitasync library
I think the syntax of the asyncawait library is rather horrible, but it works as a proof-of-concept for the real thing.

async await native keywords
With the async/await keywords in JavaScript, suddenly asyncrounous code can be handled just like in Java or C#. Since it is familiar it is appealing! No doubt it is clean and practical. I would hesitate to mix it with Callbacks or Promises, and would rather wait until I can do a complete rewrite.

Common sources of bugs in JavaScript are people trying to return from within (callback/promises) functions, people not realising the rest of the code continues to run after the asyncrous call, or things related to variable scopes. I guess in most cases the await/async makes these things cleaner and easier, but I would expect problems where it causes unexpected effects when not properly used.

Finally, if you start using async/await keywords there is no polyfill or fallback for older browser (maybe Babel can do that for you). As usual, IE seems to lag behind, and you can forget about Node v6 (or earlier). Depending on your situation, this could be a show stopper or no issue at all.

Watch something?
For more details, I can recommend this video on 5 architectures of asynchronous JavaScript.

Thinking small and big (in programming)

When programming, thinking small often allows for a quick start but after a while your project slows down as it grows with pain. Based on every such painful experience there are countless of good practices in programming for thinking big. However, thinking big is difficult and comes with overhead, and if you think too big there is a risk you will only think big, and not think about your problem and your actual code very much.

I will give a number of examples of smaller and bigger thinking (don’t get stuck reading the list).

Memory usage
Small thinking: everything fits in RAM
– smaller: everything fits in CPU cache, or CPU registries
Big thinking: using external storage, streaming, compression
– bigger: scaling out

Parallelism
Small thinking: single process and thread
Big thinking: multi threading, sending work to other processes
– bigger: scaling out

Portability
Thinking small: portable by standard compliance
– smaller: single platform
Thinking big: target specific tweaks, build and configuration options
– bigger: target specific dependencies (Mysql for Linux, MS SQL for windows)

Source management
Small thinking: versioned tarballs
– smaller: just a single local file
Big thinking: a git/svn repository
– bigger:several repositories, bug tracker, access rights

Building
Small thinking: single standard compile command
– smaller: no building required in the first place
Big thinking: make
– bigger: autoconf, tools and configuration required (Babel)
– even bigger: build a build-and-config-system (like menuconfig for Linux kernel)

Testing
Small thinking: assert()
Big thinking: unit tests
– bigger: test driven development, test coverage analysis
– even bigger: continuous integration

Configuration
Small thinking: command line options
– smaller: hard coded
Big thinking: configuration file
– bigger: configuration (G)UI
– even bigger: download configuration, find out configuration itself, selection of different configurations (like XML-file, JSON-file or database)

Error handling
Small thinking: crash with error message
Big thinking: log file(s), verbose levels
– bigger: error recovery, using system logs (like Windows event log)
– even bigger: monitoring, choice of different external log systems

UI
Thinking small: single CLI or GUI
Thinking big: build a backend library or server that allows for different UIs

Dependencies (code)
Thinking small: only standard library
Thinking big: require libraries (external and own code broken out to libraries)
– bigger: optional dependencies, supporting different libraries that do the same thing
– even bigger: dependencies can be loaded dynamically during run time

Dependencies (databases, services)
Thinking small: no dependencies
Thinking big: external storage
– bigger: allow multiple clients against common storage
– even bigger: distributed, scaled out, storage

State
Small thinking: functions and data
– smaller: rely on global data
Big thinking: encapsulation (OO-style), immutable data (FP-style)

Generics
Small thinking: functions are specific for data
Big thinking: generic functions by templates, interfaces, generators, iterators

Performance
Small thinking: code is fast enough
Big thinking: architecture allows scaling out for more performance as required

Deployment
Small thinking: manual copy-replace is good enough
Big thinking: testing, continuous integration, rollback, zero-downtime

Automation
Thinking small: automation not needed since all tasks are simple enough
Thinking big: automation makes complex tasks fast and easy

One size does not fit all
The important thing to understand is that there is no silver bullet. Each program, problem or project has its own requirements and its own sweet spot of small-vs-big. But this my change over time; even if you need to be a bit big later on, it may not help in the beginning.

Perhaps obvious; it is not meaningful to say that a full CI-environment is better than assert() (and you may argue that they are entirely different things). Having global data is not (for all problems) worse than having completely immutable state. And so on.

Your need for big varies within a project: you may need a very big and configurable build process to build something that has very small requirements when it comes to scaling out.

There are no safe default choices
You need to make qualified small-vs-big choices. If you are an experienced programmer you often don’t need to think much about it. If you work within an environment where you already master tools that are available (perhaps even mandatory) you can use them with little overhead. However, if you take your environment (perhaps just an IDE, or more) for granted and you rely on it, it may not be so easy for someone else to pick up where you left.

Just as you can fall behind others who use better tools you can grow fat and fall behind those who use fewer tools (and stay smaller).

If in doubt: start small
Going small in every aspect is often not good enough (except for very isolated problems). But it can be a good start and if (or when) it fails you will learn from your mistake. You will understand how to architect your software better and you will understand why some (big) tools and practices really exist. That is good wisdom!

Going big in every aspect is most definitely going to make you fail. You may need to do this for building systems like SAP or Windows (but such large project often do fail). If you fail with something far too big it is hard to learn from it. Chances are you never really got down to the requirements and chances are much energy was spent integrating tools and frameworks into a development and operation environment that worked at all.

Small sometimes goes a long way
There are often theoretical discussions about small-vs-big. Big often looks attractive and powerful. However, some problems are just hard regardless how you solve them, and a small solution is often more right on target.

There was a macro-kernel vs micro-kernel discussion. A micro kernel is a big solution: more encapsulation, more isolation, less global data, more dynamic loading and so on. Linux is obviously more successful than HURD (the GNU micro kernel), mostly because it actually works.

Agile and Refactoring
Agile and refactoring are about encouraging you to start small, make things that are good enough for now, and fix them later on (if ever needed). Often the problem down the road is not what you expected when you started.

Architecture, Microservices, UNIX
The UNIX principle is that everything is a program that does one thing well.
Microservices is much the same thing except it spans over several networked services.

This works. Because most of the time, for most purposes, the developers (of UNIX and Microservices) can think small. Most programs in a UNIX system, like most services in a Microservice architecture, are for most practical purposes small programs or small services.

UNIX: some programs need to be highly secure, some need an interactive UI, some need to log, some have high performance requirements, some have dynamic dependencies and some are better not written in C. This is why you should build a microservice architecture (not a monolith) and this is how you should build it (unless you are as good as Torvalds and you can land a Monolith in C – but that works thanks to very good architecture and practices – and Linux is still just the Kernel in a much bigger system).

Limited time
All software projects have limited time available. Time is spent on:

  1. Understanding requirements
  2. Producing code that correctly and efficiently matches the requirements
  3. Test and deployment
  4. Solution architecture
  5. Tools and frameworks: understanding and integration

#1 deliver value even on its own (sometimes a technical solution is not even required).
#4 and #5 only deliver value if the they save total time (by lowering #2 and #3).
#2 sometimes is just not possible without #5, then please go ahead with #5.

But if #2 takes one week if you use Notepad to code a single index.html file containing HTML+CSS+JavaScript (and this solves the requirements), then there must be a good case for spending time on #4 and #5 (going big) instead of just solving the problem (staying small).

#4 and #5 produce what I call invented problems; problems that you did not have in the first place, that are not related to your requirements but comes with your tools. The most obvious example is licencing issues. If you go multithreading and/or use an external database you suddenly have deadlocks, race conditions, transactions and semaphores to worry about: is that price worth it for what you get from the database or multithreading? Deployment (and server configuration) is absolutely necessary, often rather complicated, and delivers no value to the customer what so ever.

Always ask yourself: how hard would it be to solve this problem using the smallest reasonable set of tools?

Maintain vs Replace
Many big practices are about producing maintainable code. Often this never pays off:

  • There is no need for the code anymore
  • The code does what it needs to do, and no change is required
  • Even though the code itself is maintainable, no one understands the problem and the solution well enough to actually improve (or even change) it

When (if) the moment of change actually comes, often a fresh start is anyway the best solution. If programs are made small and they do one thing well (so it is quite easy to test that another program can replace it) replacing them is not a big deal.

This means that ugliness (global variables, lack of encapsulation, hard coded limitations, lack of proper test coverage, inability to scale, and so on) often is not a problem. On the other hand, a (big) program that is not fit for purpose (not correct and efficient) never produce much value in the fist place.

Performance (and Scaling)
Golden rule of optimization:

  1. Don’t
  2. Experts only: see (1)

This is not entirely true but most of your code is not performance critical. In computing, there are two ways you can get faster:

  1. Go small: find ways to make your code require less resources
  2. Go big: assign more resources to run your code

The truth is that modern hardware is extremely powerful. Even a Raspberry Pi V1 (with 700MHz CPU and 512MB RAM) can serve enormous amounts of network requests or crunch amazingly many numbers. If a Raspberry Pi is not enough for you, you either have

  1. very many users
  2. a very complicated/large/heavy problem
  3. coded a solution that mostly wastes resources

If you know that #1 (only) is your case, go ahead and scale out big. Be sure to know your bottlenecks and seriously consider your storage model.

If #2 is your case you need to sit down and think.

If #3 is your case, you should have stayed small from the beginning. It is probably cheaper to rewrite significant parts of your solution in C (or another language that uses minimal resources) and keeping all data in RAM, than it is to scale your code out.

Availability (and Redundancy)
You may need high availability: downtime, unexpected or not, is expensive.

The big solution is to go for redundancy: if one goes down the other takes over. This can be the right thing to do – when everything else has already been tried. Sometimes the cure is worse than the disease.

The small solution is to keep your program simple and when something unexpected happens, let it crash. This way you will quite soon (pre production) have nailed out the critical errors. And if you can not make it stable no redundancy or fault-tolerant environment will really save you.

Conclusion
When going big understand the cost. The road to hell is sided by good intentions. Beware of grande architectures.

Lodash Performance Sucks!

To continue my Functional Programming Sucks series of posts I will have a closer look at reduce().

I complained with Lodash (and Underscore) for different reasons. One complaint was performance, but I just read the code and presumed it was going to be slow without measuring. Then I complained with the performance of Functional Programming in general.

I thought it would be interesting to “improve” the Functional code with Lodash functions, and to my surprise (I admit I was both wrong and surprised) I found Lodash made it faster! After reading a little more about it I discovered this is a well known fact.

So, here are four different implementations of a function that checks if the elements (numbers) in an array are ordered (cnt is incremented if the array is sorted, such was the original problem).

// Standard reduce()
    this.test = function(p) {
        if ( false !== p.reduce(function(acc,val) {
            if ( false === acc || val < acc ) return false;
            return val;
        }, -1)) cnt++;
    };

// Lodash reduce(), and some other Lodash waste
    this.test = function(p) {
        if ( false !== LO.reduce(p,function(acc,val) {
            if ( false === acc || val < acc ) return false;
    //      if ( !LO.isNumber(acc) || val < acc ) return false;
            return val;
        }, -1)) cnt++;
    };

// My own 4 minute to implement simpleReduce(), see below
    this.test = function(p) {
        if ( false !== simpleReduce(p,function(acc,val) {
            if ( false === acc || val < acc ) return false;
            return val;
        }, -1)) cnt++;
    };

// A simple imperative version
    this.test = function(p) {
        var i;
        for ( i=1 ; i < p.length ; i++ ) {
            if ( p[i] < p[i-1] ) return;
        }
        cnt++;
    };

// my own implementation reduce()
    function simpleReduce(array, func, initval) {
         var i;
         var v = initval;
         for ( i=0 ; i<array.length ; i++ ) {
             v = func(v, array[i]);
         }
         return v;
    }

The interesting thing here is that the standard library reduce() is the slowest.
However, my simpleReduce is faster than Lodash reduce().

(seconds) reduce()
Std Lib Lodash Simple Imperative
Raspberry Pi v1 (ARMv6 @ 700) 21 13 9.3 4.8
MacBook Air (Core i5 @ 1400) 0.46 0.23 0.19 0.16

Conclusion
The conclusion is that from a performance perspective Functional Programming sucks. Lodash sucks too, but a little bit less so than the standard library (however, if you decorate all your code with isEmpty, isString, isNumber and that crap it will get worse).

That said, the generic nature of Lodash comes at a cost. The most simpleReduce() imaginable outperforms Lodash. As I see it, this leaves Lodash in a pretty bad (or small) place:

  • Compared to the standard library it is an extra dependency with limited performance benefits
  • The generic nature of Lodash comes at both a performance cost and it allows for sloppy coding
  • A hand written reduce() outperforms Lodash and is a good excercise for anyone to write. I expect this is quite true also for other functions like take or takeRight.
  • For best performance, avoid Functional Programming (and in this case the imperative version is arguably more readable than the FP reduce() versions)

Whats up with the Standard Library???
JavaScript is a scripted language (interpreted with a JIT compiler) that has a standard library written in C++. How can anything written in JavaScript execute faster than anything in the standard library that does the same thing?

First, kudos to the JIT designers! Amazing job! Perhaps the standard library people can learn from you?

I can imagine the standard library functions are doing some tests or validations that are somehow required by the standard, and that a faster and less strict version of reduce() would possibly break existing code (although this sounds far fetched).

I can (almost not) imagine that there is a cost of going from JS to Native and back to JS: that function calls to native code comes with overhead. Like going from user space to kernel space. It sounds strange.

I have read that there are optimizations techniques applied to Lodash (like lazy evaluation), but I certainly didn’t do anything like that in my simpleReduce().

For Node.js optimizing the standard library truly would make sense. In the standard library native code of a single-threaded server application every cycle counts.

UPDATE: I tried replacing parts of the above code: 1) the lambda function that is passed to reduce(), 2) the imperative version, with native code. That is, I wrote C++ code for V8 and used it instead of JavaScript code. In both cases this was slower! Obviously there is some overhead in going between native and JavaScript JIT, and for rather small functions this overhead makes C++ “slower” than JavaScript. My idea was to write a C++ reduce() function but I think the two functions I wrote are enough to show what is happening here. Conclusion: don’t write small native C++ functions for performance, and for maximum performance it can be worth to rewrite the standard library in JavaScript (although this is insane to do)!

All FP-sucks related articles
Functional Programming Sucks)
Underscore.js sucks! Lodash sucks!
Functional Programming Sucks! (it is slow)
Lodash Performance Sucks! (this one)

Functional Programming Sucks! (it is slow)

Update 2017-07-17: Below i present numbers showing that functional code is slower than imperative code. It seems this has changed with newer versions of Node.js: functional code has not turned faster but imperative code has become slower. You can read a little more about it in the comments. I will look more into this. Keep in mind that the below findings may be more accurate for Node.js v4-6 than for v8.

Functional programming is very popular with contemporary JavaScript programmers. As I have written before, Functional programming sucks and functional libraries for JavaScript also suck.

In this post I will explain more why Functional Programming sucks. I will start with the conclusion. Read on as long as you want more details.

Functional Programming practices are bad for performance
It is very popular to feed lamda-functions to map(), reduce(), filter() and others. If you do this carelessly the performance loss is significant.

It is also popular to work with immutable data. That is, you avoid functions that change (mutate) current state (side effects) and instead you produce a new state (a pure function). This puts a lot of pressure on the garbage collector and it can destroy performance.

The Benchmark Problem
Sometimes I entertain myself solving problems on Hackerrank.com. I particularly like the mathematical challenges in the Project Euler section (the Project Euler is also an independent organisation – HackerRank uses the challenges in Project Euler to create programming challenges).

This article refers to Project Euler 32. I will not go into details, but the solution is basically:

  1. Generate all permutations of the numbers 1,2,3,4,5,6,7,8,9 (there are 9! of them)
  2. For each permutation, check if it is “good” (very few are)
  3. Print the sum of the good instances

The first two steps give good benchmark problems. I have made different implementations of (1) and (2) and then compared the results.

Benchmark Results
I have three different permutation generators (all recursive functions):

  1. Pure function, immutable data (it may not be strictly pure)
  2. Function that mutates its own internal state, but not its input
  3. Function that mutates shared data (no allocation/garbace collection)

I also have three different test functions:

  1. Tests the orginal Project Euler problem
  2. Simplified test using reduce() and lamda function
  3. Simplified test implemented a standard loop

I benchmarked on two different systems using Node.js version 6. I have written elsewhere that Node.js performance on Raspberry Pi sucks.

(seconds) Project Euler Test Simplified Test
Test Function: Functional Imperative
Permutation Gen: Pure Semi Shared Shared Shared Pure
Raspberry Pi v1 (ARMv6 @ 700) 69 23 7.4 21 3.7 62
MacBook Air (Core i5 @ 1400) 0.77 0.29 0.13 0.40 0.11 0.74

Comparing columns 1-2-3 shows the performance of different generators (for Project Euler test)
Comparing columns 4-5 shows the performance of two different test functions (using fast generator)
Comparing columns 5-6 shows the performance of two different generators (for fast simple test)

This shows that the benefit of using shared/mutable data (not running the garbage collector) instead of immutable data is 5x performance on the Intel CPU and even more on the ARM. Also, the cost of using reduce() with a lamda function is more than 3x overall performance on the Intel CPU, and even more on the ARM.

For both the test function and permutation generation, making any of them functional-slow significantly slows down the entire program.

The conclusion of this is that unless you are quite sure your code will never be performance critical you should avoid functional programming practices. It is a lot easier to write imperative code than to later scale out your architecture when your code does not perform.

However, the pure immutable implementation of the permutation generator is arguably much simpler than the iterative (faster) counterpart. When you look at the code you may decide that the performance penalty is acceptable to you. When it comes to the reduce() with a lamda function, I think the imperative version is easier to read (and much faster).

Please notice that if your code consists of nice testable, replaceble parts without side effects you can optimize later on. The functional principles are more valuable at a higher level. If you define your functions in a way that they behave like nice FP functions it does not matter if they are implemented using imperative principles (for performance).

Generating Permutations
I used the following simple method for generating permutations. I start with two arrays and I send them to my permute-function:

  head = [];
  tail = [1,2,3,4];

  permute(head,tail);

My permute-function checks if tail is empty, and then: test/evalute head.
Otherwise it generates 4 (one for each element in tail) new sets of head and tail:

  permute( [1] , [2,3,4] )
  permute( [2] , [1,3,4] )
  permute( [3] , [1,2,4] )
  permute( [4] , [1,2,3] )

The difference in implementation is:

  • Pure version generates all the above 8 arrays as new arrays using standard array functions
  • Semi pure version generates its own 2 arrays (head and tail) and then uses a standard loop to change the values of the arrays between the (recursive) calls to permute.
  • Shared version simply creates a single head-array and 9 tail-arrays (one for each recursion step) up front. It then reuses these arrays throughout the 9! iterations. (It is not global variables, they are hidden and private to the permutation generator)

The simplified test
The simplified test checks if the array is sorted: [1,2,3,4]. Of all permutations, there is always exactly one that is sorted. It is a simple test to implement (especially with a loop).

// These functions are part of a "test-class" starting like:
function testorder1() {
    var cnt = 0;

// Functional test
    this.test = function(p) {
        if ( false !== p.reduce(function(acc,val) {
            if ( false === acc || val < acc ) return false;
            return val;
        }, -1)) cnt++;
    };

// Iterative test (much faster)
    this.test = function(p) {
        var i;
        for ( i=1 ; i<p.length ; i++ ) {
            if ( p[i] < p[i-1] ) return;
        }
        cnt++;
    };

I tried to optimise the functional reduce() version by breaking out a named function. That did not help. I also tried to let the function always return the same type (now it returns false OR a number) but that also made no difference at all.

All the code
For those who want to run this themselves or compare the permutation functions here is the entire program.

As mentioned above, the slowest (immutable data) permutation function is a lot smaller and easier to understand then the fastest (shared data) implementation.


'use strict';

// UTILITIES

function arrayToNum(p, s, e) {
    var r = 0;
    var m = 1;
    var i;
    for ( i=e-1 ; s<=i ; i-- ) {
        r += m * p[i];
        m *= 10;
    }
    return r;
}

function arrayWithZeros(n) {
    var i;
    var a = new Array(n);
    for ( i=0 ; i<a.length ; i++ ) a[i] = 0;
    return a;
}


// PERMUTATION ENGINES

function permutations0(n, callback) {
}

// IMMUTABLE (SLOWEST)

function permutations1(n, callback) {
    var i;
    var numbers = [];
    for ( i=1 ; i<=n ; i++ ) numbers.push(i);
    permute1([],numbers,callback);
}

function permute1(head, tail, callback) {
    if ( 0 === tail.length ) {
        callback(head);
        return;
    }

    tail.forEach(function(t, i, a) {
        permute1( [t].concat(head),
                  a.slice(0,i).concat(a.slice(i+1)),
                  callback);

    });
}

// MUTATES ITS OWN DATA, BUT NOT ITS ARGUMENTS

function permutations2(n, callback) {
    var i;
    var numbers = [];
    for ( i=1 ; i<=n ; i++ ) numbers.push(i);
    permute2([],numbers,callback);
}

function permute2(head, tail, callback) {
    if ( 0 === tail.length ) {
        callback(head);
        return;
    }
    var h2 = [tail[0]].concat(head);
    var t2 = tail.slice(1);
    var i  = 0;
    var tmp;
    
    while (true) {
        permute2(h2, t2, callback);
        if ( i === t2.length ) return;
        tmp   = h2[0];
        h2[0] = t2[i];
        t2[i] = tmp;
        i++;
    }
}

// MUTATES ALL DATA (INTERNALLY) (FASTEST)

function permutations3(n, callback) {
    var i;
    var head  = arrayWithZeros(n);
    var tails = new Array(n+1);

    for ( i=1 ; i<=n ; i++ ) {
        tails[i] = arrayWithZeros(i);
    }

    for ( i=1 ; i<=n ; i++ ) {
        tails[n][i-1] = i;
    }

    function permute3(x) {
        var j;
        var tail_this;
        var tail_next;
        var tmp;
        if ( 0 === x ) {
            callback(head);
            return;
        }
        tail_this = tails[x];
        tail_next = tails[x-1];

        for ( j=1 ; j<x ; j++ ) {
            tail_next[j-1] = tail_this[j];
        }

        j=0;
        while ( true ) {
            head[x-1] = tail_this[j];
            permute3(x-1);
             
            j++;
            if ( j === x ) return;

            tmp            = head[x-1];
            head[x-1]      = tail_next[j-1];
            tail_next[j-1] = tmp;
        }
    }

    permute3(n);
}

// TEST FUNCTIONS

function testprint() {
    this.test = function(p) {
        console.log(JSON.stringify(p));
    };

    this.done = function() {
        return 'Done';
    };
}

// CHECKS IF PERMUTATION IS ORDERED - FUNCTIONAL (SLOWEST)

function testorder1() {
    var cnt = 0;

    this.test = function(p) {
        if ( false !== p.reduce(function(acc,val) {
            if ( false === acc || val < acc ) return false;
            return val;
        }, -1)) cnt++;
    };

    this.done = function() {
        return cnt;
    };
}

// CHECKS IF PERMUTATION IS ORDERED - IMPERATIVE (FASTEST)

function testorder2() {
    var cnt = 0;

    this.test = function(p) {
        var i;
        for ( i=1 ; i<p.length ; i++ ) {
            if ( p[i] < p[i-1] ) return;
        }
        cnt++;
    };

    this.done = function() {
        return cnt;
    };
}

// TEST FUNCTION FOR PROJECT EULER 32

function testeuler() {
    var sums = {};

    this.test = function(p) {
        var w1, w2, w;
        var m1, m2, mx;
        w =  Math.floor(p.length/2);
        w1 = 1;
        w2 = p.length - w - w1;
    
        while ( w1 <= w2 ) {
            m1 = arrayToNum(p,     0, w1      );
            m2 = arrayToNum(p,    w1, w1+w2   );
            mx = arrayToNum(p, w1+w2, p.length);
        
            if ( m1 < m2 && m1 * m2 === mx ) {
                sums['' + mx] = true;
            }
        
            w1++;
            w2--;
        }
    };

    this.done = function() {
        var i;
        var r = 0;
        for ( i in sums ) {
            r += +i;
        }
        return r;
    };
}

// MAIN PROGRAM BELOW

function processData(input, parg, targ) {
    var r;

    var test = null;
    var perm = null;

    switch ( parg ) {
    case '0':
        perm = permutations0;
        break;
    case '1':
        perm = permutations1;
        break;
    case '2':
        perm = permutations2;
        break;
    case '3':
        perm = permutations3;
        break;
    }

    switch ( targ ) {
    case 'E':
        test = new testeuler;
        break;
    case 'O1':
        test = new testorder1;
        break;
    case 'O2':
        test = new testorder2;
        break;
    case 'P':
        test = new testprint();
        break;
    }


    r = perm(+input, test.test);
    console.log(test.done());
} 

function main() {
    var input = '';
    var parg = '1';
    var targ = 'E';
    var i;

    for ( i=2 ; i<process.argv.length ; i++ ) {
        switch ( process.argv[i] ) {
        case '0':
        case '1':
        case '2':
        case '3':
            parg = process.argv[i];
            break;
        case 'E':
        case 'O1':
        case 'O2':
        case 'P':
            targ = process.argv[i];
            break;
        }
    }
    

    process.stdin.resume();
    process.stdin.setEncoding('ascii');
    process.stdin.on('data', function (s) {
        input += s;
    });

    process.stdin.on('end', function () {
       processData(input, parg, targ);
    });
}

main();

This is how I run the code (use a lower value than 9 to have fewer than 9! permutations)

### Project Euler Test: 3 different permutation generators ###
$ echo 9 | time node projecteuler32.js 3 E
45228
8.95user ...
b$ echo 9 | time node projecteuler32.js 2 E
45228
25.03user ...
$ echo 9 | time node projecteuler32.js 1 E
45228
70.34user ...

### Simple check-order test, two different versions. Fastest permutations.
b$ echo 9 | time node projecteuler32.js 3 O1
1
23.71user ...
$ echo 9 | time node projecteuler32.js 3 O2
1
4.72user ...

(the timings here may not exactly match the above figures)

All FP-sucks related articles
Functional Programming Sucks)
Underscore.js sucks! Lodash sucks!
Functional Programming Sucks! (it is slow) (this one)
Lodash Performance Sucks!

Underscore.js sucks! Lodash sucks!

In a world of functional programming hype there are two very popular JavaScript frameworks: underscore.js and Lodash. Dont use them! It is a horrible idea. They suck just like functional programming sucks!

They make claims like:

  • Lodash: A modern JavaScript utility library delivering […] performance
  • Underscore: JavaScript library that provides a whole mess of useful functional programming helpers.

The road to hell is sided by good intentions. This is how it goes.

1. Sloppy data types
There are many good things about JavaScript. The sloppy dynamic typing is perhaps not one of them. The following are for example true:

  • ’27’ == 27
  • undefined == null
  • 0 == ”
  • ‘object’ === typeof null

Now that I consider myself an experienced programmer I find it quite convenient to not need to be explicit about data types. But I dont mix and change types! If a variable is a number from the beginning it keeps being a number. I carefully pick types: Objects, Arrays, Number, String and stick to that type (for a variable or property). Occationally – mostly for return variables – I break the rule.

Lodash and Underscore is about allowing yourself to be sloppy:

  • Dont know if its an object or an array: use map, foreach, filter, reduce and many more
  • Dont know if it is empty (or what being empty even means): use isEmpty
  • Dont know if it is String Object or a String primitive or something else: use isString

If you dont know what it is is you already have a much bigger problem than how to do something with it.
If you mix String Objects and String primitives AND other things, and you want to know if it is any kind of string you are doing something wrong.

So Step 1 with Lodash and Underscore is that you

  1. Add a depenceny
  2. Allow sloppy and inconsistent typing
  3. No one can now presume anything about your types anymore
  4. Your code is now impossible to maintain or extend without lodash or underscore

2. Performance!
My experience after many years in software development is that when an application is not well received by the users it is very often because of (bad) performance. And bad performance causes weird, hard to reproduce, bugs and instability as well.

An important type of optimization that the JIT can do relies on the runtime generating classes with strict types for your objects (it guesses and learns the types as the program runs). If you allow a property to assume values of different types you are likely to destroy this optimization.

Lets look at the little function isEmpty.

/** Underscore **/
_.isEmpty = function(obj) {
    if (obj == null) return true;
    if (isArrayLike(obj) && (_.isArray(obj) || _.isString(obj) || _.isArguments(obj))) return obj.length === 0;
    return _.keys(obj).length === 0;
};

/** Lodash **/
function isEmpty(value) {
    if (isArrayLike(value) &&
        (isArray(value) || isString(value) ||
         isFunction(value.splice) || isArguments(value))) {
        return !value.length;
    }
    return !nativeKeys(value).length;
}

If you KNOW the datatype you can just do:

/** String **/
if ( 0 === s.length )

/** String that may be null **/
if ( null === s || 0 === s.length )

/** Array **/
if ( 0 === a.length )

/** Object **/
function objectIsEmpty(o) {
    for ( x in o ) return false;
    return true;
}

(you may want to check o.hasOwnProperty(x) depending on what you actually want – but if you dont know what you want using Lodash or Underscore will produce equally unexpected results as my code)

The worst thing with the Underscore and Loadash implementations are the last lines:

    return _.keys(obj).length === 0;
    return !nativeKeys(value).length;

Unless the JIT compiler and runtime is very smart those two will produce an entire array of strings on the heap – just to check if the array length is 0! Even though this in most practical cases will have an acceptable overhead it can get very expensive.

This IS the beauty of FP generic programming. Focus on WHAT, not HOW, and any little innocent check (like isEmpty) can turn horribly expensive.

However, to be honest, I took some plain functional code and replaced plain JavaScript with Lodash functions (forEach, reduce, isNumber, isEmpty and a few more) and the code actually got faster! Still much slower than imperative code, but slightly faster than not using Lodash.

3. Complexity and Optimization
Now that you have added an extra dependency, made your data objects sloppy, made your application harder for the JIT to optimize, perhaps your application is not as quick as you need it to be. If you are writing a front end you are probably quite fine. But if you are coding a Node.js backend, performance matters a lot more and waste is more unacceptable. If you are really unlucky these sloppy types give you hard to find bugs and your backend is not completely stable and reliable.

What do you do? Common practices in the business could be things like:

  • Minification/uglification
  • Scale out service architecture
  • Failover technology
  • Spend time optimizing code and modules

This is how little sloppyness, laziness and convenience when making initial decisions about your architecture later can cause you huge problems and costs.

Of course, just including lodash and using isEmpty() in a few places is not going to do much (if any) harm.
But finding lodash or underscore generally preferable to not using them is one kind of low quality thinking that causes software to be bad.

Conclusion
Be explicit, careful, consistent and smart about the data types you use.
Be restrictive with libraries and frameworks that introduce overhead and hide relevant details.

Use the standard library. However, you can find that for example the Array functions of Lodash outperform the standard library equivalents. That may not be true in the future (and I really wonder how it can happen at all).

All FP-sucks related articles
Functional Programming Sucks)
Underscore.js sucks! Lodash sucks! (this one)
Functional Programming Sucks! (it is slow)
Lodash Performance Sucks!

Functional Programming Sucks*

(*like everything else that is mindlessly applied the wrong way to solve the wrong problems)

Functional Programming Rocks!
You can do amazing things with Haskell. The historical importance of Lisp is huge. Math, computer science, algorithms and engineering can come together beautifully with functional programming. Writing small reusable, testable functions with no side effects is a great thing to do in any language – but more than anywhere else those virtues are emphasized in functional programming. I love first class functions!

But…

The unfortunate JavaScript Hype
As JavaScript made map, filter and reduce part of the standard those have become the foremost frontier of functional programming. Being quite serious, many people argue that if you replace your for-loops with map, filter, reduce and forEach you have achieved something significant when it comes to readability. What do you think?

// non-functional smelly loop
redFruits = [];
for ( f in fruits ) {
  if ( 'red' === fruits[f].color ) redFruits.push(fruits[f]);
}

// fantastic functional alternative
greenFruits = fruits.filter(function(f) {
  return 'green' === f.color;
});

As a bonus with functional, you can use lamda-functions… that is functions, with no name, that can not be reused. You can even use arrow-functions if you think this is too easy a concept and you think the code gets more readable by omitting the confusing word function.

While there are a lot of balanced articles about using functional ideas in JavaScript, for a lot of people eliminating for-loops at any cost seems to be functional enough.

The argument here is that your code should be declarative, it should express intention rather than instructions. In the case above with the fruit-filter it does make some sense. But if that argument should make any sense, it requires that the programmer does not abuse map, filter, forEach or reduce to eliminate a for-loop just for the sake of it. I have seen things like:

// functional
applePrice = Object.keys(fruits).filter(function(f) {
  return 'apple' === f.name;
})[0].price;

// when this would work
for ( f in fruits ) {
  if ( 'apple' === fruits[f].name ) {
    applePrice = fruits[f].price;
    break;
  }
}

I have several objections with the functional version. The Object.keys thing is hardly pleasant to the eye. It is false marketing to use filter: the intention is find or search, not filter. So you still need to read the details (just as with the for-loop), you are just first fooled into thinking its a filter (and chaining is very popular, so then you have no function names and no variable names). But perhaps the worst problem is the lack of error handling. Functional code is not meant to have side effects, and error handling is exactly that. If ‘apple’ is not found you get an ugly Error. You can of course try/catch, or make temporary array variable apples and check that its length is one, but people who prefer functional style usually dont do it.

I understand your objection: people can write crappy code with any language an paradigm and just becuase I have seen bad applications of filter does not mean there is anything wrong with filter or FP. Of course. The bad thing is recommending people to use it like a silver bullet. The bad thing is that good FP is actually difficult and junior programmers will get it wrong trying to be fashionable.

Here is another example to show I am not inventing this rant.

Functional is preferable-Hype
Another favorite example of this functional hype is from rosettacode. Look at this amazing collection of implementations of a simple algorithm: the Luhn Algorithm. I suggest:

  1. Read the introduction so you get some idea
  2. Look at the C-implementation: imperative and simple. Testable and no side effects.
  3. Look at the functional implementations: C++11, Common Lisp, Haskell, JavaScript (ES5.1 version), PicoLisp, Python, Rust, Scheme

Look at Scala: there are two versions, a Functional Style (recommended) and an Imperative style. The people att IOCCC would be jealous with this shit!

I can only come to one conclusion: the emperor is naked.

I mean, if you code in PicoLisp, for fun or for a good reason, and you have to implement the Luhn algorith, please do so! But to say “recommended” about the functional Scala code or to think that the C++11 code is in anyway more reasonable than the original C-code… it makes no sense. No real systems programmers will choose Rust over C based on this. And Python – a language known for its clarity… such a sad example.

Trendy fashionable “functional programmers” suck
So, the problem here is not primarily with Function Programming itself or the code that guru coders write with Functional Programming. The problem is that people with far too little theoretical background, training and experience follow the hype and the result is ugly.

Functional Programming Sucks (at)
There are some things that Functional Programming is not very good at: State and Side Effects. This is by design. From a FP perspective (and a general perspective as well) State and Side Effects are nasty and problematic: they should be avoided and when they cant be avoided they need special attention (like Monads).

The problem here is that JavaScript is mostly a programming language for the web. A simple (modern, single page) web application:

  1. Loads data from a server
  2. Presents data to the user
  3. Lets the user update the data
  4. Sends the data back to the server

This is (almost) all about state and side effects! You may have some functions for validation, filtering, sorting and calculations on the client, and those benefit from Functional Programming ideas. But your core enterprise is about state, side effects and error handling! Functional Programming is so unsuitable for this that a simple web page just can’t be written functionally, unless you start breaking rules. And this is of course what programmers end up doing because in the end of the day they have a real application to build for real users.

The difficult thing is to break the right rules in the right way at the right place for the right reason. This is architecture – mixing concepts, and designing how your application lives, its breaths and heartbeats. Architecture is difficult, and it is not made easier relying on unsuitable silver bullets.

Functional Reactive Programming
There is one thing that is even more hyped and even more theoretic than Functional Programming, and that is Functional Reactive Programming.

But it makes sense (as I understand it). It is not about taking Functional Programming one step further but about putting it in context. You have data that changes (signals, events, behaviours, whatever). That data can be pipelined with high quality through FP logic to produce the content of your GUI or the requests to the server.

Keep the functional parts functional, and let other parts of your application just deal with I/O, GUI and state. Dividing your code into separate modules with clear responsibilites and clear interactions have always been a good idea.

Performance
My experience is that when a real world application is not well received by the users a lot of the time its because performance sucks. When performance is bad usability is also bad, and stability gets bad (especially on the server side).

Functional programming is typically bad for performance.

  • anonymous functions can often not be JIT compiled and will never be optimized
  • recursion is nice, but rarely faster than a loop
  • the chaining concept creates arrays and arrays and arrays just for throwing away, which is fun for the garbage collector, but it is not fast
  • the immutable object concept is about generating new RO copies of object instead of changing objects, which is expensive and wasteful

Perhaps worse, since functional programming and proper use of and map(), filter(), reduce() and friends is hard (easy things get difficult) not so experienced programmers end up writing implementations with unnecessary computational complexity ( O(n) turns into O(n^2) ). It is not funny – you cant afford that for most anything that goes to production.

Agile, refactoring and generic code
It is HARD to design code properly from the beginning! Good objects, classes, functions, modules, packages, namings, dependency trees and architecture dont come for free! Agile and Refactoring are about accepting that the design will not be optimal the first time, thus we should not bother too much about it, but rather fix the code when we have learnt more about the problem and our code starts getting (too) ugly.

A strong argument for FP is that it is highly generic – which is true. But until a programmer has spent much time with her domain and problem she will not know what things can and should be made generic. Making things too generic is called overengineering, and it is perhaps the worst sickness in our industry.

I usually:

  • start with one source file rather than many
  • allow myself some copy-paste until I see what code really gets repeated
  • make my code as specific as possible, unless I see an obvious generalisation or the actual need for generalisation emerges
  • dont worry too much about global variables in the beginning (after a while there will be a natural place for them or for what the represent)
  • allow quite long functions until I see what parts of them actually do something meaningful on its own
  • code quite defensively with lots of error handling (it usually pays of quite quickly)

This works for getting quick practical results. Later, during refactoring, when the code base has grown, and when I have learnt more about the domain, I can break out pieces of critical code that creates nice generic functions. But thinking FP first – no way!

All FP-sucks related articles
Functional Programming Sucks (this one)
Underscore.js sucks! Lodash sucks!
Functional Programming Sucks! (it is slow)
Lodash Performance Sucks!