Category Archives: Programming

Arrow functions in JavaScript: A strategy

Arrow functions have been a part of JavaScript since ES6. They are typically supported where you run JavaScript, except in Internet Explorer. To be clear, arrow functions are:

(a,b) => a+b

instead of

function(a,b) { return a+b }

I like to make things simple, and

  1. my code sometimes run on Internet Explorer
  2. arrow functions offers shorter and simplified syntax in some cases, but fundamentally you can write the same code with function
  3. I like to not have a build step (babel, webpack and friends) for a language that really does and should not need one

so, until now I have simply avoided them (and kind of banned them, along with other ES6 features) in code and software I am responsible for.

However

  1. arrow functions (as part of ES6) are here to stay
  2. they offer some advantages
  3. Internet Explorer will go away.

so, it makes sense to have a strategy for when to use arrow functions.

What I find on the Internet
The Internet is full of sources telling you how you can use arrow functions, how to write them, what are the pros, cons and pitfalls, and what they cannot do.

  • The key difference is how arrow functions work with this.
  • The syntax is shorter especially for single argument (needs no parenthesis), single statement (needs no return), functions.
  • Arrow functions don’t work well with Object oriented things (as constructors and prototype function)

In short, there are some cases where you can’t use arrow functions, some cases where they offer some real advantages, but in most cases it makes little real difference.

Arrow functions allow you to chain sort().filter().map() in very compact ways. With simple single statement arrow functions it is quite nice. But if the arrow functions become multiple lines I think it is poor programming.

What I don’t really find on the Internet
I don’t really find good advice on when to use arrow functions and when not to use arrow functions. I mean, when I program, I make decisions all the time:

  • Should I break this code out into a function?
  • Should this be an object (prototype style) or just data?
  • Should I break this code into its own module?
  • Should I write tests for this?
  • Should I allow a simple, slower algorithm, or should I add effort and complexity to write my code faster?
  • What should be the scope of these variables?
  • Should this be a parameter or can it be hard coded?
  • Can I make good use of map/reduce/every and friends, or is it better I just use a loop?
  • Naming everything…
  • …and so on…

Using, or not using, an arrow function is also a choice. How do I make that choice to ensure my code is good? I don’t really find very clear guidelines or style guides on this.

Lambda functions in other languages
Other languages have lambda functions. Those are special case anonymous functions. The thing I find peculiar about the use of arrow functions in JavaScript is that they are often used instead of function, when a standard function – not a lambda – would have been the obvious choice in other languages.

Intention
For practical purposes most often function and () => {} are interchangeable. And I guess you can write any JavaScript program using only arrow functions.

When you write code, it mostly does not matter what you use.
When you read code, it comes down to understanding the intention of the writer.

So I think good use of arrow functions is a way that makes the intention of the code as clear as possible. I want clear and consistent guidelines.

Using arrow functions in well defined cases shows more intention and contributes to more clear code than never using them.

I tend to read arrow functions as being a strong marker for functional programming. I find it confusing and when arrow functions are used in code that breaks other good core principles of functional programming.

The strongest cases
The strongest cases for arrow functions I can see:

Minimal syntax (no () or {} required), and never worth breaking such function out.

names = stuffs.map(stuff => stuff.name);

Callback: the arguments (error, data) are already given by openFile and the callback function cannot have a meaningful this. Also, for most practical purposes, the callback needs to use closure to access data in the parent scope, so it can not be a named function declared elsewhere.

openFile('myFile', (error, data) => {
  ... implementation
});

When it makes little difference
For a regular function it makes no difference:

const swapNames = (a,b) => {
  let tmp = a.name;
  a.name = b.name;
  b.name = tmp;
}

The function alternative would be:

function swapNames(a,b) {

and is actually shorter. However, I can appreciate with arrows that it is completely clear from the beginning that a binding of this can never happen, that it can not be used as a constructor and that there can be no hidden arguments (accessed via arguments).

Confused with comparison
There are cases when arrow functions can be confused with comparison.

// The intent is not clear
var x = a => 1 ? 2 : 3;
// Did the author mean this
var x = function (a) { return 1 ? 2 : 3 };
// Or this
var x = a <= 1 ? 2 : 3;

Obfuscate with higher order functions
Higher order functions (map, reduce, filter, sort) are nice and can improve your code. But, carelessly used they can be confusing and obfuscating.

These are not the fault of () => {} in itself. But it is a consequence of making higher order functions with arrow functions too popular.

I have seen for example (things like):

myArray.map(x => x.print())

map() should not have a side effect. It is outright obfuscating to feed a function that has a side effect into map(). And side effects have nothing to do with functional programming in the first place.

I have also seen reduce() and filter() being used when every(), some() or find() would have been the right choice. It is obfuscating, it is expensive, and it produces more code than necessary.

The use of arrow functions with higher order functions is only appropriate when the correct higher order function is used.

The abusive cases
Anonymous functions that are non-trivial and could clearly be named and reused (and testable) is clearly bad code:

myStuff.sort((a,b) => {
  if ( a.name < b.name ) return -1;
  if ( a.name > b.name ) return  1;
  if ( a.id   < b.id   ) return -1;
  if ( a.id   > b.id   ) return  1;
  return 0;
});

especially when the code is duplicated or the parent function is large.

An arrow-friendly policy
Admittedly, after doing my research I feel happier with arrow functions than I thought I would.

I suggest (as long as your runtime supports it) to use arrow functions as the default function. The reason for this is that they do less. I think the standard behavior of arguments, this and of OOP-concepts (prototype and constructors) should be optional and require explicit use (of function).

Just as one-line if-statements and if-statements without {} should be used carefully (I tend to abuse it myself) I think the same applies to arrow functions.

I think this is excellent:

names = stuffs.map(stuff => stuff.name);

but apart from those common simple cases I think think the full syntax should be used for clarity:

const compareItems (a,b) => {
  if ( a.name < b.name ) return -1;
  if ( a.name > b.name ) return  1;
  if ( a.id   < b.id   ) return -1;
  if ( a.id   > b.id   ) return  1;
  return 0;
};

(dont try to be clever by omitting (), {}, or return).

The use of function should be reserved for

  • constructors
  • prototype functions
  • functions that need the standard behavior of this
  • functions that do things with arguments
  • source files where function is used exclusively since before

Basic good functional programming practices should be especially respected when using arrow functions:

  • Dont duplicate code: break out anonymous functions to named functions when appropriate
  • Dont write long functions: break out anonymous functions to named functions when appropriate
  • Avoid side effects and global variables
  • Use the correct higher order function for the job

Also, obviously, take advantage of OOP and function when appropriate!

Callback functions
I think anonymous callback functions should generally be kept short.

const doStuff = () => {
  readFile('myFile', (error, data) => {
    if ( error )
      console.log('readFile failed: ' + e);
    else
      doStuffWithData(data);
  });
};

const doStuffWithData = (data) => {
  ...
};

Performance
In principle, I see no reason why arrow functions should not be at least as fast as regular function. In practice, the current state of JavaScript engines could be disappointing - I don't know.

However, a named static function is typically faster than an anonymous inline function. The JIT typically can optimize a function the more it is run so named and reusable functions are preferred.

I have made no benchmarks on arrow functions.

Feedback
I will start using arrow functions when I write new code and I feel enthusiastic about it. I will probably come across things I have not thought about. Do you have any thoughts on this? Let me know!

Want to be a programmer! Where to start?

Quite often I hear (read) someone who wants to become a programmer and asks where to start. Often, not always, they ask what programming language they should learn first. Sometimes they have decided for a language and they ask what operating system, tools and perhaps online services they should use. Sometimes the understanding of programming in particular and computers in general is vague.

The fascinating thing is that such questions can receive very different answers. Different working programmers have completely different ideas on how to become a programmer. Completely.

The most important thing
If you find a way to work with computers and code that keeps you entertained and thrilled, and you spend hours and days feeling curious and enthusiastic, this is a good way for you to learn! A way of learning that works perfectly for someone else, but does not make you enthusiastic at all, will probably not work well for you. Hard work and difficult things are a lot easier if it is fun and it makes sense to you!

A reading advice
When you read the rest of this text don’t stop if there is a word you don’t understand! For being a text written for beginners the text is full of words (interpreter, service, syntax and so on) that you may not be familiar with – at least not in this context. Ignore it and just read on. You can later find the meaning (in the context of computers and programming) of those words on wikipedia.

The programming ecosystem
Lets say that programming is the act of making computers do stuff for people.

There is a stack of expertise involved in delivering a service or a product:

  1. Computer science: data structures, algorithms, information theory
  2. Coding: reading and writing code, thinking like a computer, getting it right
  3. Programming language: syntax, keywords and tools specific to a programming language
  4. Libraries and frameworks: code you can reuse to do more with writing less code
  5. The Internet: networking, protocols, formats, security, how it all works
  6. Development environment: your computer, its OS, and the tools you use to code
  7. Production environment: where your code runs, if it is some kind of service
  8. Deployment, test, lifecycle: how to continuously release new versions
  9. Data modelling: how to turn real world information into processable computer data
  10. Requirement analysis: understanding your customer and the market
  11. Team work: different people have different skill sets and work together

Obviously you are not going to have the same high expertise in each of the above areas. Perhaps you have a lot of passion for some things while you are completely uninterested in other things. That is fine.

There is a bit of a catch 22 here. When you already have knowledge you can get involved in a project or company, and work with just a few (or all) of the things above. But when you are a beginner, all those fields of knowledge are quite abstract and useless on their own. So to produce anything that is fun or slightly meaningful you want to work with the entire list, which is obviously kind of impossible (as a beginner). So, it helps to be persistent, to like reading and details, and have quite low expectations on what is fun and meaningful!

Programming is enormously rewarding for the brain. You set out to create something, you work on it, and it works. You get dopamine! You need to find a way to work and learn so you get rewarded often. It depends on your grit, but you should usually feel rewarded and experience success several times per day, both when learning and working.

So when learning to code, you need to find small contained projects that are simple and interesting enough to allow you to succeed and feel successful.

A common advice from programmers is often to try different programming languages. I am not so sure about it. I think what is also very important is to iterate often and fast from idea to “product”. With time, ideas will be bigger and more complex. To do that, it makes sense to master a language, the tools and the ecosystem, rather than just learning more of them.

I will discuss a few platforms from a beginners perspective.

Arduino
You can buy an Arduino start kit. It comes with everything you need (except a computer, Mac, Windows, Linux does not matter). It comes with a book with projects that take a few hours to complete. No previous knowledge is required, the Arduino is designed for non programmers (children, artists) to create stuff. When you have completed the projects you can modify them and experiment. When you do this you will learn to write the code needed to achieve what you want.

The Arduino is a very self-contained ecosystem where you can iterate quickly. The code you will write is very basic C-code (actually C++). But you don’t need to know that or think about it.

Later when you want to write other code, not just for Arduino, most everything you have learnt on the Arduino is useful. But more complex ecosystems have many more aspects to consider.

Hackerrank.com
There are many such sites, but Hackerrank is the one I have experience with.

Hackerrank offers a wide range of “problems” to solve online in (almost) any programming language you want. It is free, requires nothing to be installed or configured on your computer, and you get (for training purposes) relevant, well defined problems and a contained environment to work with them.

Hackerrank is great to learn new languages, datastructures and algorithms. You will need a reference or language tutorial elsewhere (but for relevant languages you can find it online). There are things you will not learn on Hackerrank: how to configure your own system, more advanced tools, code that interacts with the user, filesystem or network, and error handling. But it is quite fine to master a language and algorithms first.

iOS
I have no experience with iOS (or macOS) development. But if you have a Mac, an iOS device (iPhone or iPad) and you get a beginners book, you have everything you need to make real iOS apps that you can sell for money.

Apple also have a Swift Playground app for iPad (Swift is the preferred programming language of iOS).

It seems like a good idea to me to learn to iterate from idea to working App in such a contained (and, for good and bad: walled, protected, restricted and designed) ecosystem.

Swift may not be the most useful language outside the Apple world. But it is a modern language that have everything in common with other common languages (such as Java, C#, Rust, Python).

Automation with the shell
If your objective is to automate server configuration/operation/maintenance look at bash for Linux and Powershell for Windows. Don’t expect to become a “real” programmer, but it is the way to get your problems solved. Be sure to be aware of the commands/utilities available in your environment (use sort/grep instead of implementing similar functionality in bash).

Python
Python is a very good language to learn. It is a simple, clean, well documented, widely used language that works equally well in macOS, Linux and Windows.

Python is suitable for simple and advanced mathematical applications and simulations. It is suitable for parsing, processing and outputting data and to interface with databases: automation and integration.

Web
The web is generally a difficult ecosystem for a beginner. The problem is that many things come into play. Lets say we want to write a simple shopping list. Typically you need to deal with a database to store data, backend code for APIs (with authentication/security), http for transporting data and html+css+javascript for the applications itself. Also, you need to think about hosting and domain registration. You end up with several programming languages (for example SQL, PHP and JavaScript) even for a simple application. Not only is the web browser (http+html+css+javascript) a quite cumbersome programming environment, you also need to consider different web browsers.

Nevertheless, the web is probably the most relevant ecosystem to develop applications for! But perhaps you should not learn programming by coding for the web.

Web: WordPress
If you need to deliver websites in the form of a blog (like a little newspaper) or perhaps a little webshop, WordPress can be amazing!

Note that WordPress is based on LAMP (Linux, Apache, MySQL and PHP) which is a rather complex mess. But if you can ignore that (find a hosted solution, or just follow instructions without thinking and questioning too much) WordPress can be very productive. You will learn PHP and JavaScript as you need to do more advanced things. These are perhaps the worst two languages out there for the purpose of learning programming, but perhaps the most productive languages when it comes to delivering content and features.

Web: Node.js
You can build web applications with Node.js. The advantage is that you can use JavaScript both on the server and in the web browser and keep your toolbox smaller. However, it is very possible to grow your toolbox enormously with npm (the package manager for Node.js). I don’t think Node.js-based web applications, or JavaScript, are suitable for beginners. But if you are a beginner and you want to program web applications, it is probably your best choice.

Desktop
Perhaps 20 years ago, programming was much about building desktop applications (programs with a graphic user interface running in Windows, macOS or Linux). This is, I would say, quite a niche field in programming nowadays (more commonly, programmers develop applications for iOS/Android, for the web, or server code for internal use).

Games are obviously a significant part of Desktop programs.

Desktop is quite complex and qualified programming. If you want to do it for macOS only, get a Mac with Xcode and get a beginners book. If you want to write platform independent desktop applications (Linux + macOS + Windows) have a look at QT (which is, kind of, C++, and very nice). For Windows only, ask someone else.

If this is what you want to do, look att Hackerrank above, and stick to C, Swift (for macOS) or C# (for Windows), to first learn the fundamentals of programming. When you know more, go on experimenting with the desktop.

Android
I don’t know what is the best way for a beginner to program for Android. I would say, start coding for iOS to learn “mobile” (and you reach an equally big audience/market with iOS). When you are a proficient iOS developer, I think picking up Android is no big deal.

Very simple games
If you want to develop very simple (retro) games, have a look at PICO-8. It is a (non-free) programming environment for building simple games for a virtual game console. These games can be deployed to and played in a web browser or most computers.

The language is Lua – a very simple language that is useful for other purpose

Deep knowledge – computers and systems
If you want to understand computers, operating systems, security and the internet: learn C (not C++, not C#, not Objective-C, just C). To learn, I suggest you get some tutorial (like the book: Learn C the hard way). Make sure to know C99 (C is standardised – learn that and use it consistently)! I suggest you start with exercises or problems on Hackerrank (or a similar site or tutorial) until you get rather comfortable writing C.

All major operating systems are written in C, as are a lot of the infrastructure that operates the internet. C is “unsafe” and the cause for many security issues in computer systems. This means that to understand the nature of these problems it really helps to know C. Many other languages (or strictly, their runtime/interpreter) are themselves written in C. They need to be, to talk to the operating system (which they need to do most anything). C is not going away and it is fundamental to most every computer we see.

C++ is technically a superset of C (that is C with more features – and a few exceptions). So it can appear C++ is better. But it is two languages with very different “style”. You should solve problems very differently in them. C++ has merits of its own, but for the purpose of deep understanding of computers and operating systems, go for C. C# is a language that mostly resembles Java. Objective-C is also technically a superset of C, but it is a rarely used language that you most likely can ignore.

To go even deeper you can learn Assembly language. Most likely it makes no sense for you to do it. At least not in the beginning of your learning.

Deep knowledge – math and computer science
If you are fascinated with math and you like an academic approach to things you can look into functional programming. This is where programming gets beautiful – if you have sensitivity for that kind of aesthetics. But it is not where you solve most practical problems.

Haskell is for purists. LISP mixes pragmatism with myth. But many modern programming languages (for example Java, Swift, Rust, JavaScript, C++, C#) incorporates practical aspects of functional programming.

LISP (Common LISP to be precise) has very capable built in support for math (fractions, complex numbers, arbitrary large numbers). If you are a mathematician you may find most other languages unsatisfying.

While C more than anything else focuses on making a computer do exactly WHAT you instruct (program) it to do, functional languages are more like programming with mathematical definitions (functions).

Conclusions
When you know programming in general, you understand how the internet and a computer works, you are familiar with established standards and you know a few programming languages, it is pretty easy to learn new languages and tools.

So what language you learn first matters not so much. What matters is that you learn to go from idea to product, and that you know how to do things properly (write clean, efficient, effective, secure and correct code).

To do that, you more than anything else need to work with things that you find challenging, interesting and fun.

Programming is so much more than programming languages: it is about attention to details, understanding the real world, understanding people, making beautiful things, keeping things simple and trying often and failing fast.

Minimalistic Services and Applications

Question: There are plenty of documentation, patterns, architectures and practices for scaling up your cloud Services and Applications solution. But how do I scale it down?

In 2015 I set up a minimalistic architecture for delivering Services and Web Applications. It was based on 15 years of experience (not only positive) on constructing and operating applications, services, servers and integrations. Now in 2018 I can say that my architecture is doing very fine. I have continously been delivering business value for 3 years. I will share some principles and technical details.

Limitations and reservations
Not all solutions are good for everyone. Neither is mine. If you know that you want worldwide internet scalability my architecture is not for you. But often you are building applications that are internal for your organisation. Or you have a local/regional physical business that you need to support with services and applications. Then you know that there is a practical upper limit of users that is not very high.

While modern cloud services are supposed to scale more or less unlimited this does not come for free. It comes with complexity and drawbacks that you may not want to pay for, since you are anyway not aiming for those gigantic volumes of users and data.

The architecture I am presenting is designed both to perform and to scale. But within limits. Know your limits.

Microservices
Microservices is about many things. I have made a practical interpretation.

My… delivery platform… consists of microservices that communicate with each other. They should have limited responsibilities and they should not grow too big. Each service should store its own data, really. Two different services should share data via their (public) APIs, never by using a shared storage.

I ended up with a separate Authentication Service (knowing about users and credentials) and Roles Service (knowing about roles/priviliges granted to a user). In hindsight perhaps this could, or should, have been just one service. On the other hand, if I want to store something like personal Settings/Preferences for each user, perhaps it is good that it does not go to a common single User service that grows more complex than necessary.

As you may know, there is another Microservice principle about each service being able to run in multiple instances and that (via Event Sourcing and CQRS) state is not immediately consistent, but eventually consistent. I normally outright break this principle saying that a single service has a single instance holding the single truth. I feel ok doing this since I know that each service is not too big, and can be optimized/rewritten if needed. I also feel ok doing this because I know I save a lot of complexity and my approach opens up for some nice optimizations (see below).

It is all about HTTP APIs
My microservices talk to each other over HTTP in the simplest possible way. The important thing is that your web applications, native (mobile) applications, external partners and IoT-devices use the same APIs.

I want it to be trivial to connect to a service using wget/curl, an Arduino, or any left behind environments my clients may be using. I also want any server platform to be capable of exposing APIs in a conforming way.

What I basically allow is:

http://host:port/ServiceName/Target/Action?token={token}&...your own parameters

Your service needs to have a name and it is in the URL. Target is something like Order or Customer. Action is something like Update or Cancel. token is something you need to obtain from the Authentication Service before making any calls. You can have extra parameters, but for more data it is preferable to POST a JSON object.

I dont want any extra headers (for authentication, cookies or whatever) but I respect Content-Type and it should be correct. Absolutely no non-standard or proprietary headers.

I only use GET and POST. It just doesn’t get clear and obvious enough if you try to be smart with PUT and DELETE.

For things like encryption (HTTPS) and compression (gz) I rely on nginx.

Reference Implementation
The above principles constitute the Architecture of a number of services together making up a virtual application and service platform. As you can see you can build this with almost any technology stack that you want. That is the entire point!

  • You may want to make API calls from known and unknown devices and systems in the future
  • You may want some legacy system to be part of this virtual delivery platform
  • You may want to build some specific service with some very specific technology (like a .NET service talking to your Active Directory)
  • You may find a better technology choice in the future and migrate some of your services from current technology

But for most purposes you can build most services and applications using a few, simple, free and powerful tools. More important than the tools themselves are established standards (HTTP, HTML, JavaScript and CSS) and principles about simplicity and minimalism.

JSON and JavaScript
After working for years with integrations, web services (SOAP), XML, SQL databases and .NET I can say that the following type of technology stack i common:

  1. Web application is written in JavaScript, works with JSON
  2. Web application communicates with server using XML
  3. Server processes data using .NET/C#
  4. Data is persisted using SQL queries and a relational database

This means that a single business object (such as an Order) has data representations in SQL, C#, XML and JSON. This means that you have several mappings or transitions in both ways. You can also not reuse business logic written in SQL, C# or JavaScript in another layer of your application.

With Node.js you have the opportunity to do:

  1. Web application is written in JavaScript, works with JSON
  2. Web application communiates with server using JSON
  3. Server processes data using JavaScript and JSON
  4. Data is persisted in JSON format (either in files or a database like MongoDB)

This is simply superior. A lot of problems just disappear. You can argue about Java vs C#, HTTP1 vs HTTP2, Angular vs React and things like that. But you just cant argue about this (the fundamental advantage a pure JS stack gives you – not because JavaScript is a superior language but because it is the defacto language of the web).

So my reference platform is based on Node.js and I store my data in JSON.

Binary formats
Binary formats have their advantages. It is about power and efficiency. But rarely the differences are significant. Base64-encoding is 33% more expensive than the original binary. Compiled languages are somewhat faster and use less memory. But humans can’t read binary. The compilation (or transpilation) into binary (or machine generated code) is not only an extra step requiring extra tools. It also creates a longer distance between the programmer and source code on one hand and the execution and its error messages on the other hand. Source maps are a remedy to a problem that can be avoided altogether.

I was once responsible for a .NET solution (with Reporting Services) that was so hard to change and deploy that we eventually refused to even try. I realised that if the system had been coded in the worst imaginable PHP I could have made a copy of the source (in production), modified the system (in production) and restored the system if my changes were not good.

Similar problems can appear with databases. Yes, you can make a backup and restore a database. But how confident do you feel that you can just restore the database and the system will be happy? What is IN the database backup/restore, and what is configuration outside that database that might not be trivial or obvious to restore (access rights, collation settings, indices and stored procedures, id-counters, logging settings and so no).

So my reference platform minimises the use of binary formats, build steps and databases. I code plain JavaScript and i prefarably store data in regular files. Obvously I use native file formats for things like images and fonts.

Storage
Some appliations have more live data than others. I have rarely come across very large amounts of transaction or record data. I have very often come across applications with little data (less than 100Mb) and a truly complex relational database. I have also seen relational databases with not too much data (as in 1GB) with severe performance problems.

So, before architecting your solution for 10-100GB+ data, ask yourself if it can happen. And perhaps, if it eventually happens it is better to deal with it then?

Before constucting a relational datamodel with SQL ask yourself if it is really worth it.

Since we are using a micrsoservice strategy and since services share data via their APIs two things happen:

  1. Most services might get away with very little (or no) data at all (while some have much data)
  2. A service that later turns out to need to deal with more data than it was first built for can be refactored/rebuilt without affecting the other services

So I suggest, if in doubt, start small. What I do (somewhat simplified) is:

  1. Start up Node.js service
  2. Load data from local files into RAM
  3. All RO-access is RAM only
  4. When data is updated, I write back to file within 10s (typically all of it every time, but I keep different kinds of data in different files).
  5. Flush data before shutting down

This has an advantage which is not obvious. JavaScript is single-threaded (but Node.js has more threads) so a single request is guaranteed to finish completely before the next request starts (unless you make some async callback for waiting or I/O). This means that you have no transaction issues to deal with – for free – which significantly simplifies a lot of your request handling code and error handling.

Another advantage is that RAM is extremely fast. It will often be faster and cheaper to “just access all the data in RAM” than to fetch a subset of the data from a database and process it.

This may sound like “reinventing the wheel”. But the truth is that the above 1-5 are very few lines of quite simple code. You can use functions like map(), reduce() and filter() directly on your data without fetching it (async) first. That will save you lines of code.

Again, this may not work for all your services for all future, but it is surprisingly easy and efficient.

Code, Storage, Configuration and installation
When I check out my (single) git repostory I get something like:

packages/                         -- all my source code and dependencies
tools/                            -- scripts to control my platform,
                                     and a few other things

I then copy the environment template file and run install (to make node_modules from packages):

$ cp tools/env-template.json dev.json
$ ./tools/install.sh

This config file can be edited to replace “localhost” with something better and decide what services should run on this machine (here, in this library) and where other services run if I use different machines. Now I start the system, and now I have:

$ node tools/run dev.json ALL     -- use dev.json, start ALL services

dev.data/                         -- all data/state
dev.json                          -- all environment configuration
node_modules/
packages/                         -- all code
tools/

I can now browse that services on localhost:8080, but to login I need to create an Admin user using a script in tools (that just calls an API function) before logging in.

Notice how easy it is to start a new environment. There are no dependencies outside packages. You may create a dev-2.json which will then live in dev-2.data side by side with dev. To backup your state you can simply backup dev.data and move it to any other machine.

Lets have a look at dev.data (the files for one service):

Authentication.localstorage/     -- all data for one service
Authentication.log/              -- a log file for one service (kept short)

In packages you find:

common/                          -- JavaScript packages that can be used
                                    on Node as well as web
node/                            -- Node-only-packages
services/                        -- Node-packages containing services
web/                             -- JavaScript packages that can be used
                                    on the web only

You shall include tests on different levels (unit, integration) in a way that suits you. The above is somewhat simplified, but on the other hand in hindsight I would have preferred some things to be simpler than I actually implemented them.

Notice that there are no build scripts and no packaging required. All node code is executed in place and web applications load and execute files directly from packages/.

Serving files, input validation, proxy and nginx
Node.js is very capable of serving files (and APIs) just as it is. I have written a custom Node.js package that services use to handle HTTP requests. It does:

  • Validation of URL (that URLs conform to my standards)
  • Authentication/authorization
  • Is it a file or an API call
  • Files: serve index.js from common/ and web/, and www/ with all contents from all packages
  • APIs: validate target, action (so it exists), validate all URL-parameters (dates, numbers, mandatory input, and so on)

This may seem odd but there are a few good reasons for doing exactly this.

  1. Service APIs and policies are metadata-driven
  2. Consistent good logging and error messages
  3. Consistent authorization and 401 for everything questionable (both for files and APIs)
  4. The same service serves both API and www-files which eliminates all need to deal with cross-site issues (which is something of the least value-adding activity imaginable)
  5. Consistent input validation (if there is anything I don’t trust people get right every time they write a new service this is it)

You can probably do this on top of Express, or with Express, if you prefer not to use Node.js standard functionality.

At this point, each service listens at localhost:12345 (different ports) so you need a proxy (nginx) that listens to 80 and forwards to each service (remember the service name is always in the URL).

I prefer each service to handle all its API calls. Quite often it just forwards them to another service to do the actual job (lets say a user action of the Order service should create an entry in the Log service: the Order web UI calls Order/log/logline, which in turn calls the Log service). This can be very easily achieved: after authentication/authorization you just send the request through (standard Node.js does this easily).

Dependencies
The web has more npm packages than anyone can possibly want. Use them when you need (if you want, read Generic vs Specific Code, Lodash and Underscore Sucks, …).

My biggest fear (really) is to one day check out the source code on a new machine and not being able to install dependencies, build it, test it, run it and deploy it. So I think you should get rid of dependecies and build, and rather focus on testing, running and deployment.

I think, when you include a dependency, place it in packages/ and push it to your repository. Then you are in control of updating the dependency when it suits you. New dev/test/prod machines will get your proven and tested versions from packages/, regardless what the author did to the package.

This approach has both advantages and disadvantages. It is more predictable than the alternatives and I like that more than anything else.

Error handling
I take error handling seriously. Things can get strange in JavaScript. You should take the differences between numbers and strings, objects and arrays seriously (thats why you should not use Lodash/Underscore). There are no enums to safely use with switch-statements. I often add throw new Error(…) to code paths that should not happen or when data is not what I expect.

On the (Node.js) server I don’t have a big try-catch around everything to make sure the server does not crash. I also don’t restart services automatically when they fail. I write out a stack-trace to and let the server exit. This way I always work with a consistent, correct state. Critical errors need to be fixed, not ignored. This is the Toyota way – everyone has a red button to stop production if they see anything fishy. In effect my production system is among the most stable systems I have ever operated.

Validation, models and objects
Data validation is important. Mostly, the server needs to validate all data sent to it. But a good UX requires continous validation of input as well.

I put effort into defining models (basically a class in an OO language). But since my data objects are regularly sent over the network of fetched from disk I don’t want to rely on prototypes and member functions. I call each object type a model, and early on I write a quite ambitious validation-function for each model.

Sharing code between Node.js, web (and AngularJS)
I want my code (when relevant) be be usable on both Node.js and the Web. The Web used to mean AngularJS but I have started not using it.

This is what I do:

 /*
  * myPackage : does something
  *
  * depends on myUtil.
  */
(function() {
  'use strict';

  function myFactory(myUtil) {

    function doSomething(str) {
      ...
    }

    return {
      doSomething : doSomething
    };
  }

  if ('undefined' !== typeof angular) { // angular
    angular.module('mainApplication').factory('myPackage',
                  ['myUtil',
          function( myUtil ) {
      return myFactory(myUtil);
    }]);
  } else if ( 'undefined' !== typeof MYORG ) { // general web
    MYORG.myPackage = myFactory(MYORG.util);
  } else if ( 'undefined' === typeof window ) { // nodejs (probably)
    module.exports = myFactory( require('common/util') );
  } else {
    throw new Error('Neither angular, node or general web');
  }
})();

This way exactly the same source code can be used both on the web and in Node.js. It requires no build step. The “general web” approach relies on a global object (call it what you want) and you may prefer to do something else. You just need to make sure you can serve common/util/index.js and common/mypackage/index.js to the web.

Scaling and cloud technology
For a simple development system, perhaps for a test system or even for a production system, everything can live in a single folder. If you need more power or separation you can put each service in a Docker container. You can also run different (groups of) services as different users on different machines.

So, the minimalistic architecture easily scales to one service per machine. In practice you can run a heavy service on a single machine with 16GB RAM (or more) which will allow for quite much RW-data. 16GB or more RAM is quite cheap compared to everything else.

Scaling and more data storage
There are many other possible strategies for a service that needs more storage than easily fits in RAM (or can be justified in RAM).

Some services (like a log) is almost exclusively in Write mode. You can keep just the last day (or hour) in RAM and just add new files for every day. It is still quite easy and fast to query several days or logs when needed.

Some services (like a customer statistics portal) has mostly RO-data that is not regularly accessed, and that lives in “islands”. Then you can have (load from other systems) a JSON-file for each customer. When the customer logs in you load that file to memory and later you can just recover that memory. Such a service can also be divided into several services: 1 main RW, 1 RO (A-L), 1 RO (M-Z).

Some services will do expensive processing or perhaps expensive communication/integration with other systems. Such processing or integration can be outsourced to a dedicated service, freeing up resources in the main service. If you for example generate a PDF, make sure you do it in a process outside Node.js.

In the same way a service can offload storage to another service (which could possibly be a MongoDB).

Web files (html, css, images, js) can be cached by nginx (if you accept to serve them without authentication) and served virtually for free even if your service has full control.

Things like logging can also be outsourced to a dedicated and enterprise class logging software. Nevertheless, it is good to have a simple reference Node.js logging service that can be used for development purposes locally.

Finally, GDPR indicates that you should throw away data. You can also move data from a live system to a BI-system or some Big Data tool. Perhaps your architecture does not need to support data growth for 10+ years – perhaps it is better it does not.

Scaling – conclusion
These scaling strategies may not sound too convincing. But the truth is that building your entire system in a single very powerful monolith is probably going to be less scalable. And building everything super scalable from the beginning is not easy or cheap (but if thats what you really need to do, go ahead).

Integration testing
Notice how integration testing can be achieved locally, automated, with virtually no side effects:

  1. Generate a integration-env.json
  2. Start up services (as usual)
  3. Run tests to inject data into the services (throw standard APIs)
  4. Run tests to read data, query data
  5. Shut down services
  6. Remove integration-env.json and integration-env.data/

Source control and repositories
For now, I have all code in a single git repository. It would be easy to use multiple repositories if that simplifies things (when developing multiple independent services at the same time). Linux is in a single git repository so I think my services and applications can be too.

Tooling
All developers prefer different tools and I think this should be respected. I also think coding style does not need to be completely consistent across services (although single files should be kept consistent).

But just as developers should be allowed their own tools, the artifacts of those tools should not make the repository dirty. And the next developer should not need to use the same tools as the previous to be able to keep working on the code.

Web frameworks
If I mastered direct DOM-manipulation I would probably suggest that you should too (and not use any web frameworks). However I have been productive using AngularJS (v1) for years. Since AngularJS is inevitably getting old I have started using Vue.js instead (which I think is actually a better choice than Angular, however check my post about loading vue templates).

React is also a fine framework but it requires a build process. For my minimalistic approach that is a very high and unnessecary addition of complexity. I don’t see any indications that React is fundamentally more productive or competent than Vue.js so I think you are fine with Vue.js (or jquery or Vanilla.js if you prefer).

Performance
I have, to be honest, not had the opportunity to add very many simultaneous users to my system. On the other hand I have used it for rather mission critical services for 3 years with very few issues. So this architecture has served me well – it may or may not serve you well.

My production environment consists of a single VPS with 1 core, 2GB RAM and 20GB storage. Performance is excellent and system load minimal.

Missing Details
Obviously there are a lot of details left out in this post. You dont have to do things exactly the way I did it. I just want to outline an architecture based on minimalistic principles. The details of users, authentication, logging and naming conventions are of course up to you to decide.

Feel free to ask though! I am open to discuss.

Conclusion and final words
I wrote this post quickly and I will probably add more content in the future (and correct/clarify things that could be improved).

 

Code Reuse: Generic vs Specific code

Code reuse is probably the most important key to productivity and quality when programming. Node.js is very successful, partly because NPM is very successful at facilitating simple code reuse of vast amounts of reusable packages (of various quality).

However, a project can suffer from too many (external) dependencies which can cause problems with quality, complexity, reliability, maintainability and flexibility: the problems the external code was supposed to solve, not cause.

Generic code
Being generic is generally good when it comes to code. A successful (as in many downloads) NPM library is probably quite generic (like Express, Moment or Bootstrap), otherwise it would not appeal to a large number of developers. One problem with generic functionality is that when used, it is often not the easiest and most compact way to do things.

Specific code
Being specific means that code (especially it API) is shaped exactly for the what is needed in the current project. It may not be very useful outside this project (or organisation), or most other organisations would choose do do things differently and would not agree with it. However, for the project it is being used in, it maximizes reuse.

Example: Bootstrap
Imagine you use Bootstrap and you have a lot of buttons that look the same. You use standard Bootstrap (not to confuse other people or reinvent the wheel) and may end up with exactly the following in many places:

  <button type="button" class="btn btn-outline-danger btn-sm">...

This is 60 characters, repeated exactly all over the application. How about:

  <button class="button-red">...

Now, the first is the result of using Generic and widely reusable classes. The second is the result of deciding that in this application there are a few button types only, each with a specific custom class.

The more specific way is more compact, less prone to typing mistakes and clearer. It is simply better because it maximizes reuse.

Example: Moment
Imagine in your application that in many places you produce date strings of the form “2018-12-24”. Not wanting to use the annoying (ES) standard library you use Moment.js.

  var datestr = moment(somedate).format('YYYY-MM-DD');

This is admittedly as good as a generic library gets, but if you do this in multiple places you can be more specific:

  var datestr = dateToStr(somedate);

It is clearly more compact and consistent and less prone to error. It is simply a better way to write code. The authors of Moment cant make such a specific dateToStr-function because it makes no sense to a lot of use cases. But in your project it makes a lot of sense (to effectively restrict date strings to a single format).

Making your specific choices
My point is that in your project you make specific choices and the code you write should conform to these choices automatically (by reusing your own function) not by passing (the same) arguments to functions too generic for your needs.

Wrapping or writing your own
What I find sometimes happens when I use generic code is something like:

  1. I start using it naively
  2. I get too much redundancy and my code is not as compact as I want
  3. I start wrapping the generic code with my own interfaces
  4. The generic code is not capable of what I want
  5. I extend my own interfaces to do things on their own
  6. I realise I don’t need the generic code at all

I wrote a wrapper around Angular (v1) $http because a lot of my parameters were always the same. Later I wanted to use my wrapper in an application that did not use Angular at all, so I reimplemented my wrapper using XMLHttpRequest (the JS standard library). In the end the code was just marginally larger and more complex than it was when it depended on $http and it had 0 dependencies. So now I have my own http-request-library, but it is arguably too specific and narrow for use in other places.

So one problem with using generic code (first) is that you don’t know much value it creates. I thought $http did great stuff for me, when in fact it was easily removed and I might as well have been using XMLHttpRequest from the beginning.

How much do you save?
If you want to travel from Paris to San Francisco, would you:

  • Use a free train ticket Paris-London?
  • Use a free flight ticket to Dallas?
  • Use a 25% discount coupon with Icelandic Air (flying via Reykjavik)
  • Rather be in Bogata in the first place

The fact is that if Air France sells direct tickets Paris-SF none of the above offers might really be worth it. I think it is the same with programming. Using 10 dependencies, each to save a few lines of codes, in a rather simple project, is in a way the equivalent of flying from Paris to San Francisco with 10 connections for $1.

Please notice that I don’t intend to be ridiculous about this. You should use standard browsers available, web servers, compilers, linters to check your code, an editor that makes you productive, appropriate testing tools and so on. I am not saying write everything in C or ams.js from scratch! I am saying, understand that the standard tools (browsers, Node.js) and the real published standards (ES, HTTP, HTML, CSS) offer (most) everything you actually really need. They are the equivalent of airlines, airplanes, passports and open borders: the things that make it possible for most people in the West to safely cross the Atlantic ocean for fun, something unimaginable 200 years ago. It is not the bonus programs, airport lounges or in-flight-entertainment that make a significant difference when travelling between continents.

A simple test case
I decided to write a very simple responsive web page. It allows you to enter a date and obtain the weekday. It is not a datepicker test. I use:

  • Moment to validate input date YYYY-MM-DD
  • Moment to calculate week day
  • Bootstrap to make the application responsive (it looks different on a mobile phone in portrait mode)
  • Vue.js for DOM manipulation

I then removed Bootstrap and Moment and used CSS and standard library Date instead.

When you try them I think you find that they essentially equivalent. Both are crude and require more work before the UX is perfect for all screen sizes. I did not intend the non-Bootstrap version to be an exact copy of the Bootstrap version (they both have their pros and cons).

Bootstrap: benefits and implications
For this simple application all I needed was the minimised bootstrap css (144k, larger than vue, moment and my code together), no javascript code.

With Bootstrap my CSS was 10 lines. Even for this trivial application Bootstrap did not satisfy my requirements (this may be due to my ignorance when it comes to Bootstrap). The HTML code was 37 lines.

Without Bootstrap my CSS was 39 lines and the HTML code was 27 lines. This basically means that writing my own (more specific, reusable) CSS allows for more compact and consistent HTML.

I am not saying this is real evidence for anything or that the solutions are perfectly equivalent. I am just saying that Bootstrap is not like getting from Paris to LA for free. It is more like starting in London instead of Paris. And you will hardly write more compact HTML with Bootstrap then you would with a well crafted project specific CSS.

Moment.js: benefits and implications
I will just show the code. With Moment:

      function dateValidate(d) {
        return d === moment(d, 'YYYY-MM-DD').format('YYYY-MM-DD');
      }
      function dateToWeekday(d) {
        return moment(d, 'YYYY-MM-DD').format('dddd');
      }

As you can see, I wrapped Moment to make the component code cleaner. Using standard library Date instead of Moment:

      function zPad(x,l) {
        var r = '' + x;
        while ( r.length < l ) r = '0' + r;
        return r;
      }
      function jsDateToStr(d) {
        return zPad(d.getFullYear(),4) + '-'
             + zPad(d.getMonth()+1 ,2) + '-'
             + zPad(d.getDate()    ,2);
      }
      function strToJsDate(s) {
        return new Date( +(s.substr(0,4))   ,
                         +(s.substr(5,2))-1 ,
                         +(s.substr(8,2)) );
      }
      function dateValidate(d) {
        return d === jsDateToStr(strToJsDate(d));
      }
      function dateToWeekday(s) {
        return ['Sunday','Monday','Tuesday','Wednesday','Thursday',
                'Friday','Saturday'][strToJsDate(s).getDay()];
      }

In a real project you would probably have a pad-function available but now I wrote one. And perhaps there are better ways to use the standard library.

My point here is that if you don’t do much Date-manipulation, using Moment eliminates 10 lines of code (at the cost of an external dependency of 51k minimized). These 10 lines of code are something every programmer should be very capable of writing, they are easily testable, and highly reusable.

Vue.js
How about eliminating Vue.js as well? The ironic thing is that I don’t know how to do direct DOM manipulation (with or without JQuery). So I just presume that vue.js is worth it at 87k minimized.

If I did know how to eliminate Vue.js I would at least have a clue about what its value actually is. Now I just trust I am better off with it than without it. And that is what I encourage you not to do about every external dependency that could be used in your project.

More Reading
I found this blog post interesting.

BTW: Bootstrap sucks
I am not claiming that my no-bootstrap-version is perfect. But I think Bootstrap gets responsive UX wrong. My little appliation looks ridiculous on a really wide screen. I don’t need room for more than 10 characters, and the buttons don’t need to be spread out. A good responsive design reorganises blocks of content without changing the size of elements such as buttons.

BTW: Moment is slower
I made some benchmarks of my date validation functions based on Moment and based on Standard Library. Depending on Moment roughly doubled execution times. On my Chromebook I could do about 10000 validations per second, which is not a huge number. There might be better ways to check if a date is valid (both for Moment and Standard Library) so perhaps this is nonsense.

On Grit and becoming a better programmer

I have read the book Grit by Angela Duckworth. It brought some obvious (or not) things to my attention:

1. To really get better at something you need to challenge yourself, try more difficult things, not just repeatedly do what you are already capable of. (my words, not a quote from the book)

If you think of an athlete, a high jumper, this seems very obvious (you are never going to jump 2.00m if you keep practicing 1.50m all days).

2. Mastering something is about allowing yourself to dig deeper, getting more depth and seeing more details (than a novice).

If you think of a sports commentator (making remarks about subtle technical details in figure scating or gymnastics), this also seems fairly obvious.

What are programmers told to learn
I often hear advice to programmers about how to learn and work. I would say it is mostly about trying new things:

  • Learn new programming languages
  • Learn new tools and libraries (that simplifies things)

While these are obviously good things to learn it is neither very challenging nor very much going deeper. And when it comes to tools and libraries that simplify things you perhaps trade deep understanding for easy productivity.

It is not very often I hear the advice to try to solve a hard and challenging problem using tools you already know very well. And it is also not very common that I hear the advice to go into very intricate details about anything.

Programmers seem to value, or be valued by, knowledge about things allowing them to find shortcuts or ready building blocks:

  • Libraries and frameworks – to write no code, less code or avoid writing difficult code
  • Different programming languages – to pick a language that makes it easy
  • Patterns and methodology – to avoid difficult technical analysis and design
  • …and soft skills or course

All these skills are quite easily described in a CV. But none of it is particularly difficult or challenging.

What is truly hard about programming
To implement a correct solution a programmer needs to:

  • Understand the problem or problem domain correctly (and perhaps completely)
  • Come up with a solution and software architecture that can actually solve the problem
  • Go through the labour of correctly crafting the required code
  • …and this is or course done iteratively, learning and adapting along the way (because getting everything right fom the beginning is often impossibly hard), so you need to make incorrect/insufficient decisions that lead you in the right direction for now

This can perhaps be called problem solving or seniority in a CV, but problem solving is a rather abstract cliche and seniority is often measured in years more than anything else. Also it can appear to be covered by things like requirement analysis, patterns, TDD and agile. But these things are about how to plan, facilitate and manage the difficult things. You can know a lot about TDD, without being able to correctly write test cases that describe the problem domain correctly, and without being able to implement an algorithm that solves the problem sufficiently well.

A balanced training
Back to athletes. Golfers (lets call them athletes) used to not be very fit. Then came Tiger Woods. Since then all (top) golfers go to the gym (to be able to compete). To me, this is like you can be a good programmer but if you don’t know git you are simply not very competetive.

But golfers spend most of their time mastering their swing (or in the gym, or with a shrink). They don’t also do horseback riding, pole jumping and run marathons. Or if they do, they at least don’t think it is key to becoming a better golfer. But when it comes to programmers this is often what we do: learn a new language, a new framework or a new service. Like it would significantly make us better (even though it is no challenge at all, just some time spent).

No similes (or metaphors) are perfect. Golf is not programming. Most programmers don’t aspire to be among the best in the world. But I think the question is worth asking:

Do I, as a programmer, have the right mix of hard/challenging practice and trying/learning new stuff?

Learning in the workplace
In our workplaces they don’t want us to work with things that are so challenging that we might very well fail. That is a project risk. And IT projects fail far too often for anyone to be satisfied. It is not at all strange that organisations want us to work with things that we know and otherwise they want to mitigate the risk by making things easier. But do we learn this way? And do we, 5-10 years down the road, reach our potential and develop the capabilities that would benefit us the most?

Is there a genuine conflict because making things as easy and productive as possible on one hand and improving skills on the other?

For whom do we learn?
I don’t know if programmers who challenge themselves and become masters with deep knowledge are rewarded for it. I don’t know if most organisations even want such programmers. I already hear the complaints:

  • No one else understands her code (but if the problem was THAT hard, what was the poor master going to do?)
  • She is just inventing things instead of using standard tools
  • She is not following best practices

Also, who will ever know what a great job such a programmer does? It is like:

  1. This integration routine is unstable and too slow, can you try to fix it? (lets say it is very hard)
  2. Master fixes it in 4 days
  3. Some suspicious comments: yeah sure?!
  4. After the next week no one ever remembers and its just taken for granted that it works the way it always should have

Don’t do as they say they do, do as they do!
I can’t back this up, but I have the feeling that the best programmers we know are people who challenged themselves with insane projects. But what we hear is that programmers are valued by the number of technologies they know.

I would think that smart organisations know how to identify and appreciate a master. And I think master programmers eventually find themselves in those smart organisations. But I think it happens mostly below the radar.

Example: git
Before git there was subversion (improved on svn) and a number of commercial version control systems. These were, like all tools, both appreciated and hated and using them was best practice.

Now Master Torvalds was not happy. He challenged existing technologies and designed and wrote his own system: git.

However, what I find fascinating here is that he wrote git in C. People complained about it of course. But git was fine because Torvalds

  1. deeply knew the problem domain,
  2. designed git very well,
  3. implemented it in a language he mastered.

It is like, you can’t argue for implementing a system like git in C, but in the end git could not have been better (smaller, faster, more portable) if it was implemented in any other language.

I guess for the rest of us the question is always:

  1. should we use a proven solution and take the easy path?
  2. should we invent our own solution possibly using the crude tools we master?

But if we are never arrogant enough to go for #2, how will we ever grow to be able to go for #2 when it is really needed of us?

The Hard Way
There is a (somewhat infamous) book series and online courses about learning to code the hard way. Many programmers like C/C++ perhaps partly because the fact that it is difficult and even a bit unsafe is fun. I think somehow JavaScript has the same appeal in a different way.

Many hackers seem to be struggling with the impossible even though it is hardly worth it from a rational perspective.

I sometimes entertain myself with Hackerrank.com (especially Project Euler). Some challenges are truly hard (so I have not solved them). Some I have solved after weeks of struggle (often using Lisp or C). I used to judge myself, thinking it was an absolute waste of time. On top of everything it made me a bit stressed and gave me occasional sleeping problems because I could not stop thinking about a problem. I am about to reconsider it. And perhaps it is the really hard challenges, that I fail to solve properly, that I should focus on.

Conclusion
I have left a number of unanswered questions in this post. I don’t have answers. But I think it is worth considering the question: do I reach my potential as a programmer the way I try to learn?

Update: Cynefin
A colleague made we aware of the Cynefin Framework. I imagine many of us encounter problems of all levels: Simple, Complicated, Complex, Chaotic. When we encounter a more difficult (Chaotic) problem we need to identify the problem as such and feel comfortable working with such problems. That requires training and experience, not just with Simple and Complicated problems, but with really hard problems.

VIM: Disable autoindent

More and more often I find that Vim comes with auto indention enabled. I don’t want that.

Perhaps the best way to fix this annoyance is to add the following to your .vimrc file.

" Switch off all auto-indenting
set nocindent
set nosmartindent
set noautoindent
set indentexpr=
filetype indent off
filetype plugin indent off

I found these exact lines here.

Acer Chromebook R13: 3. As a Linux development workstation

I have got an Acer Chromebook R13 and I will write about it from my perspective.

1. Background
2. As a casual computer
3. As a Linux development workstation (this post)

As a Linux development workstation
I switched my Chromebook to Development mode and everything that follows depends on that.

In ChromeOS you can hit CTRL-ALT-T to get a crosh shell. If in Development mode you can run shell to get a regular “unix” shell. You now have access to all of ChromeOS. It looks like this:

crosh> shell
chronos@localhost / $ ls /
bin     dev  home  lost+found  mnt  postinst  root  sbin  tmp  var
debugd  etc  lib   media       opt  proc      run   sys   usr
chronos@localhost / $ ls ~
'Affiliation Database'          login-times
'Affiliation Database-journal'  logout-times
Bookmarks                       'Media Cache'
Cache                           'Network Action Predictor'
Cookies                         'Network Action Predictor-journal'
Cookies-journal                 'Network Persistent State'
'Current Session'               'Origin Bound Certs'
'Current Tabs'                  'Origin Bound Certs-journal'
databases                       'Platform Notifications'
data_reduction_proxy_leveldb    Preferences
DownloadMetadata                previews_opt_out.db
Downloads                       previews_opt_out.db-journal
'Download Service'              QuotaManager
'Extension Rules'               QuotaManager-journal
Extensions                      README
'Extension State'               'RLZ Data'
Favicons                        'RLZ Data.lock'
Favicons-journal                'Service Worker'
'File System'                   'Session Storage'
GCache                          Shortcuts
'GCM Store'                     Shortcuts-journal
GPUCache                        Storage
History                         'Sync App Settings'
History-journal                 'Sync Data'
'History Provider Cache'        'Sync Extension Settings'
IndexedDB                       'Sync FileSystem'
'Last Session'                  Thumbnails
'Last Tabs'                     'Top Sites'
local                           'Top Sites-journal'
'Local App Settings'            'Translate Ranker Model'
'Local Extension Settings'      TransportSecurity
'Local Storage'                 'Visited Links'
log                             'Web Data'
'Login Data'                    'Web Data-journal'
'Login Data-journal'
chronos@localhost / $ uname -a
Linux localhost 3.18.0-16387-g09d1f8eebf5f-dirty #1 SMP PREEMPT Sat Feb 24 13:27:17 PST 2018 aarch64 ARMv8 Processor rev 2 (v8l) GNU/Linux
chronos@localhost / $ df -h
Filesystem               Size  Used Avail Use% Mounted on
/dev/root                1.6G  1.4G  248M  85% /
devtmpfs                 2.0G     0  2.0G   0% /dev
tmp                      2.0G  248K  2.0G   1% /tmp
run                      2.0G  456K  2.0G   1% /run
shmfs                    2.0G   24M  1.9G   2% /dev/shm
/dev/mmcblk0p1            53G  1.3G   49G   3% /mnt/stateful_partition
/dev/mmcblk0p8            12M   28K   12M   1% /usr/share/oem
/dev/mapper/encstateful   16G   48M   16G   1% /mnt/stateful_partition/encrypted
media                    2.0G     0  2.0G   0% /media
none                     2.0G     0  2.0G   0% /sys/fs/cgroup
tmpfs                    128K   12K  116K  10% /run/crw

This is quite good! But we all know that starting to install things and modifying such a system can cause trouble.

Now, there is a tool called Crouton that allows us to install a Linux system (Debian or Ubuntu) into a chroot. We can even run X if we want. So, I would say that for doing development work on your Chromebook you have (at least) 5 options:

  1. Install things directly in ChromeOS
  2. Crouton: command line tools only
  3. Crouton: xiwi – run X and (for example) XFCE inside a ChromeOS window
  4. Crouton: X – run X side by side with ChromeOS
  5. Get rid of ChromeOS and install (for example) Arch instead

I will explore some of the options.

#2. Crouton command line tools only
For the time being, I don’t really need X and a Window Manager. I am fine (I think) with the ChromeOS UI and UX. After downloading crouton I ran:

sudo sh ./crouton -n deb-cli -r stretch -t cli-extra

This gave me a Debian Stretch system without X, named deb-cli (in case I want to have other chroots in the future). Installation took a few minutes.

To access Debian I now need to

  1. CTRL-ALT-T : to get a crosh shell
  2. crosh> shell : to get a ChromeOS unix shell
  3. $ sudo startcli : to get a shell in my Debian strech system

This is clearly a sub-optimal solution to get a shell tab (and closing the shell takes 3x exit). However, it works very well. I installed Node.js (for ARMv8) and in a few minutes I had cloned my git nodejs-project, installed npm packages, run everything and even pushed some code. I ran a web server on 127.0.0.1 and I could access it from the browser just as expected (so this is much more smooth than a virtual machine).

For my purposes I think this is good enough. I am not very tempted to get X up an running side-by-side with ChromeOS. However I obviously would like things like shortcuts and virtual desktops.

Actually, I think a chroot is quite good. It does not modify the base system the way package managers for OS X tend to do. I don’t need to mess with PATH and other variables. And I get a more complete Debian system compared to just the package manager. And it is actually the real Debian packages I install.

I installed Secure Shell and Crosh Window allowing me to change some defaults parameters of the terminal (by hitting CTRL-SHIFT-P), so at least I dont need to adjust the font size for every terminal.

#4. Crouton with XFCE
Well, this is going so good that I decided to try XFCE as well.

sudo sh ./crouton -n deb-xfce -r stretch -t xfce,extensions

It takes a while to install, but when done just run:

sudo startxfce4

The result is actually pretty nice. You switch between ChromeOS and XFCE with CTRL-ALT-SHIFT-BACK/FORWARD (the buttons next to ESC). The switching is a little slow, but it gives you a (quite needed) virtual desktop. Install crouton extensions in ChromeOS to allow copy-paste. A good thing is that I can run:

sudo enter-chroot -n deb-xfce

to enter my xfce-chroot without starting X and XFCE. So, for practical purposes I can have an X-chroot but I dont need to start X if I dont want to.

screen
After a while I have uninstalled XFCE and I only use crouton with cli. The terminal (part of the Chrome browser) is a bit sub-optimal. My idea is to learn to master screen, however:

$ screen
Cannot make directory '/run/screen': Permission denied

This is easily fixed though (link):

mkdir ~/.screen
chmod 700 ~/.screen

# add to .bashrc
export SCREENDIR=$HOME/.screen

# and a vim "alias" I found handy
svim () { screen -t $1 vim $1; }

I found that I get problems when I edit UTF-8 files in VIM in screen in crouton in a crosh shell. Without screen there are also issues, but slightly less so. It seems to be a good idea to add the following line to .vimrc:

set encoding=utf8

It improves the situation, but still a few glitches.

Now at least screen works. It remains to be seen if I can master it.

lighttpd
I installed lighttpd just the normal Debian way. It does not start automatically, but the normal way works:

$ $ sudo service lighttpd start

If you close your last crouton-session without stopping lighttpd you get:

$ exit
logout
Unmounting /mnt/stateful_partition/crouton/chroots/deb-cli...
Sending SIGTERM to processes under /mnt/stateful_partition/crouton/chroots/deb-cli...

That stopped lighttpd after a few seconds, but I guess a manual stop is preferred.

Performance
I have written about NUC vs RPi before and to be honest I was worried that my ARM Chromebook would more have the poor performance of the RPi than the decent performance of the NUC. I would say this is not a problem, the Acer R13 is generally fast enough.

After a few Nodejs tests, it seems the Acer Chromebook R13 is about 5-6 times faster than an RPi V2.

A C-program (some use of 64-bit double floats, little memory footprint) puts it side-by-side with my Celeron/NUC:

                s
RPi V1        142
RPi V2         74
Acer R13       12.5
Celeron J3455  13.0
i5-4250U        7.5

Benchmarks are always tricky, but I think this gives an indication.

Vue.js: loading template html files

Update 2018-05-27: A few months have passed since I wrote this post. I have used my solution/library for several real applications and it has worked very well. So everything looks exactly as it did when I posted v0.1 and that is a good thing. There are obviously improvement opportunites and probaby limitations/bugs. But for my purposes I have not encountered any problems to fix. And nobody has notified me of needed fixes.

You may want to code your Vue.js application in such way that your html templates are in separate html files, but you still do not want a build/compile step. Well, the people writing Vue dont want you do do this, but it can easily be done.

All you need is to download this single js file and include it in your Vue.js web page. All instructions and documentation required are found in the js file.

VueWithHtmlLoader-library
I wrote a little library that simply does what is required in a rather simple way. I will not hold you back and I will show you by example immediately:

  • A Rock-paper-scissors Vue-app, all in 1 file: link
  • A Rock-paper-scissors Vue-app, modularised with separate html/js files: link
  • Source of VueWithHtmlLoader library: link

These are the code changes needed to use VueWithHtmlLoader:

 * 1) After including "vue.js", and
 *    before including your component javascript files,
 *    include "vuewithhtmlloader.js"
 *
 * 2) In your component javascript files
 *    replace: Vue.component(
 *       with: VueWithHtmlLoader.component(
 *
 *    replace: template: '...'
 *       with: templateurl: 'component-template.html' (replace with your url)
 *
 * 3) The call to "new Vue()" needs to be delayed, like:
 *    replace: var myVue = new Vue(...);
 *       with: var myVue;          
 *             function initVue() {
 *               myVue = new Vue(...);
 *             }
 *             VueWithHtmlLoader.done(initVue);

My intention is that the very simple Rock-paper-scissors-app shall work as an example.

Disclaimer: the library is just written and tested only with this application. The application is written primarily to demonstrate the library. The focus has been clarity and simplicity. Please feel free to suggest improvements to the library or the application, but keep in mind that it was never my intention to follow all best practices. The purpose of the library is to break a Vue best practice.

What the library does:

  1. It creates a global object: VueWithHtmlLoader
  2. It provides a function: VueWithHtmlLoader.component() that you shall use instead of Vue.component() (there may be unsupported/untested cases)
  3. When using VueWithHtmlLoader.component(), you can provide templateurl:’mytemplate.html’ instead of template:’whatever Vue normally supports’
  4. The Vue()-constructor must be called after all templateurls have been downloaded. To facilitate this, place the code that calls new Vue() inside a function, and pass that function to VueWithHtmlLoader.done()
  5. The library will now load all templateurls. When an html template is successfully downloaded over the network Vue.component() is called normally.
  6. When all components are initiated, new Vue() is called via the provided function

Apart from this, you can and should use the global Vue object normally for all other purposes. There may be more things that you want to happen after new Vue() has been called.

The library has no dependencies (it uses XMLHttpRequest directly).

Background
Obviously there are people (like me) with an AngularJS (that is v1) background who are used to ng-include and like it. We see Vue as a better, smaller AngularJS for the future, but we want to keep our templates in separate files without a build step.

I also expect many developers with various backgrounds to try out Vue.js. They may also benefit from a simple way to keep templates in separate files without worrying about a build tool.

As I see it, there are different sizes of applications (and sizes of team and support around them).

  1. Small single-file applications: I think it is great that Vue supports simple single-file applications (with x-template if you want), implemented like my game above. This has a niche!
  2. Applications that clearly require modularization, but optimizing loading times is not an issue, and you want to use the the simplest tools available (keep html/js separate to allow standard editor support and not require a build step). AngularJS (v1) did this nicely. I intend Vue to do it nicely too with this library.
  3. Applications built by people or organizations that already use Webpack and such tools, or applications that are so demanding that such tools are required.

I fully respect and understand the Vue project does not want to support case 2 out of the box and that they prefer to keep the Vue framework small (and as fast as possible).

But i sense some kind of arrogance with articles like 7 Ways To Define A Component Template in Vue.js. I mean 1,2 are only useful for very small components. 3 is only useful for minimal applications that dont require modularization. 4 has very narrow use cases. 5 is insane for normal development (however, I can see cases where you want to output/generate it). And 6,7 requires a build step.

8. Put the damn HTML in an HTML-file and include it? Nowhere to be seen.

The official objection to 8 is obviously performance. I understand that pre-compiling your html instead of serving html that the client will compile is faster. But compared to everything else this overhead may be negligable. And that is what performance is all about, focusing on what is critical and keeping everything else simple. My experience is that loading data to my applications take much more time than loading the application itself.

The Illusion of Simplicity
AngularJS (v1) gave the illusion of simplicity. You just wrote JavaScript-files and (almost) HTML-files, the browser loaded everything and it just worked. I know this is just an illusion and a lot happens behind the scenes. But my experience is that this illusion works well, and it does not leak too much. Vue.js is so much simpler than AngularJS in so many ways. I think my library can keep my illusion alive.

Other options
There is thread on Stackoverflow about this and there are obviously other solutions. If you want to write .vue-files and load them there is already a library for that. For my solution I was inspired by the simple jquery example, but: 1) it is nice to not have a jquery dependency, 2) it is nice to keep the async stuff in one place, 3) the delayed call of new Vue() seems forgotten.

Feedback, limitations, bugs…
If you have suggestions for improvements or fixes of my library, please let me know! I am happy to make it better and I intend to use it for real applications.

I think this library suits some but not all (or even most) Vue.js applications. Lets not expect it to serve very complex requirements or applications that would actually benefit more of a Webpack treatment.

TODO and DONE

  • A minified version – I have not really decided on ambition/obfuscation level
  • Perhaps change loglevel if minified Vue is used? or not.
  • I had some problems with comments in html-files, but I failed to reproduce them. I think <!– comments –> should definitely be supported.

JavaScript: Sets, Objects and Arrays

JavaScript has a new (well well) fancy Set datastructure (that does not come with functions for union, intersection and the likes, but whatever). A little while ago I tested Binary Search (also not in the standard library) and I was quite impressed with the performance.

When I code JavaScript I often hesitate about using an Array or an Object. And I have not started using Set much.

I decided to make some tests. Lets say we have pseudo-random natural numbers (like 10000 of them). We then want to check if a number is among the 10000 numbers or not (if it is a member of the set). A JavaScript Set does exactly that. A JavaScript Object just requires you to do: set[314] = true and you are basically done (it gets converted to a string, though). For an Array you just push(314), sort the array, and then use binary search to see if the value is there.

Obviously, if you often add or remove value, (re)sorting the Array will be annoying and costly. But quite often this is not the case.

The test
My test consists of generating N=10000 random unique numbers (with distance 1 or 2 between them). I then insert them (in a kind of pseudo-random order) into an Array (and sorts it), into an Object, and into a Set. I measure this time as an initiation time (for each data structure).

I repeat. So now I have 2xArrays, 2xObjects, 2xSets.

This way I can test both iterating and searching with all combinations of data structures (and check that the results are the same and thus correct).

Output of a single run: 100 iterations, N=10000, on a Linux Intel i5 and Node.js 8.9.1 looks like this:

                         ====== Search Structure ======
(ms)                        Array     Object      Set
     Initiate                1338        192      282
===== Iterate =====    
        Array                 800         39       93
       Object                 853        122      170
          Set                1147         82      131

By comparing columns you can compare the cost of searching (and initiating the structure before searching it). By comparing rows you can compare the cost of iterating over the different data structures (for example, iterating over Set while searching Array took 1147ms).

These results are quite consistent on this machine.

Findings
Some findings are very clear (I guess they are quite consistent across systems):

  • Putting values in an Array, to sort it, and the search it, is much slower and makes little sense compared to using an Object (or a Set)
  • Iterating an Array is a bit faster than iterating an Object or Set, so if you are never going to search an Array is faster
  • The newer and more specialized Set offers little advantage to good old Objects

What is more unclear is why iterating over Objects is faster when searching Arrays, but iterating over Sets if faster when searching Objects or Sets. What I find is:

  • Sets seem to perform comparably to Objects on Raspberry Pi, ARMv7.
  • Sets seem to underperform more on Mac OS X

Obviusly, all this is very unclear and can vary depending on CPU-cache, Node-version, OS and other factors.

Smaller and Larger sets
These findings hold quite well for smaller N=100 and larger N=1000000. The Array, despite being O(n log n), does not get much more worse for N=1000000 than it already was for N=10000.

Conclusions and Recommendation
I think the conservative choice is to use Arrays when order is important or you know you will not look for a member based on its unique id. If members have unique IDs and are not ordered, use Object. I see no reason to use Set, especially if you target browsers (support in IE is still limited in early 2018).

The Code
Here follows the source code. Output is not quite as pretty as the table above.

var lodash = require('lodash');

function randomarray(size) {
  var a = new Array(size);
  var x = 0;
  var i, r;
  var j = 0;
  var prime = 3;

  if ( 50   < size ) prime = 31;
  if ( 500  < size ) prime = 313;
  if ( 5000 < size ) prime = 3109;

  for ( i=0 ; i<size ; i++ ) {
    r = 1 + Math.floor(2 * Math.random());
    x += r;
    a[j] = '' + x;
    j += prime;
    if ( size <= j ) j-=size;
  }
  return a;
}

var times = {
  arr : {
    make : 0,
    arr  : 0,
    obj  : 0,
    set  : 0
  },
  obj : {
    make : 0,
    arr  : 0,
    obj  : 0,
    set  : 0
  },
  set : {
    make : 0,
    arr  : 0,
    obj  : 0,
    set  : 0
  }
}

function make_array(a) {
  times.arr.make -= Date.now();
  var i;
  var r = new Array(a.length);
  for ( i=a.length-1 ; 0<=i ; i-- ) {
    r[i] = a[i];
  }
  r.sort();
  times.arr.make += Date.now();
  return r;
}

function make_object(a) {
  times.obj.make -= Date.now();
  var i;
  var r = {};
  for ( i=a.length-1 ; 0<=i ; i-- ) {
    r[a[i]] = true;
  }
  times.obj.make += Date.now();
  return r;
}

function make_set(a) {
  times.set.make -= Date.now();
  var i;
  var r = new Set();
  for ( i=a.length-1 ; 0<=i ; i-- ) {
    r.add(a[i]);
  }
  times.set.make += Date.now();
  return r;
}

function make_triplet(n) {
  var r = randomarray(n);
  return {
    arr : make_array(r),
    obj : make_object(r),
    set : make_set(r)
  };
}

function match_triplets(t1,t2) {
  var i;
  var m = [];
  m.push(match_array_array(t1.arr , t2.arr));
  m.push(match_array_object(t1.arr , t2.obj));
  m.push(match_array_set(t1.arr , t2.set));
  m.push(match_object_array(t1.obj , t2.arr));
  m.push(match_object_object(t1.obj , t2.obj));
  m.push(match_object_set(t1.obj , t2.set));
  m.push(match_set_array(t1.set , t2.arr));
  m.push(match_set_object(t1.set , t2.obj));
  m.push(match_set_set(t1.set , t2.set));
  for ( i=1 ; i<m.length ; i++ ) {
    if ( m[0] !== m[i] ) {
      console.log('m[0]=' + m[0] + ' != m[' + i + ']=' + m[i]);
    }
  }
}

function match_array_array(a1,a2) {
  times.arr.arr -= Date.now();
  var r = 0;
  var i, v;
  for ( i=a1.length-1 ; 0<=i ; i-- ) {
    v = a1[i];
    if ( v === a2[lodash.sortedIndex(a2,v)] ) r++;
  }
  times.arr.arr += Date.now();
  return r;
}

function match_array_object(a1,o2) {
  times.arr.obj -= Date.now();
  var r = 0;
  var i;
  for ( i=a1.length-1 ; 0<=i ; i-- ) {
    if ( o2[a1[i]] ) r++;
  }
  times.arr.obj += Date.now();
  return r;
}

function match_array_set(a1,s2) {
  times.arr.set -= Date.now();
  var r = 0;
  var i;
  for ( i=a1.length-1 ; 0<=i ; i-- ) {
    if ( s2.has(a1[i]) ) r++;
  }
  times.arr.set += Date.now();
  return r;
}

function match_object_array(o1,a2) {
  times.obj.arr -= Date.now();
  var r = 0;
  var v;
  for ( v in o1 ) {
    if ( v === a2[lodash.sortedIndex(a2,v)] ) r++;
  }
  times.obj.arr += Date.now();
  return r;
}

function match_object_object(o1,o2) {
  times.obj.obj -= Date.now();
  var r = 0;
  var v;
  for ( v in o1 ) {
    if ( o2[v] ) r++;
  }
  times.obj.obj += Date.now();
  return r;
}

function match_object_set(o1,s2) {
  times.obj.set -= Date.now();
  var r = 0;
  var v;
  for ( v in o1 ) {
    if ( s2.has(v) ) r++;
  }
  times.obj.set += Date.now();
  return r;
}

function match_set_array(s1,a2) {
  times.set.arr -= Date.now();
  var r = 0;
  var v;
  var iter = s1[Symbol.iterator]();
  while ( ( v = iter.next().value ) ) {
    if ( v === a2[lodash.sortedIndex(a2,v)] ) r++;
  }
  times.set.arr += Date.now();
  return r;
}

function match_set_object(s1,o2) {
  times.set.obj -= Date.now();
  var r = 0;
  var v;
  var iter = s1[Symbol.iterator]();
  while ( ( v = iter.next().value ) ) {
    if ( o2[v] ) r++;
  }
  times.set.obj += Date.now();
  return r;
}

function match_set_set(s1,s2) {
  times.set.set -= Date.now();
  var r = 0;
  var v;
  var iter = s1[Symbol.iterator]();
  while ( ( v = iter.next().value ) ) {
    if ( s2.has(v) ) r++;
  }
  times.set.set += Date.now();
  return r;
}

function main() {
  var i;
  var t1;
  var t2;

  for ( i=0 ; i<100 ; i++ ) {
    t1 = make_triplet(10000);
    t2 = make_triplet(10000);
    match_triplets(t1,t2);
    match_triplets(t2,t1);
  }

  console.log('TIME=' + JSON.stringify(times,null,4));
}

main();

When to (not) use Web Workers?

Web Workers is a mature, simple, standardised, compatible technology for allowing multithreaded JavaScript-applications in the web browser.

I am not going to write about how to use Web Worker (check the excellent MDN article). I am going to write a little about when and why to (not) use Web Worker.

First, Web Workers are about performance. And performance is typically not the best thing to think about first when you code something.

Second, when you have performance problems and you throw more cores at the problem your best speedup is x2, x4 or xN. In 2018 it is quite common with 4 cores and that means in the optimal case you can make your program 4 times faster by using Web Workers. Unfortunately, if it was not fast enough from the beginning chances are a 4x speedup is not going to help much. And the cost of 4x speedup is 4 times more heat is produced, the battery will drain faster, and perhaps other applications will be suffering. A more efficient algorithm can often produce 10-100 times speedup without making the maintainability of the program suffer too much (and there are very many ways to make a non-optimised program faster).

Let us say we have a web application. The user clicks “Show report”, the GUI locks/blocks for 10s and then the report displays. The user might accept that the GUI locks, if just for 1-2 seconds. Or the user might accept that the report takes 10s to compute, if it shows up little by little and the program does not appear hung. The way we could deal with this in JavaScript (which is single thread and asyncronous) is to break the 10s report calculation into small pieces (say 100 pieces each taking 100ms) and after calculating each piece calling window.setTimeout which allows the UI to update (among other things) before calculating another piece of the report. Perhaps a more common and practical approach is to divide the 10s job into logical parts: fetch data, make calculations, make report, but this would not much improve the locked GUI situation since some (or all) parts still take significant (blocking) time.

If we could send the entire 10s job to a Web Worker our program GUI would be completely responsive while the report is generated. Now the key limitation of a web worker (which is also what allows it to be simple and safe):

Data is copied to the Worker before it starts, and copied from the Worker when it has completed (rather than being passed by reference).

This means that if you already have a lot of data, it might be quite expensive to copy that data to the web worker, and it might actually be cheaper to just do the job where the data already is. In the same way, since there is some overhead in calling the Web Worker, you can’t send too many too small pieces of work to it, because you will occupy yourself with sending and receiving messages rather than just doing the job right away.

This leaves us with obvious candidates for web workers (you can use Google):

  • Expensive searches (like chess moves or travelling salesman solutions)
  • Encryption (but chances are you should not do it in JavaScript in the first place, for security reasons)
  • Spell and grammar checker (I don’t know much about this).
  • Background network jobs

This is not too useful in most cases. What would be useful would be to send packages of work (arrays), like streams in a functional programming way: map(), reduce(), sort(), filter().

I decided to write some Web Worker tests based on sort(). Since I can not (easily, and there are probably good reasons) write JavaScript in WordPress I wrote a separate page with the application. Check it out now:

So, for 5 seconds I try to do the following job as many times I can, while I keep track of how much the GUI is suffering:

  1. create an array of 10001 random numbers: O(n)
  2. sort it: O(n log n)
  3. get the median (array[5000]): O(1)

The expensive part is step 2, the sort (well, I actually have not measured 1 vs 2). If the ratio of amount of work done per byte being sent is high enough then it can be worth it to send the job to a Web Worker.

If you run the tests yourself I think you shall see that the first Web Worker tests that outsource all of 1-2-3 are quite ok. But this basically means giving the web worker no data at all and when it has done a significant amount of job, receiving just a few numbers. This is more Web Worker friendly than Chess where at least the board would need to be sent.

If you then run the tests that outsource just sort() you see significantly lower throughput. How suitable sort()? Well, sorting 10k ~ 2^13 elements should require each element to be compared (accessed) about 13 times. And there is no data sent that is not needed by the Web Worker. Just as a counter example: if you send an order to get back the sum of the lines most of the order data is ignored by the Web Worker, and it just needs to access each line value once; much much less suitable than sort().

Findings from tests
I find that sort(), being O(n log n), on an array of numbers is far too fast to be outsourced to a Web Worker. You need to find a much more “dense” problem to benefit of a Web Worker.

Islands of data
If you can design your application in such way that one Web Worker maintains its own full state and just shares small selected parts occationally, that could work. The good thing is that this would also be clean encapsulation of data and separation of responsibilites. The bad thing is that you probably need to design with the Web Worker in mind quite early, and this kind of premature optimization is often a bad idea.

This could be letting a Web Worker do all your I/O. But if most data that you receive is needed in your application, and most data you send comes straight from your application, the benefit is very questionable. An if most data you receive is not needed in your application, perhaps you should not receive so much data in the first place. Even if you process your incoming data quite much: validating, integrating with current state, precalculating I would not expect it to come very close to the computational intensity of my sort().

Conclusions
Unfortunately, the simplicity and safety of Web Worker is unfortunately also its biggest limitation. The primary reason for using a Web Worker should be performance and even for artificial problems it is hard to get any benefit.