Category: JavaScript

Infinite sequences in JavaScript

Infinite sequences

They’re kind of cool in Haskell and allow you to concisely express and manipulate unbounded series of data.

naturalNumbers = [1..]

Haskell’s lazy nature suits the creation of these kind of data structures. More complex infinite sequences can be expressed recursively.

fibonacci = 0 : 1 : zipWith (+) fibonacci (tail fibonacci)

This data can then be processed through a pipeline of operations, typically taken from the standard fare of functional programming: map, filter and their relatives. As long as the final result is bounded (for instance, through inclusion of a take operation), the calculation doesn’t blow up.

take 5 fibonacci -- [0, 1, 1, 2, 3]

Expressing data in terms of transformations over infinite series can lead to very concise descriptions, and, with laziness, efficient construction algorithms.

How can we apply these ideas in JavaScript?

Several libraries have popped up over the years to support functional-style programming in JavaScript. The most ubiquitous is, of course, underscore, closely followed by its performant clone, lodash.

Can we get near to supporting infinite series through the APIs presented by these libraries?

Underscore

Underscore does have an extension library, underscore-contrib, that adds additional support for operations useful in functional programming.

The function _.iterateUntil can be used to generate a sequence. It takes:

  1. a function to generate the next value in the sequence value given the previous one,
  2. a guard function that checks if the sequence value is within range, otherwise returns false,
  3. a starting value.

This is not lazy though – so the computation below hangs and can’t be chained.

// var naturalNumbers = _.iterateUntil(_.inc, _.constant(true), 0);

Underscore-contrib does support iterators, which allow you to generate indefinately-sized sequences of values generated by the repeated execution of an iterator function.
From their website examples:

var by5 = _.iterators.range(5, Infinity, 5);
by5(); // => 5
by5(); // => 10

In particular, _.iterators.unfoldWithReturn provides a generalized interface that can be used to generate successive values in an infinite series calculation.

var nextFibonacci = _.iterators.unfoldWithReturn([0, 1], function (pair) {
	// the first element in the list is fed to the next iteration of the
	// generator, and the second element is output
	return [[pair[1], pair[0] + pair[1]], pair[0]];
});

nextFibonacci(); // => 0
nextFibonacci(); // => 1
nextFibonacci(); // => 1
nextFibonacci(); // => 2
nextFibonacci(); // => 3
nextFibonacci(); // => 5

However, we still need to manually generate this series, and stick it in an array, before we can chain it and feed it through a data-processing pipeline.

Underscore’s chain interface doesn’t support construction with a generator, so we can’t immediately use the plethora of functional utilities it provides against this data.

_.chain(nextFibonacci).filter(function(x) {
	return x % 2 === 0;
}).take(3).value() // => [0,2,8]? sadly not...

There’s probably a good reason for this, as underscore’s functional operations are optimized against standard JavaScript arrays. Extending the API to accept other forms of sequence description, like generators, may impact on performance for a perceived niche benefit.

Lodash

Lodash supports almost an identical API to underscore (including its own lodash-contrib clone.)

The main premise for its construction was to create a performant version of underscore (which it was very successful in doing). About a year ago, Lodash started using lazy evaluation to dramatically speed up its support for chained operations.

This is great! But how do we feed our lazy creation mechanic into such a pipeline. It feels like it should be possible, but I haven’t yet seen how…

Although lodash’s performance advantage over underscore seems to have somewhat diminished over time, it still provides value with useful additions to its API, such as deep extend, _.merge. In the past, I’ve felt that this, in itself, has made the move to lodash worthwhile, particularly if you’re coding in a functional style with immutable data.

Immutable.js

I love programming with JavaScript’s arrays and objects. In particular, their literal syntax is beautifully concise.

You can use libraries like underscore and lodash to work with these structures in an immutable way. However, if a team decides to work like this, it might be worth using a library like Facebook’s Immutable.js, to enforce that data is not inadvertently mutated. Such inadvertent mutations can introduce unforeseen bugs, particularly when a team is not actively looking for this class of errors anymore.

I particularly like Immutable’s fromJS and toJS functions which allow you to still work predominantly with native JS literal syntax.

What else can Immutable.js give us?

Well, it actually has lazy sequences that support chained operations.

var naturalNumbers = Immutable.Range(1);

naturalNumbers.map(function(x) {
   return x + 1;
}).take(3).toJS(); // => [2,3,4]

Perfect!

Except that at this point, the API kind of dries up. There doesn’t seem to be any support for unfold style operations where I can iterate the generated series.

The two interfaces to infinite series generation seem to be Range and Repeat which pass an incrementing value, or a constant value, to the generator function respectively. No option to pass the result of a generation to the next iteration…

So no Fibonacci here. 😦

Lazy.js

Lazy.js is a library that exploded onto the scene a couple of years ago with some eye-watering performance figures relative to underscore and lodash.

I think these figures are liable to some misinterpretation. Certainly, if you’re just applying some map operation over an array, there is no advantage in converting your native JS array into a lazy array, applying the map and converting back to your JS array. The speed advantage comes from keeping your data in its lazy form, and applying successive processing operations on it.

So, with laziness clearly marked on the tin, I was interested to see how this library could help supporting creation and processing of infinite series data.

Sadly, I got no further than with Immutable. Immutable’s Range and Repeat are mapped to Lazy’s generate and repeat (strangely, Lazy’s range is strictly bounded.) No unfold, or the like, means we can’t go any further.

lazy-array

lazy-array is a neat little library from Oliver Caldwell. It is based on Clojure’s implementation of lazy sequences.

lazy-array does exactly what I want – allows a generator to describe construction of an infinite series, and supports chaining of functional operations to process the data.

The generator is quite explicit in how the lazy list is being constructed. I’m not sure if I’d like some higher abstraction over this to hide the details, but, what the heck, it works!

var integers = function(n) {
	return larr.create(function() {
		return larr.cons(n, integers(n + 1));
	});
};
larr.all(larr.take(5, larr.map(function(n) { 
	return n + 1; 
}, integers(1)))); // => [2, 3, 4, 5, 6]

var fibonacci = function(x, y) {
    return larr.create(function () {
        return larr.cons(x, fibonacci(y, x + y));
    });
};
larr.all(larr.take(7, fibonacci(0, 1))); // [0, 1, 1, 2, 3, 5, 8]

The syntax of the library deliberately follows closely the syntax of Clojure’s Seq API. Although I’m a big fan of what Clojure brings to the table, to me, the syntax feels a little clunky in JavaScript.

In particular, we can see those Lisp-style parentheses starting to build up – the chaining syntax we’re used to from underscore (reminiscent of Haskell’s do notation) would make this feel more idiomatic to JavaScript.

It’s the nearest we’ve got to the promised land, but let’s keep looking…

wu.js

I remember wu.js from a few years back, when I started to get interested in functional-style programming in JavaScript. It was the go-to library for doing work with lazy sequences. I personally had problems getting it to work, so largely stuck with underscore.

Recently, wu seems to have been reincarnated with support for lazy functional programming using ECMAScript 6 iterators.

It’s JavaScript, Jim, but not as we know it.

I find it difficult to get excited about JavaScript-next features and don’t like to use them in code I write. Well, actually, I do get excited about them, but I personally feel you lose a lot from adding a compilation step in your workflow, particularly during development.

I want to debug directly the code I’m writing, not have to peer at transpiled source, or deal with source-maps.

I’m a big Lisp fan, and have, in the past, gone for the source-map route to support scripting a non-native language in the Browser. (I actually use TypeScript at work so am well used to the compile-to-JS workflow and issues surrounding it.)

Compiling one JavaScript to another though… This feels slightly creepy to me. I’ve used ES6 proxies in the past. They’re really great and allow you to implement cool features like universal object stubbing; stuff that you get straight out of the box with Python’s standard Mock library.

In this case, my tests were restricted to running in FireFox which supports the ES6 extensions that I was using. But, unit tests should execute the same in any JS environment right… So it was a compromise worth taking. (I eventually abandoned it since using the headless browser PhantomJS is so much more flexible, but that’s a different story.)

I’m sure wu.js will do what I want. But it’s ES3 all the way for me.

So, with regret, I’m going to close my eyes and move swiftly on.

Highland.js

Caolan McMahon was author of the amazingly impressive and popular async.js library. With Highland.js he presents a unified interface for passing all manner of different data sources through a functional processing pipeline. In this way, it seems very reminiscent of Functional Reactive libraries like RxJS and Bacon/Kefir.

I’m a massive fan of these libraries.

The advantage of having the unified interface is that processing JavaScript arrays, NodeJS streams, generators, Browser events and all manner of other data sources, is done in exactly the same way.

The downside, from the point of view of this exercise, is that, having to support synchronous and asynchronous data through the same interface means you have to use the lowest common denominator when it comes to actually retrieving your output. You have to use some method of deferred retrieval, like promises, or, as with Highland.js, callbacks.

var naturalNumbers = _(function () {
	var value = 1;
	return function (push, next) {
		push(value);
		value += 1;
		next();
	};
}());

naturalNumbers.take(3).toArray(function(output) {
	output; // =>  [1, 2, 3]
});

Ugh.

I’m completely sold on the value of the push-based model for application development that these libraries promote. But in this case, I just want to calculate my data synchronously, so jumping through the hoop of a callback feels like unnecessary overhead.

Transducers

What on Earth are transducers?

Yes.
Good question.
Don’t know.

At least, not fully, though Rich Hickey, who introduced transducers into Clojure last year, gave a couple of interesting talks on them. Conceptually, they’re the generalization of a fold operation over any kind of data series. A more abstract fold, that explicitly separates out the algorithmic processing related to the fold from the mechanic of how the output data is constructed.

Sounds like gobbledygook?

Let’s step back a bit.

Fold, or, more commonly in JavaScript-speak, reduce, is the granddaddy of many functional operations. It’s the standard algorithm by which we transform one data structure into another.

Here’s how we can perform array summation as a reduce operation, taking a function called with an accumulator and each value in the series to build up the output, along with a starting value for the accumulator.

[1, 2, 3].reduce(function(acc, x) { return acc + x; }, 0); // => 6

Of the three most archetypal functional programming operations, map, filter and reduce, map and filter can both be rewritten in terms of the more fundamental, reduce.

var map = function(array, fn) {
	return array.reduce(function(acc, x) {
		acc.push(fn(x));
		return acc;
	}, []);
};
map([1, 2, 3], function(x) { return x * x; }); // => [1, 4, 9]

var filter = function(array, fn) {
	return array.reduce(function(acc, x) {
		if (fn(x)) acc.push(x);
		return acc;
	}, []);
};
filter([1, 2, 3], function (x) { return x % 2 === 1; }); // => [1, 3]

So we really only need to worry about reduce.
How then does transduce fit in?

Well, if you look above, you can see some particular ugliness related to JavaScript’s inbuilt array construction mechanic. push is a mutating operation, and furthermore, push returns the value added rather than the constructed array, so the concatenation and return has to be two steps.

The key point is that these details, of how to construct the output, are nothing to do with the fundamentals of the reduce operation. I can’t use this mapping fold, or filtering fold with any data structure other than a JS array.

Might I even want to?

Sure. How about if we want to apply this same kind of processing to a tree structure, or even to some asynchronous data series like an RxJS observable, or NodeJS stream. Do I really need to re-implement the same algorithm for each of these data structures?

With transducers, no. One fold to rule them all.

A transducer applies a reduction, but we now pass in additional details that describe how we construct the output from the values generated.

OK. But why do we care?

As we now separate out the transformation algorithm from its output building algorithm, we can do some neat things to generate infinite JS arrays. And, the properties of transducers means that we can progressively feed generated values through a pipeline of chained operations.

It’s not laziness, but it does result in similar properties in that we don’t waste resources on constructing intermediary data structures at each step in the pipeline. And, because we consume values in sequence through the entire pipeline, we can cope with infinite series quite happily.

Two transducer libraries popped up shortly after Rich Hickey had released the idea in Clojure, named distinctly, transducers-js and transducers.js.

For chained operations, transducer-based libraries put forward spectacular performance figures, similar to Lazy.js.

We’re there.

But…

Sometimes, it’s nice to have a one-stop shop for everything in your functional programming arsenal. Do I want to go to a dedicated library for handling infinite series? If I want just to perform a single map over a JS array, is the syntax as easy as established tools like underscore?

I have to say, both transducer libraries seem have quite pleasant APIs. They are, however, developed by relatively small communities and so don’t have a level of documentation that makes me confident in buying into them fully.

Ramda

So, finally to Ramda.

The Ramda API gets it right where underscore, and the numerous flavoured clones that followed it, got it wrong.

It supports data-last, automatically curried functions. It follows ideas in common with Lee Crossley’s functional.js, and Lodash’s own lodash-fp offshoot.

Why this is important is covered elsewhere, but suffice to say, it turns verbose functional code written in underscore into extremely concise code reminiscent of Haskell.

When I first discovered it, it felt like the heavens had opened and angels had sang out.

It is just, Right.

It also has a lot of documentation and a relatively large community behind it.

The transducer libraries seem to support similar, function first, syntax in producing their pipelined operations. But, I did see transduce being supported in the Ramda documentation, so thought I would check it out.

The documentation gives an example with a JS array. If you can’t feed a generator into the transduce function then it’s just not going to work.

I really wanted it to work, so I spent a bit of time poking around the source and test code.

Bingo. You can do it. Here’s how.

// description of how to build my result
var listxf = {
	'@@transducer/step': function (acc, x) { return acc.concat([x]); },
	'@@transducer/init': function () { return []; },
	'@@transducer/result': function (x) { return x; }
};

var naturalNumbers = function () {
	var value = 0;
	var result = {
		value: value,
		next: function () {
			value += 1;
			return {
				value: value,
				next: result.next,
				done: false
			};
		},
		done: false
	};
	return result;
};
R.transduce(R.compose(R.map(R.add(1)), R.take(2)), 
	listxf, [], naturalNumbers()); // => [2, 3]

var fibonacci = function () {
	var pair = [1, 0];
	var result = {
		next: function () {
			pair = [pair[1], pair[0] + pair[1]];
			return {
				value: pair[0],
				next: result.next,
				done: false
			};
		},
		done: false
	};
	return result;
};
R.transduce(R.compose(R.map(R.add(2)), R.map(R.multiply(3)), R.take(4)), 
	listxf, [], fibonacci()); // => [6, 9, 9, 12]
R.transduce(R.compose(R.take(4), R.map(R.multiply(3)), R.map(R.add(2))), 
	listxf, [], fibonacci()); // => [2, 5, 5, 8]

My goodness. Quite a lot more verbose than the equivalent would be in Haskell.

Note that the operations are composed in the reverse order to how you would normally compose them if applying directly to a data series. I don’t know why. Someone does.

The story with Ramda is not all rosy. Their performance figures seem to lag some way behind dedicated transducer libraries and Lazy.js, though it looks like they are on the case to improve the situation before v1.0.

I’m quite happy and relieved to end the journey here.

Final words

This article describes my search for a way to perform chained transformations on infinite series of data.

I am certain that I’m using many of these libraries in ways that they were not designed for; my intention is not to offer some direct comparison between them.

It may also be the case that you can perform these kind of calculations using these, or other, libraries, in ways that I was not aware. Let me know!

…Doh!

Here’s some love from Lazy.js.

var fibonacci = Lazy.generate(function() {
	var pair = [0, 1];
	return function() {
		var value = pair[0];
		pair = [pair[1], pair[0] + pair[1]]
		return value;
	};
}());

fibonacci.filter(function(x) { 
	return x % 2 === 0; 
}).take(3).toArray(); // => [0, 2, 8]