Distributed Functional Programming and Foreign Language Integration

Cuneiform combines the strong points of functional programming languages, distributed databases, and workflow management systems. It enables you to run your data-intensive analysis workloads with minimal effort and no lock-in. Here is why:

Small Language

Cuneiform has only a small set of syntactic concepts: Pretty much the only thing you can do is to define and call tasks.

addition in Perl

Computation, I/O, or communication are done only by integrating foreign code. This makes Cuneiform predictable at the organizational level while giving you the opportunity to stick with your battle-tested methods at the computational level.

Automatic Parallelism

Having to develop operators in an application domain is often hard enough. Additionally having to micro-manage potential parallelism is distracting and unnecessary. Let Cuneiform do the parallelization for you. Independent operations are executed in parallel without additional syntactic clutter. Just define your tasks and call them. The rest will be figured out automatically. Here is an example:

parallel unzipping

Since all three files can be unzipped independently, Cuneiform will try to run these three operations in parallel.

Recursion

Cuneiform is more than yet another distributed make-clone or a database query language. Cuneiform gives you many important features from functional programming, e.g., general recursion. While your application scenario might not need this feature right away, it may be worthwhile when your project has matured. E.g., say your workflow includes a costly operator that iteratively improves on an initial state and terminates when a convergence criterion is met. Keeping the operation within one task is fine until data sets grow to a point where the operation presents a major bottleneck for the whole workflow. With Cuneiform you can now unlock the parallelization potential of the operator by transposing the iteration to the workflow level. Now Cuneiform can manage the parallelization potential of the steps, the operator comprises. The following example shows how the k-means clustering algorithm can be expressed in Cuneiform:

Recursion in k-means

Foreign Language Integration

Adopting Cuneiform lets you reuse the software you already have with minimum effort. Say you have a custom visualization library that converts a table to an image written in R, or an ETL operator with a command line interface written in Java. There is no need to alter these operators in any way to use them with Cuneiform. All that is needed is a light-weight wrapper in the form of a task definition.

ETL operator from command line and image conversion in R

This is also the reason for Cuneiform coming with no batteries included. Neither is there a standard library nor are there operators for arithmetics or boolean logic. There do not need to be. Any functionality you would expect from a general purpose programming language still is provided by your favorite general purpose programming language. This holds for basic operations like arithmetics as well as for sophisticated libraries and tools. The following foreign languages are currently supported:

  • Bash
  • Perl
  • Python
  • R

Adaptable Software Stack

If you are running Linux or Mac OS, chances are that you do not have to alter your software stack at all. All Cuneiform needs is Erlang, which can be installed via the default package manager with most Linux distributions. You can test small workflows on your laptop and run medium-sized workflows on a workstation with this basic setup. When your use case exceeds a single workstation, you can scale out with a setup using distributed Erlang and HDFS. If your use case exceeds a hundred workstations, you can scale out further with a setup using Erlang, Hadoop, and HDFS.

Research-Driven Dev Community

Cuneiform is a language rooted in research. It is maintained by a small number of dedicated developers whose major goal is the generation of relevant publications around this topic. This means that features leading to new publications may be given a higher priority while obvious improvements in usability may take longer to finish. Furthermore, while some workflow languages bundle package managers (and more) Cuneiform is limited to a narrow set of features. This approach ensures we can keep up our longterm commitment to supporting Cuneiform.


view all news

News

subscribe via RSS


Featured Publications

More publications ...

Featured Examples

More examples ...

Featured Tutorials

More tutorials ...