Distributed Functional Programming and Foreign Language Integration
Cuneiform combines the strong points of functional programming languages, distributed databases, and workflow management systems. It enables you to run your data-intensive analysis workloads with minimal effort and no lock-in. Here is why:
Cuneiform has only a small set of syntactic concepts: Pretty much the only thing you can do is to define and call tasks.
Computation, I/O, or communication are done only by integrating foreign code. This makes Cuneiform predictable at the organizational level while giving you the opportunity to stick with your battle-tested methods at the computational level.
Having to develop operators in an application domain is often hard enough. Additionally having to micro-manage potential parallelism is distracting and unnecessary. Let Cuneiform do the parallelization for you. Independent operations are executed in parallel without additional syntactic clutter. Just define your tasks and call them. The rest will be figured out automatically. Here is an example:
Since all three files can be unzipped independently, Cuneiform will try to run these three operations in parallel.
Cuneiform is more than yet another distributed make-clone or a database query language. Cuneiform gives you many important features from functional programming, e.g., general recursion. While your application scenario might not need this feature right away, it may be worthwhile when your project has matured. E.g., say your workflow includes a costly operator that iteratively improves on an initial state and terminates when a convergence criterion is met. Keeping the operation within one task is fine until data sets grow to a point where the operation presents a major bottleneck for the whole workflow. With Cuneiform you can now unlock the parallelization potential of the operator by transposing the iteration to the workflow level. Now Cuneiform can manage the parallelization potential of the steps, the operator comprises. The following example shows how the k-means clustering algorithm can be expressed in Cuneiform:
Foreign Language Integration
Adopting Cuneiform lets you reuse the software you already have with minimum effort. Say you have a custom visualization library that converts a table to an image written in R, or an ETL operator with a command line interface written in Java. There is no need to alter these operators in any way to use them with Cuneiform. All that is needed is a light-weight wrapper in the form of a task definition.
This is also the reason for Cuneiform coming with no batteries included. Neither is there a standard library nor are there operators for arithmetics or boolean logic. There do not need to be. Any functionality you would expect from a general purpose programming language still is provided by your favorite general purpose programming language. This holds for basic operations like arithmetics as well as for sophisticated libraries and tools. The following foreign languages are currently supported:
Adaptable Software Stack
If you are running Linux or Mac OS, chances are that you do not have to alter your software stack at all. All Cuneiform needs is Erlang, which can be installed via the default package manager with most Linux distributions. You can test small workflows on your laptop and run medium-sized workflows on a workstation with this basic setup. When your use case exceeds a single workstation, you can scale out with a setup using distributed Erlang and HDFS. If your use case exceeds a hundred workstations, you can scale out further with a setup using Erlang, Hadoop, and HDFS.
Research-Driven Dev Community
Cuneiform is a language rooted in research. It is maintained by a small number of dedicated developers whose major goal is the generation of relevant publications around this topic. This means that features leading to new publications may be given a higher priority while obvious improvements in usability may take longer to finish. Furthermore, while some workflow languages bundle package managers (and more) Cuneiform is limited to a narrow set of features. This approach ensures we can keep up our longterm commitment to supporting Cuneiform.
view all news
- Dec 22, 2016 Hi-WAY: Execution of Scientific Workflows on Hadoop YARN accepted at EDBT 2017
- Nov 24, 2016 Scalable Multi-Language Data Analysis on Beam presented at Erlang Factory Lite 2016 in Berlin
- Oct 28, 2016 Cuneiform poster presented at BSR Winterschool 2016 in Ede
- Oct 17, 2016 Release: Cuneiform 2.2.1
- Sep 8, 2016 Scalable Multi-Language Data Analysis on Beam presented at Erlang User Conference 2016 in Stockholm
- Jul 19, 2016 Functional Programming and Foreign Language Interfaces presented at Curry On 2016 in Rome
- Jun 9, 2016 Talk Scalable Multi-Language Data Analysis on Beam: The Cuneiform Experience accepted at Erlang User Conference 2016 in Stockholm
- Apr 23, 2016 Talk Functional Programming and Foreign Language Interfaces accepted at Curry On 2016 in Rome
subscribe via RSS