Deep Learning Frameworks Overview

I have some experience with caffe and it was my main tool for research in area of Music Information Retrieval. However, Deep Learning is not reduced to Convolution Neural Networks and caffe is not suitable for fast, prototype implementations. So I was faced with the question: What is the best Deep Learning framework?

Before google-it let’s  quora-it. We can easily find a related question: Which is the best deep learning framework Theano Torch7 or Caffe ? I recommend to read all this thread, but here I copy-paste some interesting parts:

If one wants to code up the entire algorithm for specific problem Theano is the quickest to get started with. It gives a comprehensive control over Neural Network formation . The reason we use Theano at ParallelDots is that the Neural Networks we make had no standard implementations and hence Theano was the best way to prototype them .

if you want to do more fundamental work like changing loss function or introducing some optimization constraint, you have to go to Theano (…) but I would like to warn you about complexities of Theano. It might happen that you waste 3 months just to understand the nity gritty of codes, by the time research has moved ahead.

Theano is very easy and quick to build back propagation. Torch7 is more transparent.

However, both question and answers do not mention about Google piece of cake – TensorFlow. So let’s quora-it again. Now we get quries like: Is TensorFlow better than other leading libraries such as Torch Theano? What is unique about Tensorflow from the other existing Deep Learning Libraries?

TensorFlow is both an R&D and deployment framework. It can be deployed on phones too. For rest of the features, it is more or less like Theano.  So yes, it more or less subsumes Theano and Torch’s features.
For new projects I can see a rapid adoption of TensorFlow somewhat beating others. Old practitioners who have been working on Theano/Torch will continue to use these frameworks. At least for me, Theano fulfils pretty much all requirements and I dont have anything which I need and I already know how to program it.

 TensorFlow performs non trivially worse than its competitors, in both speed and memory usage, Google is working on fixing this. I wouldn’t be surprised at performance parity in a couple of releases.
Benchmark TensorFlow · Issue #66 · soumith/convnet-benchmarks

The Hacker’s Machine Intelligence Platform just trolls Tensorflow.

I think there are two main differences at the moment, comparing it to the more mainstream libraries:
1. The visualization module (TensorBoard): One of the main lacking areas of almost all open source Machine Learning packages, was the ability to visually model and follow the computation pipeline.
2. The all-in-one hardware implementation approach: The libraries can be deployed in all kinds of hardware, from mobile devices to more powerful heterogeneous computing setups.

Lack of Symbolic loops (“scan” in Theano). Googles white paper mentions several control flow operations, but they are not ready yet.

Subgraph Execution is awesome. Being able to introduce and retrieve the results of discretionary data on any edge of the graph introduces considerable debugging potential into TensorFlow. I truthfully cannot undersell how useful this is, I can see on the fly execution of sub components making its way into my workflow nicely.

In summary:

  • Tensor Flow is not a winner of speed.
  • Tensor Flow is not only R&D framework, but deployment platform too.
  • Great Visualization module Tensor Board.

Similar observation can be made by watching Justin Johnson lecture:

My own impressions of Justin lecture:

1. Justin does not mention about a prosaic TensorFlow constraint – if you do not have a good GPU, you will not be able to install Tensor Flow (my GeForce GT 755M  is not enough)
2. I am disappointed  by Torch. Lua appears to be evil (I do not know what is worse, global variables,  or indexing from 1 !?). Torch is also not recommended  for Recurent Neural Networks.
3. Andrei shows that debugging is cumbersome, however
a) there are not shown debugging options of Theano
b) I believe Theano’s problems with debugging is related to computation
graphs paradigm that is common with Tensor Flow (‘graphs-code independence’).
4. At the end (use-case Batch-norm), Andrei recommends to use Torch when we do want efficient backprop and Theano or Tensor Flow, when do not want to derive analytic equations. However in  Matrix factorization with Theano there is shown that:

there doesn’t seems to be any gain in analytically deriving a function with respect to using the automatic derivation capabilities of Theano.

Nevertheless, I like this lecture very much. Here you can find slides – on 145 there is a table with overview.

Coming back to Theano, there are many Theano-based libs. Probably, the easiest way to start training CNN or RNN is to use keras (which as backend can also use Tensor Flow !). One of the main features of keras is an abstraction – hiding backend (Theano/Tensor Flow). However, sometimes we want to get our hands dirty and have an easier access to Theano. In this case, probably Lasagne is a better choice.

So after all, my choice of Deep Learning framework (after longer journey with caffe, and shorter with keras) will be probably Theano + Lasagne + nolearn (helper functions around Lasagne). It is very probable that later I will switch to Tensor Flow which is really tempting even today, but Theano-based toolbox is convincing from next 3 reasons:

  1. Theano is public from 2010 (google-it is more powerful)
  2. I am also intrested in Bayesian Approach and pymc3
  3. I believe that ‘graphs computation paradgim’ is something more than code, so switching to Tensor Flow is easier for Theano expert (and such switching I do not exclude).

At the end, if you are interested in Theano/Lasagne, here I list links to interesting educational materials:
Theano, a short practical guide, presentation made by Emmanual Bengio
From multiplication to convolutional networks (Theano presentation, codes)
Using convolutional neural nets to detect facial keypoints tutorial (blog post of Daniel Nouri, Lasagne+nolearn)
Recurrent Neural Networks Tutorial (great series about making RNN in numpy and Theano by Denny Britz)
Neural networks with Theano and Lasagne (Theano/Lasagne tutorial by Eben Olson)

Have fun!

Edit: A really good answer has appeared on Quora just now (6.05.2016) so I feel forced to place it here: Is TensorFlow better than other leading libraries such as Torch/Theano?