I mentioned in my initial post “Deep Learning Frameworks Overview” that my choice of Deep Learning library is (at least for now) Theano and Lasagne combination. However, I did not use, in all my post, some most important word: the experiment. So, let’s assume that you have some idea and want to test it quickly. For example, what if we add to a standard CNN (omitted maxpooling for a clarity):
some extra “convolutional branch”, that is concatenated with last but one layer:
This experiment is really easy to do in (based on Theano) Lasagne. I have just added a build_modified_cnn method to a mnist example (bolded text refers to my “convolutional branch”, rest is the same as a standard build_cnn method):
def build_modified_cnn(input_var=None): l_in = lasagne.layers.InputLayer(shape=(None, 1, 28, 28), input_var=input_var) l_conv1 = lasagne.layers.Conv2DLayer( l_in, num_filters=32, filter_size=(5, 5), nonlinearity=lasagne.nonlinearities.rectify, W=lasagne.init.GlorotUniform()) l_conv1A = lasagne.layers.Conv2DLayer( l_in, num_filters=32, filter_size=(10, 10), nonlinearity=lasagne.nonlinearities.rectify, W=lasagne.init.GlorotUniform()) l_maxpool1A = lasagne.layers.MaxPool2DLayer(l_conv1A, pool_size=(5, 5)) l_dense1A = lasagne.layers.FlattenLayer(l_maxpool1A) l_maxpool1 = lasagne.layers.MaxPool2DLayer(l_conv1, pool_size=(2, 2)) l_conv2 = lasagne.layers.Conv2DLayer( l_maxpool1, num_filters=32, filter_size=(5, 5), nonlinearity=lasagne.nonlinearities.rectify) l_maxpool2 = lasagne.layers.MaxPool2DLayer(l_conv2, pool_size=(2, 2)) l_dense1 = lasagne.layers.DenseLayer( lasagne.layers.dropout(l_maxpool2, p=.5), num_units=256, nonlinearity=lasagne.nonlinearities.rectify) l_concat = lasagne.layers.ConcatLayer([l_dense1, l_dense1A]) l_dense2 = lasagne.layers.DenseLayer( lasagne.layers.dropout(l_concat, p=.5), num_units=10, nonlinearity=lasagne.nonlinearities.softmax) return l_dense2
We see here some standard layers such as Conv2DLayer and MaxPool2DLayer and less standard however self-explanatory ones: FlattenLayer and ConcatLayer. Some results of the test on mnist dataset:
Type | Epoch | Accuracy | Time |
Standard CNN | 1 | 56.75 % | 6.767s |
Standard CNN | 30 | 96.76 % | 6.923s |
Modified CNN | 1 | 72.51 % | 9.552s |
Modified CNN | 30 | 96.03 % | 9.592s |
Not this time. My modification made a significant progress, if we look only at first epoch. However, in general, it learns slower than a standard one. But to see it, I needed only a few minutes of coding – and this is a true power of Theano and Theano-based libs!
Could you clarify the “time” column. It’s elapsed real time per epoch, and not total training time, I assume?
LikeLike
Yup, you are right 🙂
LikeLike
Reblogged this on mugedblog and commented:
Post Grześka o tym, jak łatwo eksperymentować w Lasagne!
LikeLike
I wonder if the the network *starts* with 2 branches, like this:
input_A -> [model_path_A] —–
\
|—[Linear]–> output_y
/
input_B -> [model_path_B] —–
the input is a tuple (input_A, input_B), how to set the input and train?
Can you give a new example?
Thanks!
LikeLike