Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Making Keras flexible without too many specific edge cases #883

Closed
EderSantana opened this issue Oct 23, 2015 · 12 comments
Closed

Making Keras flexible without too many specific edge cases #883

EderSantana opened this issue Oct 23, 2015 · 12 comments

Comments

@EderSantana
Copy link
Contributor

As we approach have passed the 3000 stars and attract several users more and more edge cases are being added to the main repo. As this is awesome to make the library grow fast and help as much people as possible, it makes me wonder if we can continue to scale at that speed.

To make sure Keras can continue to grow fast and helping as much as possible without breaking things, I'd like to propose some functional options, the first two things I thought about was a LambdaLayer and LambdaDataset. I'm already using LambdaLayer in Seya pretty successfully and it looks like this:

class Lambda(MaskedLayer):
    def __init__(self, func, output_shape, ndim=2):
        super(Lambda, self).__init__()
        self.input = ndim_tensor(ndim)
        self.func = func
        self._output_shape = output_shape

    def get_output(self, train=False):
        X = self.get_input(train)
        return self.func(X)

    @property
    def output_shape(self):
        return self._output_shape

As an example of how powerful this is, we can bootstrap all Merge modes with a combination of merge_mode="join" and a LambdaLayer. For example, say we want to merge things with a kernel projection:

def kproj(x):
    # x[0] is layer1
    # x[1] is layer2
    return (T.dot(x[0], x[1].T) + 1)**3

model.add_node(Lambda(kproj, output_shape=(None, out_dim)), inputs=['layer1', 'layer2'], merge_mode='join')

We did it pretty quickly and nothing new had to be added to the main repo, although now we are merging layer1 and layer2 in a huge dimensional space and making Jordan happy.

The same flexibility could be added to datasets and I'm sure you guys have ideas of how to do that elsewhere. This is the same philosophy as the one behind our beloved Callbacks. Let me know what you think.

@fchollet
Copy link
Collaborator

I think lambda layers are a great idea, in fact it's surprising we don't have them already. They're in the spirit of our existing "freeform" elements (activations, losses, callbacks).

They can definitely help reduce the amount of layers in Keras, which might be starting to reach its limits. By the way, one way to control that amount is to start deleting unnecessary layers. Like WordContextProduct (should be implementable with a Dot layer) and a few others. Striving for simplicity also means deleting that which is not essential.

Only technical issue I could see with lambda layers is the output shape computation. Here's a suggestion: the output_shape constructor argument should be polymorphic and should take 3 types:

  • None (default): we default to returning the input shape
  • a tuple (not including the samples dimension (first dimension) for consistency with the input_shape layer API): we return (input_shape[1],) + tuple.
  • a lambda taking a tuple and returning a tuple. Maps input shape (including samples dimension) to output shape.

About samples dimension: in the general case it only contains None because the number of samples in a batch is variable. But it may not always be the case. If you are doing exotic reshapes / dimshuffles (or setups like word embeddings where you need either 2 or 3 samples per batch), you might want to fix the batch size.

@fchollet
Copy link
Collaborator

Note that one issue that would be raised with lambda layers is serialization. How do you serialize/deserialize the func argument and the output_shape argument (when that one is a lambda too)?

@jfsantos
Copy link
Contributor

It would not be possible to serialize lambdas, as @fchollet mentioned, at least when using the HDF5 or JSON serialization backends. We could keep a stub for lambda layers in the serialization format and have the user fill in the func and output_shape fields. Any other thing would look like an ugly hack :)

@nouiz
Copy link
Contributor

nouiz commented Oct 26, 2015

If the user user a python function instead of a lambda it will work but the
pickled file will depend on his code to be present.
Le 24 oct. 2015 16:16, "François Chollet" [email protected] a
écrit :

Note that one issue that would be raised with lambda layers is
serialization. How do you serialize/deserialize the func argument and the
output_shape argument (when that one is a lambda too)?


Reply to this email directly or view it on GitHub
#883 (comment).

@sergeyf
Copy link

sergeyf commented Nov 4, 2015

@EderSantana Can you help me understand the difference between merge_modes join and concat?

@EderSantana
Copy link
Contributor Author

concat creates a new tensor. Say you have dimensions [100, 5] and [100, 4]. The result is a tensor with dimensions [100, 9]. join simply creates a list with both tensors and pass it to the next layer.

@sergeyf
Copy link

sergeyf commented Nov 4, 2015

Thanks!

On Wed, Nov 4, 2015 at 7:42 AM, Eder Santana [email protected]
wrote:

concat creates a new tensor. Say you have dimensions [100, 5] and [100,
4]. The result is a tensor with dimensions [100, 9]. join simply creates
a list with both tensors and pass it to the next layer.


Reply to this email directly or view it on GitHub
#883 (comment).

@rpinsler
Copy link
Contributor

Is it possible to pass a function that takes additional parameters? I tried to implement a custom activation function with a Lambda layer but couldn't really figure out how to do it properly (see #1061). Are there any alternatives?

@joetigger
Copy link

Can someone please add a backend function to get tensor shape? Otherwise Lambda still can't do the trick such as custom loss function.

@fchollet
Copy link
Collaborator

There is K.shape -> symbolic shape
K.int_shape -> static shape

@joetigger
Copy link

Hmm, I tried but got error:

def get_lossfunc(y_true, y_pred):
  output_shape = K.init_shape(y_pred)[:-1] + (1,)
  mu = Lambda(lambda x:x[:,0], output_shape=output_shape)(y_pred)
  sigma = Lambda(lambda x:x[:,1], output_shape=output_shape)(y_pred)
  p = K.exp(-K.square((y_true - mu)/sigma)*0.5) / sigma * PI_half + K.epsilon()
  return K.mean(-K.log(p))

K.init_shape:

output_shape = K.init_shape(y_pred)[:-1] + (1,)
AttributeError: 'module' object has no attribute 'init_shape'

K.shape

raise TypeError('In Lambda, output_shape '
TypeError: In Lambda, output_shape must be a list, a tuple, or a function.

@ParikhKadam
Copy link

@joetigger it's K.int_shape() and not K.init_shape().

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants