Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to use a custom objective function for a model? #369

Closed
log0 opened this issue Jul 9, 2015 · 34 comments
Closed

How to use a custom objective function for a model? #369

log0 opened this issue Jul 9, 2015 · 34 comments

Comments

@log0
Copy link

log0 commented Jul 9, 2015

New to Keras and DL, so I may be asking really basic questions... appreciate if someone could explain in an easier term. Thanks!

How to use a custom objective function for a model? Any sample code I could look into to reference from?

It seems models can only accept a string of the pre-defined objective functions here : https://github.com/fchollet/keras/blob/master/keras/objectives.py

However, the doc (http://keras.io/models/) sounds like as if I could just plug in a custom function like score(true, pred):
loss: str (name of objective function) or objective function.

The source code also seems to say it's only string: https://github.com/fchollet/keras/blob/master/keras/models.py#L240

@mthrok
Copy link
Contributor

mthrok commented Jul 9, 2015

You almost answer the question yourself. Create your objective function like ones in https://github.com/fchollet/keras/blob/master/keras/objectives.py like this,

import theano
import theano.tensor as T

epsilon = 1.0e-9
def custom_objective(y_true, y_pred):
    '''Just another crossentropy'''
    y_pred = T.clip(y_pred, epsilon, 1.0 - epsilon)
    y_pred /= y_pred.sum(axis=-1, keepdims=True)
    cce = T.nnet.categorical_crossentropy(y_pred, y_true)
    return cce

and pass it to compile argument

model.compile(loss=custom_objective, optimizer='adadelta')

@M-Taha
Copy link

M-Taha commented Oct 11, 2015

I am trying to write a new objective function like the following:

def expert_loss(y_true, y_pred):
    # y_pred is n-dimensional, y_true is n+1 dimensional.
    return T.sqr(T.dot(y_true[:-1].T, y_pred) - y_true[-1]).mean(axis=-1)

In this objective function the dimensions of y_true and y_pred are different. Keras gives error for shape mismatch. Is this a bug? Is there a way around?

@BrianMiner
Copy link

@mthrok , this may be a naive question....but all you need to do is a create a loss function (although it looks like it is not simple if you are not familiar with Theano)? In back prop isnt the gradient needed?

@RagMeh11
Copy link

Hi, can you tell me how to check the dimension of y_true and y_pred?? I want to define a objective function which is dependent on the dice coefficient instead of accuracy and as we are using it for segmentation

@rgalhama
Copy link

rgalhama commented Mar 8, 2016

Hey, I also need to create a custom objective function, but in this case it should use the weights of the layer (instead of the output and prediction). Is there any workaround to access the weights?
Thanks!

@vadirajmkulkarni
Copy link

@rgalhama To access the weights of any layer after compilation of the entire model you can use
weights = model.layer[id].get_weights() # list of numpy arrays

Not sure whether this will work during run-time though.
If your goal is to add loss component based on weights of the neural network then you can check regularization option too.

@gideonite
Copy link

Is there a way of accessing the rest of the batch? I'm doing unsupervised learning and have a custom objective function which is " batch-wise" --- it depends on every example in the batch.

@wddabc
Copy link

wddabc commented May 4, 2016

Hi,
It is great to abstract the objective function to take (y_pred, y_true) as input. I'm wondering whether there are clean ways to define the objective as a whole? I feel a little bit awkward as I have to explicitly decompose my objective to y_pred and y_true in order to suit the objective interface. In my case, I only need to pass a X into my model and directly optimize the objective, without considering y. A work around is define a function which ignores y_true and pass a 1-vector during training. But I'm wondering whether there are some better ways for that.

@LeZhengThu
Copy link

@mthrok If I want to write a custom objective function myself, where should I put the function, in the objectives.py of keras file, or directly in my python file? I tried your example code and put it in my python file, but an error occured. 'Exception: Invalid objective: custom_objective'

@LeZhengThu
Copy link

@mthrok I used compile(loss='custom_objective', ....)
And if I used compile(loss= custom_objective, ...) just as your example, keras threw out an error: 'NameError: global name 'custom_objective' is not defined'

@luthfianto
Copy link
Contributor

luthfianto commented Jun 23, 2016

@LeZhengThu loss= custom_objective should be okay. Are you sure you aren't misplacing/typo your custom objective function?

@hadi-ds
Copy link

hadi-ds commented Jul 11, 2016

This is slightly off the topic, but how can I get shape of the theano tensor variables that go through a cost function?

I am trying to write a custom cost function for an auto-encoder I built. It is basically generalization of 'mean_squared_error' to this case where for each input vector the output is also a vector rather than a value. Mathematically, it is the following:

cost = sum_(i=1,...,N ) {||x_i - x'_i||^2} / N, where x_i is a vector. I modified mean_squared_error as follows:

def vector_mse(X_true, X_pred):
    from keras import backend as K
    n_featues = K.shape(X_true)[1]
    return n_features * K.mean(K.square(y_pred - y_true), aix=None) 
    # 'axis=None' makes 'mean' to go over all elements of the tensor,
    # so I need to multiply the result with n_features to get the mean as defined in equation above.

It fails on 'n_featues = K.shape(X_true)[1]' because K.shape(X_true) return 'Shape.0' instead of a tuple with number of rows/columns.

I am not very familiar with theano backend, but I tried evaluating it as 'K.eval(K.shape(X_true))' but it doesn't work.
Another thing I tried was to convert the tensors to numpy array
'X_true_numpy = K.eval(X_true), and go from there, but it also fails with following error:
raise MissingInputError("Undeclared input", variable) theano.gof.fg.MissingInputError: ('Undeclared input', sequential_2_target) as if X_true is not really assigned.

I appreciate any ideas on how to make this work.

@shamidreza
Copy link

shamidreza commented Nov 19, 2016

I want to design a customized loss function in which we use the layer outputs in the loss function calculations.
For a hypothetical example, lets consider a 3 layered DNN: x->h_1->h_2->y
Let's consider that in addition to minimizing (y,y_pred) we want to minimize (h_1, h_2) (crazy hypothetical).
In theano it is straightforward: cost=mse(y,y_pred)+mse(h_1,h_2)
But with keras, how should I access h_1/h_2 so I could work on them?

One solution that comes to my mind is to create a new input/output pair, let's call them X/Y. Then define X=[y,h_1] and Y=[y_pred,h_2] by Merge(concatenate) them, and then build a new cost function that decouples the Merged symbols and compute mse on each of them. Is this something that might work?
Thanks,
Hamid

@17shasvatj
Copy link

17shasvatj commented Dec 21, 2016

@M-Taha Did you ever resolve this issue involving shape mismatch? I have the same problem.

@curiale
Copy link

curiale commented Feb 27, 2017

@shamidreza Have you found a cool way to do it? (I mean, avoiding the merge option)

@shamidreza
Copy link

@curiale Unfortunately, no. I had to go back to my beloved theano for that.

@curiale
Copy link

curiale commented Feb 27, 2017

@shamidreza Thanks. I think that you should open a new issue about this.

@kobeee
Copy link

kobeee commented Mar 14, 2017

@hadi-ds I met the same problem like yours, have you solved it?

@hadi-ds
Copy link

hadi-ds commented Mar 15, 2017

@kobeee I ended up writing that objective function in theano as follows:

def vector_mse(y_true, y_pred):
    from theano import tensor as T
    diff2 = (y_true - y_pred)**2
    return T.mean(T.sum(diff2, axis = -1))

hope this helps.

@kobeee
Copy link

kobeee commented Mar 16, 2017

@hadi-ds Thank you! I have solved the problem.

@joetigger
Copy link

joetigger commented Apr 17, 2017

I have a similar problem as @hadi-ds except that mine couldn't be easily solved with tensor functions.
In my problem, the model output y_pred has shape (num_samples,2), where y_pred[:,0] is mean and y_pred[:,1] is sigma. Since keras doesn't have split, I followed the suggestion to write my objective function using Lambda:

def myloss(y_true, y_pred):
   mu = Lambda(lambda x:x[:,0], output_shape=input_shape[:-1]+(1,))(y_pred)
   sigma = Lambda(lambda x:x[:,1], output_shape=input_shape[:-1]+(1,))(y_pred)
   p = K.exp(-K.square((y_true-mu)/sigma)*0.5)/sigma/np.sqrt(2*np.pi) + K.epsilon()
   return K.mean(-K.log(p))

However, unlike Layer, custom loss function doesn't know the shape of tensor, and based on Issue 2801 it seems that keras doesn't support getting tensor shape or tensor split, so how can I implement my objective function?

@hkmztrk
Copy link

hkmztrk commented Apr 21, 2017

Hello,

I'm trying to define my own metric,

def cindex_score(y_true, y_pred):
	sum = 0
	pair = 0	
	for i in range(1, len(y_true)):
		for j in range(0, i):
			if i is not j:
				if(y_true[i] > y_true[j]):
				  pair +=1
				  sum +=  1* (y_pred[i] > y_pred[j]) + 0.5 * (y_pred[i] == y_pred[j])
	if pair is not 0:
		return sum/pair
	else:
		return 0

But I get the following error, "object of type 'Tensor' has no len()". I know Tensor object does not have len attribute, but shape attribute does not work as well.

For instance, y_true is represented as such Tensor("dense_4_target:0", shape=(?, ?), dtype=float32)
and its shape is Tensor("strided_slice:0", shape=(), dtype=int32).

Could you please help me about how to turn the code above into runnable form?

@danmoller
Copy link

danmoller commented Apr 26, 2017

@joetigger , I managed to work your problem around (but I'm not sure if your Lambdas will work ok, my case was solved with this). It's probably not the most expected solution, but it's the only thing I could do so far, and it works great :D

I needed the loss function to carry the result of some keras layers, so, I created those layers as an independent model and appended those to the end of the model. The idea was to train the model using the already processed output instead of the original output:

Creating the model for calculating the loss:

#it will be used also for processing your training and validation outputs

def createLossModel():

    inLay = Input(#shape of your original output)

    #use your lambdas here.....
    lay = AnyKerasLayer(params...)(inLay)
    lay = AnyOtherKerasLayer(params...)(lay)
    ...
    output = OneMoreLayer(params...)(lay)

    m = Model(inLay,output)

    #it's important to make this model not trainable if it has weights 
        #(you should probably set these weights manually if that's the case)    
    m.trainable = False
    for l in m.layers: l.trainable = False
     
    return m

Now, let's create a function to join this model to the original model:

def appendLossModel(lossModel,appendToModel, loss = None):

    #create input layers that match your original model
    inLay = Input(#shape of your original input)
    origOut = appendToModel(inLay)
    lossOut = lossModel(origOut)

    m = Model(inLay,lossOut)

    #this is the model you're going to train, so it needs to be compiled
    if not loss is None:
        m.compile(optimizer='adam',loss=loss)

    return m

Now let's manage preparing our models for training:

originalModel = createYourOriginalModelHere() #not necessary to compile
lossModel = createLossModel() #not necessary to compile    

#joining models:
#here you can use the rest of your "myloss", the part with K.exp(-K.square((y_true-m.....
trainingModel = appendLossModel(lossModel, originalModel, loss=myloss) 
    #compiled inside the append function    

Training - Important: you're going to compare with different results now, so, process your training Y:

newY = lossModel.predict(originalY)
newValidationY = lossModel.predict(originalValidationY)

#there we go:
trainingModel.fit(originalX, newY, ......... [originalValidationX,newValidationY] .......)

And finally, for the results:

results =  originalModel.predict(originalX)

I hope it helps :D

@joetigger
Copy link

@danmoller Thanks for the tip! Yes, your solution would help solve my problem. 👍

@InderpreetSinghChhabra01
Copy link

InderpreetSinghChhabra01 commented Jul 18, 2017

Hii, I need to define my own loss function, I am using GAN model and my loss will include both adverserial loss and L1 loss between true and generated images, I tried to write a function but the folloeing error.

ValueError: ('Could not interpret loss function identifier:', Elemwise{add,no_inplace}.0)

my loss function is

def loss_function (y_true, y_pred, y_true1, y_pred1):
bce = 0, batch_size=64

for i in range (64):
    a = y_pred1[i]        
    b = y_true1[i]        
	x = K.log(a)
    bce=bce-x
bce/=64
print('bce = ', bce)

for i in zip( y_pred, y_true):
    img   = i[0]
    image = np.zeros((64,64),dtype=y_pred.dtype)
    image = img[0,:,:]                
    image = image*127.5+127.5                
    imgfinal = Image.fromarray(image.astype(np.uint8))
    
    img1 = i[1]
    image1 = np.zeros((64,64), dtype=y_true.dtype)
    image1 = img1[0,:,:]
    image1 = image1*127.5+127.5              
    imgfinal1 = Image.fromarray(image1.astype(np.uint8))
            
    diff = ImageChops.difference(imgfinal,imgfinal1)

    h = diff.histogram()
    sq = (value*((idx%256)**2) for idx, value in enumerate(h))
    #sq = (value*(idx**2) for idx, value in enumerate(h))
    sum_of_squares = sum(sq)
    lossr = math.sqrt(sum_of_squares/float(im1.size[0] * im1.size[1]))
    loss  = loss+lossr
                                  
loss /=(64*127) 
print('loss = ', loss)

return x+loss

Thanx in advance

@gaopinghai
Copy link

@hkmztrk Have you solved this problem? It bothers me too!

@hkmztrk
Copy link

hkmztrk commented Jul 31, 2017

Hey @PingHGao, sorry for my late reply! Yeah, check this out! https://stackoverflow.com/questions/43576922/keras-custom-metric-iteration

@sajjo79
Copy link

sajjo79 commented Nov 29, 2017

Hi,
I am trying to write a loss function that include statements like this

tmp=tf.zeros([20,5,240,240])
 ind=tf.where(y_true==1) # y_true is also same shape tensor as that of tmp
 tmp=tf.assign(tmp[ind],1)

but it is giving me error. How can i do this.

@hgaiser
Copy link
Contributor

hgaiser commented Nov 29, 2017

tf.equal

@mverzett
Copy link

mverzett commented Mar 29, 2018

Dear experts,

I've been trying to find the answer online without success: what are the expected output shape of the tensor for custom loss? Is it a single scalar value or a vector?

Thank you

Edit:
I also noticed the following inconsistency in Keras 1 (I know, old set-up)

>>> import numpy as np
>>> from keras import losses
>>> import tensorflow as tf
>>> from keras import backend as K
>>> s = tf.Session()
>>> x = np.random.rand(10,1)
>>> y = (np.random.rand(10,1) > 0.5).astype(float)
>>> losses.binary_crossentropy(tf.convert_to_tensor(y), tf.convert_to_tensor(x)).eval(session=s)
array([ 2.78712478,  1.3038832 ,  1.34750736,  1.20378335,  0.8925575 ,
        1.65690233,  0.54149211,  1.28658053,  0.30892665,  0.68163989])
>>> K.binary_crossentropy(tf.convert_to_tensor(y), tf.convert_to_tensor(x)).eval(session=s)
array([[ 15.1252521 ],
       [ 11.7424268 ],
       [ 11.92920796],
       [ 11.28175078],
       [  9.51601343],
       [ 13.04390931],
       [  6.7393083 ],
       [ 11.66605725],
       [  4.28363182],
       [  7.96577442]])

Can somebody explain me the difference in the behaviour and, most importantly, the reson for the different values?

Again, thanks!

@immuno121
Copy link

@mverzett , I think there is some implementation level difference.
If we look at the documentation of both,

  1. For losses.binary_crossentropy(y_true,y_pred), on line 76- they have called the same function
    i.e. K.binary_crossentropy but have taken the mean of the output along the last axis
    https://github.com/keras-team/keras/blob/master/keras/losses.py#L76
    Here is the link to K.binary_crossentropy
    https://github.com/tensorflow/tensorflow/blob/r1.7/tensorflow/python/keras/_impl/keras/backend.py#L3413

I do not understand why this should make a difference in your case as the last axis had only a single dimension though.

@emnajaoua
Copy link

I am having trouble converting this function to keras in order to calculate a custom loss.
def detect_blocks(x): outputs=[] for i, row in enumerate(x): last_ele=row[0] for j, val in enumerate(row[1:]): if val == last_ele: continue outputs.append([i,j, last_ele]) last_ele=val outputs.append([i,len(row)-1, last_ele]) return outputs
and this function as well:

def calculate_accuracy(l1,l2): #should be rescaled !!! acc = 0 cmp = 0 j = 0 i = 0 len_l1 = len(l1) len_l2 = len(l2) initial_length = len_l1 while i < len_l1: while j < len_l2: if np.array_equal(l1[i], l2[j]): cmp += 1 l1.remove(l1[i]) l2.remove(l2[j]) len_l1 = len(l1) len_l2 = len(l2) elif abs(l1[i][2] - l2[j][2]) < neighborhood_constant: if (l1[i][0] == l2[j][0]) and (l1[i][1] == l2[j][1]): cmp += 1 l1.remove(l1[i]) l2.remove(l2[j]) len_l1 = len(l1) len_l2 = len(l2) j = 0 j += 1 i += 1 acc = cmp/initial_length return acc
and I want to call those functions to calculate the loss using this code

`#additional loss

x_test_normalized = tf.round(36 * inputs)
x_decoded_normalized = tf.round(36* outputs)
acc_rectangles = 0
#elem_1 = tf.map_fn(lambda x: (x), x_test_normalized, dtype=(tf.float32))
#elem_2 = tf.map_fn(lambda x: (x), x_decoded_normalized, dtype=(tf.float32))
rectanglesL1 = detect_blocks(tf.reshape(x_test_normalized, [6,6]))
rectanglesL2 = detect_blocks(tf.reshape(x_decoded_normalized,[6,6]))
acc = calculate_accuracy(rectanglesL1,rectanglesL2)
acc_rectangles += acc
additional_loss= acc_rectangles/len(inputs)`

So Basically, I am using a VAE where I want to integrate this additional loss as well.
but I get this error
`---> 14 rectanglesL1 = detect_blocks(tf.reshape(x_test_normalized, [6,6]))
15 rectanglesL2 = detect_blocks(tf.reshape(x_decoded_normalized,[6,6]))
16 acc = calculate_accuracy(rectanglesL1,rectanglesL2)

in detect_blocks(x)
1 def detect_blocks(x):
2 outputs=[]
----> 3 for i, row in enumerate(x):
4 last_ele=row[0]
5 for j, val in enumerate(row[1:]):

/opt/aiml4it/anaconda/3-5.2.0-generic/lib/python3.6/site-packages/tensorflow/python/framework/ops.py in iter(self)
434 if not context.executing_eagerly():
435 raise TypeError(
--> 436 "Tensor objects are not iterable when eager execution is not "
437 "enabled. To iterate over this tensor use tf.map_fn.")
438 shape = self._shape_tuple()

TypeError: Tensor objects are not iterable when eager execution is not enabled. To iterate over this tensor use tf.map_fn.`

@MingleiLI
Copy link

MingleiLI commented Nov 4, 2018

You almost answer the question yourself. Create your objective function like ones in https://github.com/fchollet/keras/blob/master/keras/objectives.py like this,

import theano
import theano.tensor as T

epsilon = 1.0e-9
def custom_objective(y_true, y_pred):
    '''Just another crossentropy'''
    y_pred = T.clip(y_pred, epsilon, 1.0 - epsilon)
    y_pred /= y_pred.sum(axis=-1, keepdims=True)
    cce = T.nnet.categorical_crossentropy(y_pred, y_true)
    return cce

and pass it to compile argument

model.compile(loss=custom_objective, optimizer='adadelta')

@mthrok What is the requirement on the input y_pred and y_true? Should they come from logits or from softmax that performs as a probability just as https://github.com/tensorflow/tensorflow/blob/r1.11/tensorflow/python/keras/backend.py?

@MatthiasWinkelmann
Copy link

@emnajaoua

Your question (and, tbh, all the others) are better suited for stack overflow, considering they aren't bugs in keras but questions regarding its use.

You are also more likely to get useful answers by investing a modicum of time to at least properly format your post. It's currently lacking line breaks and respect for the reader.

That being said, I'll give you the hint that Tensorflow was giving you a hint:

TypeError: Tensor objects are not iterable when eager execution is not enabled. To iterate over this tensor use tf.map_fn.`

That's a pretty useful error message. Did you try �tf.map_fn? Because it would do what you're trying to do.

The larger problem, however, is that you haven't grokked how tensorflow actually works. https://www.tensorflow.org/guide/graphs might be a good introduction. Specifically, getting a tensor's length does not make much sense because a tensor's length/shape is fixed at compile time.

The process of implementing a custom loss isn't fundamentally different from doing so in standard python. It's just that you cannot manipulate tensors with the python stdlib you are used to. See the tensorflow documentation for the operations that are defined on tensors (such as tf.mean()). That's the toolbox you are working with. It also helps to read some of the keras source code to get a feel of how such things are done.

fchollet pushed a commit that referenced this issue Sep 22, 2023
* added perceiver image classifier

* removed tfa dependency and replaced tfa LAMB optimizer and Input names

* modify comment in tutorial
hubingallin pushed a commit to hubingallin/keras that referenced this issue Sep 22, 2023
* added perceiver image classifier

* removed tfa dependency and replaced tfa LAMB optimizer and Input names

* modify comment in tutorial
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests