Skip to content
This repository has been archived by the owner on Dec 16, 2022. It is now read-only.

Add weighting option to CopyNet #258

Merged
merged 9 commits into from
May 4, 2021
Merged

Add weighting option to CopyNet #258

merged 9 commits into from
May 4, 2021

Conversation

epwalsh
Copy link
Member

@epwalsh epwalsh commented May 3, 2021

This is just a minor, general improvement to CopyNet. It adds the option to weight the loss contributions from individual instances when calculating the batch loss.

@epwalsh epwalsh requested a review from dirkgr May 4, 2021 16:32
@@ -194,6 +202,9 @@ def text_to_instance(

fields_dict["metadata"] = MetadataField(meta_fields)

if weight is not None:
fields_dict["weight"] = ArrayField(np.array([float(weight)], dtype=np.single))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just make it a TensorField, and use Torch methods to create the tensors. ArrayField is the old way.

# shape: (batch_size,)
if len(weight.shape) > 1:
weight = weight.squeeze()
loss = -(weight * log_likelihood).sum() / batch_size
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be / weight.sum()?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've gone back and forth on this.

If you use weight.sum() to normalize, then the weighting is only relative to each batch, which is probably not what you want.

For example, let's say we normalize by weight.sum() and your weights range from 0.5 - 1.0. If you have a batch that contains only instances with weights of 0.5, then this will give you the same result as if they all had weights of 1.0.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, fair enough. I would expect that setting all the weights to 1000 should be the same as setting them all to 0.001. But to get that behavior, and also the behavior you want, we would have to sum up all the weights in the dataset before processing a single batch, and then scale each batch accordingly. That's not practical. So let's leave it like this then.

@epwalsh epwalsh merged commit 77315fc into main May 4, 2021
@epwalsh epwalsh deleted the weighted-copynet branch May 4, 2021 17:35
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants