Generator_loss curve and discriminator_loss curve looks wrong？ #1

jdwang125 · 2023-03-16T05:14:15Z

jdwang125
Mar 16, 2023

On the basis of github code and the dataset of mls_french , add tenserboard to monitor the generator_loss curve and discriminator_loss curve, maxepoch=50. From the result, the two loss curves are incorrect. The dataset is mls_french.

Answered by jhauret

Mar 17, 2023

Hi,

Your curves are not incorrect, you just have pushed the training too far! 😄 If you stop the training at the 13th epoch($\approx 200k$ steps), you will obtain the same model as the one in the project.

The real-time and on-device constraints have forced us to have a much lower number of generator parameters in comparison to discriminators (1.9M vs 27.8M). Starting from that point, it was hard to reach a Nash equilibrium during the training. BUT, this does not prevent obtaining a performing generator.

If you want to go further than a simple result reproduction, we have tried two interesting techniques from the Encodec Paper that helped to stabilize training and improved results:

Normali…

View full answer

jhauret · 2023-03-17T09:09:11Z

jhauret
Mar 17, 2023
Maintainer

Hi,

Your curves are not incorrect, you just have pushed the training too far! 😄 If you stop the training at the 13th epoch($\approx 200k$ steps), you will obtain the same model as the one in the project.

The real-time and on-device constraints have forced us to have a much lower number of generator parameters in comparison to discriminators (1.9M vs 27.8M). Starting from that point, it was hard to reach a Nash equilibrium during the training. BUT, this does not prevent obtaining a performing generator.

If you want to go further than a simple result reproduction, we have tried two interesting techniques from the Encodec Paper that helped to stabilize training and improved results:

Normalizing the feature matching loss with the reference embedding norm.
Adding a dynamic loss balancer that ensures generator parameters updates are equally impacted by reconstructive and adversarial losses (based on an exponential moving average normalization of gradients )

We did not include those techniques as they were not part of our original paper.

15 replies

jhauret Oct 15, 2023
Maintainer

You're welcome !

If you are referring to the normalized feature loss and the loss balancer, the code snippets are in this conversation thread.

SayeedChowdhury Oct 15, 2023

Great, thanks a lot!

jiaweiru Oct 18, 2023

Hi, thanks for contributing the code about SEANet! I was wondering if the downsampling convolution in EncBlock needs to incorporate an activation function, in some other implementations, such as EnCodec, it seems that the activation is performed before the convolution.

class EncBlock(nn.Module):
    def __init__(self, out_channels: int, stride: int, weight_norm: bool):
        super().__init__()

        self.residuals = nn.Sequential(
            ResidualUnit(channels=out_channels // 2, dilation=1),
            ResidualUnit(channels=out_channels // 2, dilation=3),
            ResidualUnit(channels=out_channels // 2, dilation=9),
        )
        self.conv = conv1d_wrapper(
            in_channels=out_channels // 2,
            out_channels=out_channels,
            kernel_size=2 * stride,
            stride=stride,
            padding=stride - 1,
            bias=False,
            padding_mode="reflect",
            weight_norm=weight_norm,
        )

    def forward(self, x):
        out = self.conv(self.residuals(x))
        return out

jhauret Oct 18, 2023
Maintainer

Hi @jiaweiru, great question and unfortunately this is the kind of detail we cannot find in the article... I think there is no clear answer, just like for the type of residual unit (pre/post-activation or all other sub-versions). If you have enough computing power, you should try both and keep the best one for your case. If you find an ablation study on this point, don't hesitate to share it!

jiaweiru Oct 19, 2023

Thanks for the reply, I will try it afterward if I have time.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generator_loss curve and discriminator_loss curve looks wrong？ #1

{{title}}

Replies: 1 comment 15 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

Select a reply

Generator_loss curve and discriminator_loss curve looks wrong？ #1

jdwang125 Mar 16, 2023

Replies: 1 comment · 15 replies

jhauret Mar 17, 2023 Maintainer

jhauret Oct 15, 2023 Maintainer

SayeedChowdhury Oct 15, 2023

jiaweiru Oct 18, 2023

jhauret Oct 18, 2023 Maintainer

jiaweiru Oct 19, 2023

jdwang125
Mar 16, 2023

Replies: 1 comment 15 replies

jhauret
Mar 17, 2023
Maintainer

jhauret Oct 15, 2023
Maintainer

jhauret Oct 18, 2023
Maintainer