Is the Gumbel-Softmax formulation accurate? #10

atiorh · 2020-05-12T23:13:02Z

Thanks for releasing the code!

I have been reviewing how the Gumbel-Softmax[1] trick was used and both the paper and the code suggest that the "relevance scores are interpreted as log probabilities"[2] but how come the output of a convolutional layer is interpreted as being a strictly negative quantity? (This is unlikely to break training but silently yield suboptimal performance due to inaccurate approximate sampling from the discrete distribution)

Please let me know, maybe there is a subtle intuition or training dynamic at play here that I am missing. Thanks!

[1] https://arxiv.org/pdf/1611.01144.pdf (Equation 1)
[2] https://arxiv.org/pdf/1711.11503.pdf (Section 3.3, page 5)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is the Gumbel-Softmax formulation accurate? #10

Is the Gumbel-Softmax formulation accurate? #10

atiorh commented May 12, 2020

Is the Gumbel-Softmax formulation accurate? #10

Is the Gumbel-Softmax formulation accurate? #10

Comments

atiorh commented May 12, 2020