Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add detailed documents for beam search implementation #535

Closed
shawnthu opened this issue Feb 27, 2019 · 9 comments
Closed

add detailed documents for beam search implementation #535

shawnthu opened this issue Feb 27, 2019 · 9 comments

Comments

@shawnthu
Copy link

The SequenceGenerator class is so hard to understand, can someone provide a detailed document?
e.g.

        # get the top beam_size active hypotheses, which are just the hypos
        # with the smallest values in active_mask
        active_hypos, _ignore = buffer('active_hypos'), buffer('_ignore')  # [b, k]
        torch.topk(
            active_mask, k=beam_size, dim=1, largest=False,
            out=(_ignore, active_hypos)
        )

        active_bbsz_idx = buffer('active_bbsz_idx')
        torch.gather(
            cand_bbsz_idx, dim=1, index=active_hypos,
            out=active_bbsz_idx,
        )
        active_scores = torch.gather(
            cand_scores, dim=1, index=active_hypos,
            out=scores[:, step].view(bsz, beam_size),
        )

        active_bbsz_idx = active_bbsz_idx.view(-1)
        active_scores = active_scores.view(-1)

        # copy tokens and scores for active hypotheses
        torch.index_select(
            tokens[:, :step + 1], dim=0, index=active_bbsz_idx,
            out=tokens_buf[:, :step + 1],
        )
        torch.gather(
            cand_indices, dim=1, index=active_hypos,
            out=tokens_buf.view(bsz, beam_size, -1)[:, :, step + 1],
        )
        if step > 0:
            torch.index_select(
                scores[:, :step], dim=0, index=active_bbsz_idx,
                out=scores_buf[:, :step],
            )
        torch.gather(
            cand_scores, dim=1, index=active_hypos,
            out=scores_buf.view(bsz, beam_size, -1)[:, :, step],`

The above vectorized code makes me almost crazy. I know it helps speeding computation, but at the cost of understanding.

@myleott
Copy link
Contributor

myleott commented Feb 28, 2019

We plan to release a simpler (albeit much slower) implementation as an alternative soon. If you want to modify the search code, the simplified version may be a bit easier to work with.

Or, are you trying to understand the fast version? If so, I highly recommend putting a break point (pdb) and stepping through it line by line with a batch of data to see what’s happening. The code you referenced is simply selecting the topk hypotheses that don’t end in eos, and adding the selected tokens/scores to the final tokens/scores tensors.

@shawnthu
Copy link
Author

shawnthu commented Mar 5, 2019

We plan to release a simpler (albeit much slower) implementation as an alternative soon. If you want to modify the search code, the simplified version may be a bit easier to work with.

Or, are you trying to understand the fast version? If so, I highly recommend putting a break point (pdb) and stepping through it line by line with a batch of data to see what’s happening. The code you referenced is simply selecting the topk hypotheses that don’t end in eos, and adding the selected tokens/scores to the final tokens/scores tensors.

thx O(∩_∩)O

@ashutoshsaboo
Copy link

Hi, has there been any further update on the more-understandable beam search that the one from SequenceGenerator?
Also, what is SequenceScorer doing exactly? From the code if i'm not wrong, is it just a normal forward pass inference equivalent to beam_size=1?

@myleott
Copy link
Contributor

myleott commented Apr 23, 2019

Someone wrote a really nice tutorial about the beam search implementation in fairseq: http://www.telesens.co/2019/04/21/understanding-incremental-decoding-in-fairseq/

We have a slightly simpler beam search implementation, but we'd like to simplify it even further (by removing all batching) before releasing it. In any case this will be much slower than the current vectorized implementation.

@villmow
Copy link

villmow commented May 29, 2019

Any updates on the release of the simpler beam search implementation?

@myleott
Copy link
Contributor

myleott commented May 30, 2019

@villmow, take a look here: 20bbbdc

It should be a drop-in replacement for SequenceGenerator. It's still batched, but removes a lot of the other complexity. Happy to take a PR if you want to take a stab at integrating this more cleanly or simplifying it further.

@hokkaido
Copy link

I echo @shawnthu's sentiment, the current SequenceGenerator class and especially its generate method is very hard to dissect for an outsider (like me).

@stale
Copy link

stale bot commented Apr 17, 2022

This issue has been automatically marked as stale. If this issue is still affecting you, please leave any comment (for example, "bump"), and we'll keep it open. We are sorry that we haven't been able to prioritize it yet. If you have any new additional information, please include it with your comment!

@stale stale bot added the stale label Apr 17, 2022
@stale
Copy link

stale bot commented Apr 27, 2022

Closing this issue after a prolonged period of inactivity. If this issue is still present in the latest release, please create a new issue with up-to-date information. Thank you!

@stale stale bot closed this as completed Apr 27, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants