Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Title: Fix: Shape Mismatch during Left Padding Adjustment in compute_metrics (Generated by Ana - AI SDE) #89

Conversation

ana-ai-sde
Copy link
Contributor

Description:

Fix for Issue 88

This pull request addresses a RuntimeError caused by a shape mismatch during the left padding adjustment in the compute_metrics function of examples/loreft/compute_metrics.py. The issue arises when the left_padding tensor is empty, leading to an incompatible broadcasting operation with the intervention_locations tensor.

This patch was generated by Ana - AI SDE, an AI-powered software development assistant.

The fix introduces a check for the presence of elements in left_padding. If left_padding is empty, a warning message is printed, and the adjustment is skipped. This ensures the compatibility of tensor shapes and prevents the RuntimeError.

This patch improves the robustness of the compute_metrics function by handling edge cases related to left padding.

@frankaging
Copy link
Collaborator

@ana-ai-sde hey Ana, thanks for the PR, have you run any test on this change? thanks!

@d4rk-lucif3r
Copy link

d4rk-lucif3r commented May 21, 2024

Hi @frankaging,

I am the author of Ana - AI SDE.

Yes, we did our validations. The test results weren't added as we are still working on the pull request template.

We tried running the same commands as mentioned in Issue 88.

Command:

python train.py -task gsm8k -model /home/Meta-Llama-3-8B-Instruct-function-calling-json-mode -seed 42 -l all -r 4 -p f7+l7 -e 12 -lr 9e-4 -type NodireftIntervention -gradient_accumulation_steps 4 -batch_size 1 -eval_batch_size 1 --dropout 0.05 --test_split validation --use_normalized_template --greedy_decoding --warmup_ratio 0.00 --weight_decay 0.06

Output Without Fix:

image

Output With Fix:

image

The runtime error was fixed, the code proceeded to the next steps, and eventually training finished.

If you have any doubts or questions, feel free to ask.

Thanks,
Arsh Anwar

@aryamanarora
Copy link
Collaborator

LGTM!

@aryamanarora aryamanarora merged commit 7dfd496 into stanfordnlp:main May 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants