-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Track gradients #15
Track gradients #15
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @MatteoRobbiati.
The black line corresponds to the end of the first VQE training, correct?
Perhaps instead of that when you draw the line connecting the two VQE you can use a different/color style and state in the legend that you are just connecting the points while performing DBI.
You mean, taking the figure as reference, the line which connects the gradients value's jump from 1e-3 to 1e-1? |
Yes, just to understand that the gradient is not increasing by itself but it is due to the DBI. |
Yep, makes sense. It was my idea! |
Regarding GPU I might need to check if the code works properly but I think so, let me know if you see any errors. |
@MatteoRobbiati I would wait until this PR is merged to start running multiple jobs for the BP. |
Thanks @MatteoRobbiati |
Zoe:
Test initialization BP:
When you're stuck are the parameters jumping around or just making a small wiggle?
Test landscape after training #1:
Test landscape after training #2:
Q: What happens if one does a quantum natural gradient step instead of a DBI step? BPs show up for sampling of shots but for these system sizes there should always be enough visibility to still take a gradient step (even if it's a small update). This means we only have a reduction of gradients but they are not super small yet. Quantum imaginary evolution What is the relation to this? See lower bounds |
Running this with shot noise can be interesting to showcase the functioning of the method. It can happen that VQE is really hard to assign if there are measurements of the cost function (evaluation of which direction improves becomes noisy). In this case the advantage of adding DBI might be amplified. |
With this PR we enable the tracking of the gradients of the loss function$L$ w.r.t. the trainable parameters $\vec{\theta}$ during the training process (added to
callbacks
).This feature can be used to detect the Barren Plateau regime, or to trigger the DBI if some average magnitude threshold value is detected.
In the function$$g = \frac{1}{N_{\rm params}}\sum_{i=1}^{N_{\rm params}}|\partial \text{L}_{\theta_i}|$$ are plotted as function of the optimization iteration.
plotscripts.plot_gradients
the valuesAn example of output follows (10 qubits, 1 layer, BFGS).
Loss:
Gradients: