-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implemented Getting Zenodo citation #10
Conversation
Codecov Report
@@ Coverage Diff @@
## master #10 +/- ##
==========================================
+ Coverage 90.00% 91.59% +1.59%
==========================================
Files 3 3
Lines 90 119 +29
==========================================
+ Hits 81 109 +28
- Misses 9 10 +1
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your contribution!
I left some comments and suggestions on the specific parts of the code. It looks good, but I think it would be better to refactor the code a bit to return a merged citation dictionary from get_citation
and maybe have a function get_badge_citations
that would call get_citation_zenodo
. This way it will be easier to extend for JOSS (and other types) of badges. I also think that we should default to getting all the citation data.
I'm wondering if we could just setup a regex for DOIs in badges.
Note that some of your tests were already failing, and they are not related to the new functions.
Can you provide some information about the failing tests? The CI doesn't show failing package tests and I also checked locally and I didn't encounter any problems.
The errors in CI like
Error: Can't open display: (null)
are due to the clipboard integration tests because I haven't managed to install xclip
properly (#11).
src/citations.jl
Outdated
@@ -5,32 +5,43 @@ function citation_path(pkg) | |||
end | |||
end | |||
|
|||
function get_citation(pkg) | |||
|
|||
# Added `zenodo` flag to avoid breaking current tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be better to default to including the citations from badges and change / fix the tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, but I thought this should be done in another PR, cause it would be too much change.
src/citations.jl
Outdated
elseif !isnothing(urlzenodo) | ||
get_citation_zenodo(urlzenodo) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be better to merge
the citations from all sources. It would also be nice to have a duplication check in case package authors have the citation info also in CITATION.bib and use that instead (I'm not sure if it's worth it to make this configurable. Preferring the CITATION.bib over the badge seems reasonable to me.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way it's now, it just get's the zenodo citation if no CITATION.bib is not present. Isn't this your suggestion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking that there might be packages that have both a badge and a CITATION.bib file, but if we prefer the CITATION.bib anyway, then it would be simpler to go with the current implementation. I was thinking that there might be some edge cases when we need to merge badge citations with CITATION.bib citations, but it would be better to just document that we will prefer the .bib file over badges.
src/citations.jl
Outdated
readme_path = joinpath(pkg.source, "README.md") | ||
if isfile(readme_path) | ||
readme = open(f->read(f, String), readme_path); | ||
index_init = findlast("https://zenodo.org/badge", readme) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if there are packages with more than one zenodo badge. We can leave it like this and update in the future if it's necessary.
src/PkgCite.jl
Outdated
@@ -2,13 +2,15 @@ module PkgCite | |||
|
|||
using Pkg: include | |||
using DataStructures: getkey, values | |||
using Base: String | |||
using Base: String, indent_width |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is indent_width
used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I forgot to remove it. It's not used.
src/citations.jl
Outdated
function get_citation(pkg) | ||
|
||
# Added `zenodo` flag to avoid breaking current tests | ||
function get_citation(pkg; zenodo=false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that it might be better to have a more general flag like badge
(or include_badges
or something like that).
What exactly do you mean by using "regex for DOI"? Do you mean, search for the "DOI" term, and then extract the badge link? I guess this could be done, although I'm not very good with regex. Also, can you give an example of what do you mean by the " but I think it would be better to refactor the code a bit to return a merged citation dictionary from |
Co-authored-by: Sebastian Micluța-Câmpeanu <[email protected]>
Co-authored-by: Sebastian Micluța-Câmpeanu <[email protected]>
Co-authored-by: Sebastian Micluța-Câmpeanu <[email protected]>
Co-authored-by: Sebastian Micluța-Câmpeanu <[email protected]>
By the way, here are the test failures I get when I just run the forked repo without my changes: [0c5d862f] Symbolics v0.1.27
┌ Warning: There was an error reading the CITATION.bib file for Distributions
│ exception = KeyError: key "july" not found
└ @ PkgCite ~/MEGA/EMAp/PkgCite.jl/src/citations.jl:15
┌ Warning: There was an error reading the CITATION.bib file for Distributions
│ exception = KeyError: key "july" not found
└ @ PkgCite ~/MEGA/EMAp/PkgCite.jl/src/citations.jl:15
┌ Warning: Overwriting julia_citations.bib
└ @ PkgCite ~/MEGA/EMAp/PkgCite.jl/src/citations.jl:53
[ Info: A julia_citations.bib file with the citations for LabelledArrays, QuadGK, Symbolics, RecursiveArrayTools, ArrayInterface and AbstractAlgebra was generated in the current working directory (/home/davibarreira/MEGA/EMAp/PkgCite.jl/test).
┌ Warning: There was an error reading the CITATION.bib file for Distributions
│ exception = KeyError: key "july" not found
└ @ PkgCite ~/MEGA/EMAp/PkgCite.jl/src/citations.jl:15
┌ Warning: There was an error reading the CITATION.bib file for Distributions
│ exception = KeyError: key "july" not found
└ @ PkgCite ~/MEGA/EMAp/PkgCite.jl/src/citations.jl:15
[ Info: The above sentence was copied to your clipboard.
┌ Warning: Overwriting julia_citations.bib
└ @ PkgCite ~/MEGA/EMAp/PkgCite.jl/src/citations.jl:53
[ Info: A julia_citations.bib file with the citations for LabelledArrays, QuadGK, Symbolics, RecursiveArrayTools, Julia, ArrayInterface and AbstractAlgebra was generated in the current working directory (/home/davibarreira/MEGA/EMAp/PkgCite.jl/test).
Clipboard: Error During Test at /home/davibarreira/MEGA/EMAp/PkgCite.jl/test/runtests.jl:81
Unexpected Pass
Expression: clipboard() == CITE_STR_JL
Got correct result, please change to @test if no longer broken.
┌ Warning: There was an error reading the CITATION.bib file for Distributions
│ exception = KeyError: key "july" not found
└ @ PkgCite ~/MEGA/EMAp/PkgCite.jl/src/citations.jl:15
[ Info: The above sentence was copied to your clipboard.
┌ Warning: Overwriting julia_citations.bib
└ @ PkgCite ~/MEGA/EMAp/PkgCite.jl/src/citations.jl:53
[ Info: A julia_citations.bib file with the citations for LabelledArrays, QuadGK, Symbolics, RecursiveArrayTools, Julia, ArrayInterface and AbstractAlgebra was generated in the current working directory (/home/davibarreira/MEGA/EMAp/PkgCite.jl/test).
Clipboard: Error During Test at /home/davibarreira/MEGA/EMAp/PkgCite.jl/test/runtests.jl:89
Unexpected Pass
Expression: clipboard() == CITE_STR
Got correct result, please change to @test if no longer broken.
Test Summary: | Pass Error Broken Total
PkgCite.jl | 24 2 1 27
Empty env | 1 1
Packages | 6 1 7
Citations | 4 4
Citations in .bib | 5 5
Only direct dependencies | 3 3
PkgCite sentence | 5 2 7
Clipboard | 2 2 4
ERROR: LoadError: Some tests did not pass: 24 passed, 0 failed, 2 errored, 1 broken.
in expression starting at /home/davibarreira/MEGA/EMAp/PkgCite.jl/test/runtests.jl:10
ERROR: Package PkgCite errored during testing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, can you give an example of what do you mean by the " but I think it would be better to refactor the code a bit to return a merged citation dictionary from get_citation and maybe have a function get_badge_citations that would call get_citation_zenodo"? I don't fully understand what exactly would be this merged citation dictionary.
I was thinking if there would be an edge case where we need to merge citations form the badge with the ones in the CITATION.bib file, but as I've said it the response comment, it would be better to just document that we prefer the CITATION.bib file over the badges and leave the implementation as is.
What exactly do you mean by using "regex for DOI"? Do you mean, search for the "DOI" term, and then extract the badge link? I guess this could be done, although I'm not very good with regex.
I was referring to parsing the README for badges and then look for DOI-like strings (things like "10.5281/ZENODO.4692172") insde them since the badge links contain the DOI information, but I see now that this will not work for Zenodo since there can be links with latestdoi
that need to be followed for the DOI. I thought that the DOI info is present in the link like in the badge for Plots or DynamicalSystems.
Looking for the "DOI" string is indeed a better solution since it seems to be used in all the badges I've encountered.
src/citations.jl
Outdated
elseif !isnothing(urlzenodo) | ||
get_citation_zenodo(urlzenodo) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking that there might be packages that have both a badge and a CITATION.bib file, but if we prefer the CITATION.bib anyway, then it would be simpler to go with the current implementation. I was thinking that there might be some edge cases when we need to merge badge citations with CITATION.bib citations, but it would be better to just document that we will prefer the .bib file over badges.
Yeah, there are some cases where the DOI is in the badge, but for Zenodo, at least, this is not the correct way to setup. The reason is that for every new package version a new DOI is generated. So using "latestdoi" is the correct setup, hence, why I wrote that "warning". Perhaps, one can check if the badge has a DOI, and use that instead. |
Regarding the failing tests, I marked the |
It looks like looking for a badge that starts with "DOI" will cover all cases (and also work for JOSS badges). We could issue a warning in the case of Zenodo if we detect a badge that doesn't use |
I'll do the following. I'll create a function that searchers for a badge with DOI, and get's the link. Now, if the link if from zenodo, then uses my code to get the bibtex. If it's a doi link, then I'll create another function to generate the bibtex from it. |
Made all the suggested changes. The function |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much for your contribution!
Not a problem! Thanks for the pkg by the way :D |
I've implemented two new functions in
citations.jl
:get_zenodo_badge
- This function reads theREADME.md
file and extracts the link from the zenodo's badge;get_citation_zenodo
- This functions takes the badge link and obtains the bibtex.I've modified the
get_citations
function to search for the badge in case noCitation.bib
file is found. I've also added an argumentzenodo=false
toget_citation
,get_tool_citations
andcollect_citations
. The default isfalse
so that your previous tests do not break.I've also implemented a testset for the zenodo functions. Note that some of your tests were already failing, and they are not related to the new functions.
I hope this PR is helpful.