A python script to check which (Google Scholar) papers and citations are missing from Scopus.
If you ever had to cross-check which papers and citations Scopus forgot to index, you would know--it is not fun. This script may provide a little help with that.
Please notice that this is a rather unreliable tool, and should only be used as a rough first screening. I take no responsibility for any missed citations or documents. See LICENSE
for more information.
To get a local copy up and running follow these steps.
You are going to need:
pip
https://pip.pypa.io/en/stable/installation/virtualenv
https://virtualenv.pypa.io/en/latest/installation.html- a Scopus subscription (your University will probably provide that?)
- Clone the repo and go into the directory
git clone [email protected]:enzodesena/scopus-checker.git cd scopus-checker
- Create python environment and activate it
virtualenv -p /usr/bin/python3 env source env/bin/activate
- Install pip packages
pip install -r requirements.txt
- (Optional, but not so optional) Get a free API Scraper Key at scraperapi.com and copy/paste your scraper API key.
Before you can use the tool, you need to download some information from Scopus:
- Go to scopus.com and look up your own profile. The free search would not be enough for that.
- Go to
Documents
->Export all
, selectCSV
, and toggle all options underCitation information
and nothing else; click onExport
and move the file to your repository directory; we will call this the 'document file'. - Go to
Cited by XXX Documents
->Export all
, selectCSV
, and toggle all options underCitation information
and also Include references; click onExport
and move the file to your repository directory; we will call this the 'citations file'.
Now you are ready to run the script. Go into your repository directory and run the python script (if you haven't done so already, activate the virtual environmnet with source env/bin/activate
):
cd scopus-checker
python scopus-checker.py -d <document file> -c <citations file> -a '<your name and surname>' -p scraperapi -k <your own scraper api key>
In this final step, notice how we used the scraperapi
proxy option. Given how strict Google Scholar has become over the years, this is pretty much the only option that will make this script work. You can also run the script without proxies, but it is unlikely to work for very long (see scholarly proxies for more information).
Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.
If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement". Don't forget to give the project a star! Thanks again!
- Fork the Project
- Create your Feature Branch (
git checkout -b feature/AmazingFeature
) - Commit your Changes (
git commit -m 'Add some AmazingFeature'
) - Push to the Branch (
git push origin feature/AmazingFeature
) - Open a Pull Request
Distributed under the MIT License. See LICENSE
for more information.
Enzo De Sena: desena.org @enzoresearch
This script uses scholarly.