Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docx+citations from zotero: different ids used for in-text and CSL-YAML #10366

Closed
iandol opened this issue Nov 8, 2024 · 8 comments
Closed
Labels

Comments

@iandol
Copy link
Contributor

iandol commented Nov 8, 2024

I have a Word docx written using Zotero by a collaborator with the following injected example reference (toggle field codes and copy paste):

{
	"id": "Ml6QQcFl/5Udmir6l",
	"uris": [
		"http://zotero.org/users/9456003/items/M6XJVYGG"
	],
	"itemData": {
		"id": 2102,
		"type": "article-journal",
		"abstract": "...",
		"container-title": "NeuroImage",
		"DOI": "10.1016/j.neuroimage.2012.01.078",
		"ISSN": "1095-9572",
		"issue": "2",
		"journalAbbreviation": "Neuroimage",
		"language": "eng",
		"note": "PMID: 22285220",
		"page": "1307-1315",
		"source": "PubMed",
		"title": "Abnormal cortical processing of pattern motion in amblyopia: evidence from fMRI",
		"title-short": "Abnormal cortical processing of pattern motion in amblyo-pia",
		"volume": "60",
		"author": [
			{
				"family": "Thompson",
				"given": "B."
			},
			{
				"family": "Villeneuve",
				"given": "M. Y."
			},
			{
				"family": "Casanova",
				"given": "C."
			},
			{
				"family": "Hess",
				"given": "R. F."
			}
		],
		"issued": {
			"date-parts": [
				[
					"2012",
					4,
					2
				]
			]
		}
	}
}

The important part is there is a base id "id": "Ml6QQcFl/5Udmir6l" and an "itemData": { "id": 2102, — pandoc unfortunately uses the first one for the in-text citation:

However, with further research, fMRI studies have revealed that amblyopic patients exhibit not only functional abnormalities in V1 but also in other regions, such as V2, V3, V4, V5, and higher-order
areas like MT+ [@Ml6QQcFl/5Udmir6l].

but the YAML uses the other id:

- abstract: ...
  author:
  - family: Thompson
    given: B.
  - family: Villeneuve
    given: M. Y.
  - family: Casanova
    given: C.
  - family: Hess
    given: R. F.
  container-title: NeuroImage
  container-title-short: Neuroimage
  DOI: 10.1016/j.neuroimage.2012.01.078
  id: 2102
  ISSN: 1095-9572
  issue: 2
  issued: 2012-04-02
  language: eng
  page: 1307-1315
  PMID: 22285220
  source: PubMed
  title: "Abnormal cortical processing of pattern motion in amblyopia:
    evidence from fMRI"
  title-short: Abnormal cortical processing of pattern motion in
    amblyopia
  type: article-journal
  volume: 60
image

As this is a docx from a collaborator, I don't have his database and I don't know why the zotero data is like this (most references are like this), but this is in-the-wild and I'd hope a consistent id selection by pandoc should be easy to do?

@iandol iandol added the bug label Nov 8, 2024
@iandol
Copy link
Contributor Author

iandol commented Nov 8, 2024

Test.docx

Minimal test docx

pandoc -s --extract-media=./ -f docx+citations Test.docx -o Test.md
pandoc --version
pandoc 3.5
Features: +server +lua
Scripting engine: Lua 5.4
User data directory: /Users/ian/.local/share/pandoc
Copyright (C) 2006-2024 John MacFarlane. Web: https://pandoc.org
This is free software; see the source for copying conditions. There is no
warranty, not even for merchantability or fitness for a particular purpose.

@jgm
Copy link
Owner

jgm commented Nov 8, 2024

Here is the citation embedded in Test-2.docx. You can see that the id of the first citationItem is indeed Ml6QQcFl/5Udmir6l. There is then an itemData that embeds bibliographic information, and it uses a different id "2102". I'm not sure how it's supposed to work in this case (if it's not just a mistake), but perhaps we're meant to use the id Ml6QQcFl/5Udmir6l both places? @bdarcus do you know?

{
  "citationID": "NLAuDP0i",
  "properties": {
    "formattedCitation": "(Thompson et al., 2012)",
    "plainCitation": "(Thompson et al., 2012)",
    "noteIndex": 0
  },
  "citationItems": [
    {
      "id": "Ml6QQcFl/5Udmir6l",
      "uris": [
        "http://zotero.org/users/9456003/items/M6XJVYGG"
      ],
      "itemData": {
        "id": 2102,
        "type": "article-journal",
        "abstract": "Converging evidence from human psychophysics and animal neurophysiology indicates that amblyopia is associated with abnormal function of area MT, a motion sensitive region of the extrastriate visual cortex. In this context, the recent finding that amblyopic eyes mediate normal perception of dynamic plaid stimuli was surprising, as neural processing and perception of plaids has been closely linked to MT function. One intriguing potential explanation for this discrepancy is that the amblyopic eye recruits alternative visual brain areas to support plaid perception. This is the hypothesis that we tested. We used functional magnetic resonance imaging (fMRI) to measure the response of the amblyopic visual cortex and thalamus to incoherent and coherent motion of plaid stimuli that were perceived normally by the amblyopic eye. We found a different pattern of responses within the visual cortex when plaids were viewed by amblyopic as opposed to non-amblyopic eyes. The non-amblyopic eyes of amblyopes and control eyes differentially activated the hMT+ complex when viewing incoherent vs. coherent plaid motion, consistent with the notion that this region is centrally involved in plaid perception. However, for amblyopic eye viewing, hMT+ activation did not vary reliably with motion type. In a sub-set of our participants with amblyopia we were able to localize MT and MST within the larger hMT+ complex and found a lack of plaid motion selectivity in both sub-regions. The response of the pulvinar and ventral V3 to plaid stimuli also differed under amblyopic vs. non-amblyopic eye viewing conditions, however the response of these areas did vary according to motion type. These results indicate that while the perception of the plaid stimuli was constant for both amblyopic and non-amblyopic viewing, the network of neural areas that supported this perception was different.",
        "container-title": "NeuroImage",
        "DOI": "10.1016/j.neuroimage.2012.01.078",
        "ISSN": "1095-9572",
        "issue": "2",
        "journalAbbreviation": "Neuroimage",
        "language": "eng",
        "note": "PMID: 22285220",
        "page": "1307-1315",
        "source": "PubMed",
        "title": "Abnormal cortical processing of pattern motion in amblyopia: evidence from fMRI",
        "title-short": "Abnormal cortical processing of pattern motion in amblyopia",
        "volume": "60",
        "author": [
          {
            "family": "Thompson",
            "given": "B."
          },
          {
            "family": "Villeneuve",
            "given": "M. Y."
          },
          {
            "family": "Casanova",
            "given": "C."
          },
          {
            "family": "Hess",
            "given": "R. F."
          }
        ],
        "issued": {
          "date-parts": [
            [
              "2012",
              4,
              2
            ]
          ]
        }
      }
    }
  ],
  "schema": "https://github.com/citation-style-language/schema/raw/master/csl-citation.json"
}

@jgm jgm closed this as completed in b088a55 Nov 8, 2024
@jgm
Copy link
Owner

jgm commented Nov 8, 2024

I just pushed a fix that will use the citationItem id in the bibliography, even if the itemData contains a different reference id. If that's wrong, we can change.

@iandol
Copy link
Contributor Author

iandol commented Nov 9, 2024

Thanks @jgm -- what happens when there is a citation-key, for example, this ref:

{
	"id": "uh2vLrAB/XwGHp8PL",
	"uris": [
		"http://zotero.org/users/1940082/items/IYHGI6A3"
	],
	"itemData": {
		"id": 15691,
		"type": "book",
		"note": "Citation Key: dowling2017\npage: 136",
		"publisher": "International Retinal Research Foundation",
		"title": "Amblyopia: Chal-lenges and opportunities",
		"volume": "The Lasker/IRRF Initiative for Innovation in Vi-sion Science",
		"author": [
			{
				"family": "Dowling",
				"given": "John E."
			}
		],
		"editor": [
			{
				"family": "Dowling",
				"given": "John E."
			}
		],
		"issued": {
			"date-parts": [
				[
					"2017"
				]
			]
		},
		"citation-key": "dowling2017"
	}
}

...has a main id, an itemData: id: and an itemData: citation-key — the citation-key comes from BetterBibTeX and will be used by Zotero to output BibTeX, so I wonder if the order shouldn't be: itemData: citation-key > itemData: id: > id? I normally use Bookends reference manager so my knowledge of Zotero is very limited...

@jgm
Copy link
Owner

jgm commented Nov 9, 2024

We use id. If you wanted the other behavior you could use a filter to overwrite id with citation-key (which isn't even an official CSL JSON field, I believe).

@iandol
Copy link
Contributor Author

iandol commented Nov 10, 2024

Thank you as always!

@iandol
Copy link
Contributor Author

iandol commented Nov 10, 2024

Just FYI, I just checked the schema for csl-data (which is what itemID is IIUC) and there is a citation-key field:

https://github.com/citation-style-language/schema/blob/master/schemas/input/csl-data.json#L62

  "citation-key": {
        "type": "string"
      },

So this would be where the BibTeX key, if present, should be stored. Let me see if I can make a filter to make this replacement, as a workflow where the BibTeX key is used as an id is more flexible overall...

@jgm
Copy link
Owner

jgm commented Nov 11, 2024

It's not documented for the released version; perhaps it was added later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants