-
-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CSL? #32
Comments
Perhaps TOML (or similar more simple languages, like JSON, as mentioned above) would also be a nice alternative to YAML for this use case, since YAML is known to be rather unintuitive/counter-intuitive in too many cases (see https://noyaml.com/ for example). |
Using a well-defined format like CSL JSON, you'd easily be able to create a data structure that can be serialized/deserialized from any input format. This is possible already with the I agree with you @bdarcus, @reknih, you should definitely hit up @cormacrelf and try to leverage his work wherever possible. The Rust ecosystem would definitely benefit from a proper citation processing library---I know that at least me and likely the Tectonic guys are interested! |
I dunno; I think toml would be a poor format for this purpose, and YAML is actually pretty good. Among other things, you can validate it's files using JSON schemas. |
citeproc-rs is possibly dead; readme calls it WIP, maintainer seems to have been active in the last couple of months but no commits to master since 2021. |
Not very active, no, but it's still maintained, as evidenced by the PR opened 2 months ago that fixes a bug introduced by Rust 1.67. Also, it being part of the Zotero org, one would think it has some official weight. That being said, Hayagriva is way more elegant IMO, and it actually exists on crates.io. Adapting to support CSL JSON shouldn't be that difficult (I am considering opening a PR for it.) CSL styles is another beast, though, but one that this library should definitely aim for supporting! |
edited for clarity
I and a group of other CSL developers and contributors talked about status and strategy in general along with the Zotero folks last Summer (summary and further discussion here), and what I gather from that and from other discussions is:
I don't understand the last point (cc @dstillman) , in part because I don't do Rust, and so can't really assess the codebase. Cormac did mention during that meeting that he has maybe been held back a bit by perfectionism, but I'm not sure if it's that, or some technical issue(s) they've run into. My impression is they're also skeptical they'll get much in the way of quality PRs (it's not code suitable, for example, for many amateur programmers); that there's not likely a market for this among other developers. I'm more optimistic about the prospects for a community-developed Rust-based open source CSL processor :-) It would help for the Zotero folks to communicate more clearly about this:
Absent answers, or of course if they simply say "sorry, this was an experiment, and it won't work for us", maybe some dedicated Rust developer(s) should just fork it?
In what way(s)?
Right.
I will say in general :
Final, much more speculative, point: I created CSL around 2005, writing my first book. I think it reflects sound decisions based on the technology landscape at that time; the decision to use XML and RELAX NG, to insist on output format independence and being language-agnostic, to make it suitable for hand-editing in schema-aware XML editors, and also subtle things like designing it in such a way that one could switch among radically different citation styles without editing document source. Now, close to 20 years later, I am big on the idea of using things like machine learning to simply create language-independent styles from formatted output examples, so users don't have to edit styles at all. I could imagine if that could be perfected, it would open the door to different sorts of output options: CSL XML initially certainly, but also maybe formats better optimized for machine processing. Alas, I have neither the time or the skill to explore that idea! |
@bdarcus From what I can glean: cleaner API, smaller and easier to understand codebase, all-in-all looks more elegant. This makes sense, as Hayagriva has a narrower audience of library consumers (essentially just themselves) and is newer. |
News on Basically they're stalled, with labor and technical hurdles, and need help to get the code in shape and released. Another third-party developer is going to spend some time trying to figure if and how to do that. |
Hi folks! I have already considered adding CSL, it's definitely on the roadmap! It would be nice if I did not have to reimplement a Rust parser for CSL, is |
IDK; "csl" is one of only two crates he actually released. Parsing is easy; it's just XML after all. It's the processing that's difficult. |
Dan posted another more detailed follow-up on the technical status. There's also the excellent Haskell based version I mentioned, which can effectively act like a JSON server. https://github.com/jgm/citeproc/blob/master/man/citeproc.1.md |
FWIW, I've been working on an experimental evolution of CSL in a typescript model; a commented YAML file of the current state.
Late-May update: I've made quite a bit of progress on this, and realized in the process the typescript Style model can be auto-converted to Rust code to serialize and deserialize a style. Here's a little demo repo that demonstrates: https://github.com/bdarcus/csln-rs EDIT: in looking at your YAML format now, I'm seeing your defining authors as a list of people? And assuming string parsing on those to get the components? If yes, that seems to leave out org authors. |
@reknih when you and/or your other developers have a bit free time, can you take a look at this? https://github.com/bdarcus/csln It's a reimplementation of the csl-next draft model in pure Rust, with very tight coupling (thanks to serde) between the JSON schema input and internal model. I'm pretty confident in that model, though it would need more review, testing, and iteration for me to be fully happy with it. I'm much less confident in my programming skills, and the fact I'm a complete Rust newbie. But I'm absolutely serious about building this out. I just need some help. It should compile fine using the cargo, and I have it licensed under Mozilla 2.0, which I think should be compatible with your Apache option; probably not MIT. But my view on licenses is as a practical open source advocate. I choose the license simply because it's the same as citeproc-rs, It's not quite pare with the typescript processor; here's an example of where I'm at: ❯ target/debug/csln processor/examples/style.csl.yaml processor/examples/ex1.bib.yaml Example result: {
"smith1": {
"disamb-condition": false,
"group-index": 1,
"group-length": 1,
"group-key": "Smith, Sam:2023-10"
}, So the core of the processor at this point is a sorted bibliography vector, and this HashMap. The next step is a function to iterate through the former and template, and use the latter to generate the pre-rendered AST. |
Hey @bdarcus, I recently started a CSL 1.0.2 XML parser and processor with typst/citationberg. Good to know that you are working on something for the next generation of CSL! What kind of help are you looking for? |
Oh, cool; didn't know! How are you finding working with the XML?
I hadn't gotten around yet (since this is newer) to sketching out milestones, but the ones for the typscript project more-or-less apply. https://github.com/bdarcus/csl-next/milestones?direction=asc&sort=due_date There's still a lot of work to do on the processor, for example, and we need to figure out a way to convert 1.0 styles, which may or may not be a big task. It may be useful for you to review the model now, and think about whether there's promise in using that, and simply converting 1.0 styles to it, if we can do it fairly losslessly? I imagine your model would help a lot with that? And perhaps there's a way to share code between the two projects? This is admittedly not fully-developed at this point, but I think I've thought-through enough details that it should work out as I intend. EDIT: I did try to sketch out where I see this going in some of the crate READMEs (for example, for cli). On a more mundane level, since I'm a mediocre amateur programmer and rust newbie (though sometimes I think this gives me certain advantages compared to trained programmers), reviews of existing code and PRs to improve would be welcome :-) |
Any news on that? Still needs help? What kind of processing is needed? What makes this task difficult? (Rush newbie) |
Thank you for asking. The following tasks are still open:
|
What will be relationship between hayagriva and citationberg? |
Citationberg parses CSL but makes no assumptions about how variables and data types are expressed within the consumer. Hayagriva will have a frontend to enter bibliographic information and be the CSL processor. |
This has been shipped with 0.4.0 |
I'm admittedly biased, given I created it, but have you considered (also) supporting CSL citation styles and JSON input format?
I'm sure there are performance and other advantages to the rust-based styles, but there are thousands of CSL styles, as well as citeproc-rs.
The text was updated successfully, but these errors were encountered: