Page 1 of 1

[Feature Request] Support CSL-JSON Format

Posted: Mon Feb 10, 2025 6:58 am
by iandol
For markdown workflows the recommendation is to use CSL-JSON instead of BibTeX, e.g.:

https://retorque.re/zotero-better-bibte ... ith-pandoc

I currently run a Launchd script to convert the BibTeX that Bookends generates using "Sync Linked BibTeX File...", but it would be nice if we could choose to output CSL-JSON directly without needing secondary conversion.

Has anyone tried to make a JSON format using "Format Manager". JSON is a bit fussy with commas etc. but perhaps FM can do this already? If not I think Bookends would need lower level support.

Here is an example CSL-JSON reference, converted from BibTeX output from Bookends using pandoc:

Code: Select all

[
  {
    "DOI": "10.1038/nrn3950",
    "author": [
      {
        "family": "Barrett",
        "given": "LF"
      },
      {
        "family": "Simmons",
        "given": "WK"
      }
    ],
    "container-title": "Nature Reviews Neuroscience",
    "id": "barrett2015",
    "issue": "7",
    "issued": {
      "date-parts": [
        [
          2015
        ]
      ]
    },
    "page": "419-429",
    "title": "Interoceptive predictions in the brain",
    "type": "article-journal",
    "volume": "16"
  },
]
The formal spec is here: https://citeproc-js.readthedocs.io/en/l ... arkup.html

Re: [Feature Request] Support CSL-JSON Format

Posted: Mon Feb 10, 2025 8:43 am
by Jon
Would this be an option for Create/Sync Linked BibTeX File? Or something more general?

Jon
Sonny Software

Re: [Feature Request] Support CSL-JSON Format

Posted: Tue Feb 11, 2025 6:25 am
by iandol
My personal use would be for Create/Sync Linked BibTeX File, but I can imagine other people may want to use CSL-JSON manually. This was why I asked more generally whether anyone had tried to see if format manager could do this already; I didn't have time to try to build JSON from FM myself yet, and know that JSON can be fussy. As we can handle this conversion manually using pandoc, I would say it is low priority in terms of your development time: this feature simplifies a workflow, but its absence does not block it.

Re: [Feature Request] Support CSL-JSON Format

Posted: Tue Feb 11, 2025 9:12 am
by Jon
Please send a small (5-10 reference) library to support@sonnysoftware.com so we can experiment with possible solutions for this.

Jon
Sonny Software

Re: [Feature Request] Support CSL-JSON Format

Posted: Tue Feb 11, 2025 1:01 pm
by iandol
Sent, thanks for the consideration!

Re: [Feature Request] Support CSL-JSON Format

Posted: Mon Mar 10, 2025 12:06 pm
by somelinguist
Thanks for adding this in 15.1.2

Is it only possible to use one synced file?

I've been using the synced BibTeX file, and the new option for CSL-JSON is grayed out. Do I need to delete the synced BibTeX file in order to sync CSL-JSON instead?

I actually would love to be able to produce both for use in different projects (some using Pandoc, others with LaTeX).

Currently, I have an automated script to create a matching CSL-JSON file from the the synced BibTeX file using Pandoc. It mostly works well, but I remember there being issues with a small subset of references due to Pandoc's conversion. I wonder if Bookends would produce better output.

Re: [Feature Request] Support CSL-JSON Format

Posted: Mon Mar 10, 2025 12:26 pm
by Jon
Yes, only one or the other. If you move the text file to the Trash (but DON"T empty the trash), Bookends will let you create a new file, which can be either BibTeX or CSL-JSON. Then you can compare the two and see which one works better for you.

Jon
Sonny Software

Re: [Feature Request] Support CSL-JSON Format

Posted: Mon Mar 10, 2025 9:32 pm
by iandol
Thank you Jon for the implementation!!!

For Pandoc users, you may find you have some issue with dates. Many of my refs have dates like "2021 Feb 12" and when synced to BibTeX only the year exports (so Pandoc renders just the year in in-text citations/bibliography as expected). But, when exported as CSL-JSON, the date string is exported as a raw string: "2021 Feb 12", and Pandoc treats raw strings as-is, so you end up with in-text citations like: (Shipp et al., 2021 Feb 12).

Zotero parses raw dates, but Pandoc doesn't; thus you'll see a difference between Zotero and Bookends with the same reference and the same CSL-JSON output.

There is no clear best solution here, CSL-JSON have made new recommendations but legacy use means it is hard to make big changes. More details are available here: https://github.com/jgm/citeproc/issues/149

I will probably make an applescript to regularise my bookends database, where dates like "2021 Feb 12" would be converted to ISO8601/EDTF <https://en.wikipedia.org/wiki/ISO_8601#EDTF> dates like "2021-02-12" which pandoc can parse in the raw field... But I'm short on time, I asked ChatGPT to make a robust date parser to embed in a script and it failed spectacularly :P .

In the meantime, a script to parse the CSL-JSON dates is the easiest route...

Here is a regex pattern to replace dates with just the year in the JSON (it doesn't handle lots of variants, just the basic): https://regex101.com/r/qmxmYv

Re: [Feature Request] Support CSL-JSON Format

Posted: Tue Mar 11, 2025 10:25 am
by somelinguist
Thanks, Jon and iandol for your help!

I remember now that the dates were the issue with Pandoc for me too. I'lll see if I can get something working based on your regex.

Re: [Feature Request] Support CSL-JSON Format

Posted: Wed Mar 12, 2025 4:12 am
by iandol
This is my interim zsh script to fix dates that runs via launchctl:

https://codeberg.org/iandol/dotfiles/sr ... eakJSON.sh

It first uses jq to remove abstracts, addresses etc which cuts the file size down (my JSON is 18MB before and 7MB after, pandoc can load and parse it more quickly), and uses Ruby to regex fix the dates. Ruby is a bit faster than sed for this task, but either will work:

Code: Select all

sed 's/"raw": *".*\([0-9]\{4\}\)[^"]*"/"raw": "\1"/g' temp.json > $outfile
ruby -e 'puts File.read(ARGV[0]).gsub(/"raw":\s*".*?(\d{4})[^"]*"/, "\"raw\": \"\\1\"")' temp.json > $outfile
That script also fixes the case for some words in my field using a second script (https://codeberg.org/iandol/dotfiles/sr ... fixCase.rb) but you can leave that out.

Re: [Feature Request] Support CSL-JSON Format

Posted: Fri Mar 14, 2025 12:01 am
by iandol
For anyone thinking about switching to JSON over BIB here is a performance comparison:

Code: Select all

✦ ε ➪ benchmark { pandoc -t plain --csl apa.csl --citeproc --bibliography Core.json test.md > /dev
/null }
337.467316ms ± 15.798866ms (min 322.477834ms, max 366.256375ms, 5 runs)
                                                                                                      
✦ ε ➪ benchmark { pandoc -t plain --csl apa.csl --citeproc --bibliography Core.bib test.md > /dev/null
 }
1.747915883s ± 19.313042ms (min 1.732968833s, max 1.785758333s, 5 runs)
Around 5X faster. With a bigger document and/or regular compilation (or example using Marked2 for live preview of your paper), this does make a difference...