[Feature Request] Post processing Imports for DOIs

A place for users to ask each other questions, make suggestions, and discuss Bookends.
iandol
Posts: 465
Joined: Fri Jan 25, 2008 2:31 pm

[Feature Request] Post processing Imports for DOIs

Post by iandol »

DOIs have become one of the most important elements of a modern reference. Annoyingly when importing, for RIS (and possibly other formats), different sources use different fields to specify the DOI. This often causes Bookends to import invalid DOIs. A clear example is the Nature/Springer group:

http://www.nature.com/nrn/journal/v18/n ... 017.15.ris

That uses L3 not M3 as the DOI. Other journals seems to use other fields and I've seen at least 4 different potential locations (including the URL) or malformed DOIs. One could create a bunch of customised RIS import formats, and select the correct one on each import, and then manually fix any issues described below, but sometime if you have a bunch of RIS files you may not remember the origin of each one anyway.

A potential solution would be some sort of post-processing. DOIs start with 10. and so you can regex match the correct data. In the Nature example above the field Bookends defaults to import (M3) contains a string "Article" that is clearly not a DOI. L3 does contain a correct DOI but if you modify the import filter to import M3 and L3 you end up with "Article\n10.1038/nrn.2017.15" which breaks the DOI. Other journals correctly use M3 but use "doi: 10.xxxx" and again creates a broken DOI in BE. Or I've seen "M3 - http://dx.doi.org/10.xxx" so the DOI is expressed as a URL. In all these cases if Bookends validated the DOI to be of the form "10.xxx" these many different problems would be solved without needing any manual intervention.
ozean
Posts: 461
Joined: Fri Mar 04, 2005 11:53 am
Location: Norway
Contact:

Re: [Feature Request] Post processing Imports for DOIs

Post by ozean »

Not sure about the technical side, but some kind of automatic solution would be welcomed. Recently, importing SAGE’s .ris files started to populate the DOI field with two entries separated by a line break – I always forget to correct this when importing the .ris and then have to go back to do this when I start using the reference…
Jon
Site Admin
Posts: 10048
Joined: Tue Jul 13, 2004 6:27 pm
Location: Bethesda, MD
Contact:

Re: [Feature Request] Post processing Imports for DOIs

Post by Jon »

What was the .ris file content?

Jon
Sonny Software
iandol
Posts: 465
Joined: Fri Jan 25, 2008 2:31 pm

Re: [Feature Request] Post processing Imports for DOIs

Post by iandol »

Here is a Sage example, the DOI is available in 4 different fields: DO, N1, M3 and UR, and Bookends by default combines M3 and DO:

Code: Select all


TY  - JOUR
T1  - Microaggressions
AU  - Lilienfeld, Scott O.
Y1  - 2017/01/01
PY  - 2017
DA  - 2017/01/01
N1  - doi: 10.1177/1745691616659391
DO  - 10.1177/1745691616659391
T2  - Perspectives on Psychological Science
JF  - Perspectives on Psychological Science
JO  - Perspectives on Psychological Science
SP  - 138
EP  - 169
VL  - 12
IS  - 1
PB  - SAGE Publications
N2  - The microaggression concept has recently galvanized public discussion and spread to numerous college campuses and businesses. ...
AB  - The microaggression concept has recently galvanized public discussion and spread to numerous college campuses and businesses. ...
SN  - 1745-6916
M3  - doi: 10.1177/1745691616659391
UR  - http://dx.doi.org/10.1177/1745691616659391
Y2  - 2017/03/19
ER  - 
The DOI in Bookends for me gets set to this which depending on your UI font looks normal in Bookends as the second line is not clearly visible:

Code: Select all

10.1177/1745691616659391
doi: 10.1177/1745691616659391
This is just one example. The Nature journals example I linked uses M3 as "M3 - Review" and puts DOI in L3 and UR. Oxford University press uses M3 but with "doi: " prepended and N1 with the DOI correctly formatted. And I've clearly seen what looks like an link-like number end up as a DOI too recently (Cell Calcium):

Code: Select all

S0143-4160(16)30215-9
10.1016/j.ceca.2016.12.005
Last edited by iandol on Mon Mar 20, 2017 9:28 am, edited 1 time in total.
Jon
Site Admin
Posts: 10048
Joined: Tue Jul 13, 2004 6:27 pm
Location: Bethesda, MD
Contact:

Re: [Feature Request] Post processing Imports for DOIs

Post by Jon »

The obvious solution available right now is to duplicate this filter, rename as Sage, and remove the N1 tag from the filter (or have it redirected to, say, Notes).

For Bookends to allow two tags and handle the resolution automatically, it would have to

1. Recognize that the data is headed for the DOI field.
2. Remove any text preceding the 10.
3. When the second tag is encountered, trim it (if necessary), and check to see if it's already in the DOI field. If so, ignore it.

This would bespecific to this case, but not hard to do (I think).

Jon
Sonny Software
iandol
Posts: 465
Joined: Fri Jan 25, 2008 2:31 pm

Re: [Feature Request] Post processing Imports for DOIs

Post by iandol »

That sounds like it would solve most cases. I think this is specific to DOIs as they are recent enough that there seems no clear standard on what field to put them in or how to format them.
iandol
Posts: 465
Joined: Fri Jan 25, 2008 2:31 pm

Re: [Feature Request] Post processing Imports for DOIs

Post by iandol »

The Cell Calcium example I gave was actually the Pubmed .nbib export from https://www.ncbi.nlm.nih.gov/pubmed/280 ... t=Citation — Bookends uses the AID to filter on import which has two parts and the second one is the DOI:

Code: Select all

PMID- 28027798
OWN - NLM
STAT- Publisher
DA  - 20161228
LR  - 20161229
IS  - 1532-1991 (Electronic)
IS  - 0143-4160 (Linking)
DP  - 2016 Dec 21
TI  - Improved deep two-photon calcium imaging in vivo.
LID - S0143-4160(16)30215-9 [pii]
LID - 10.1016/j.ceca.2016.12.005 [doi]
AB  - Two-photon laser scanning calcium imaging has emerged as a useful method for the 
      exploration of neural function and structure at the cellular and subcellular
      level in vivo. The applications range from imaging of subcellular compartments
      such as dendrites, spines and axonal boutons up to the functional analysis of
      large neuronal or glial populations. However, the depth penetration is often
      limited to a few hundred micrometers, corresponding, for example, to the upper
      cortical layers of the mouse brain. Light scattering and aberrations originating 
      from refractive index inhomogeneties of the tissue are the reasons for these
      limitations. The depth penetration of two-photon imaging can be enhanced through 
      various approaches, such as the implementation of adaptive optics, the use of
      three-photon excitation and/or labeling cells with red-shifted genetically
      encoded fluorescent sensors. However, most of the approaches used so far require 
      the implementation of new instrumentation and/or time consuming staining
      protocols. Here we present a simple approach that can be readily implemented in
      combination with standard two-photon microscopes. The method involves an
      optimized protocol for depth-restricted labeling with the red-shifted fluorescent
      calcium indicator Cal-590 and benefits from the use of ultra-short laser pulses. 
      The approach allows in vivo functional imaging of neuronal populations with
      single cell resolution in all six layers of the mouse cortex. We demonstrate that
      stable recordings in deep cortical layers are not restricted to anesthetized
      animals but are well feasible in awake, behaving mice. We anticipate that the
      improved depth penetration will be beneficial for two-photon functional imaging
      in larger species, such as non-human primates.
CI  - Copyright (c) 2016. Published by Elsevier Ltd.
FAU - Birkner, Antje
AU  - Birkner A
AD  - Institute of Neuroscience, Technical University of Munich, Munich, Germany;
      Munich Cluster for Systems Neurology (SyNergy) and Center for Integrated Protein 
      Sciences (CIPSM), Munich, Germany. Electronic address: antje.birkner@tum.de.
FAU - Tischbirek, Carsten H
AU  - Tischbirek CH
AD  - Institute of Neuroscience, Technical University of Munich, Munich, Germany;
      Munich Cluster for Systems Neurology (SyNergy) and Center for Integrated Protein 
      Sciences (CIPSM), Munich, Germany.
FAU - Konnerth, Arthur
AU  - Konnerth A
AD  - Institute of Neuroscience, Technical University of Munich, Munich, Germany;
      Munich Cluster for Systems Neurology (SyNergy) and Center for Integrated Protein 
      Sciences (CIPSM), Munich, Germany. Electronic address: arthur.konnerth@tum.de.
LA  - eng
PT  - Review
PT  - Journal Article
DEP - 20161221
PL  - Netherlands
TA  - Cell Calcium
JT  - Cell calcium
JID - 8006226
OTO - NOTNLM
OT  - Calcium signaling
OT  - Mouse visual cortex
OT  - Multi-photon microscopy
OT  - Neuronal activity
EDAT- 2016/12/29 06:00
MHDA- 2016/12/29 06:00
CRDT- 2016/12/29 06:00
PHST- 2016/12/05 [received]
PHST- 2016/12/19 [accepted]
AID - S0143-4160(16)30215-9 [pii]
AID - 10.1016/j.ceca.2016.12.005 [doi]
PST - aheadofprint
SO  - Cell Calcium. 2016 Dec 21. pii: S0143-4160(16)30215-9. doi:
      10.1016/j.ceca.2016.12.005.
Jon
Site Admin
Posts: 10048
Joined: Tue Jul 13, 2004 6:27 pm
Location: Bethesda, MD
Contact:

Re: [Feature Request] Post processing Imports for DOIs

Post by Jon »

OK.

Jon
Sonny Software
iandol
Posts: 465
Joined: Fri Jan 25, 2008 2:31 pm

Re: [Feature Request] Post processing Imports for DOIs

Post by iandol »

An example where the DOI [DO] is correctly formatted but ends up with JOUR [M3] appended in the Bookends field:

Code: Select all

10.7554/eLife.22749
JOUR

Code: Select all

TY  - JOUR
AU  - Gordon, Noam
AU  - Koenig-Robert, Roger
AU  - Tsuchiya, Naotsugu
AU  - van Boxtel, Jeroen JA
AU  - Hohwy, Jakob
A2  - Stephan, Klaas Enno
TI  - Neural markers of predictive coding under perceptual uncertainty revealed with Hierarchical Frequency Tagging
PY  - 2017
DA  - 2017/02/28
JF  - eLife
SN  - 2050-084X
PB  - eLife Sciences Publications, Ltd
DO  - 10.7554/eLife.22749
VL  - 6
UR  - https://dx.doi.org/10.7554/eLife.22749
M3  - JOUR
C1  - eLife 2017;6:e22749
SP  - e22749
KW  - Hierarchical Frequency Tagging
KW  - predictive coding
KW  - Semantic Wavelet-Induced Frequency Tagging
KW  - intermodulation
AB  - There is ...
ER  - 
Jon
Site Admin
Posts: 10048
Joined: Tue Jul 13, 2004 6:27 pm
Location: Bethesda, MD
Contact:

Re: [Feature Request] Post processing Imports for DOIs

Post by Jon »

By OK I meant that the fix was made, it will be in the next update.

Jon
Sonny Software
iandol
Posts: 465
Joined: Fri Jan 25, 2008 2:31 pm

Re: [Feature Request] Post processing Imports for DOIs

Post by iandol »

There are still some problems with Pubmed DOIs in 12.8.1, e.g. this citation (imported via pubmed browser or citation download):

https://www.ncbi.nlm.nih.gov/pubmed/?term=28467827

imports this DOI: 10.1038/nature22073 [doi

Note the [doi remaining on the DOI; only the closing ] is removed. Here are a subset of the fields from the nbib file:

Code: Select all

TI  - Thalamic amplification of cortical connectivity sustains attentional control.
LID - 10.1038/nature22073 [doi]
AID - nature22073 [pii]
AID - 10.1038/nature22073 [doi]
PST - ppublish
SO  - Nature. 2017 May 11;545(7653):219-223. doi: 10.1038/nature22073. Epub 2017 May 3.
Jon
Site Admin
Posts: 10048
Joined: Tue Jul 13, 2004 6:27 pm
Location: Bethesda, MD
Contact:

Re: [Feature Request] Post processing Imports for DOIs

Post by Jon »

Download 12.8.1 again.

Jon
Sonny Software
iandol
Posts: 465
Joined: Fri Jan 25, 2008 2:31 pm

Re: [Feature Request] Post processing Imports for DOIs

Post by iandol »

Just redownloading and I get a reproducible hang (spinning beachball, requires a force quit) when trying to import the citation.nbib from that reference. I've sent a crash report but can do a spindump as well if it helps...

Here is the text of that ref:

Code: Select all


PMID- 28467827
OWN - NLM
STAT- In-Data-Review
DA  - 20170503
LR  - 20170511
IS  - 1476-4687 (Electronic)
IS  - 0028-0836 (Linking)
VI  - 545
IP  - 7653
DP  - 2017 May 11
TI  - Thalamic amplification of cortical connectivity sustains attentional control.
PG  - 219-223
LID - 10.1038/nature22073 [doi]
AB  - Although interactions between the thalamus and cortex are critical for cognitive 
      function, the exact contribution of the thalamus to these interactions remains
      unclear. Recent studies have shown diverse connectivity patterns across the
      thalamus, but whether this diversity translates to thalamic functions beyond
      relaying information to or between cortical regions is unknown. Here we show, by 
      investigating the representation of two rules used to guide attention in the
      mouse prefrontal cortex (PFC), that the mediodorsal thalamus sustains these
      representations without relaying categorical information. Specifically,
      mediodorsal input amplifies local PFC connectivity, enabling rule-specific neural
      sequences to emerge and thereby maintain rule representations. Consistent with
      this notion, broadly enhancing PFC excitability diminishes rule specificity and
      behavioural performance, whereas enhancing mediodorsal excitability improves
      both. Overall, our results define a previously unknown principle in neuroscience;
      thalamic control of functional cortical connectivity. This function, which is
      dissociable from categorical information relay, indicates that the thalamus has a
      much broader role in cognition than previously thought.
FAU - Schmitt, L Ian
AU  - Schmitt LI
AD  - NYU Neuroscience Institute, Department of Neuroscience and Physiology, NYU
      Langone Medical Center, New York, New York 10016, USA.
FAU - Wimmer, Ralf D
AU  - Wimmer RD
AD  - NYU Neuroscience Institute, Department of Neuroscience and Physiology, NYU
      Langone Medical Center, New York, New York 10016, USA.
FAU - Nakajima, Miho
AU  - Nakajima M
AD  - NYU Neuroscience Institute, Department of Neuroscience and Physiology, NYU
      Langone Medical Center, New York, New York 10016, USA.
FAU - Happ, Michael
AU  - Happ M
AD  - NYU Neuroscience Institute, Department of Neuroscience and Physiology, NYU
      Langone Medical Center, New York, New York 10016, USA.
FAU - Mofakham, Sima
AU  - Mofakham S
AD  - NYU Neuroscience Institute, Department of Neuroscience and Physiology, NYU
      Langone Medical Center, New York, New York 10016, USA.
FAU - Halassa, Michael M
AU  - Halassa MM
AD  - NYU Neuroscience Institute, Department of Neuroscience and Physiology, NYU
      Langone Medical Center, New York, New York 10016, USA.
AD  - Center for Neural Science, New York University, New York, New York 10016, USA.
LA  - eng
PT  - Journal Article
DEP - 20170503
PL  - England
TA  - Nature
JT  - Nature
JID - 0410462
EDAT- 2017/05/04 06:00
MHDA- 2017/05/04 06:00
CRDT- 2017/05/04 06:00
PHST- 2016/11/14 [received]
PHST- 2017/03/15 [accepted]
AID - nature22073 [pii]
AID - 10.1038/nature22073 [doi]
PST - ppublish
SO  - Nature. 2017 May 11;545(7653):219-223. doi: 10.1038/nature22073. Epub 2017 May 3.

Jon
Site Admin
Posts: 10048
Joined: Tue Jul 13, 2004 6:27 pm
Location: Bethesda, MD
Contact:

Re: [Feature Request] Post processing Imports for DOIs

Post by Jon »

I see it. Please download 12.8.1 again, it should work now.

Jon
Sonny Software
Post Reply