Page 1 of 1

author parser: multiple Author separaters

Posted: Wed Nov 10, 2021 6:59 am
by Dellu
File:
Sandra A. Thompson, Robert E. Longacre, And Shin Ja J. Hwang
Imported:
Hwang, Sandra A. Thompson, Robert E. Longacre, And Shin Ja J.
I used the "comma" separater. But, the file contains "and" separater as well (putting "and" before the last author is a usual way of writing).
But, the parser also some other issue: I don't know why it took the last author to the first.


setting (https://take.ms/tKxVZa) Image

Re: author parser: multiple Author separaters

Posted: Thu Nov 11, 2021 8:15 am
by Dellu
The author parser in BE is not flexible enough to be functional to parse different patterns of input.

I went with Keyboard Maestro. For any one interested in parsing authors of different formats: here is a wonderful KM macro:
https://forum.keyboardmaestro.com/t/par ... at/24718/7

Re: author parser: multiple Author separaters

Posted: Thu Nov 11, 2021 7:02 pm
by Jon
What import filter were you using. If it's BibTeX, it expects all authors to be separated with the word " and ".

Jon
Sonny Software

Re: author parser: multiple Author separaters

Posted: Sat Nov 13, 2021 2:57 am
by Dellu
Jon wrote: Thu Nov 11, 2021 7:02 pm What import filter were you using. If it's BibTeX, it expects all authors to be separated with the word " and ".

Jon
Sonny Software
Yes, I am using the bibtex filter. It has option in the manager to change it to comma as well. Doesn't it?

But, the problem is I am having mixed separators such as command and and as in the example I have shown above. In that case, BE is failing to parse it correctly. For now, I am using KM to do the parsing (disabled the author parser in BE).

But, in the future, it would be nice if you think about it making the parser a bit more powerful/flexible.

Re: author parser: multiple Author separaters

Posted: Sat Nov 13, 2021 7:37 am
by Jon
It has the option, but it's ignored internally for BibTeX. BibTeX is a special case and follows special rules. It expects " and " between authors. I suppose I could disable that field (punct between authors) if the format specifies BibTeX, but it's not something I anticipated anyone changing. I'll do that in the next update.

The parser is meant for tagged and regular output, like RIS, Endnote Refer, etc., which have well defined structures, not for irregular inputs. Creating a free format parser (an AI, basically) would be quite complicated for users, and is beyond the scope of an app like Bookends. We do offer the ability to import unstructured references if you need it (File -> Import From Existing Bibliography), which harnesses an AI created by Crossref. But it requires that the references have DOIs, which I know many of yours do not.

Jon
Sonny Software

Re: author parser: multiple Author separaters

Posted: Sun Nov 14, 2021 3:22 am
by Dellu
Jon wrote: Sat Nov 13, 2021 7:37 am It has the option, but it's ignored internally for BibTeX. BibTeX is a special case and follows special rules. It expects " and " between authors. I suppose I could disable that field (punct between authors) if the format specifies BibTeX, but it's not something I anticipated anyone changing. I'll do that in the next update.

The parser is meant for tagged and regular output, like RIS, Endnote Refer, etc., which have well defined structures, not for irregular inputs. Creating a free format parser (an AI, basically) would be quite complicated for users, and is beyond the scope of an app like Bookends. We do offer the ability to import unstructured references if you need it (File -> Import From Existing Bibliography), which harnesses an AI created by Crossref. But it requires that the references have a DOIs, which I know many of yours do not.

Jon
Sonny Software
OK; I understand. It is a complicated process as the author names can come in various formats (not mentioning the various conventions such as the Dutch names, Norwegian Names, Ethiopian names etc).

I don't think you need to change the Bibtex parser per se. I was thinking if it is possible for the whole parsing to happen in BE. Indeed, "and" appears to be the standard connector for the Bibtex. So, you can leave it as it is now.

Thanks