BibTeX import, characters and unknown fields

A place for users to ask each other questions, make suggestions, and discuss Bookends.
teoric
Posts: 10
Joined: Mon Dec 17, 2007 4:28 pm

BibTeX import, characters and unknown fields

Post by teoric »

Hi!

I'm thinking about trying a bibliography manager that integrates with WYSIWYG programmes. I've come across BookEnds and read nice comments about it, so I downloaded the Demo version.

Now I tried to import my BIBTeX database, and it seems That BookEnds does not work at all. It may be my fault but I don't find anything in the on-line help or on the web (might be my fault again)? Four sample BIBTeX entries, where
* unknown fields end up in neighbouring fields instead of being thrown out,
* dashes are not correcty recognised ("--" becomes "ñ", MacRoman) or accented characters aren't ({\'e} becomes something unpastable, Latin-1) or both (UTF-8), and "{\c c}" does not become "ç" but remains "{\c c}".

Is it possible to fix any these problems, and maybe get the information in the entries (location on the disk, crossreferences to other entries) into the BookEnds database, or would I have to write a preprocessor that throws out what BE cannot handle and before that export my database in a way to remove the crossreferences? (That would be a show-stopper for me, probably, so it would be nice to know before I start liking the programme.)

And would it be possible to get the information about the language of an entry into BE? It's really useful for spell-checking and especially for line-breaking.

(By the way, these entries were created using BibDesk and JabRef.)

Grateful for any help
and sorry for the verbosity,
Bernhard


@book{abeille:nouvelle,
Author = {Abeill{\'e}, Anne},
Date-Modified = {2007-02-02 14:51:18 +0100},
Keywords = {Syntax},
Language = {french},
Publisher = {Armand Collin},
Title = {Les nouvelles syntaxes: Grammaires d'unification et analyse du Fran{\c c}ais},
Year = {1993}}


@incollection{baeu-zim:frage,
Author = {B{\"a}uerle, Rainer and Zimmermann, Thomas Ede},
Crossref = {hsk6},
Language = {german},
Pages = {333--348},
Title = {Frages{\"a}tze}}


@article{artstein:sub,
Author = {Artstein, Ron},
Date-Modified = {2007-02-02 14:51:18 +0100},
File = {:/Lit/FOCUS/focus-below-word-level.pdf:PDF},
Journal = {Natural Language Semantics},
Keywords = {Fokus; Phonologie; Semantik},
Language = {english},
Number = {1},
Pages = {1--22},
Title = {Focus below the word level},
Url = {http://privatewww.essex.ac.uk/~artstein ... /focus.pdf},
Volume = {12},
Year = {2004}}



@misc{safar:sem-into,
Author = {{\v{S}}af{\'a}\v{r}ov{\'a}, Marie},
Comment = {Zum Workshop: http://www.ling.helsinki.fi/kielitiede/ ... l_Methods/},
Date-Modified = {2007-02-02 14:51:18 +0100},
Howpublished = {paper for the Workshop on Experimental Methods in Semantics, Helsinki, January 7--9, 2004},
Keywords = {Fokus; Topic; Information Structure; Semantik; Pragmatik},
Language = {english},
Title = {Analyzing the Semantics of Intonation},
Url = {http://staff.science.uva.nl/~mnilseno/helsinki.pdf},
Year = {2004}}
Jon
Site Admin
Posts: 10293
Joined: Tue Jul 13, 2004 6:27 pm
Location: Bethesda, MD
Contact:

Post by Jon »

Hi,

There are two issues here:

1. The BibTeX filter only handles tags it knows about, which are the core BibTeX tags. You have to add other tags to the filter (either to Ignore them, or to put there where you want them). For example, to get the Language tag you might enter

Language -> User7 (or Notes, or User18, for example)

2. Incorrect import of TeX characters. Make sure you have Convert from TeX checked in Preferences. I see the problem with the double dash and I'll fix that for the next update.

Jon
Sonny Software
Last edited by Jon on Mon Dec 17, 2007 6:12 pm, edited 1 time in total.
teoric
Posts: 10
Joined: Mon Dec 17, 2007 4:28 pm

Post by teoric »

Hi,

that was a fast reply, thanks!
Jon wrote:Hi,

There are two issues here:

1. The BibTeX filter only handles tags it knows about, which are the core BibTeX tags. You have to add other tags to the filter (either to Ignore them, or to put there where you want them). For example, to get the Language tag you might enter

Language -> User18 (or Notes, or User18, for example)
ok, thanks, I'll try that. Importing the language data to user18 would allow me to use this for export, but BookEnds does not make any use of it, does it? (E.g. by not Inadequately Capitalising Foreign-Language Titles.)
Jon wrote: 2. Incorrect import of TeX characters. Make sure you have Convert from TeX checked in Preferences. I see the problem with the double dash and I'll fix that for the next update.
OK, so I am supposed to use MacRoman, obviously. The Problem of Mrs. Å afářová's diacritics persists; and the given name and last name seem to be inversed if they were given in the unambiguous order "<last>, <first>" in the bibtex file. Is there any way around that? I tried changing the format of the authors in the Format Manager, but that does not work.

Thanks for your helpful comments,
Bernhard
Jon
Site Admin
Posts: 10293
Joined: Tue Jul 13, 2004 6:27 pm
Location: Bethesda, MD
Contact:

Post by Jon »

teoric wrote:
ok, thanks, I'll try that. Importing the language data to user18 would allow me to use this for export, but BookEnds does not make any use of it, does it? (E.g. by not Inadequately Capitalising Foreign-Language Titles.)
I'm not quite sure what you mean here. You can specify that user18 be used in any format, of that's what you want.
OK, so I am supposed to use MacRoman, obviously. The Problem of Mrs. Å afářová's diacritics persists;
I've already fixed the á. But not the "che" (is that what that is?). Bookends BibTeX support for Slavic (again, I think that's it, please excuse my ignorance) accents is poor.
and the given name and last name seem to be inversed if they were given in the unambiguous order "<last>, <first>" in the bibtex file. Is there any way around that? I tried changing the format of the authors in the Format Manager, but that does not work.
You have to tell the BibTeX import filter what order the names are in (surname first or last). In this case it is of course surname first.

Jon
Sonny Software
teoric
Posts: 10
Joined: Mon Dec 17, 2007 4:28 pm

Post by teoric »

Hi Jon,

I meant: If I have a title of a publication that is not English, it should not be Put InTo Title Case even if the style demands that English be typeset in Title Case. So it would be nice if BookEnds used the language information for such decisions regarding formatted copy/paste.

As for the characters: It's supposed to be a "sh" (but it is a Czech name, so you were close :)), but never mind. I thought that you could e.g. treat TeX accents generically instead of case by case, mapping them to the Unicode combining diacritic marks (of course, you'd have to put these after the character they modify)? (If you then normalise to NFKC so that all the characters that have a single character representation in Unicode are indeed expressed using that character, that would be great.) It could be done in a preprocessor, of course, but having it in BE would be very handy, I think (and I get lazy if I think about buying something :oops:).
You have to tell the BibTeX import filter what order the names are in (surname first or last). In this case it is of course surname first.
Yes, I figured I was in the wrong Format Manager, and the comma does not seem to derange BE, so that problem is solved.

Thanks again,
Bernhard
Jon
Site Admin
Posts: 10293
Joined: Tue Jul 13, 2004 6:27 pm
Location: Bethesda, MD
Contact:

Post by Jon »

Hi,

Hm, I'm afraid that, as you say, Bookends will capitalize the contents of the title field regardless of language (if you tell it to in the format). A workaround would be to put the title in a user-defined field. But that has it's own set of issues...

As for the TeX conversion, yes, it's on a case-by-case basis. The idea of using the accents themselves to create the unicode (decomposed?) and then normalize is interesting, but I'd have to look into that a lot more.

Jon
Sonny Software
teoric
Posts: 10
Joined: Mon Dec 17, 2007 4:28 pm

Post by teoric »

Hi,

thanks for the info. This is a minor issue, of course, but it can mean a lot of “little workâ€
Jon
Site Admin
Posts: 10293
Joined: Tue Jul 13, 2004 6:27 pm
Location: Bethesda, MD
Contact:

Post by Jon »

I appreciate the posts. :-)

I do add more TeX conversions periodically (and yes, generally when other development efforts aren't pressing), and I'll take a look at these.

Thanks for your interesting feedback.

Jon
Sonny Software
teoric
Posts: 10
Joined: Mon Dec 17, 2007 4:28 pm

Post by teoric »

Jon wrote:I appreciate the posts. :-)
Thanks, that's too kind.
Jon wrote:I do add more TeX conversions periodically (and yes, generally when other development efforts aren't pressing), and I'll take a look at these.

Thanks for your interesting feedback.
Thanks for ‘listening’. So I'll wait for the things to come and will see if I can use BE as a replacement for or as a complement to BibDesk.

Bernhard

PS: If I buy a student version now, do I have to upgrade once I finish in summer? :roll:
Jon
Site Admin
Posts: 10293
Joined: Tue Jul 13, 2004 6:27 pm
Location: Bethesda, MD
Contact:

Post by Jon »

teoric wrote:If I buy a student version now, do I have to upgrade once I finish in summer? :roll:
Hi,

No. Historically it's been 2-3 years between charged-for upgrades, and Bookends 10 was released in July of this year.

Jon
Sonny Software
teoric
Posts: 10
Joined: Mon Dec 17, 2007 4:28 pm

Post by teoric »

Thanks, I

No. Historically it's been 2-3 years between charged-for upgrades, and Bookends 10 was released in July of this year.
Thanks, I was rather wondering about the ‘legal’ issue (no longer a student => upgrade to ‘real license’) than about the update period. I got a license now (the €–$ rate easing the decision).

BIBTeX import. I did the conversion from my BIBTeX database the following way: export to UTF-8 via BibDesk, then treat special characters ("--","---", \l, \v, \c...) and import as UTF-8 rather than Mac-Roman. I discovered four issues that might also afflict others, so I mention them here.

BibTeX key. The BibTeX key field (user1) is used by some formats (e.g. MLA, DIN 1505) for something else; so when I now use my freshly imported references with these formats, I get the BIBTeX key in my formatted reference. Is there any way to easily change this interaction? I see two ways: (a) change the BIBTeX importer and exporter (or at least the latter) and move all BIBTeX keys to e.g. user16 or (b) change each of the user1 bibliography formats to use some other user field in the references. This of course means I should not share references with others and have to mirror any format changes e.g. in Bookends DIN 1505 format to my personal variant, doesn't it?

This issue seems to have come up earlier (http://www.sonnysoftware.com/phpBB2/viewtopic.php?t=26), but it hasn't been solved in a simple way, has it?


Book Chapters. Many of my BIBTeX references were @incollection; many of these show up as journal articles in Bookends. I can change this, but it's surprising.


‘Grey Literature. It seems that there is no format corresponding to BIBTeXs @Unpublished or @Manuscript so that I would have to implement these via a unused... field, doesn't it? (I noticed someone else asked for it – isn't gey literature so common that it might become a standard type in Bookends some day?)


URLs. There seems to be no uniform way to ‘enable URL support’ in Bookends, as you can do in BIBTeX/custombib (e.g. for referencing on-line versions/preprints of Journal articles). That means one has to add this to every bibliography format, doesn't it?

Grateful for any help,
and merry Christmas / season's holidays,
Bernhard
Jon
Site Admin
Posts: 10293
Joined: Tue Jul 13, 2004 6:27 pm
Location: Bethesda, MD
Contact:

Post by Jon »

teoric wrote: Thanks, I was rather wondering about the ‘legal’ issue (no longer a student => upgrade to ‘real license’) than about the update period.
If you buy a license while you are a student you do not need to pay anything additional when you graduate.

The BibTeX key field (user1) is used by some formats (e.g. MLA, DIN 1505) for something else; so when I now use my freshly imported references with these formats, I get the BIBTeX key in my formatted reference. Is there any way to easily change this interaction?
None of the formats we create use User1, just for this reason. This includes MLA (if you see differently, please let me know). DIN 1505 was contributed by a user. If you, or any other German-speaking user, would like to change this format by substituting a more appropriate field for "u1" in the format, please do so and send it to me. I'll include it in the next update.

Many of my BIBTeX references were @incollection; many of these show up as journal articles in Bookends. I can change this, but it's surprising.
Easily fixed. Open the BibTeX import filter, click Edit Type Definitions, and enter ",Incollection" after "Inbook" (it will now be imported as a Book Chapter). I'll include this as the default in the next update.

It seems that there is no format corresponding to BIBTeXs @Unpublished or @Manuscript so that I would have to implement these via a unused... field, doesn't it? (I noticed someone else asked for it – isn't gey literature so common that it might become a standard type in Bookends some day?)
Bookends doesn't have an Unpublished Type. You can add your own, and map the BibTeX types there. We can of course add more Types, but I'd like to add several at once. I'd welcome feedback from people one what new Types we should add.

There seems to be no uniform way to ‘enable URL support’ in Bookends, as you can do in BIBTeX/custombib (e.g. for referencing on-line versions/preprints of Journal articles). That means one has to add this to every bibliography format, doesn't it?
I'm not sure what you mean here. If you want the url output in a format, then yes, you'll have to specify that in the format itself.

Jon
Sonny Software
teoric
Posts: 10
Joined: Mon Dec 17, 2007 4:28 pm

Post by teoric »

None of the formats we create use User1, just for this reason. This includes MLA (if you see differently, please let me know). DIN 1505 was contributed by a user. If you, or any other German-speaking user, would like to change this format by substituting a more appropriate field for "u1" in the format, please do so and send it to me. I'll include it in the next update.
As for MLA, the Conference Proceedings reference definition is:

Code: Select all

a. “t.â€
Jon
Site Admin
Posts: 10293
Joined: Tue Jul 13, 2004 6:27 pm
Location: Bethesda, MD
Contact:

Post by Jon »

Hi,

The MLA conference proceedings indeed has a typo, thanks for pointing it out. I've changed it to this

a. “t.â€
teoric
Posts: 10
Joined: Mon Dec 17, 2007 4:28 pm

Post by teoric »

Thanks for the fix! And thanks for looking into the commas (no idea who wants to see 6 editors, but …)

Cheers,
Bernhard
Post Reply