Scanning a LaTeX file and utf-8 unicode problems

A place for users to ask each other questions, make suggestions, and discuss Bookends.
Post Reply
talazem
Posts: 71
Joined: Sat Jan 14, 2006 5:18 pm

Scanning a LaTeX file and utf-8 unicode problems

Post by talazem »

First of all, thanks for the Universal update, Jon!

I'm trying to use Bookends with LaTeX (I'm using TeXShop, and all my files are encoded in utf-8, because i'm using XeLaTeX). I have two questions here, one in usage theory, and one to solve a practical problem I'm having:

1. usage theory: when should one just export the bookends file as .bib, place it in the .tex file's folder, and just do the bibliography from within latex; and when should one use the Bookends "Scan Document" feature? What is the difference between the two as far as real output, or process? (What crossed my mind was that I would be able to use Bookends built in file formats, ex. Chicago, while if I ran it off as a .bib file inside the LaTeX program, I'd have to define the citation format...correct?)

2. practical problem: I was experimenting with Scan Document (so that I could just answer #1 by myself ;) ), but I keep running into a problem that has to do with encodings, I assume. By trial LaTeX document has 4 citations. The LaTeX document is encoded as utf-8, and 3 of the four citations have unicode utf-8 characters in them (diacritics, to be exact.) So, I go to Biblio > Scan a Document... , and I choose my file, and then I choose the following settings:
* Generate a bibliography after scan [Scan Using ... APA/5th Edition]
* Send bibliography to (Bibliography window)
* Generate bibliography as (BibTex)
etc....

Now, the problem is: no matter what settings I choose in that pane above, when it comes to run the scan, it finds my four citations from the LaTeX file just fine, BUT it refuses to read the unicode characters correctly: instead, it replaces them with other characters. So, instead of "Bājūrī", I get "BÆ’Ã…j≈´rƒ´"; instead of "Dhahabī", I get "Dhahabƒ´".

I even went to the preferences, and played with all the bibtex conversion settings, but I couldn't get it to work. I am sure that my .tex file is utf-8, and obviously my Bookends database is utf-8. Somewhere in the middle something is getting translated and lost.

Help?
talazem
Posts: 71
Joined: Sat Jan 14, 2006 5:18 pm

Post by talazem »

Just wondering: is there a reason why in the [Bibliography Formatter], we can choose utf-8 for export, but in [Scan A Document > Generate a Biblgioraphy], utf-8 is not an option?

The reason I'm asking, is that I'm noticing another negative (though minor) side affect of my troubles mentioned above; namely, that after trying (and failing) to export as delineated above, the [Bibliography Formatter] is automatically switching to Bibtex from my setting of "utf-8". As such, even the "View Formatted" preview pane on my main Bookends window is coming up garbled, all the unicode is lost, until I go back to the [Bibliography Formatter] and manually resetting it as utf-8.

In short, I'm even more confused as to what's going on.
Jon
Site Admin
Posts: 10292
Joined: Tue Jul 13, 2004 6:27 pm
Location: Bethesda, MD
Contact:

Re: Scanning a LaTeX file and utf-8 unicode problems

Post by Jon »

talazem wrote:1. usage theory: when should one just export the bookends file as .bib, place it in the .tex file's folder, and just do the bibliography from within latex; and when should one use the Bookends "Scan Document" feature? What is the difference between the two as far as real output, or process? (What crossed my mind was that I would be able to use Bookends built in file formats, ex. Chicago, while if I ran it off as a .bib file inside the LaTeX program, I'd have to define the citation format...correct?)
Honestly, I'm not the guy to ask (I don't use LaTeX/BibTeX or any other TeX). Maybe someone else on the forum can help here.
Now, the problem is: no matter what settings I choose in that pane above, when it comes to run the scan, it finds my four citations from the LaTeX file just fine, BUT it refuses to read the unicode characters correctly: instead, it replaces them with other characters. So, instead of "B?j?r?", I get "BƒÅj?´rƒ´"; instead of "Dhahab?", I get "Dhahabƒ´".
Did you try

Biblio -> Bibliography Foramatter -> output as UTF-8?

Jon
Sonny Software
talazem
Posts: 71
Joined: Sat Jan 14, 2006 5:18 pm

Post by talazem »

Yes, as I mentioned above. But there is something here with Bookends itself, because, like I mentioned above, everytime I try to export like that (and it doesn't work), after closing that screen, my bookends database View Formatted pane is not utf-8; I have to go back into Biblio > Bibliography Formatter, and change the "as" back to utf-8 again.

In other words, somethings happening within the BE system.
Jon
Site Admin
Posts: 10292
Joined: Tue Jul 13, 2004 6:27 pm
Location: Bethesda, MD
Contact:

Post by Jon »

Please contact me directly and we'll try to figure out what the problem is.

Jon
Sonny Software
talazem
Posts: 71
Joined: Sat Jan 14, 2006 5:18 pm

Post by talazem »

Will do...

...but I'd still love to hear from anyone else about question #1...when do you export a .bib file, and when do you scan your .tex from within BE? Does anyone do that?
boris
Posts: 12
Joined: Wed Oct 10, 2007 1:41 am

Post by boris »

I use Mellel+Bookends for wysiwyg and Texshop+Bibdesk for latex. I bought Bookends recently so I might be wrong but it looks as if import and export works. Some of the bibtex fields don't show up in Bookends but that is not surprising. Before you ask why I use two environments: Mellel is good and fastbut latex is great and not so fast. Crossreferencing is one of the reasons I prepare final manuscripts using latex. In Mellel I have to write "...Figure 3.5 on page 47..." and then everything is wrong after one or two weeks. There are several other features in latex which make it a must for demanding work.
Boris
------------------------
Quod non est in actis non est in mundo
boris
Posts: 12
Joined: Wed Oct 10, 2007 1:41 am

Post by boris »

I forgot to mention that as I prepare final versions in latex I use the tex convention for nonascii (example: Bj{\"o}rklund). I have not tested Bookends' ability to deal with nonascii character conversion between latex and utf-8. I am slightly pessimistic. As soon as I have time I will test it and make a feature request.
Boris
------------------------
Quod non est in actis non est in mundo
Post Reply