To make this a killer app: convert PDF to reference

A place for users to ask each other questions, make suggestions, and discuss Bookends.
Jon
Site Admin
Posts: 10084
Joined: Tue Jul 13, 2004 6:27 pm
Location: Bethesda, MD
Contact:

Re: autolinking pdfs and BE would be great

Post by Jon »

Harry Lime wrote:Having a way to automagically link one's large library of pdfs with the bookends database would be great. I am sure that most of the pdfs I want linked are already present in my database, just unlinked. So how about taking the abstract field from bookends and searching (w/ spotlight?) the folder of pdfs and looking for a match?
Stay tuned (in Vienna as well as elsewhere).
I would like some help with a related issue. If I have pdfs attached and I use bookends on my desktop and notebook computers, and synchronize the database and a folder full of pdfs, the links get messed up since the files do not have the exact same path.
I suggest using the Bookends default folder as the place for you attachmnts. In that case, it won't make any difference what computer you are on (it will be in ~/Library/Application Support/Bookends/Attachments).

Jon
Sonny Software[/i]
Harry Lime
Posts: 11
Joined: Sun Sep 24, 2006 10:21 pm

nice added feature - a few comments

Post by Harry Lime »

When you say stay tuned, you mean it!

This is a great added feature. I tried it briefly and may not fully understand how it is working but have a bit of initial feedback.

1) It might work faster if one could limit the searching to a particular directory of pdfs (as an option).

2) More importantly, in at least one case, a record found one pdf and matched it "automagically" but it turned out to be a review that contained the full title for the reference. I do not know what you are using to search with, but this was one reason I suggested using text from the abstract - though I had read somewhere that spotlight won't do full content searching (ie exact matching of long strings). Is that true?
Jon
Site Admin
Posts: 10084
Joined: Tue Jul 13, 2004 6:27 pm
Location: Bethesda, MD
Contact:

Re: nice added feature - a few comments

Post by Jon »

Harry Lime wrote:1) It might work faster if one could limit the searching to a particular directory of pdfs (as an option).
You can't limit Spotlight searches to a directory (or at least I haven't seen a way to).
2) More importantly, in at least one case, a record found one pdf and matched it "automagically" but it turned out to be a review that contained the full title for the reference. I do not know what you are using to search with, but this was one reason I suggested using text from the abstract - though I had read somewhere that spotlight won't do full content searching (ie exact matching of long strings). Is that true?
Bookends does use information in the Abstract in the search. Does the reference you used contain one? And yes, Spotlight has problems with long string, which is the major reason for failure to find a correct match in Bookends.

Jon
Sonny Software
Harry Lime
Posts: 11
Joined: Sun Sep 24, 2006 10:21 pm

Post by Harry Lime »

You can't limit Spotlight searches to a directory (or at least I haven't seen a way to).
There must be a way to do this as it is possible in the finder to focus the search to a directory and it is also implemented in the simple spotlight frontend called "notlight".
Bookends does use information in the Abstract in the search. Does the reference you used contain one? And yes, Spotlight has problems with long string, which is the major reason for failure to find a correct match in Bookends.
This may have been my error. I may have unknowingly hit return to select to the first hit. I seem to be getting quite a few hits for some papers and none for others. Focusing to a directory would help some. I my sense is that the spotlight search criteria could be tweaked a bit, since I can find pdf's for some articles with spotlight that aren't found directly from Bookends (albeit from a relatively long list of hits).
Jon
Site Admin
Posts: 10084
Joined: Tue Jul 13, 2004 6:27 pm
Location: Bethesda, MD
Contact:

Post by Jon »

Well, if you see how to do this, let me know. My guess is that the searches are actually system-wide, but the utility filters out any except the ones in the folder you specified. But perhaps there really is something in the API I missed.

As for the "best" search, I'm convinced there isn't any. I spent a lot of time with this, and there is always a trade off. Some searches are more selective but miss the correct hit. Some are broader but get too many incorrect hits. Spotlight is far from perfect, but I think the way it's implemented in Bookends does a pretty good job, and for cases that miss you can do it the old fashioned way -- find the pdf and attach it yourself.

Jon
Sonny Software
Harry Lime
Posts: 11
Joined: Sun Sep 24, 2006 10:21 pm

Post by Harry Lime »

Hi Jon,
I ain't a programmer, but I was challenged...

Here is some of the output from
man mdfind
NAME
mdfind -- finds files matching a given query

SYNOPSIS
mdfind [-live] [-onlyin directory] query

DESCRIPTION
The mdfind command consults the central metadata store and returns a list of files that match the given metadata query. The query can be a string or a query expression.

-onlyin dir
Limit the scope of the search to the directory specified.
Jon
Site Admin
Posts: 10084
Joined: Tue Jul 13, 2004 6:27 pm
Location: Bethesda, MD
Contact:

Post by Jon »

Those are teminal commands? If so, that's not using the Spotlight API.

But it makes little difference with regard to speed in any case -- once a single search has been done subsequent searches are very fast (caching).

Jon
Sonny Software
Harry Lime
Posts: 11
Joined: Sun Sep 24, 2006 10:21 pm

Post by Harry Lime »

A few more comments on the pdf-linking.

1) I don't fully understand the caching you refer to if I search for paper A, does the cache generated speed up the search for paper B?


2) I have had a few cases where the routine finds a single hit and assigns it and it may not be correct. It would be good to allow the user to approve/disapprove.

3) Some feedback (growl notification?) would be good to note that no matches were found.



concerning focusing the search with the onlyif command, this maybe the API equivalent which I found in this document:
http://developer.apple.com/documentatio ... archScopes:



setSearchScopes:
Resctrict the search scope of the receiver.
-(void)setSearchScopes:(NSArray *)scopes
Parameters
scopes
Array of NSString or NSURL objects that specify file system directories. You can also include the predefined search scopes specified in “Constantsâ€
Jon
Site Admin
Posts: 10084
Joined: Tue Jul 13, 2004 6:27 pm
Location: Bethesda, MD
Contact:

Post by Jon »

Harry Lime wrote:A few more comments on the pdf-linking.

1) I don't fully understand the caching you refer to if I search for paper A, does the cache generated speed up the search for paper B?
It's not my code, of course (it's Apple's). But try it yourself. In my experience, a second Spotlight search for a new set of parameters is much faster than the first search.

2) I have had a few cases where the routine finds a single hit and assigns it and it may not be correct. It would be good to allow the user to approve/disapprove.
I think that's too intrusive. You have that already (in a more passive way). Keep the Attachment Inspector open and you'll see what was attached immediately. If you don't like it, use the Action menu to move it to the Trash.

Bookends does ask you to approve/disapprove/select if there is more than one possible match...
3) Some feedback (growl notification?) would be good to note that no matches were found.
The feedback is that paper clip icon won't show up.

setSearchScopes:
Yes, I believe that's it, thanks.

Jon
Sonny Software
Post Reply