Retrieving Call Num at LoC

Users asking other users for AppleScripts that work with Bookends.
Post Reply
Fritz
Posts: 4
Joined: Thu Jul 05, 2012 5:12 pm

Retrieving Call Num at LoC

Post by Fritz »

I wanted to update the "Call Num" field (user5) in my database with the Library of Congress Classification Number (LCCN). Sorting on "Call Num" would then put my database into subject heading order.

I first built a a new format in the Format Manager to output just the ISBN. I created a text file (ISBN.txt) that held these numbers each on their own line.

I then built a Perl script that used the Z39.50 gateway to interrogate the Library of Congress (LoC) and return a MARC record, which was parsed to extract the LCCN. The ISBN and the LCCN were combined as output to an SQL file. The appropriate SQL command is: <UPDATE thereferences SET user5 = 'LCCN' WHERE user6 = 'ISBN' ; >. This SQL file can be used by Valentina-DB (http://www.valentina-db.com/) which is the underlying data engine to Bookends. There is one line in the SQL file for each successful retrieval.

The Perl script was very difficult to write, the Z39.50 module was difficult to install, the script ran slowly (over 4 hours for 3221 ISBNs with 2359 successful LCCNs retrieved). The success rate suggested that I needed to look elsewhere for a solution. [4.5 hrs / 3221 records = 5 seconds/query]

My neighbor, who is a librarian, suggested I research WorldCat. I found an experimental API (http://oclc.org/developer/documentation ... /using-api).

I built an Applescript that moves through the ISBNs (just like before) and queries the OCLC url. WorldCat returns XML code that can be parsed to extract the LCCN. I built another set of ISBNs (most of which did not have records at the LoC). These 828 ISBNs were used by my Applescript which after a successful retrieval built the appropriate SQL command (just like above). The Applescript ran in 20 minutes. [ 20 min / 828 records = 1.5 seconds/query]

The Applescript routine is easily to understand and modify; the underlying XML modules are freely available on the web; and it's so much faster!

The SQL files run in less that 1 second! The Valentina-DB engine is just awesome!

There's are several drawbacks: (1) You need to buy Valentina-DB ($199), (2) you need to find version 4.9.1 to work with Bookends 11 (easy to do), and (3) you need to work with a copy of your database just incase something doesn't work (cumbersome but necessary).

Here is the Applescript itself (the comments should clarify the installation):

# This script needs XML Tools.osax installed in ~/Library/ScriptingAdditions.
# It is freely available from: http://www.latenightsw.com/freeware/XML ... index.html.

# the following routines were cut and pasted from:
# http://www.latenightsw.com/freeware/XML ... ities.html

on getAnElement(theXML, theElementName)
-- find and return a particular element (this presumes there is only one instance of the element)

repeat with anElement in XML contents of theXML
if class of anElement is XML element and ¬
XML tag of anElement is theElementName then
return contents of anElement
end if
end repeat

return missing value
end getAnElement

on getElementFromPath(theXML, theElementPath)
if theElementPath is {} then
return theXML
else
local foundElement

set foundElement to getAnElement(theXML, item 1 of theElementPath)
if foundElement is not missing value and ¬
class of foundElement is XML element then
return getElementFromPath(foundElement, rest of theElementPath)
else
return missing value
end if
end if
end getElementFromPath

on getElementValue(theXML)
if theXML is missing value or theXML is {} then
return ""
else if class of theXML is string then
return theXML
else
try
return item 1 of XML contents of theXML
on error number -1728
return ""
end try
end if
end getElementValue

# This Applescript assumes that all the files are on the Desktop
# I like the Desktop because it encourages you to trash files that are no longer being used
# You will need to change 'Fritz' (my account name) to your account name
# ISBN.txt should already have been created by Bookends
# Work with a small number of ISBNs at first, say 10-20
# so that you know this routine will work for you
# Temporary.XML will vanish nicely in the last iteration of looping over the ISBNs
# SQLquery.sql will be the input file for 'Load DUMP" in Valentina

set input_id to open for access ¬
file "Macintosh HD:Users:Fritz:Desktop:ISBN.txt"
set output_id to open for access ¬
file "Macintosh HD:Users:Fritz:Desktop:SQLquery.sql" with write permission
set eof output_id to 0

set fileContents to paragraphs of (read input_id) # read whole input file into a list with a final empty paragraph
set ISBN_count to length of fileContents


repeat with i from 1 to ISBN_count - 1 #.........start major loop
set ISBN to item i of fileContents

if i mod 10 = 0 then # print out progress every 10th record
display alert "Working on ISBN record #" & i buttons {} giving up after 1 # very cheap progress meter
end if

# See e.g. http://oclc.org/developer/documentation ... /using-api
set theURL to "http://classify.oclc.org/classify2/Classify?isbn=" & ISBN & "&summary=true"

# curl -s (silent no progress meter) -L (redirection) -o (output)
# curl will create a file for output but it will append data to that file on each iteration
# hence unix rm to delete after each use
do shell script "curl -s -L -o /Users/Fritz/Desktop/Temporary.xml " & theURL

set theXML to parse XML alias "Macintosh HD:Users:Fritz:Desktop:Temporary.xml"
set responseCode to code of XML attributes of getAnElement(theXML, "response")

if responseCode as integer ≤ 2 then # good XML response back; otherwise skip
set elementXML to getAnElement(getAnElement(theXML, "recommendations"), "lcc")
if elementXML is not missing value then # no LCC tag and therefore no LCC

set elementXML to getAnElement(elementXML, "mostPopular")
set LCCN to sfa of XML attributes of elementXML
set LCCN to LCCN & " OCLC" # add OCLC suffix to identify source
set output_string to "UPDATE thereferences SET user5 = '" & LCCN & "' where user6 = '" & ISBN & "' ;"

write output_string & return to output_id
end if
end if
do shell script "rm /Users/Fritz/Desktop/Temporary.xml" # clean up for next iteration
end repeat #..........................................end major loop

close input_id
close output_id
display alert "Successful Completion!"
Farcas
Posts: 14
Joined: Sat Feb 15, 2014 10:20 pm

Re: Retrieving Call Num at LoC

Post by Farcas »

This project sounds awesome! Thanks for posting. I would love your help in working this into my data. Problem for me is that I don't have all the ISBN numbers for my data.

Would you mind breaking this down to help me get this project started? Where do I began?

(if this doesn't belong the forum, please email me)


Thanks!
Fritz
Posts: 4
Joined: Thu Jul 05, 2012 5:12 pm

Re: Retrieving Call Num at LoC

Post by Fritz »

This is an update on my previous posting (4 Apr 2013).

I wanted to update the "Call Num" field (user5) in my database with the Library of Congress Call Number (let’s call it LCCN). The LCCN has this format (http://www.usg.edu/galileo/skills/unit0 ... 3_04.phtml) but I am only interested in the subject heading. Sorting on “Call Num” would then put my database into subject heading order.

I used the old format that I had previously built in the Format Manger to output just an ISBN. The Field Order holds ‘u6.’ I sorted my Bookends database on Call Num so that the blanks rose to the top. Selecting all the blanks and then with the shift key depressed clicking on one of the hit boxes makes a hit list of all the blanks.

In the menu Biblio > Bibliography with my ISBN format I created a text file called “ISBN.txt”.

The XMLTools2.9.4.dmg can now be downloaded from http://latenightsw.com/freeware/xml-tools/. You need to follow their installation guide and create a new folder ~/Library/ScriptingAdditions and copy XMLTools.osax therein.

With ISBN.txt on your Desktop the AppleScript from my previous posting is ready to fly.

Initially I had 1146 hits. Because I use Amazon to “Autofill from the Internet” many of my ISBNs are only for Amazon lookups (and therefore will not not work with OCLC). Manually deleting those entries left me with 971usable ISBNs. The AppleScript ran in 12 minutes on a quiet Super Bowl Sunday (5 Feb 2017). I had 734 entries outputted in my SQL file (SQLquery.sql).

The AppleScript is not sophisticated. It skips errors from faulty ISBNs, errors from ISBNs that refer to multiple print editions, in fact just any kind of error. I think those errors can only be addressed manually.

I downloaded the 32 bit version {5.8.8} of Valentina Studio (vstudio_5__mac) from http://valentina-db.com/download/prev_r ... .8/mac_32/. Fortunately you can now easily obtain this for FREE with an easy registration at http://www.valentina-db.com. Valentina version 5.8.8 is what underlies Bookends 12.7.8. You MUST use the exact Valentina version that underlies Bookends!

As before I used a duplicate DB into which to load my new LCCNs (just in case something went south). The actual command in Valentina Studio is File > Load Dump and then specify SQLquery.sql on the desktop which has been created by the AppleScript run.
Fritz
Posts: 4
Joined: Thu Jul 05, 2012 5:12 pm

Re: Retrieving Call Num at LoC

Post by Fritz »

My previous post indicated that some ISBNs would need to have their corresponding LCCNs found manually. Here’s my suggestion:

I use a 13” MacBook for my Bookends database, so often the available screen real estate is much to small. (I dislike needing to open and hide windows on a repetitive basis.) I opted to use Duet (www.duetdisplay.com) which allows an iPad to double as a video display (2nd screen). You need to download the Mac app (free) and buy the iOS app for the iPad ($9.99). With your iPad physically connect to your MacBooks USB port, Duet installed on your MacBook and Duet running on you iPad, you have a very nice 2nd screen.

On the iPad screen i have my browser pointed to http://classify.oclc.org/classify2/. The browser window indicates where I can input an ISBN.

My Bookends Library Window (running on my main screen) has a Reference Panel on the right hand side. The Reference Panel is open to “Additional Fields” so that Call Num and ISBN are at the top of the Panel.

Now it’s just copy and paste. Highlight and copy the ISBN from Bookends and paste it into the right spot in your browser on your 2nd screen. Click “Search” and then deal with the holdings from OCLC.

For example, ISBN 0449202496 (All Quiet on the Western Front) returns 4 holdings at OCLC. I arbitrarily click on the first entry which shows an LCC (or as I call it, LCCN) of PT2635.E68. This is highlighted and pasted into the Bookends database.

I have just 400 more entries to do so it might take some time!

Any suggestions will definitely be appreciated.
Post Reply