Create new Internet record and Set Fields

Users asking other users for AppleScripts that work with Bookends.
Post Reply
jpottsx1
Posts: 4
Joined: Sun Sep 29, 2024 9:56 am

Create new Internet record and Set Fields

Post by jpottsx1 »

I am an admitted newbie with AppleScript, but I have cobbled together a script to retrieve from a website the basic information to create an Internet reference. How do I insert the data into the fields of a new blank record?

Code: Select all

tell application "Safari"
	activate
	open location "https://theconversation.com/future-evolution-from-looks-to-brains-and-personality-how-will-humans-change-in-the-next-10-000-years-176997"
	delay 5 -- Wait for the page to load
	
	tell front document
		-- Retrieve the page title
		set pageTitle to do JavaScript "document.title"
		if pageTitle is missing value or pageTitle is "" then
			set pageTitle to "Title not available"
		end if
		
		-- Retrieve the author's name
		try
			set authorName to do JavaScript "document.querySelector('meta[name=\"author\"]').content"
			if authorName is missing value or authorName is "" then error
		on error
			set authorName to "Author not available"
		end try
		
		-- Retrieve the writer's name (if available)
		try
			set writerName to do JavaScript "document.querySelector('meta[name=\"writer\"]').content"
			if writerName is missing value or writerName is "" then error
		on error
			set writerName to "Writer not available"
		end try
		
		-- Retrieve the publication year
		try
			set publicationYear to do JavaScript "document.querySelector('meta[property=\"article:published_time\"]').content.split('-')[0]"
			if publicationYear is missing value or publicationYear is "" then error
		on error
			set publicationYear to "Year not available"
		end try
		
		-- Retrieve the full publication date
		try
			set publicationDate to do JavaScript "document.querySelector('meta[property=\"article:published_time\"]').content"
			if publicationDate is missing value or publicationDate is "" then error
		on error
			set publicationDate to "Date not available"
		end try
		
		-- Retrieve the description
		try
			set pageDescription to do JavaScript "document.querySelector('meta[name=\"description\"]').content"
			if pageDescription is missing value or pageDescription is "" then error
		on error
			set pageDescription to "Description not available"
		end try
		
		-- Retrieve the keywords
		try
			set pageKeywords to do JavaScript "document.querySelector('meta[name=\"keywords\"]').content"
			if pageKeywords is missing value or pageKeywords is "" then error
		on error
			set pageKeywords to "Keywords not available"
		end try
		
		-- Retrieve the publisher's name
		try
			set publisherName to do JavaScript "document.querySelector('meta[property=\"og:site_name\"]').content"
			if publisherName is missing value or publisherName is "" then error
		on error
			try
				-- Fallback: Check for publisher in another meta tag
				set publisherName to do JavaScript "document.querySelector('meta[name=\"publisher\"]').content"
				if publisherName is missing value or publisherName is "" then error
			on error
				set publisherName to "Publisher not available"
			end try
		end try
	end tell
end tell

-- Retrieve the access date (current date)
set accessDate to (current date) as string

-- Output the results
set result to "Title: " & pageTitle & return & "Author: " & authorName & return & "Writer: " & writerName & return & "Year: " & publicationYear & return & "Publication Date: " & publicationDate & return & "Description: " & pageDescription & return & "Keywords: " & pageKeywords & return & "Publisher: " & publisherName & return & "Access Date: " & accessDate
display dialog result
Jon
Site Admin
Posts: 10200
Joined: Tue Jul 13, 2004 6:27 pm
Location: Bethesda, MD
Contact:

Re: Create new Internet record and Set Fields

Post by Jon »

I assume you've read the documentation of using AppleScript with Bookends in the User Guide, but for others I'll mention it's in the section "Scripting Bookends with AppleEvents", starting on page 445 of the current release.

Here's an example of creating a new reference and filling in fields (p. 455):

tell application "Bookends"

tell front library window

make new publication item with properties {title:"My great new paper", publication date string:"2019/01/31"}

end tell

end tell

The field names to use are those in the database, not the human-readable labels, which are configurable. For example, if you want to put text in the "Access Date" field of an Internet reference, you'd refer to it as user3. If you were adding a Journal Article, user3 would be used to hold the name of the translator.

There are several examples of how field labels map to database fields in the User Guide, including the section beginning on p. 70.

Jon
Sonny Software
msteffens
Posts: 39
Joined: Thu Jul 19, 2007 10:04 am
Location: Germany
Contact:

Re: Create new Internet record and Set Fields

Post by msteffens »

The below script extracts metadata for the frontmost Safari page and creates a new Bookends publication with these metadata.

The createBookendsPublication() handler shows how a Bookends publication can be created & populated from AppleScript.

The script should work well with journal article pages from publishers like PLoS, Nature, Biomed Central, PubMed, PubMed Central, etc, and it should work ok with news article publishers like New York Times. However, since every site offers different metadata & value formats, the script could benefit from a lot more testing & tweaking. The JavaScript query patterns likely will need to get adopted to your most-used sources.

As an alternative approach, if you're able to extract a common bibliographic identifier (like a DOI, PMID, etc) from your web site, then it may be better to use Bookends' own "Quick Add" feature which can also be scripted.

Code: Select all

-- Extracts page metadata for the frontmost web page that's currently displayed in Safari and
-- creates a new Bookends "Internet" or "Journal article" publication with these metadata.

-- by Matthias Steffens, keypoints.app

-- TODO: better extraction of multiple authors & keywords
-- TODO: create publication types other than "Internet" or "Journal article" based on the given metadata
-- TODO: support more metadata, e.g. from <https://genomebiology.biomedcentral.com/articles/10.1186/s13059-024-03351-2>

-- Defines the JavaScript query patterns for website metadata to be extracted.
-- Each list may contain several patterns which will be executed first to last until
-- there's a pattern that returns content.
property pageTitleQuery : {"document.querySelector('meta[name=\"dc.title\"]').content", "document.querySelector('meta[name=\"citation_title\"]').content", "document.querySelector('meta[property=\"og:title\"]').content", "document.title"}
property authorQuery : {"document.querySelector('meta[name=\"citation_authors\"]').content.replace(/( [A-Z]+)\\b/g, ',$1').split(';').join('\\n').trim()", "document.querySelector('meta[name=\"author\"]').content", "document.querySelector('meta[name=\"dc.creator\"]').content", "document.querySelector('meta[name=\"citation_author\"]').content", "document.querySelector('meta[name=\"dc.contributor\"]').content"} -- TODO: multiple authors must be on a separate lines & ideally formatted as: Surname, First name(s) or Initials
property writerQuery : {"document.querySelector('meta[name=\"writer\"]').content"} -- TODO: currently not used by `createBookendsPublication()`
property institutionQuery : {"document.querySelector('meta[name=\"institution\"]').content", "document.querySelector('meta[name=\"citation_author_institution\"]').content"} -- TODO: currently not used by `createBookendsPublication()`
property publicationQuery : {"document.querySelector('meta[name=\"prism.publicationName\"]').content", "document.querySelector('meta[name=\"citation_journal_title\"]').content"}
property publicationYearQuery : {"document.querySelector('meta[name=\"dc.date\"]').content.match(/^[0-9]{4}/)[0]", "document.querySelector('meta[property=\"article:published_time\"]').content.split('-')[0]", "document.querySelector('meta[name=\"pubdate\"]').content.match(/^[0-9]{4}/)[0]", "document.querySelector('meta[name=\"citation_publication_date\"]').content.match(/^[0-9]{4}/)[0]"}
property publicationDateQuery : {"document.querySelector('meta[name=\"dc.date\"]').content", "document.querySelector('meta[property=\"article:published_time\"]').content.match(/^[0-9]{4}-[0-9]{2}-[0-9]{2}/)[0]", "document.querySelector('meta[name=\"pubdate\"]').content.replace(/^([0-9]{4})([0-9]{2})([0-9]{2})/, '$1-$2-$3')", "document.querySelector('meta[name=\"citation_date\"]').content"}
property publicationVolumeQuery : {"document.querySelector('meta[name=\"prism.volume\"]').content", "document.querySelector('meta[name=\"citation_volume\"]').content"}
property publicationIssueQuery : {"document.querySelector('meta[name=\"prism.number\"]').content", "document.querySelector('meta[name=\"citation_issue\"]').content"}
property publicationFirstPageQuery : {"document.querySelector('meta[name=\"prism.startingPage\"]').content", "document.querySelector('meta[name=\"citation_firstpage\"]').content"}
property publicationLastPageQuery : {"document.querySelector('meta[name=\"prism.endingPage\"]').content", "document.querySelector('meta[name=\"citation_lastpage\"]').content"}
property pageDescriptionQuery : {"document.querySelector('meta[name=\"dc.description\"]').content", "document.querySelector('meta[name=\"description\"]').content", "document.querySelector('meta[property=\"og:description\"]').content"}
property pageKeywordsQuery : {"document.querySelector('meta[name=\"keywords\"]').content.split(/[,;] */).map(s => s.trim()).join('\\n')", "document.querySelector('meta[name=\"news_keywords\"]').content.split(/[,;] */).map(s => s.trim()).join('\\n')", "document.querySelector('meta[name=\"dc.subject\"]').content"}
property publisherQuery : {"document.querySelector('meta[name=\"dc.publisher\"]').content", "document.querySelector('meta[name=\"DC.Publisher\"]').content", "document.querySelector('meta[name=\"citation_publisher\"]').content", "document.querySelector('meta[property=\"og:site_name\"]').content", "document.querySelector('meta[name=\"publisher\"]').content"}
property issnQuery : {"document.querySelector('meta[name=\"prism.issn\"]').content", "document.querySelector('meta[name=\"citation_issn\"]').content"}
property doiQuery : {"document.querySelector('meta[name=\"DOI\"]').content", "document.querySelector('meta[name=\"citation_doi\"]').content", "document.querySelector('meta[name=\"prism.doi\"]').content.replace(/^doi:(.+)/, '$1')", "document.querySelector('meta[name=\"dc.identifier\"]').content.replace(/^doi:(.+)/, '$1')"}
property pmidQuery : {"document.querySelector('meta[name=\"citation_pmid\"]').content"}

-- These two lists map the metadata keys to their corresponding JavaScript query patterns, i.e.,
-- the first item in `keysList` defines the metadata key name for the first item in `queriesList` etc.
-- NOTES:
-- - Both lists must have an equal item count.
-- - If you add more keys & patterns, you also need to add support for these in `createBookendsPublication()`
property keysList : {"pageTitle", "author", "writer", "institution", "publication", "publicationYear", "publicationDate", "publicationVolume", "publicationIssue", "publicationFirstPage", "publicationLastPage", "pageDescription", "pageKeywords", "publisher", "issn", "doi", "pmid"}
property queriesList : {pageTitleQuery, authorQuery, writerQuery, institutionQuery, publicationQuery, publicationYearQuery, publicationDateQuery, publicationVolumeQuery, publicationIssueQuery, publicationFirstPageQuery, publicationLastPageQuery, pageDescriptionQuery, pageKeywordsQuery, publisherQuery, issnQuery, doiQuery, pmidQuery}

use framework "Foundation"
use scripting additions


on run
	if (count of keysList) ≠ (count of queriesList) then
		display alert "Incorrect metadata <-> query mapping" message "Please open this script and edit the properties `keysList` and `queriesList` so that they have matching elements." as critical buttons {"OK"} default button "OK" giving up after 10
		return
	end if
	
	set pageMetadata to my pageMetadataFromSafari()
	
	if pageMetadata is not {} then
		set bookendsPublication to my createBookendsPublication(pageMetadata)
	end if
end run


-- Extracts page metadata for the frontmost web page that's currently displayed in Safari.
on pageMetadataFromSafari()
	set accessDate to my formattedDateString(current date)
	set pageMetadata to {accessDate:accessDate}
	
	tell application "Safari"
		set pageURL to front document's URL
		if pageURL is missing value then
			display alert "Missing Safari content" message "Please open a website in Safari and run this script again." as critical buttons {"OK"} default button "OK" giving up after 10
			return {}
		end if
		
		set pageMetadata to pageMetadata & {pageURL:pageURL}
		
		set metadataValues to {}
		repeat with theQueries in queriesList
			set theResult to my executeJavascript(theQueries)
			copy theResult to end of metadataValues
		end repeat
		
		set pageMetadata to pageMetadata & (my recordFromKeys:keysList andValues:metadataValues)
	end tell
	
	return pageMetadata
end pageMetadataFromSafari


-- Creates a new Bookends "Internet" or "Journal article" publication with the given metadata.
on createBookendsPublication(pubData)
	tell application "Bookends"
		tell front library window
			set aPub to make new publication item with properties {type:16, user3:pubData's accessDate, url:pubData's pageURL}
		end tell
		
		set pubTitle to my valueForKey:"pageTitle" inRecord:pubData
		if pubTitle is not missing value and pubTitle is not "" then set aPub's title to pubTitle
		
		set author to my valueForKey:"author" inRecord:pubData
		if author is not missing value and author is not "" then set aPub's authors to author
		
		set pubDate to my valueForKey:"publicationDate" inRecord:pubData
		if pubDate is missing value or pubDate is "" then set pubDate to my valueForKey:"publicationYear" inRecord:pubData
		if pubDate is not missing value and pubDate is not "" then set aPub's publication date string to pubDate
		
		set pubJournal to my valueForKey:"publication" inRecord:pubData
		if pubJournal is not missing value and pubJournal is not "" then
			set aPub's journal to pubJournal
			set aPub's type to 9
		end if
		
		set pubVolume to my valueForKey:"publicationVolume" inRecord:pubData
		set pubIssue to my valueForKey:"publicationIssue" inRecord:pubData
		if pubVolume is not missing value and pubVolume is not "" then
			if pubIssue is not missing value and pubIssue is not "" then set pubVolume to pubVolume & "(" & pubIssue & ")"
			set aPub's volume to pubVolume
		end if
		
		set publicationPages to my valueForKey:"publicationFirstPage" inRecord:pubData
		set publicationLastPage to my valueForKey:"publicationLastPage" inRecord:pubData
		if publicationPages is not missing value and publicationPages is not "" then
			if publicationLastPage is not missing value and publicationLastPage is not "" then set publicationPages to publicationPages & "-" & publicationLastPage
			set aPub's pages to publicationPages
		end if
		
		set pubAbstract to my valueForKey:"pageDescription" inRecord:pubData
		if pubAbstract is not missing value and pubAbstract is not "" then set aPub's abstract to pubAbstract
		
		set pubKeywords to my valueForKey:"pageKeywords" inRecord:pubData
		if pubKeywords is not missing value and pubKeywords is not "" then set aPub's keywords to pubKeywords
		
		set pubPublisher to my valueForKey:"publisher" inRecord:pubData
		if pubPublisher is not missing value and pubPublisher is not "" then set aPub's publisher to pubPublisher
		
		set pubISSN to my valueForKey:"issn" inRecord:pubData
		if pubISSN is not missing value and pubISSN is not "" then set aPub's user6 to pubISSN
		
		set pubDOI to my valueForKey:"doi" inRecord:pubData
		if pubDOI is not missing value and pubDOI is not "" then set aPub's doi to pubDOI
		
		set pubPMID to my valueForKey:"pmid" inRecord:pubData
		if pubPMID is not missing value and pubPMID is not "" then set aPub's user18 to pubPMID
	end tell
end createBookendsPublication


-- Executes the given JavaScript snippet(s) in the frontmost Safari document and returns
-- the first result. Returns an empty string if the executed JavaScript didn't return anything.
on executeJavascript(theQueries)
	if theQueries is {} then return ""
	
	repeat with theQuery in theQueries
		if theQuery is not "" then
			tell application "Safari"
				tell front document
					try
						set theResult to do JavaScript theQuery
						if theResult is missing value then error
					on error
						set theResult to ""
					end try
					if theResult is not "" then return theResult
				end tell
			end tell
		end if
	end repeat
	
	return ""
end executeJavascript


-- Returns the given date as a string formatted as "YYYY-MM-DD".
on formattedDateString(theDate)
	if theDate is missing value then set theDate to current date
	
	set accessYear to year of theDate
	
	set accessMonth to (month of theDate as integer)
	if length of (accessMonth as string) is 1 then set accessMonth to "0" & accessMonth
	
	set accessDay to day of theDate
	if length of (accessDay as string) is 1 then set accessDay to "0" & accessDay
	
	return "" & accessYear & "-" & accessMonth & "-" & accessDay
end formattedDateString


-- Creates a Cocoa dictionary using the given lists of keys and values
-- and returns the resulting dictionary as an AppleScript record.
on recordFromKeys:keys andValues:values
	set theResult to current application's NSDictionary's dictionaryWithObjects:values forKeys:keys
	return theResult as record
end recordFromKeys:andValues:


-- Returns the value of the given key in the given record, or `missing value` if the key was not found.
-- NOTE: This currently only works with text (NSString) values.
on valueForKey:theKey inRecord:theRecord
	set theDict to current application's NSDictionary's dictionaryWithDictionary:theRecord
	set theResult to theDict's valueForKey:theKey
	if theResult = missing value then
		return missing value
	else
		return theResult as text
	end if
end valueForKey:inRecord:
jpottsx1
Posts: 4
Joined: Sun Sep 29, 2024 9:56 am

Re: Create new Internet record and Set Fields

Post by jpottsx1 »

Thank you so much for the script. It is eminently more sophisticated and thorough than the one I was trying to cobble together.

I look forward to seeing your "Keypoints" app. I will add my unsolicited feature request, the ability to also access epubs annotated in Apple books. Things such as Historic Non-Fiction are being published as epubs only inmany cases these days.

Thanks once again for the script
Jeff
Jon
Site Admin
Posts: 10200
Joined: Tue Jul 13, 2004 6:27 pm
Location: Bethesda, MD
Contact:

Re: Create new Internet record and Set Fields

Post by Jon »

If that feature request is for Bookends (not Keypoints), please note that you attach an epub to a reference, just like a PDF, and from Bookends open it in your preferred epub reader (right-click) or the FInder's default reader (double click). There are also some adjustments in the next update that will improve compatibility with more epub formats.

Jon
Sonny Software
msteffens
Posts: 39
Joined: Thu Jul 19, 2007 10:04 am
Location: Germany
Contact:

Re: Create new Internet record and Set Fields

Post by msteffens »

jpottsx1 wrote: Wed Oct 02, 2024 6:19 pm Thank you so much for the script.
You're welcome!

In case of Keypoints, I agree that support for annotations made in eBooks would indeed be very nice to have. Unfortunately, this isn't straightforward to implement as—compared to PDF annotation support via the system's PDFKit framework—there's no macOS system support for this.
Post Reply