Batch editing PDF metadata

A place for users to ask each other questions, make suggestions, and discuss Bookends.
Post Reply
taja
Posts: 55
Joined: Sun Feb 12, 2012 10:39 pm

Batch editing PDF metadata

Post by taja »

Is this something that others would benefit from, and that could easily be implemented in Bookends? I find having at least the author field in the metadata for PDFs filled extremely helpful for narrowing down text content searches in apps such as Devonthink and Foxtrot. But I have only entered this data for a few PDFs because it's so time-consuming - even when using a paid app with batch PDF metadata editing. Given that Bookends has the relevant data (author especially) I think it would be brilliant to have a Bookends field > PDF metadata field function, if that were possible. But perhaps it's not easy to implement.
Jon
Site Admin
Posts: 10048
Joined: Tue Jul 13, 2004 6:27 pm
Location: Bethesda, MD
Contact:

Re: Batch editing PDF metadata

Post by Jon »

I don't know how hard that would be to do, but it strikes me as a laser-focused feature (meaning very narrow in scope and probably of interest to few) that also requires adding another menu item to the UI, which is already pretty dense, no? I think this is something that might better addressed with an AppleScript. Furthermore, this actually strikes me as something DEVONthink should (might) do, because it's a document manager. They can access the author metadata for the reference, either by importing it or getting it ad hoc via an AppleScript, and modify the PDF accordingly.

Jon
Sonny Software
taja
Posts: 55
Joined: Sun Feb 12, 2012 10:39 pm

Re: Batch editing PDF metadata

Post by taja »

You're right, I doubt it would be of interest to many, and an Applescript would be an ideal solution. I have no skills in that direction, but I can see that's not a reason to include it in the UI. FWIW I do use Bookends as a document manager (for PDFs and epubs at least).
Jon
Site Admin
Posts: 10048
Joined: Tue Jul 13, 2004 6:27 pm
Location: Bethesda, MD
Contact:

Re: Batch editing PDF metadata

Post by Jon »

We have an AppleScript forum and there are a number of very knowledgeable people who post there who might be able to help.

Jon
Sonny Software
joao
Posts: 58
Joined: Fri Jun 17, 2016 4:23 am

Re: Batch editing PDF metadata

Post by joao »

I actually have an AppleScript that changes the author and title metadata on the pdf according to bookends references. I use it for my ereader.
I would recommend just changing file rename format instead though.

I'll see if I can post the script later this week (busy right now).
joao
Posts: 58
Joined: Fri Jun 17, 2016 4:23 am

Re: Batch editing PDF metadata

Post by joao »

Jon wrote: Fri Jan 14, 2022 7:28 am Furthermore, this actually strikes me as something DEVONthink should (might) do, because it's a document manager.
Actually, my script is based on one posted on the devonthink forums.
Dellu
Posts: 268
Joined: Sun Mar 27, 2016 5:30 am

Re: Batch editing PDF metadata

Post by Dellu »

Writing XML metadata onto the pdf files of selected references: (you need to install exiftool)

Code: Select all

tell application "Bookends"
	set theIDs to «event ToySRUID» "Selection"
	repeat with theID in paragraphs of theIDs
		tell front library window
			try
				set myRefs to (publication items whose id is theID)
				set myItem to first item of myRefs
				set thePath to path of attachment items of myItem
				if thePath is not {} then
					set {theKey, thePath, theAuthor, theEditor, theTitle, theWords, theYear} to {citekey, path of attachment items, authors, editors, title, keywords, publication date string} of myItem
					if theAuthor = "" then set theAuthor to theEditor
					set theMonth to (do shell script "date '+:%m:%d %H:%M:%S'")
					set theDate to (theYear & theMonth)
					set otid to AppleScript's text item delimiters
					set AppleScript's text item delimiters to linefeed
					set thePath to text items of thePath
					repeat with i in thePath
						set thisPath to i as string
						tell application "Finder" to set theName to name of (POSIX file thisPath as alias)
						set AppleScript's text item delimiters to otid
						
						try
							do shell script "/usr/local/bin/exiftool -title=" & quoted form of theTitle & " -author=" & quoted form of theAuthor & "  -keywords=" & quoted form of theWords & " -CreateDate=" & quoted form of theDate & " -overwrite_original " & quoted form of thisPath
						end try
						
					end repeat
					
				end if
				
			on error errorMessage
				
			end try
			
		end tell
	end repeat
end tell
For references in a specific folder "Newcomer" folder in this case:

Code: Select all

tell application "Bookends"
	tell front library window
		set theIDs to get id of publication items of group item "Newcomer"
		repeat with theID in theIDs
			try
				set myRefs to (publication items whose id is theID)
				set myItem to first item of myRefs
				set {theKey, thePath, theAuthor, theEditor, theTitle, theWords, theYear} to {citekey, path of attachment items, authors, editors, title, keywords, publication date string} of myItem
				if theAuthor = "" then set theAuthor to theEditor
				set otid to AppleScript's text item delimiters
				set AppleScript's text item delimiters to linefeed
				set theMonth to (do shell script "date '+:%m:%d %H:%M:%S'")
				set theDate to (theYear & theMonth)
				set thePath to text items of thePath
				repeat with i in thePath
					set thisPath to i as string
					tell application "Finder" to set theName to name of (POSIX file thisPath as alias)
					set AppleScript's text item delimiters to otid
					
					try
						do shell script "/usr/local/bin/exiftool -title=" & quoted form of theTitle & " -author=" & quoted form of theAuthor & "  -keywords=" & quoted form of theWords & " -CreateDate=" & quoted form of theDate & " -overwrite_original " & quoted form of thisPath
					end try
					
				end repeat
				
			on error errorMessage
				
			end try
		end repeat
	end tell
end tell
But, if you have a very large collection of pdf files to write XML data into, the fastest approach is to use Jabref 3.8.

If you go with the Jabref, the steps are as follows:
- Export references from Bookends in bib format
- Open the bib in Jabref
- LIbrary properties in the Jabref. There, you tell Jabref where the attachment folder sits (attachement folder of BE).
Jabref 3.8 will (later versions won't) find the pdf files. You can then tell Jabref to write the XML.
joao
Posts: 58
Joined: Fri Jun 17, 2016 4:23 am

Re: Batch editing PDF metadata

Post by joao »

Dellu wrote: Tue Feb 15, 2022 3:13 pm Writing XML metadata onto the pdf files of selected references: (you need to install exiftool)
Thanks Dellu. Haven't had the time to look into my script. Sorry.
As Dellu says, unfortunately you need to install exiftool (you can do this through homebrew).
I also update a custom field in Bookends when the script runs, so that it does not overwrite attachments that have already had their exif updated. You could have it look for a specific exif field in the pdf instead I suppose, but I never did find out what fields are available beyond author and title.

Had no idea Jabref did that. Interesting.....
Post Reply