Suggestion: "Clean-up" of Attachments folder
Suggestion: "Clean-up" of Attachments folder
Jon,
I have quite a few orphaned PDFs in my Attachments folder, from refs that I've deleted but whose PDFs were left behind.
Is it possible for Bookends to scan the folder and delete those PDFs that are not linked to any of the refs within Bookends?
Let me know what you think of this idea.
I have quite a few orphaned PDFs in my Attachments folder, from refs that I've deleted but whose PDFs were left behind.
Is it possible for Bookends to scan the folder and delete those PDFs that are not linked to any of the refs within Bookends?
Let me know what you think of this idea.
I suppose this could be dealt with by including warnings in a dialog box before any deletions occur that reminds the user that other databases may indeed be linking to the presumed orphaned PDFs.
For those with only one database, this would be a useful feature, and I guess for those who don't they'd probably never use it, but this must also be true of several features within any software program, no?
Thanks for considering it!
For those with only one database, this would be a useful feature, and I guess for those who don't they'd probably never use it, but this must also be true of several features within any software program, no?
Thanks for considering it!
-
- Posts: 11
- Joined: Mon Jun 12, 2006 2:48 am
- Location: SoCal
- Contact:
I was about to post with this exact question when I found this old one....
So here is my situation, maybe i'm doing things inefficiently (??). I have a database called PDF_library that contains all references for which I have actual papers on disk. I consider this to be my "master database". I also create other databases for individual projects, and drag any desired refs from PDF_library into the project database.
When doing Pubmed searches, I generally have an "Inbox" database. Although some refs that i grab from Pubmed come with pdfs, others don't. After I grab all the refs from Pubmed and copy them to Inbox, I'll copy them all to the project database, and then copy all that have attachments to my PDF_library database.
Doing it this way, I often end up with duplicates from both my PDF_library and project database, which I then remove from both databases. However, now i've got tons of duplicate pdfs in my attachments folder. I use Skim to mark them up, so it can be tricky to tell which copies to throw away (I generally want to keep the oldest / original one that has all my notes etc)
Since this is kind of a pain, and since nobody else but me and Prop seem to have this problem, there must be a better way to deal with it? any ideas?
So here is my situation, maybe i'm doing things inefficiently (??). I have a database called PDF_library that contains all references for which I have actual papers on disk. I consider this to be my "master database". I also create other databases for individual projects, and drag any desired refs from PDF_library into the project database.
When doing Pubmed searches, I generally have an "Inbox" database. Although some refs that i grab from Pubmed come with pdfs, others don't. After I grab all the refs from Pubmed and copy them to Inbox, I'll copy them all to the project database, and then copy all that have attachments to my PDF_library database.
Doing it this way, I often end up with duplicates from both my PDF_library and project database, which I then remove from both databases. However, now i've got tons of duplicate pdfs in my attachments folder. I use Skim to mark them up, so it can be tricky to tell which copies to throw away (I generally want to keep the oldest / original one that has all my notes etc)
Since this is kind of a pain, and since nobody else but me and Prop seem to have this problem, there must be a better way to deal with it? any ideas?
-
- Posts: 11
- Joined: Mon Jun 12, 2006 2:48 am
- Location: SoCal
- Contact:
When I have Bookends download a PDF, it puts it in the attachments folder and gives it a label, say "Hartman et al 2002 12451108.pdf"... if, sometime later, I have another database open, do a search, and download it again (often happens during "mass" downloads), I end up with multiple duplicate files in the attachments folderJon wrote:Hi,
It's not a bad idea, but of course you can have many databases. If the attachment isn't found in the db you are using to search, it may belong to another.
Jon
Sonny Software
Hartman et al 2002 12451108.pdf
Hartman et al 2002 12451108 838.pdf
Hartman et al 2002 12451108 3317.pdf
Hartman et al 2002 12451108 4252.pdf
Is there a way to tell Bookends to just not download a pdf that is already in the attachments folder? And what is the number AFTER the PMID in those PDFs?
-
- Posts: 11
- Joined: Mon Jun 12, 2006 2:48 am
- Location: SoCal
- Contact:
is that random number assigned by bookends? if so, then bookends "knows" that that file is already there, right? maybe i'm misunderstanding?Jon wrote:No, Bookends has no idea if the pdf already exists or not. You have to tell Bookends not to download the pdf again by unchecking Get PDF.
The number after the PMID is a random number to distinguish the pdfs.
Jon
Sonny Software
i guess the bottom line is that managing a large folder of downloaded pdfs will take a considerable amount of manual effort, unless I'm missing something obvious...
Bookends assigns the random number, and how it finds it when you open an attachment.
You seem to think that Bookends "knows" that 2 pdfs are the same. It does not. All it knows are the names, which can be assigned arbitrarily. If you don't want duplicate pdfs, don't download them twice. And if you do, it does no harm.
Jon
Sonny Software
You seem to think that Bookends "knows" that 2 pdfs are the same. It does not. All it knows are the names, which can be assigned arbitrarily. If you don't want duplicate pdfs, don't download them twice. And if you do, it does no harm.
Jon
Sonny Software
-
- Posts: 11
- Joined: Mon Jun 12, 2006 2:48 am
- Location: SoCal
- Contact:
ok - i guess that's how it has to be.... but it either 1. causes a huge buildup of duplicate pdfs in my attachments folder, or 2. substantially increases my work effort by not allowing me to simply grab all the references (with pdfs) that pubmed pulls down and then use "remove duplicates" to get rid of any dups from the database. it seems like if there is a PMID in the filename that it should be able to be uniquely identified, but i guess that's why i'm a biologist and not a computer programmer.
Re:
Well, BE would just need to know which databases I am using to figure out if a given pdf can be deleted. Hard to imagine someone needing 1000's dbs. Any news on the request of "cleaning up" attachment folders ?Jon wrote:Hi,
It's not a bad idea, but of course you can have many databases. If the attachment isn't found in the db you are using to search, it may belong to another.
Jon
Sonny Software
Another suggestion would be to let BE modify the "last accessed" date of pdf (touch) files for a given DB, let the user do that for relevant DBs and then use the finder to delete PDFs that have older "last accessed" date.
What do you think ?!
ps. A quick hack that seems to do it : from BE export all attachments to a new folder (repeat for all DBs), close BE and replace the attachment folder with the new one.
Re: Suggestion: "Clean-up" of Attachments folder
Since that request was initially made, we added the ability to confirm dialog when deleting references: move attachments to the Trash.
As for Bookends setting the last modified date of a pdf you read to the date read, that doesn't seem like a good idea. The last modified date has a purpose outside of Bookends.
Your workaround is a good one, BTW.
Jon
Sonny Software
As for Bookends setting the last modified date of a pdf you read to the date read, that doesn't seem like a good idea. The last modified date has a purpose outside of Bookends.
Your workaround is a good one, BTW.
Jon
Sonny Software