Overview: Complex Text Searches in Bookends

A place for users to ask each other questions, make suggestions, and discuss Bookends.
Post Reply
DrJJWMac
Posts: 348
Joined: Sat Jun 22, 2019 8:04 am
Location: Alabama USA

Overview: Complex Text Searches in Bookends

Post by DrJJWMac »

I am posting this with the thought that it might help users who are interested in doing complex text searches on their databases. It is a summary of my initial findings as I learn to work within the search approaches used by Bookends.

First, the best text search criteria will not work well if at all when the fields that are to be searched are missing, incorrect, or incomplete. The main panel shows cases where the title has issues. How can one see the mistakes in other fields? Here are three searches that I use to address typical problems in other key search fields.

* No Abstract -> SQL with abstract IS NULL
* No Keywords -> SQL with keywords IS NULL

I prefer to do each separately, because I want to fix a missing abstract first. I consider keywords only as an extra step.

* Funky Stuff -> SQL with abstract REGEX '…' OR title REGEX '…' or journal REGEX '…'

Sometimes my references are imported or added with key fields containing "..." (three dots) proceeded or followed by short clips of phrases. This means the fields may not be correct or complete.

Secondly, I have yet to appreciate the full differences between what I get from three types of searches. I outline them here as a summary to be aware that the results are not always the same, especially for complex search criteria.

Type 1 - Text Search in Fields
You create this search using a Smart Group. You fill in the various settings for Text with words or characters. You can use quotes around phrases ("some phrase") and the wild card ("red car*" for "red car" or "red cars"). You select which FIELD should have the text. Finally, you use AND or OR criteria to build a multiple-setting search.

This type of search does not allow you to build strict patterns with the boolean search criteria. This type of search does not search the text of the attachments.

Type 2 - Text Search in Attachments
You create this search using a Smart Group. You fill in various text terms. You can use AND or OR designations to narrow the search with relationships among your search terms.

This type of search does not allow you to build strict patterns with the boolean search criteria. This type of search will fail when the attachment cannot be indexed with Spotlight.

Type 3 - SQL Search in Fields
You create this search using a Smart Group (SQL). The two strongest advantages of this type of search are 1) the ability to build complex boolean criteria and 2) the ability to preface search terms with REGEX + pattern matching phrases to build flexibility into the search terms themselves. The one disadvantage of this type of search is that it only searches the fields in Bookends, it does not search within the text of the attachments.

The point to remember is that, when you a design complex pattern in the Type 1 search, you have no guarantee that you can replicate that for Type 2 or Type 3 searches. To the best of my findings for example, these two search groups return different sets of results:

Type 1
--> Text characters red* All Fields
OR Text whole words "some car*" in All Fields
AND Text words begin with Ford* in Title
OR Text words begin with Ford* in Abstract

Type 3
(allFields REGEX '(?i)red*' OR allFields REGEX '(?i)some car*') AND (title REGEX 'Ford*' OR abstract REGEX 'Ford*')

The return sets will overlap mostly. Some references from the second set will not appear in the first set and some references from the first set will not appear in the second set. I may have some insights later as to how to find the disjoint between the two sets (that is, the set of returns that are NOT in BOTH the Type 1 and the Type 3).

I hope this gives a starting point for folks who are new to Bookends and/or who are struggling with certain aspects to perform complex text search. I welcome corrections and comments to this overview.

--
JJW
Last edited by DrJJWMac on Sun Mar 22, 2020 4:56 pm, edited 1 time in total.
--
JJW
Jon
Site Admin
Posts: 10066
Joined: Tue Jul 13, 2004 6:27 pm
Location: Bethesda, MD
Contact:

Re: Overview: Complex Text Searches in Bookends

Post by Jon »

Thanks for this summary. A few small additions:

1. Searching attached PDFs can also be done with the Find command (Command-F), not just smart groups. And it does allow boolean searching using the Spotlight rules (covered in the User Guide, p. 75).

2. There's a fourth search: for PDF tags. Many people who curate PDFs in this manner find this useful.

Jon
Sonny Software
Post Reply