Add Replace feature to the REGEX search

A place for users to ask each other questions, make suggestions, and discuss Bookends.
Post Reply
Dellu
Posts: 268
Joined: Sun Mar 27, 2016 5:30 am

Add Replace feature to the REGEX search

Post by Dellu »

It would be great if we have the replace feature as well for the great REGEX search.
Jon
Site Admin
Posts: 10048
Joined: Tue Jul 13, 2004 6:27 pm
Location: Bethesda, MD
Contact:

Re: Add Replace feature to the REGEX search

Post by Jon »

That is not allowed by design. It's too easy to make blunders that can destroy data, especially in the hands of people inexperienced users. Bookends already offers several replace features via the UI. What do you want to do that can't be accomplished with them? And remember, you can combine operations, meaning you could do a complicated SQL search first and then a Global Change that applies only to the Hits.

Jon
Sonny Software
Dellu
Posts: 268
Joined: Sun Mar 27, 2016 5:30 am

Re: Add Replace feature to the REGEX search

Post by Dellu »

Yes, the Global changes are what give BE incredible powers. I love each and every of the commands there.

But, I had moments where I coveted for an advanced (REGEx) replace command.
Jon wrote: Fri Aug 12, 2022 8:04 am What do you want to do that can't be accomplished with them?
I had many cases that require advanced replace. But, the one I remember from yesterday is the discovery that many references have repeated Date.

Reference 1: 2012 2012
Reference 2: 1971 1971

There were a large number of references with this kind of repeated date (I don't know the cause for that; but, I believe importing using Endnote XML seems to the culprit). I want to remove one of the date entries. The regex search was able to collect those references. But, the removal required a lot of manual work. I was able to reduce the manual work by batching them by years; but, still a lot of work.

Had we have the regex replace, the task would have been much simpler:

Code: Select all

Find: (\d{4}) (\d{4})
Replace: $1
But, yah, if you think that is too much or too dangerous, I understand.
iandol
Posts: 465
Joined: Fri Jan 25, 2008 2:31 pm

Re: Add Replace feature to the REGEX search

Post by iandol »

I also would like a regex replace. As an example, jats XML litters my abstracts and while Zotero seems to be able to strip this XML out, Bookends doesn't so I have a bunch of <jats:p> / </jats:p / <jats:title> etc. in abstracts. I could go through each tag and manually find/replace, but regex's super powers and group backreferences would make this maintainance work much quicker.

But I do understand Jon's reluctance, and perhaps a solution is to have an extra Are you REALLY REALLY sure? dialog if regex is enabled?
Jon
Site Admin
Posts: 10048
Joined: Tue Jul 13, 2004 6:27 pm
Location: Bethesda, MD
Contact:

Re: Add Replace feature to the REGEX search

Post by Jon »

There is another, fundamental, reason you can't use raw SQL to find/replace in Bookends. The main fields (Authors through User4) allow styled text. This means that Bookends maintains two versions of the data. You can search and replace the plain text, but the styled information would be unchanged. This would cause no end of problems for you (and in fact, you'd still see the old text in the UI). Bookends handles all of this for you when you use the Global Change operations. This wouldn't be a problem with fields where the text doesn't have styles or User5-User20, but in that case it's a pretty limited "feature".

There are other considerations as well, details that are too technical to go into, about the database design that would cause problems if you were allowed to manipulate the data directly. It is simply a bad idea.

See my next post if you still want to bypass the UI.

What is jats XML and were you do obtain it from? Bookends doesn't import just any old XML, just EndNote XML and Sente XML.

Jon
Sonny Software
Jon
Site Admin
Posts: 10048
Joined: Tue Jul 13, 2004 6:27 pm
Location: Bethesda, MD
Contact:

Re: Add Replace feature to the REGEX search

Post by Jon »

I you'd really like to modify a Bookends database you can avoid the UI altogether by using Valentina Studio from Paradigma Software (who produce the incredibly powerful database engine that Bookends uses). You can download it here

https://www.valentina-db.com/en/all-downloads/current

Basic features are free, and include the ability to run queries and apply regex.

Remember my admonition in my last post about modifying fields that have styled text, and anticipate other unexpected problems. I'd use it with copies of your database until you are comfortable with the consequences of your actions.

Jon
Sonny Software
iandol
Posts: 465
Joined: Fri Jan 25, 2008 2:31 pm

Re: Add Replace feature to the REGEX search

Post by iandol »

Hi Jon, <jats> tags are sometimes present when I use Quick add... which uses Crossref, for example for this DOI [10.1126/sciadv.abm2219] Bookends calls crossref and you can see from the API you use the abstract is wrapped in <jats:p> tags (I use the jq command to show just the JSON abstract field):

Code: Select all

▶︎ curl 'https://api.crossref.org/works/10.1126/sciadv.abm2219?mailto=support@sonnysoftw
are.com' | jq -r '.message.abstract'

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  9970    0  9970    0     0   8271      0 --:--:--  0:00:01 --:--:--  8329

<jats:p>Functional correspondences between deep convolutional neural networks (DCNNs) and the mammalian visual system support a hierarchical account in which successive stages of processing contain ever higher-level information. However, these correspondences between brain and model activity involve shared, not task-relevant, variance. We propose a stricter account of correspondence: If a DCNN layer corresponds to a brain region, then replacing model activity with brain activity should successfully drive the DCNN’s object recognition decision. Using this approach on three datasets, we found that all regions along the ventral visual stream best corresponded with later model layers, indicating that all stages of processing contained higher-level information about object category. Time course analyses suggest that long-range recurrent connections transmit object class information from late to early visual areas.</jats:p>
This markup in the abstract field depends on the journal I think. Bookends does not strip these tags, although it does remove the final angle bracket >
Screenshot 2022-08-13 at 15.43.47 copy.png
Screenshot 2022-08-13 at 15.43.47 copy.png (8.78 KiB) Viewed 2953 times
It isn't a major deal.

-----

Regarding regex search / replace -- fair enough. I never use styles myself, and globally remove styled text every few weeks, but I understand how this may leave the database in an conflicted state and regex on the (I assume) RTF styling made do weird things that a user may not understand. Valentina seems a fair workaround for more technical users who should understand these problems...
DrJJWMac
Posts: 345
Joined: Sat Jun 22, 2019 8:04 am
Location: Alabama USA

Re: Add Replace feature to the REGEX search

Post by DrJJWMac »

> But I do understand Jon's reluctance, and perhaps a solution is to have an extra Are you REALLY REALLY sure? dialog if regex is enabled?

I would embrace this or an equivalent as a warning about the experience level needed to proceed. I would not support it as a way to allow Bookends to bypass its inherent limitations and do damage on its own accord.
--
JJW
Dellu
Posts: 268
Joined: Sun Mar 27, 2016 5:30 am

Re: Add Replace feature to the REGEX search

Post by Dellu »

I am also getting these pesky <jats:p> / </jats:p / <jats:title> in my abstracts.

But, I am puzzled by Jon comment on Valentina Studio. Are you telling me that I can just open the bdb file and manipulate it there?
Jon
Site Admin
Posts: 10048
Joined: Tue Jul 13, 2004 6:27 pm
Location: Bethesda, MD
Contact:

Re: Add Replace feature to the REGEX search

Post by Jon »

Yes. Valentina Studio can open and manipulate the Bookends database (we use the Valentina database engine). But be careful. Bookends hides a lot of complexity from you. But you can play with copies of your library, and if you're happy with the result use it on your real library (after backing it up, of course).

We aren't going to support Valentina Studio, of course, but the Valentina database engine is fully documented by Pardigma Software if you want to try.

Jon
Sonny Software
Jon
Site Admin
Posts: 10048
Joined: Tue Jul 13, 2004 6:27 pm
Location: Bethesda, MD
Contact:

Re: Add Replace feature to the REGEX search

Post by Jon »

iandol wrote: Sat Aug 13, 2022 10:51 am Hi Jon, <jats> tags are sometimes present when I use Quick add... which uses Crossref, for example for this DOI [10.1126/sciadv.abm2219] Bookends calls crossref and you can see from the API you use the abstract is wrapped in <jats:p> tags (I use the jq command to show just the JSON abstract field)
I did a Quick Add with that DOI and don't see any XML tags on import. I get this:

Functional correspondences between deep convolutional neural networks (DCNNs) and the mammalian visual system support a hierarchical account in which successive stages of processing contain ever higher-level information. However, these correspondences between brain and model activity involve shared, not task-relevant, variance. We propose a stricter account of correspondence: If a DCNN layer corresponds to a brain region, then replacing model activity with brain activity should successfully drive the DCNN's object recognition decision. Using this approach on three datasets, we found that all regions along the ventral visual stream best corresponded with later model layers, indicating that all stages of processing contained higher-level information about object category. Time course analyses suggest that long-range recurrent connections transmit object class information from late to early visual areas.

So it's either inconsistent (maybe fixed by Crossref?) or there is something obscure going on.


Jon
Sonny Software
Dellu
Posts: 268
Joined: Sun Mar 27, 2016 5:30 am

Re: Add Replace feature to the REGEX search

Post by Dellu »

yes, that one is coming out fine for me too. can you check this (10.1163/18776930-01202001)?
Jon
Site Admin
Posts: 10048
Joined: Tue Jul 13, 2004 6:27 pm
Location: Bethesda, MD
Contact:

Re: Add Replace feature to the REGEX search

Post by Jon »

I see the tags with that DOI. I'll take a look.

Jon
Sonny Software
Jon
Site Admin
Posts: 10048
Joined: Tue Jul 13, 2004 6:27 pm
Location: Bethesda, MD
Contact:

Re: Add Replace feature to the REGEX search

Post by Jon »

Bookends will remove those HTML entities from the abstract in the next update.

Jon
Sonny Software
iandol
Posts: 465
Joined: Fri Jan 25, 2008 2:31 pm

Re: Add Replace feature to the REGEX search

Post by iandol »

Thanks Jon; interestingly I still see the jats tags with the DOI I sent, some sort of regional server difference or something...
Post Reply