Format question

A place for users to ask each other questions, make suggestions, and discuss Bookends.
Post Reply
cboulanger
Posts: 59
Joined: Mon Jan 28, 2008 6:18 pm

Format question

Post by cboulanger »

I am developing a format that conforms to the JSON data specification (http://www.json.org). At the moment, it looks like this:

Code: Select all

`{ "id":`@`,"type":"`y`","author":"`a`","title":"`t`","editor":"`e`","journal":"`j`","volume":"`v`","pages":"`p-`","date":"`d`","publisher":"`u`","location":"`l`","url":"`z`","title":"`s`","user1":"`w`","user2":"`x`","user3":"`c`","user4":"`g`","abstract":"`b`","keywords":"`k`","notes":"`n`","user5":"`u5`","user6":"`u6`","user7":"`u7`","user8":"`u8 `","user9":"`u9`","user10":"`u10`","user11":"`u11`","user12":"`u12`","user13":"`u13`","user14":"`u14`","user15":"`u15`","user16":"`u16`","attachments":"`h`","groups":"`g`"},`
The problem I have is that if a field is empty, it will skip the part between the colon and the comma. See the example below, e.g. the "volume", "abstract" or user fields:

Code: Select all

{ "id":100850,"type":"Conference proceedings","author":"Boulanger, Christian","title":"Europeanisation Through Judicial Activism? The Hungarian Constitutional Court's Legitimacy and Hungary's 'Return to Europe'","editor","journal":"Contours of Legitimacy","volume","pages","date":"2002","publisher","location","url","title":"Europeanisation Through Judicial Activism? The Hungarian Constitutional Court's Legitimacy and Hungary's 'Return to Europe'","user1":"Boulanger-2002-Europeanization","user2","user3","user4":"COMP Judicial Activism","abstract","keywords","notes","user5","user6","user7","user8","user9","user10","user11","user12","user13","user14","user15","user16","attachments","groups":"COMP Judicial Activism"},
What do I have to change to get a correct output ( for example, "keywords"="" )?

Thanks,
Christian
Jon
Site Admin
Posts: 10296
Joined: Tue Jul 13, 2004 6:27 pm
Location: Bethesda, MD
Contact:

Re: Format question

Post by Jon »

There are several ways to do this (see the RIS format, for example). Also, look in the User Guide for "Conditional groups" in the section on formatting. It lets you have multiple options for what do to when one or more fields is empty, and is pretty easy to understand):

{$"keywords"$ : "k"^$"keywords"$ : ""}

The first option is output if there are keywords, the second if there are no keywords. Of course, you could just not output keywords in JSON if there are none:

{$"keywords"$ : "k"^}

Jon
Sonny Software
cboulanger
Posts: 59
Joined: Mon Jan 28, 2008 6:18 pm

Re: Format question

Post by cboulanger »

Very cool, thanks! I am glad I asked. The formatter syntax is very powerful, but I don't consider it "easy", actually. I am not a total dummy when it comes to computer languages, but that syntax does have a learning curve. ;-)

Here's the working result:

Code: Select all

`{`{$"id"$:@^}{$,"type"$:"y"^}{$,"author"$:"a"^}{$,"title"$:"t"^}{$,"editor"$:"e"^}{$,"journal"$:"j"^}{$,"volume"$:"v"^}{$,"pages"$:"p-"^}{$,"date"$:"d"^}{$,"publisher"$:"u"^}{$,"location"$:"l"^}{$,"url"$:"z"^}{$,"title"$:"s",`{$"user1"$:"w"^},{$"user2"$:"x"^}`,"user3"$:"c"^}{$,"user4"$:"g"^}{$,"abstract"$:"b"^}{$,"keywords"$:"k"^}{$,"notes"$:"n"^}{$,"user5"$:"u5"^}{$,"user6"$:"u6"^}{$,"user7"$:"u7"^}{$,"user8"$:"u8 ",`{$"user9"$:"u9"^}`,"user10"$:"u10"^}{$,"user11"$:"u11"^}{$,"user12"$:"u12"^}{$,"user13"$:"u13"^}{$,"user14"$:"u14"^}{$,"user15"$:"u15"^}{$,"user16"$:"u16"^}{$,"attachments"$:"h"^}{$,"groups"$:"g"}`},`
What's expecially good about it is that now empty fields are no longer included in the result, saving lots of bandwith.

Now I have an almost working JSON output generator. Of course, there are still some issues (such as having to put the result into array parentheses and to strip line breaks and other offending characters, but that can be done easily with some post-processing. Of course it would be even better to have a native JSON output! But in the meantime, this will do.

Thanks again,
Christian
cboulanger
Posts: 59
Joined: Mon Jan 28, 2008 6:18 pm

Re: Format question

Post by cboulanger »

The format still wasn't right (some of the quotation marks were transformed into the numeric XML entity "), lots of other little bugs, and fields were missing.

Here's the corrected version:

Code: Select all

`{`{$"id"$:@^}{$,"type":"$y$"$^}{$,"authors":"$a$"$^}{$,"title":"$t$"$^}{$,"editors":"$e$"$^}{$,"journal":"$j$"$^}{$,"volume":"$v$"$^}{$,"[pages]":"$p-$"$^}{$,"thedate":"$d$"$^}{$,"publisher":"$u$"$^}{$,"location":"$l$"$^}{$,"url":"$z$"$^}{$,"abstract":"$b$"$^}{$,"keywords":"$k$"$^}{$,"notes":"$n$"$^}{$,"user1":"$u1$"$^}{$,"user2":"$u2$"$^}{$,"user3":"$u3$"$^}{$,"user4":"$u4$"$^}{$,"user5":"$u5$"$^}{$,"user6":"$u6$"$^}{$,"user7":"$u7$"$^}{$,"user8":"$u8$"$^}{$,"user9":"$u9$"$^}{$,"user10":"$u10$"$^}{$,"user11":"$u11$"$^}{$,"user12":"$u12$"$^}{$,"user13":"$u13$"$^}{$,"user14":"$u14$"$^}{$,"user15":"$u15$"$^}{$,"user16":"$u16$"$^}{$,"user17":"$u17$"$^}{$,"user18":"$u18$"$^}{$,"user19":"$u19$"$^}{$,"user20":"$u20$"$^}{$,"attachments":"$h$"$^}{$,"groups":"$g$"$^}`},`
The following post-processing is necessary to get valid JSON:

Code: Select all

    function cleanup ( json )
    {    
        json = json
            .replace(/(\r\n|\r|\n)/g,"")    // strip linebreaks
            .replace(/\\/,"")               // strip backslashes
            .replace(/<HTML><HEAD>.*<\/HEAD><BODY>/i,"") // strip HTML markup
            .replace(/<BR \/><\/BODY><\/HTML>/i,"")
            .replace(/\}\,<BR \/>/gi,"},")  // strip BR tag following a reference
            .replace(/<BR \/>/gi,"\\n")     // convert all others into escaped linebreaks 
            .replace(/[\x00-\x1f]/g,"")     // strip all control characters
            .replace(/\&\#034\;/g,'\\"')    // escape quotation mark
            ;

        json = json.substr( 0, json.length-1 ); // remove trailing comma
        return "[" + json + "]";   
    }    
Post Reply