Grep (Relative searching) in JavaScript

My mission over the past few days was to take a row of tabular data in our catalog of the form:

[year]\t[part#]\t[description]\t[uom]\t[price]

then retrieve the part# and price. My eventual goal was then to execute a foreign script that checks/updates the price. So my first inclination was to do a Grep search for the part # (which is of the form “[0-9]{5}”), then from this point in the story/textframe, do a search for the price. I found this discussion on Find text from current insertion point – indesign cs3 javascript in the Adobe forums related to this issue.

As a baseline, I am using the following code for my initial find/grep:

var myDoc = app.activeDocument;
app.findGrepPreferences = NothingEnum.nothing; 
app.changeGrepPreferences = NothingEnum.nothing;
 
//Set the find options. 
app.findChangeGrepOptions.includeFootnotes = false; 
app.findChangeGrepOptions.includeHiddenLayers = false; 
app.findChangeGrepOptions.includeLockedLayersForFind = false; 
app.findChangeGrepOptions.includeLockedStoriesForFind = false; 
app.findChangeGrepOptions.includeMasterPages = false;
 
app.findGrepPreferences.findWhat =  "[0-9]{5}";
app.findGrepPreferences.appliedParagraphStyle = myDoc.paragraphStyles.item("MyParts");
var myResults = myDoc.findGrep();
for (var i = 0; i < myResults.length; i++)
{
    var result= myResults[i];
}

In the discussion there is a proposed inner find/grep in the for loop.

var myDoc = app.activeDocument;
app.findGrepPreferences = NothingEnum.nothing; 
app.changeGrepPreferences = NothingEnum.nothing;
 
//Set the find options. 
app.findChangeGrepOptions.includeFootnotes = false; 
app.findChangeGrepOptions.includeHiddenLayers = false; 
app.findChangeGrepOptions.includeLockedLayersForFind = false; 
app.findChangeGrepOptions.includeLockedStoriesForFind = false; 
app.findChangeGrepOptions.includeMasterPages = false;
 
app.findGrepPreferences.findWhat =  "[0-9]{5}";
app.findGrepPreferences.appliedParagraphStyle = myDoc.paragraphStyles.item("MyParts");
var myResults = myDoc.findGrep();
for (var i = 0; i < myResults.length; i++)
{
    var result= myResults[i];
    app.findGrepPreferences = NothingEnum.nothing; 
    app.changeGrepPreferences = NothingEnum.nothing;
 
    app.findGrepPreferences.findWhat =  "[0-9]+.[0-9]{2}";
    app.findGrepPreferences.appliedParagraphStyle = myDoc.paragraphStyles.item("MyParts");
    var myResults2 = myDoc.findGrep();
}

The problem with this is that find/grep when executed in javascript is always executed absolutely in the document, meaning from the top/beginning and not with respect to the cursor or selection. So the inner find/grep will start from the top of the document. A proposed suggestion in the discussion is to ignore all results in the second find/grep that occur before our current index. The value we are looking for would probably the smallest match with a index greater than our current insertion point.

Sounds reasonable in theory, but that is a lot of computation, O(n * n).. So I figure there has to be an easier way. Keep in mind I have only been programming in InDesign for coming a week here, and all of my google searches are coming up sparse and dry. Given the format of my data,

[year]\t[part#]\t[description]\t[uom]\t[price]

it would be ideal if I could simply parse the entire row. My initial dilemma concerned the cases where the description could be multiline (with the price on a later line). These cases can be rather convoluted so I was hoping that a double find/grep would take me where I needed to go.

The key to my solution I realized is to have a single grep that gets me everything I need. If my result contains all the data I need, it would eliminate the need for nested find/grep. To achieve this model, I decided to break my find/grep searches based on the number of rows of data entry. For example, a single line entry of the form:

[year]\t[part#]\t[description]\t[uom]\t[price]

while a multiline would be:

[year]\t[part#]\t[description]
\t\t[description]\t\t[price]

where [description] is basically the regex [^\r\t]*. With this crucial insight in hand, my solution simplifies to the following code:

var myDoc = app.activeDocument;
app.findGrepPreferences = NothingEnum.nothing; 
app.changeGrepPreferences = NothingEnum.nothing;
//Set the find options. 
app.findChangeGrepOptions.includeFootnotes = false; 
app.findChangeGrepOptions.includeHiddenLayers = false; 
app.findChangeGrepOptions.includeLockedLayersForFind = false; 
app.findChangeGrepOptions.includeLockedStoriesForFind = false; 
app.findChangeGrepOptions.includeMasterPages = false;
 
app.findGrepPreferences.findWhat =  ".*\t[0-9]{5}\t[^\t\r]*\t[^\t\r]*\t[0-9]+\.([0-9]{2})";
app.findGrepPreferences.appliedParagraphStyle = myDoc.paragraphStyles.item("MyParts");
var myResults = myDoc.findGrep();
for (var i = 0; i < myResults.length; i++)
{
    var myResult = myResults[i];
}

At this point I’m still working on what the correct way to retrieve the (part #, price) pairs I am working on. Since I still have not completely understood the object model and how to use insertion points, I am converting the data to plain text and extracting the values I need.

   var num = myResults[i].contents.match(/[^\t\r]*\t[0-9]{5}\t/)[1];
   var len = myResults[i].words.length;
   var priceWord = myResults[i].words[len-1];
   var price = priceWord.contents;

My goal is to perform manipulations on the price (priceWord). So with this work around I have the algorithm template for exactly what I am looking to do. Of course I have to repeat the find/grep for the multiline cases, but that is the same same.

Posted in InDesign. Tagged with find/grep, grep, relative searching.

By james

September 5, 2012

rev="post-247" No comments

InDesign Tip 3. Find/Grep after a Find/Grep (Relative searching) in JavaScript

0 Responses

About James Sasitorn

InDesign

Mac OS X / iOS

PHP

Recommended

Tags

InDesign Tip 3. Find/Grep after a Find/Grep (Relative searching) in JavaScript

0 Responses

Subscribe

About James Sasitorn

InDesign

Mac OS X / iOS

PHP

Recommended

Tags