My mission over the past few days was to take a row of tabular data in our catalog of the form:
[year]\t[part#]\t[description]\t[uom]\t[price]
then retrieve the part# and price. My eventual goal was then to execute a foreign script that checks/updates the price. So my first inclination was to do a Grep search for the part # (which is of the form “[0-9]{5}”), then from this point in the story/textframe, do a search for the price. I found this discussion on Find text from current insertion point – indesign cs3 javascript in the Adobe forums related to this issue.
As a baseline, I am using the following code for my initial find/grep:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | var myDoc = app.activeDocument; app.findGrepPreferences = NothingEnum.nothing; app.changeGrepPreferences = NothingEnum.nothing; //Set the find options. app.findChangeGrepOptions.includeFootnotes = false; app.findChangeGrepOptions.includeHiddenLayers = false; app.findChangeGrepOptions.includeLockedLayersForFind = false; app.findChangeGrepOptions.includeLockedStoriesForFind = false; app.findChangeGrepOptions.includeMasterPages = false; app.findGrepPreferences.findWhat = "[0-9]{5}"; app.findGrepPreferences.appliedParagraphStyle = myDoc.paragraphStyles.item("MyParts"); var myResults = myDoc.findGrep(); for (var i = 0; i < myResults.length; i++) { var result= myResults[i]; } |
In the discussion there is a proposed inner find/grep in the for loop.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | var myDoc = app.activeDocument; app.findGrepPreferences = NothingEnum.nothing; app.changeGrepPreferences = NothingEnum.nothing; //Set the find options. app.findChangeGrepOptions.includeFootnotes = false; app.findChangeGrepOptions.includeHiddenLayers = false; app.findChangeGrepOptions.includeLockedLayersForFind = false; app.findChangeGrepOptions.includeLockedStoriesForFind = false; app.findChangeGrepOptions.includeMasterPages = false; app.findGrepPreferences.findWhat = "[0-9]{5}"; app.findGrepPreferences.appliedParagraphStyle = myDoc.paragraphStyles.item("MyParts"); var myResults = myDoc.findGrep(); for (var i = 0; i < myResults.length; i++) { var result= myResults[i]; app.findGrepPreferences = NothingEnum.nothing; app.changeGrepPreferences = NothingEnum.nothing; app.findGrepPreferences.findWhat = "[0-9]+.[0-9]{2}"; app.findGrepPreferences.appliedParagraphStyle = myDoc.paragraphStyles.item("MyParts"); var myResults2 = myDoc.findGrep(); } |
The problem with this is that find/grep when executed in javascript is always executed absolutely in the document, meaning from the top/beginning and not with respect to the cursor or selection. So the inner find/grep will start from the top of the document. A proposed suggestion in the discussion is to ignore all results in the second find/grep that occur before our current index. The value we are looking for would probably the smallest match with a index greater than our current insertion point.
Sounds reasonable in theory, but that is a lot of computation, O(n * n).. So I figure there has to be an easier way. Keep in mind I have only been programming in InDesign for coming a week here, and all of my google searches are coming up sparse and dry. Given the format of my data,
[year]\t[part#]\t[description]\t[uom]\t[price]
it would be ideal if I could simply parse the entire row. My initial dilemma concerned the cases where the description could be multiline (with the price on a later line). These cases can be rather convoluted so I was hoping that a double find/grep would take me where I needed to go.
The key to my solution I realized is to have a single grep that gets me everything I need. If my result contains all the data I need, it would eliminate the need for nested find/grep. To achieve this model, I decided to break my find/grep searches based on the number of rows of data entry. For example, a single line entry of the form:
[year]\t[part#]\t[description]\t[uom]\t[price]
while a multiline would be:
[year]\t[part#]\t[description] \t\t[description]\t\t[price]
where [description] is basically the regex [^\r\t]*. With this crucial insight in hand, my solution simplifies to the following code:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | var myDoc = app.activeDocument; app.findGrepPreferences = NothingEnum.nothing; app.changeGrepPreferences = NothingEnum.nothing; //Set the find options. app.findChangeGrepOptions.includeFootnotes = false; app.findChangeGrepOptions.includeHiddenLayers = false; app.findChangeGrepOptions.includeLockedLayersForFind = false; app.findChangeGrepOptions.includeLockedStoriesForFind = false; app.findChangeGrepOptions.includeMasterPages = false; app.findGrepPreferences.findWhat = ".*\t[0-9]{5}\t[^\t\r]*\t[^\t\r]*\t[0-9]+\.([0-9]{2})"; app.findGrepPreferences.appliedParagraphStyle = myDoc.paragraphStyles.item("MyParts"); var myResults = myDoc.findGrep(); for (var i = 0; i < myResults.length; i++) { var myResult = myResults[i]; } |
At this point I’m still working on what the correct way to retrieve the (part #, price) pairs I am working on. Since I still have not completely understood the object model and how to use insertion points, I am converting the data to plain text and extracting the values I need.
var num = myResults[i].contents.match(/[^\t\r]*\t[0-9]{5}\t/)[1]; var len = myResults[i].words.length; var priceWord = myResults[i].words[len-1]; var price = priceWord.contents;
My goal is to perform manipulations on the price (priceWord). So with this work around I have the algorithm template for exactly what I am looking to do. Of course I have to repeat the find/grep for the multiline cases, but that is the same same.
0 Responses
Stay in touch with the conversation, subscribe to the RSS feed for comments on this post.