Thursday 3 February 2011

JAPE grammar - performance bottleneck

({Token})* pattern in the JAPE grammar can severely hit performance especially while performing Text Mining on large piece of text.  As advised in the GATE user guide, if the if you can predict that you won’t need to recognise a string of Tokens longer than x. Then it is possible to utilize 
({Token})[0,x]


However when it is not possible to predict it, are there any workaround? I am trying to figure out one.
If anyone else can suggest one?

No comments:

Post a Comment