TimeoutMatcher - apache/ctakes GitHub Wiki

public class TimeoutMatcher implements Closeable

Class that can / should be used to find text spans using regular expressions. It runs Matcher find {@link Matcher#find()} in a separate thread so that it may be interrupted at a set timeout. This prevents infinite loop problems that can be caused by poorly-built expressions or unexpected text contents. The timeout can be specified in milliseconds between 100 and 10,000. Large timeouts are unadvised. If a large amount of text needs to be parsed then it is better to split up the text logically and use smaller timeouts. The default timeout is 1000 milliseconds. Extending Matcher would be better, but it is final.

Proper usage is:

try ( TimeoutMatcher finder = new TimeoutMatcher( "\\s+", "Hello World !" ) ) {
   Matcher matcher = finder.find();
   while ( matcher != null ) {
      ... <do something with the match> ...
      matcher = finder.find(); 
   } 
} catch ( IllegalArgumentException iaE ) {
   ...  <do something with the exception> ...
}
  • Author: SPF , chip-nlp
  • Version: %I%
  • Since: 11/5/2016

public TimeoutMatcher( final String regex, final String text ) throws IllegalArgumentException

Uses the default timeout of 1000 milliseconds

  • Parameters:
    • regex regular expression
    • text text to parse
  • Exceptions:
    • IllegalArgumentException if the regular expression is null or malformed

public TimeoutMatcher( final String regex, final String text, final int timeoutMillis ) throws IllegalArgumentException

  • Parameters:
    • regex regular expression
    • text text to parse
    • timeoutMillis milliseconds at which the regex match should abort, between 100 and 10000
  • Exceptions:
    • IllegalArgumentException if the regular expression is null or malformed

public TimeoutMatcher( final Pattern pattern, final String text ) throws IllegalArgumentException

Uses the default timeout of 1000 milliseconds

  • Parameters:
    • pattern Pattern compiled from a regular expression
    • text text to parse
  • Exceptions:
    • IllegalArgumentException if the pattern is null or malformed

public TimeoutMatcher( final Pattern pattern, final String text, final int timeoutMillis ) throws IllegalArgumentException

Uses the default timeout of 1000 milliseconds

  • Parameters:
    • pattern Pattern compiled from a regular expression
    • text text to parse
    • timeoutMillis milliseconds at which the regex match should abort, between 100 and 10000
  • Exceptions:
    • IllegalArgumentException if the pattern is null or malformed

public Matcher nextMatch()

  • Returns: a matcher representing the next call to {@link Matcher#find()}

@Override public void close()

shut down the executor {@inheritDoc}

private final class RegexCallable implements Callable<Matcher>

Simple Callable that runs a {@link Matcher} on text

@Override public Matcher call()

{@inheritDoc}

  • Returns: matcher if there is another find, else null
⚠️ **GitHub.com Fallback** ⚠️