taskspaces.rna
Class BM

java.lang.Object
  |
  +--taskspaces.rna.BM

public class BM
extends java.lang.Object


Field Summary
private  int[] d
          Internal BM table.
private static int MAXCHAR
          Maximum chars in character set.
private  int partial
          Bytes of a partial match found at the end of a text buffer.
private  byte[] pat
          Byte representation of pattern.
private  int patLen
          Length of pattern.
private  int[] skip
          Internal BM table.
 
Constructor Summary
(package private) BM()
          Boyer-Moore text search
 
Method Summary
 void compile(java.lang.String pattern)
          Compiles the text pattern for searching.
 int partialMatch()
          Returns the position at the end of the text buffer where a partial match was found.
 int search(byte[] text, int start, int length)
          Search for the compiled pattern in the given text.
 
Methods inherited from class java.lang.Object
, clone, equals, finalize, getClass, hashCode, notify, notifyAll, registerNatives, toString, wait, wait, wait
 

Field Detail

MAXCHAR

private static final int MAXCHAR
Maximum chars in character set.

pat

private byte[] pat
Byte representation of pattern.

patLen

private int patLen
Length of pattern.

partial

private int partial
Bytes of a partial match found at the end of a text buffer.

skip

private int[] skip
Internal BM table.

d

private int[] d
Internal BM table.
Constructor Detail

BM

BM()
Boyer-Moore text search

Scans text left to right using what it knows of the pattern quickly determine if a match has been made in the text. In addition it knows how much of the text to skip if a match fails. This cuts down considerably on the number of comparisons between the pattern and text found in pure brute-force compares This has some advantages over the Knuth-Morris-Pratt text search.

The particular version used here is from "Handbook of Algorithms and Data Structures", G.H. Gonnet & R. Baeza-Yates. Example of use:

 String pattern = "and ";
 
BM bm = new BM(); bm.compile(pattern); int bcount; int search; while ((bcount = f.read(b)) >= 0) { System.out.println("New Block:"); search = 0; while ((search = bm.search(b, search, bcount-search)) >= 0) { if (search >= 0) { System.out.println("full pattern found at " + search);
search += pattern.length(); continue; } } if ((search = bm.partialMatch()) >= 0) { System.out.println("Partial pattern found at " + search); } }
Method Detail

compile

public void compile(java.lang.String pattern)
Compiles the text pattern for searching.
Parameters:
pattern - What we're looking for.

search

public int search(byte[] text,
                  int start,
                  int length)
Search for the compiled pattern in the given text. A side effect of the search is the notion of a partial match at the end of the searched buffer. This partial match is helpful in searching text files when the entire file doesn't fit into memory.
Parameters:
text - Buffer containing the text
start - Start position for search
length - Length of text in the buffer to be searched.
Returns:
position in buffer where the pattern was found.

partialMatch

public int partialMatch()
Returns the position at the end of the text buffer where a partial match was found.

In many case where a full text search of a large amount of data precludes access to the entire file or stream the search algorithm will note where the final partial match occurs. After an entire buffer has been searched for full matches calling this method will reveal if a potential match appeared at the end. This information can be used to patch together the partial match with the next buffer of data to determine if a real match occurred.

Returns:
-1 the number of bytes that formed a partial match, -1 if no partial match

-bottom