Indexed Pattern Matching Tool is a mechanism for indexing, storage and search patterns in a text file.
HOW TO USE
Open your favorite console application and navigate to the project folder.
make to compile. By default, the GCC toolchain will be used. If you
prefer to use the LLVM compiler, run
make CXX=clang++. After having
compiled, the executable will be available inside the bin folder. Run
./bin/ipmt --help for help.
SYNOPSIS $ ./bin/ipmt [OPTIONS] index FILE $ ./bin/ipmt [OPTIONS] search PATTERN INDEX_FILE
DESCRIPTION The ipmt utility was built for off-line search of text files. To do that, one must first create a compressed indexed file from the input file. From that, the tool can search one or more patterns in logarithmic time. For more than one pattern, please use a pattern file. Each line where a match of one of the patterns is found is printed to the standard output.
The algorithm used for the indexing is the Suffix Arrays, from authors Udi Manber and Gene Meyers (1989). The algorithms LZW, of Terry Welch (1984), LZ77, of Abraham Lempel and Jacop Ziv (1977), and HUFFMAN, from David Huffman (1952), are available for compression.
OPTIONS Generic Program Information -h, –help Display a help menu with all options.
-v, --version Display version information and exit. Matcher Selection -p, --pattern PATTERN_FILE Extract search patterns from PATTERN_FILE, separated by newlines. Output Control -c, --count Only a count of selected lines is written to standard output. Compression Algorithm Selection --compression=ALGORITHM Specify algorithm to be used for compression. The possible values are HUFFMAN (default), LZW and LZ77.
EXAMPLES To index a file for later search: $ ./bin/ipmt index [path to text file]
To print all occurrences of a pattern in a file: $ ./bin/ipmt search [pattern] [path to index file] If you want to discover the number of occurences of a pattern in a file: $ ./bin/ipmt -c search [pattern] [path to index file] To use the LZ77 algorithm to index and search file: $ ./bin/ipmt --compression=LZ77 index [path to text file] $ ./bin/ipmt --compression=LZ77 search [pattern] [path to index file]
DOCUMENTATION For more details about specification, please read specification.pdf inside the doc folder. The report of this project can be found in the doc folder as well.
NOTES ipmt was built by CIn/UFPE students Miguel Araújo and Paulo Lieuthier as an assignment for the String Processing course of 2015.2 (Processamento de Cadeias de Caracteres in portuguese), professor Paulo Gustavo.
LICENSE See LICENSE file