Boyer-Moore Algorithm

Algorithms

The Boyer-Moore algorithm is an efficient string matching algorithm that scans the pattern from right to left and uses two heuristics (bad character rule and good suffix rule) to skip characters.

Algorithm

The algorithm uses the bad character rule to shift the pattern when a mismatch occurs, allowing it to skip many characters in the text.

Time Complexity

The best case time complexity is , and the worst case is , where is the length of the text and is the length of the pattern.

Implementation

def bad_character_rule(pattern):
    """Preprocess the pattern to create the bad character table."""
    m = len(pattern)
    bad_char = {{chr(i): -1 for i in range(256)}}  # Initialize for all ASCII characters

    for i in range(m):
        bad_char[pattern[i]] = i
    
    return bad_char

def boyer_moore_search(text, pattern):
    """Perform Boyer-Moore search for the pattern in the text using the bad character rule."""
    n = len(text)
    m = len(pattern)
    
    bad_char = bad_character_rule(pattern)
    
    s = 0  # Shift of the pattern
    while s <= n - m:
        j = m - 1
        
        while j >= 0 and pattern[j] == text[s + j]:
            j -= 1
        
        if j < 0:
            print("Pattern found at index {}".format(s))
            s += (m - bad_char[text[s + m]] if s + m < n else 1)
        else:
            s += max(1, j - bad_char[text[s + j]])
            
# Example usage:
text = "ABAAABCD"
pattern = "ABCD"
boyer_moore_search(text, pattern)  # Output: Pattern found at index 4