# Rabin Karp Algorithm – Character Match

When we have a string pattern and want to match it with the substring of a given string, what can we do?

The tuition way is that we could use substring() to match in a for loop. But the Time Complexity would be O(N* L), N refer to the string length, L refer to the pattern length.

But in Rabin Karp way, the Time Complexity would be O(N). We convert String into Number in this case, and use a sliding window to care only the right digit and left digit in this window. In details, we calculate the hash value for pattern, and check if the hash value same for both pattern and each substring we get.

#### 187. Repeated DNA Sequences

The DNA sequence is composed of a series of nucleotides abbreviated as `'A'``'C'``'G'`, and `'T'`.

• For example, `"ACGAATTCCG"` is a DNA sequence.

When studying DNA, it is useful to identify repeated sequences within the DNA.

Given a string `s` that represents a DNA sequence, return all the `10`-letter-long sequences (substrings) that occur more than once in a DNA molecule. You may return the answer in any order.

Example 1:

```Input: s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT"
Output: ["AAAAACCCCC","CCCCCAAAAA"]
```

Example 2:

```Input: s = "AAAAAAAAAAAAA"
Output: ["AAAAAAAAAA"]```
##### enumeration
``````List<String> findRepeatedDnaSequences(String s) {
int n = s.length();

HashSet<String> seen = new HashSet(); // to store every DNA sequence we have seen
HashSet<String> res = new HashSet<>(); // to store every DNA sqeuenct existed in seen

for (int i = 0; i + 10 <= n; i++) {
String subStr = s.substring(i, i + 10); // get the substring as 10 character long
if (seen.contains(subStr)){
} else {
}
}
}
``````
##### Rabin karp
``````List<String> findRepeatedDnaSequences(String s) {
// We only care 4 Characters, so we assign integer onto them
int[] nums = new int[s.length()];
for (int i = 0; i < nums.length; i++) {
switch (s.charAt(i)) {
case 'A':
nums[i] = 0;
break;
case 'G':
nums[i] = 1;
break;
case 'C':
nums[i] = 2;
break;
case 'T':
nums[i] = 3;
break;
}
}

HashSet<Integer> seen = new HashSet<>(); // the interger for character sequence we have seen
HashSet<String> res = new HashSet<>(); // the duplicate character sequence

int L = 10;
// length of sequence
int R = 4; // quaternary

// R^(L - 1)
int RL = (int) Math.pow(R, L - 1);

int windowHash = 0;

int left = 0, right = 0;
while (right < nums.length) {
// add new number as the lowest digit
windowHash = R * windowHash + nums[right];
right++;

// when sequence length is 10
if (right - left == L) {
// check if seen
if (seen.contains(windowHash)) {

} else {

}
// remove the digit in the highest digit
windowHash = windowHash - nums[left] * RL;
left++;
}
}
}
``````

#### 28. Find the Index of the First Occurrence in a String

Given two strings `needle` and `haystack`, return the index of the first occurrence of `needle` in `haystack`, or `-1` if `needle` is not part of `haystack`.

Example 1:

```Input: haystack = "sadbutsad", needle = "sad"
Output: 0
Explanation: "sad" occurs at index 0 and 6.
The first occurrence is at index 0, so we return 0.
```

Example 2:

```Input: haystack = "leetcode", needle = "leeto"
Output: -1
Explanation: "leeto" did not occur in "leetcode", so we return -1.```
##### BRUTE FORCE
``````
int rabinKarp(String txt, String pat) {

int L = pat.length();

int R = 256;

// for module calculate
long Q = 1658598167;

// RL = Math.pow(R, L - 1);
long RL = 1;
for (int i = 1; i <= L - 1; i++) {

RL = (RL * R) % Q;
}

long patHash = 0;
for (int i = 0; i < pat.length(); i++) {
patHash = (R * patHash + pat.charAt(i)) % Q;
}

long windowHash = 0;

int left = 0, right = 0;
while (right < txt.length()) {
windowHash = ((R * windowHash) % Q + txt.charAt(right)) % Q;
right++;

if (right - left == L) {
if (windowHash == patHash) {

// check hash collision
if (pat.equals(txt.substring(left, right))) {
return left;
}
}

// + Q to avoid negative integer
windowHash = (windowHash - (txt.charAt(left) * RL) % Q + Q) % Q;

left++;
}
}

return -1;
}
``````