Longest Common Subsequence using Dynamic Programming

Yasir
3 min readJan 17, 2021

In this article , we will look on how we can use dynamic programming to come up with a solution of polynomial instead of exponential .

Let’s DO That !

This is the one of the best example to implement DP . And it is quite famous .
Lets begin

Problem Statement :

Given two strings: string X of length m [X(1..m)], and string Y of length n [Y(1..n)], find the longest common subsequence: the longest sequence of characters that appear left-to-right (but not necessarily in a contiguous block) in both strings. For example, if X = “ABCBDAB” and Y = “BDCABA”, the LCS(X, Y) = {“BCBA”, “BDAB”, “BCAB”}. We can see there are several optimal solutions.

Brute Force Approach

One simple idea is to check every subsequence of X[1.. m] (m is the length of sequence X) to see if it is also a subsequence of Y[1..n] (n is the length of sequence Y).

It will takes O(n) time, and there are 2^m subsequences of X. The running time thus is exponential O(n. 2^m) and is not good for large sequences.

Recursive Solution

Before going to DP solution, let us form the recursive solution for this and later we can add memoization to reduce the complexity.

Let’s start with some simple observations about the LCS problem. If we have two strings, say “ABCBDAB” and “BDCABA”, and if we draw lines from the letters in the first string to the corresponding letters in the second, no two lines cross:

Note : From the above observation, we can see that the current characters of X and Y may or may not match. That means, suppose that the two first characters differ. Then it is not possible for both of them to be part of a common subsequence — one or the other (or maybe both) will have to be removed. Finally, observe that once we have decided what to do with the first characters of the strings, the remaining sub problem is again a LCS problem, on two shorter strings. Therefore we can solve it recursively.

Based on the above discussion, here we get the possibilities as described below:

  1. If X[i] == Y[j] : 1 + LCS(i + 1,j + 1)
  2. If X[i] ≠ Y[j] : LCS(i,j + 1) // skipping j th character of Y
  3. If X[i] ≠ Y[j] : LCS(i + 1,j) // skipping i th character of X
def lcs(X, Y, m, n):
if m == 0 or n == 0:
return 0;
elif X[m-1] == Y[n-1]:
return 1 + lcs(X, Y, m-1, n-1);
else:
return max(lcs(X, Y, m, n-1), lcs(X, Y, m-1, n));

This is a correct solution but it is very time consuming. For example, if the two strings have no matching characters, the last line always gets executed which gives (if m == n) close to O(2^n ).

DP = Recursion + Memoization

DP Solution : Adding Memoization

To get it done in an efficient code , we will memoize and use dynamic programming :

def lcs(X, Y):    m = len(X)
n = len(Y)
# Here we are creating a 2d Matrix
L = [[None]*(n + 1) for i in range(m + 1)]
for i in range(m + 1):
for j in range(n + 1):
if i == 0 or j == 0 :
L[i][j] = 0
elif X[i-1] == Y[j-1]:
L[i][j] = L[i-1][j-1]+1
else:
L[i][j] = max(L[i-1][j], L[i][j-1])
return L[m][n]

To get a better understanding about how the above code works , I will recommend you to watch this video : http://youtube.com/watch?v=sSno9rV8Rhg

Where to Go From Here

My GitHub: https://github.com/myawesomehub

My Writing : Link

Keep following to get more useful articles 👍
Thanks You !

--

--

Yasir

Exploring the Swift, SwiftUI, and Apple universe.