【Leetcode】【python】Edit Distance 编辑距离

题目大意

求两个字符串之间的最短编辑距离,即原来的字符串至少要经过多少次操作才能够变成目标字符串,操作包括删除一个字符、插入一个字符、更新一个字符。

解题思路

动态规划,经典题目。

参考:
http://bangbingsyb.blogspot.com/2014/11/leetcode-edit-distance.html

状态:
DP[i+1][j+1]:word1[0:i] -> word2[0:j]的edit distance。

通项公式:
考虑word1[0:i] -> word2[0:j]的最后一次edit。无非题目中给出的三种方式:

a) 插入一个字符:word1[0:i] -> word2[0:j-1],然后在word1[0:i]后插入word2[j]
DP[i+1][j+1] = DP[i+1][j]+1

b) 删除一个字符:word1[0:i-1] -> word2[0:j],然后删除word1[i]
DP[i+1][j+1] = DP[i][j+1]+1

c) 替换一个字符:word1[0:i-1] -> word2[0:j-1]
word1[i] != word2[j]时,word1[i] -> word2[j]:DP[i+1][j+1] = DP[i][j] + 1
word1[i] == word2[j]时:DP[i+1][j+1] = DP[i][j]

所以min editor distance应该为:
DP[i+1][j+1] = min(DP[i][j] + k, DP[i+1][j]+1, DP[i][j+1]+1)
word1[i]==word2[j] -> k = 0, 否则k = 1

计算方向:
replace (i, j) delete (i, j+1)
insert (i+1, j) (i+1, j+1)

可见要求DP[i+1][j+1],必须要知道二维矩阵中左上,上方和下方的3个值。所以当我们确定第0行和第0列的值后,就可以从上到下、从左到右的计算了。

起始、边界值
DP[0][i] = i: word1为空,要转化到word2[0:i-1],需要添加i个字符。
DP[i][0] = i: word2为空,要从word1转化到空字符串,需要删除i个字符。

这里写图片描述

代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
class Solution(object):
def minDistance(self, word1, word2):
"""
:type word1: str
:type word2: str
:rtype: int
"""
m = len(word1)
n = len(word2)
dp = [[0 for __ in range(m + 1)] for __ in range(n + 1)]
for j in range(m + 1):
dp[0][j] = j
for i in range(n + 1):
dp[i][0] = i
# for i in range(n + 1):
# print dp[i]
for i in range(1, n + 1):
for j in range(1, m + 1):
onemore = 1 if word1[j - 1] != word2[i - 1] else 0
# print word1[:i], word2[:j]
# print 'shanchu', dp[i - 1][j] + 1, 'charu', dp[i][j - 1] + 1, 'tihuan', dp[i - 1][j - 1] + onemore
dp[i][j] = min(dp[i - 1][j] + 1, dp[i][j - 1] + 1, dp[i - 1][j - 1] + onemore)
return dp[n][m]

总结