Baozi Training Blog (包子IT面试培训博客): How to calculate “hard” runtime complexity in recursive solution?

包子IT面试培训助你拿到理想的offer!

在技术面试中，准确说出一个解法的runtime complexity（算法时间复杂度）是一个非常重要的点。考虑到对于算法时间复杂度的理解是CS领域的基础，因此这类问题，回答对了往往不加分，但是回答错误往往是致命的，因此大家不能掉以轻心。

Note: 本篇只讨论算法的时间复杂度，不涉及算法的空间复杂度。:)

对于基本的算法复杂度分析，Big O notation是必须要掌握的，详情请看wikipedia相关资料：http://en.wikipedia.org/wiki/Big_O_notation 。简单的说，Big O描述了当输入(input)的复杂度线性增长时，一个算法的计算时间复杂度的增长变化。举个例子，如果我们说一个算法的是时间复杂度是O(n)，那么意味着当该算法的输入线形增长时，其计算时间也成线性增长。

『简单的时间复杂度计算』

大部分非递归算法的时间复杂度计算比较简单，一句话简单表述，就是数有多少层嵌套循环即可，（每个循环的次数和算法的输入n相关）。比如一个问题描述『求n个int数中的最大数』，那么解法中需要用一次循环来遍历所有input（即n个数），然后返回结果。由于只需要一层循环，那么该算法时间复杂度为O(n).

『相对复杂的时间复杂度计算』

当一个算法中既包含循环(loop)又包涵递归(recursive)的时候，算时间复杂度可能需要动动脑筋。这部分也是本章重点。:)

用最近的一道高频题做例子：

Problem:

Given a string s and a dictionary of words dict, add spaces in s to construct a sentence where each word is a valid dictionary word. Return one such possible sentences.

s = "catsanddog", dict = ["cat", "cats", "and", "dog"], return “cats and dog"；

(Leetcode word break II 简化版)

Clarification

Q: What to return if s can’t be splitted into words in dict? A: Return NULL;

Q: What if there are multiple ways of splitting s? A: Return one split solution is fine;

Q: What if S is null/empty string? A: Return NULL;

So on and so forth..

Code in first pass:

public String wordBreak(String s, Set<String> dict) {

if (s == null || s.isEmpty()) {

return s;

}

List<String> solution = new ArrayList<String>();

boolean ret = wordBreakHelper(s, dict, solution);

if (!ret) {

return null;

}

//from apache commons lang

return StringUtils.join(solution, " ");

}

private boolean wordBreakHelper(String s, Set<String> dict, List<String> solution) {

if (dict.contains(s)) {

solution.add(s);

return true;

}

for(int index = 1; index < s.length(); ++index) {

String left = s.substring(0, index);

if (dict.contains(left)) {

solution.add(left);

boolean ret = wordBreakHelper(s.substring(index), dict, solution);

if (ret) {

return ret;

} else {

solution.remove(solution.size() - 1);

}

return false;

}

这样的解法看起来一目了然，对于每一个字符串，分成左右子字符串，查左子串、递归右子串（想想为什么不需要递归左子串）。

那么这个算法的时间复杂度是多少呢？

这个复杂度的计算并非那么容易一眼看出，因为解法中的循环内有递归，而每一个递归中又存在循环。所以数嵌套循环数的方法并不好用。因此我们拿一个具体的例子来看：

假设字符串s是abcde; 那么算法变化如下：

当index = 1, 字符串分为左串a和右串bcde，如果a不在字典中，那么这个拆分不成立，情况简化；如果a恰好在字典中，那么算法有两种选择，一种选择是将a放入结果集，然后递归bcde，另一种是不选择a作为完整单词，而是作为单词的一部分去试探下一个index。

当index = 2, 由于index = 1时产生了两种情况：a bcde 和 abcde。

对于a bcde的情况，由于算法递归bcde，和index = 1的情况类似，如果b在字典中，那么会演化出两种情况：a b cde; a bcde;

对于abcde的情况，如果ab在字典中，那么会演化出两种情况：ab cde; abcde;

写到这里，大家应该看得出来，对于每一个index上的字母，是否选择该字母作为单词结束的决定会产生两种可能，而且是层层叠加。因此该算法的复杂度就是2的n次方。

Code in second pass:

上面的写法虽然思路简单，但是不可避免的进行了很多重复运算。比如cde这个单词至少会被a b cde 和ab cde这两种拆分方法共同计算，如果我们能利用一个额外的集合记录之前的计算结果，那么算法的复杂度会大大下降。

修改代码：

private boolean wordBreakHelper(String s, Set<String> dict, Set<String> unsplitableSet, List<String> solution) {

if (dict.contains(s)) {

solution.add(s);

return true;

}

if (unsplitableSet.contains(s)) {

return false;

}

for(int index = 1; index < s.length(); ++index) {

String left = s.substring(0, index);

if (dict.contains(left)) {

solution.add(left);

boolean ret = wordBreakHelper(s.substring(index), dict, unsplitableSet, solution);

if (ret) {

return ret;

} else {

solution.remove(solution.size() - 1);

}

unsplitableSet.add(s);

return false;

}

既然上面的算法避免了相同子串的重复计算，那么以上这个算法的时间复杂度是多少呢？

考虑到所有重复计算已经被排除，那么我们不防穷举所有可能的计算尝试，同样以abcde为例：

以a开头的尝试：a, ab, abc, abcd, abcde,

以b开头的尝试：b, bc, bcd, bcde,
以c开头的尝试：c, cd, cde,

…

显然所有的计算尝试是n平方的复杂度（n位字符串长度），考虑到所有的尝试只计算一次（算法中的unsplitableSet起到的作用），因此上面的时间复杂度位n的2次方。

Baozi Training Blog (包子IT面试培训博客)

Saturday, August 9, 2014

How to calculate “hard” runtime complexity in recursive solution?

包子IT面试培训助你拿到理想的offer!

No comments:

Post a Comment

Saturday, August 9, 2014

How to calculate “hard” runtime complexity in recursive solution?

包子IT面试培训 助你拿到理想的offer!

No comments:

Post a Comment

包子IT面试培训助你拿到理想的offer!