Profiling

Analyze the bottlenecks

1

Before running the program, what I focus on to optimize is codes related to String, malloc, and high time complexity. I notice the index operation in the program and my guess proves right.

A bottleneck can be definitely detected in CPU usage when the program runs without any optimizations. The System.String.IndexOf takes a large number of CPU time because the time complexity is O(nm), m means the number of replaced words, which is not a linear time complexity. Thus, the key is just to traverse file only once. It’s easier and cheaper than use IndexOf each time to replace words.

Before:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
String^ string_subst(String ^data, String ^pattern, String ^replacement) {
try {
int loc;
// find every occurrence of pattern:
for (loc = data->IndexOf(pattern, 0); loc >= 0;
loc = data->IndexOf(pattern, 0)) {
// replace word
data = replace_word(data, loc, pattern->Length, replacement);

}
//O(n^2)
return data;
} catch (Exception ^e) {
Console::WriteLine("Error in substitute ");
Console::WriteLine(e->ToString());
return data;
}
}

After:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
String^ string_subst(String ^data, String ^pattern, String ^replacement) {
try {
String^ temp;
do {
temp = data;
data = data->Replace(pattern, replacement);
} while (!data->Equals(temp));
return data;
} catch (Exception ^e) {
Console::WriteLine("Error in substitute ");
Console::WriteLine(e->ToString());
return data;
}
}

I use System.String.Replace to do that thing. It performs more than ten times faster than the original one.

优化后

I have read a funny blog on Microsoft Developer. It shows a small Lab to test three of case-insensitive methods, String.Replace, RegEx.Replace and StringBuilder.Replace. Which one is the fastest to replace words. It proves String.Replace is the best one. Thus, I choose to use that one not RegEx.Replace or StringBuilder.Replace.

String.Replace:

String.replace

StringBuilder.Replace:

StringBuilder

The result sounds quite good. The program performs 906/70=12.94 times faster than before.


本博客所有文章除特别声明外,均采用 CC BY-SA 4.0 协议 ,转载请注明出处!