Regex speed in Java -


some example wallclock times large number of strings:

.split("[^a-za-z]"); // .44 seconds .split("[^a-za-z]+"); // .47 seconds .split("\\b+"); // 2 seconds 

any explanations dramatic increase? can imagine [^a-za-z] pattern being done in processor set of 4 compare operations of 4 happen if true case. \b? have weigh in that?

first, makes no sense split on 1 or more zero-width assertions! java’s regex not clever — , i’m being charitable — sane optimizations.

second, never use \b in java: messed , out of sync \w.

for more complete explanation of this, how make work unicode, see this answer.


Comments

Popular posts from this blog

asp.net - repeatedly call AddImageUrl(url) to assemble pdf document -

java - Android recognize cell phone with keyboard or not? -

iphone - How would you achieve a LED Scrolling effect? -