Regex speed in Java -


some example wallclock times large number of strings:

.split("[^a-za-z]"); // .44 seconds .split("[^a-za-z]+"); // .47 seconds .split("\\b+"); // 2 seconds 

any explanations dramatic increase? can imagine [^a-za-z] pattern being done in processor set of 4 compare operations of 4 happen if true case. \b? have weigh in that?

first, makes no sense split on 1 or more zero-width assertions! java’s regex not clever — , i’m being charitable — sane optimizations.

second, never use \b in java: messed , out of sync \w.

for more complete explanation of this, how make work unicode, see this answer.


Comments

Popular posts from this blog

Add email recipient to all new Trac tickets -

asp.net - repeatedly call AddImageUrl(url) to assemble pdf document -

java - Android recognize cell phone with keyboard or not? -