Faster ways of finding a character inside a string, in PureBasic.

Posted by on March 3, 2008

For those of you who either prototype or work with PureBasic on a daily basis, if you ever found yourself looking for faster ways of performing string manipulation, while still using the string system this language provides, be glad for I’ll be posting a few of my solutions to speed up the process.

My first entry is the FindChar() routine. Unlike the official FindString(), this one only searches for a single character. In situations where you’ll be dealing with single character string searches rather than multi-character, this routine will give you up to 2x speed increase in both Ascii and Unicode modes. Worst case scenario, you’ll get equal results speed-wise.

The FindChar() code can be found at > here <

Due to the fact that Unicode characters are 2bytes long, we must use SizeOf() to perform certain displacement operations. However, this implies the use of at least one division for our final result, this will slow things down in the long-run. So, instead, we hack ourselves into a nastier method and we thank the FPU for it (granted, multiply by 0.5, since we know the character size is constant at 2) As nasty as it is, this allows us to outperform the official routine by almost 2x, still in Unicode.

There are certain things to take care of before doing anything with my routines, and this includes making sure we pass a valid string pointer, while ensuring we never pass negative displacement values and that we don’t attempt to displace over the limits of the string. Any self-respectable coder will do this before calling any routine, though. If you want an extra push of speed, get rid of the main IF statement in the routine.

The ideal solution for this language would be to trash the actual string library and replace it with a proper one, but I fear this is a task for the author, not us. I also fear, we won’t be seeing this any time soon. So we might as well keep on sharing our solutions and “hope” the author reads a few books on proper software development, and that he understands them.

For those of you using Unicode, please beware that this routine has not been extensively tested on a production level. Even though it behaves equally to FindString(), I recommend you to perform at least a few tests before plugging this routine right into your main-code. It’s worth noting though, most of my string routines work on 0-based displacements but output equally to the official PB ones.

One of this days I’ll probably end up ranting quite badly on the fact that this sort of languages lack standad support for some really important routines… On the bright side, they provide quick means of prototyping small pieces of code before getting into production with your main language. Even so, theres no escape on the fact that the author lacks the real-world development experience to provide better solutions for his own product.


That’s it for now, I’ll be sharing more as I find time to post in here. Most of the rutines are being used on my in-house XML parser, just so you know. All benchmarks were done on multiple processors and results were averaged on a set of 20 tries per test.


0 Comments on Faster ways of finding a character inside a string, in PureBasic.