all groups > c# > march 2005 >
You're in the

c#

group:

Fast way for extracting tokens from a string.


Re: Fast way for extracting tokens from a string. Nicholas Paldino [.NET/C# MVP]
3/31/2005 2:57:44 PM
c#: Jensen,

I would just use the Split method on the string itself, passing a
semicolon. It's probably going to give you the fastest performance.

However, I don't know that you should use a string at all. If the
string is rather large, you should be parsing it apart as you are retrieving
the information. For example, if the string was in a file, or being read
over a stream, I would parse it out as I read the characters from the
stream, not once the string was constructed.

If you got it from someplace else, like a database field (where it is in
string format already) and you can't do anything about it being in a string,
use the Split method.

Hope this helps.


--
- Nicholas Paldino [.NET/C# MVP]
- mvp@spam.guard.caspershouse.com

[quoted text, click to view]

Fast way for extracting tokens from a string. Jensen bredal
3/31/2005 9:36:56 PM

Hello,
I have a string formated in the following way:

s = 1;32;100;32;09;.........;09;76;

I need to extract the numbers separated by the seicolon.
The list can contain several thousands of items and the code is
time critical.

How can i best extract them in C#.

Many Thanks in advance

JB

Re: Fast way for extracting tokens from a string. Jon Shemitz
3/31/2005 10:23:34 PM
[quoted text, click to view]

That would not be my expectation. Split is going to create "several
thousands of" little strings, greatly increasing memory pressure. This
is a classic case where enumerating through Regex.Matches should be
faster than creating an array of strings. Plus a regex could ignore
white space, check the tokens are all digits, &c.

--

Re: Fast way for extracting tokens from a string. Jensen bredal
3/31/2005 10:59:54 PM
Many thanks...
String comes from database.



Re: Fast way for extracting tokens from a string. Jon Shemitz
4/1/2005 5:38:09 PM
[quoted text, click to view]

<semicode> Your sample string

s = 1;32;100;32;09;.........;09;76;

contains a stream of digits followed by semicolons. The regex

(\d+);

will match a stream of digits followed by semicolons, capturing the
digits. This capture is only a matter of calculating string length and
start offset - no substring operations are done until you read a Value
property. So, do a foreach on Regex.Matches() of your data and the
@"(\d+);" regex pattern. Read the 2nd group in each match - that's got
the captured digits, without the semicolon.

</semicode>

--

Re: Fast way for extracting tokens from a string. Jensen bredal
4/2/2005 12:56:48 AM
could you provide some sample code?

AddThis Social Bookmark Button