Groups | Blog | Home
all groups > c# > april 2004 >

c# : Regex question


Chris R. Timmons
4/1/2004 9:26:37 PM
"Du Dang" <vietquest@hotmail.com> wrote in
news:ZW5bc.14231$kc2.299185@nnrp1.uunet.ca:

[quoted text, click to view]

Du,

You can use the "." character to match a newline if you use the
RegexOptions.Singleline option.

Try this:


string inputText = @"
<script1>
***stuff A
</script1>

***more stuff

<script2>
***stuff B
</script2>";

string regex = @"<script\d>(?<contents>.*?)</script\d>";

MatchCollection mc = Regex.Matches(inputText, regex,
RegexOptions.Singleline |
RegexOptions.IgnoreCase |
RegexOptions.IgnorePatternWhitespace);

foreach (Match m in mc)
Console.WriteLine(m.Groups["contents"].ToString());


Hope this helps.

Chris.
-------------
C.R. Timmons Consulting, Inc.
Du Dang
4/1/2004 11:27:33 PM
Text:
=====================
<script1>
***stuff A
</script1>

***more stuff

<script2>
***stuff B
</script2>

=====================

Regex:
<script>[\s\S]+</script>

I use "[\s\S]" intead of "." because there is newline char within the text.


The regex above will give me the match from <script1> to </script2>
instead of two separated matches.

How do I extract <script1> ... </script1> and <script2> ... </script2> as a
separted matches?

Thanks,

Du


Brian Davis
4/2/2004 4:26:01 PM

In addition, you can use a named backreference to make sure that you don't
match anything like "<script1>....</script2>":

<script(?<num>\d+)>(?<contents>.*?)</script\k<num>>


Brian Davis
http://www.knowdotnet.com



[quoted text, click to view]

Chris R. Timmons
4/2/2004 9:30:04 PM
"Du Dang" <vietquest@hotmail.com> wrote in
news:1krbc.14882$kc2.305674@nnrp1.uunet.ca:

[quoted text, click to view]

Quantifiers like + and * are "greedy". They will match as many
characters as they can. The question mark makes the quantifiers non-
greedy, so they match the minimum number of characters required for a
successful match.

A utility like Expresso
(http://www12.brinkster.com/ultrapico/Expresso.htm) can help in
understanding how greedy and non-greedy quantifiers behave.

Hope this helps.

Chris.
-------------
C.R. Timmons Consulting, Inc.
Du Dang
4/2/2004 11:47:55 PM
Thanks Chris, it works like a charm.

//(?<contents>.*?)
one thing I don't understand .. why the second question mark is there?
my understanding of naming a regex is (?<name_here>expression_here)

I tried to removed the second question mark and the expression stop working

Thanks again for your help,

Du

[quoted text, click to view]

Du Dang
4/2/2004 11:51:08 PM
Hi Brian, thanks for helping out!!!

regard,

Du

[quoted text, click to view]

Du Dang
4/3/2004 3:00:35 AM
I think that clear thing quite a bit.

Thank you so much for your help!!!

regard,

Du


[quoted text, click to view]

AddThis Social Bookmark Button