"Brian Davis" <@> wrote in message
news:#fSZDiPGEHA.3772@TK2MSFTNGP12.phx.gbl...
>
> In addition, you can use a named backreference to make sure that you don't
> match anything like "<script1>....</script2>":
>
> <script(?<num>\d+)>(?<contents>.*?)</script\k<num>>
>
>
> Brian Davis
>
http://www.knowdotnet.com >
>
>
> "Chris R. Timmons" <crtimmons@X_NOSPAM_Xcrtimmonsinc.com> wrote in message
> news:Xns94BEEE7EA30EDcrtimmonscrtimmonsin@207.46.248.16...
> > "Du Dang" <vietquest@hotmail.com> wrote in
> > news:ZW5bc.14231$kc2.299185@nnrp1.uunet.ca:
> >
> > > Text:
> > >=====================
> > ><script1>
> > > ***stuff A
> > ></script1>
> > >
> > > ***more stuff
> > >
> > ><script2>
> > > ***stuff B
> > ></script2>
> > >
> > >=====================
> > >
> > > Regex:
> > ><script>[\s\S]+</script>
> > >
> > > I use "[\s\S]" intead of "." because there is newline char
> > > within the text.
> > >
> > >
> > > The regex above will give me the match from <script1> to
> > > </script2> instead of two separated matches.
> > >
> > > How do I extract <script1> ... </script1> and <script2> ...
> > > </script2> as a separted matches?
> >
> > Du,
> >
> > You can use the "." character to match a newline if you use the
> > RegexOptions.Singleline option.
> >
> > Try this:
> >
> >
> > string inputText = @"
> > <script1>
> > ***stuff A
> > </script1>
> >
> > ***more stuff
> >
> > <script2>
> > ***stuff B
> > </script2>";
> >
> > string regex = @"<script\d>(?<contents>.*?)</script\d>";
> >
> > MatchCollection mc = Regex.Matches(inputText, regex,
> > RegexOptions.Singleline |
> > RegexOptions.IgnoreCase |
> > RegexOptions.IgnorePatternWhitespace);
> >
> > foreach (Match m in mc)
> > Console.WriteLine(m.Groups["contents"].ToString());
> >
> >
> > Hope this helps.
> >
> > Chris.
> > -------------
> > C.R. Timmons Consulting, Inc.
> >
http://www.crtimmonsinc.com/ >
>