Click here to Skip to main content
15,904,348 members
Please Sign up or sign in to vote.
4.00/5 (1 vote)
See more:
Hi
I have following code that reads a webpage
C#
 using (Stream stream = request.GetResponse().GetResponseStream())
{
  StreamReader sr = new StreamReader(stream);
  htmlpage= sr.ReadToEnd();
  sr.Close();
}

once i get the webpage I am trying to get my website urls extracted to make sure they are correctly being forwarded.
My problem is when i get url out, some come out fine while some have extra code infront and end of the url for example

(Javascrip:xyw('http://www.mysite.com/xyzpage.html')
I am trying to get rid of anything infront of and at the end of url so i end up with
http://www.mysite.com/xyzpage.html
I tried following, which doesnt work at all
C#
string value = Regex.Match(str, @"\((\w+)\)").Groups[1].Value);

any idea how to write that regex as I am not good at it.
Posted
Updated 31-Jul-12 17:05pm
v2

1 solution

If you know that they must start with http://, why not make that part of your required match ? Your regex now is incredibly vague, it's just 'match everything between the quotes'
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900