Click here to Skip to main content
15,891,184 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
Okay so I've been spinning me head for hours and hours trying to get the hang of this but I just simply can't and I'm wondering if anyone here could help me out.

I'm trying to use HTML Agility Pack to parse only one tr from a table. Here is the full table html code.

HTML
<table border="0" cellpadding="3" cellspacing="2" style="border-collapse: collapse" width="100%">
		<tbody><tr>
			<td width="50%" style="border-removed solid; border-top-width: 2"></td>
			<td width="50%" style="border-removed solid; border-top-width: 2"></td>
		</tr>

				<tr valign="top">
			<td>Country</td>
			<td>United Kingdom</td>
		</tr>
		
				<tr valign="top">
			<td>Contact detail</td>
			<td>08700746464<br>Vodafone Head Office<br>The Courtyard<br>2-4 London Road<br>Newbury<br>Berkshire<br>RG14 1JX</td>
		</tr>
		
				<tr valign="top">
			<td colspan="2" style="height: 0.5em"></td>
		</tr>

		<tr valign="top">
			<td style="border-bottom-style: solid; border-bottom-width: 1">Ofcom Data</td>
			<td></td>
		</tr>

		<tr valign="top">
			<td>Network</td>
			<td>Vodafone Uk Ltd</td>
		</tr>

		
		
				<tr>
			<td>Change date</td>
			<td>11-2012</td>
		</tr>
		
				
		
		
	</tbody></table>


But the actual only part I'm wanting to parse is this

HTML
Vodafone Uk Ltd


This has been driving me crazy because I've not used HTML Agility Pack before and I'm kind of new and looking to learn.

Just wondered if anyone could be kind and give me a hand here. I know I'm probably even trying it all wrong BUT I have tried and for many hours with many different methods on google.

What I have tried:

C#
HtmlAgilityPack.HtmlDocument document = htmlWeb.Load("URL");

// Targets a specific node
HtmlNode content_wrapper = document.GetElementbyId("table");

System.Console.WriteLine(table.ToString());
HtmlNodeCollection search_results = content_wrapper.SelectNodes("//tr[@class='']");
foreach (HtmlNode result in tr)
{
    string recipe_name = result.SelectSingleNode("*[@class='Network']").InnerText;// error appears here
    System.Console.WriteLine(Network);
}
System.Console.ReadKey();
Posted
Updated 10-May-16 17:25pm
v2
Comments
George Jonsson 10-May-16 22:59pm    
What is the error you get?
This code 'result.SelectSingleNode("*[@class='Network']")' probably returns null.

1 solution

try this, provided the html page should contain only one table.

C#
var tds = document.DocumentNode.SelectNodes("//table //tr//td");
           for (int i = 0; i < tds.Count; i++)
           {
               string name = tds[i].InnerText.Trim();
               if (name == "Network")
               {
                   string value = tds[i + 1].InnerText.Trim();
                  // MessageBox.Show(value);
                    Console.WriteLine(value);
                   break;
               }
           }
 
Share this answer
 
Comments
Sneha 4 13-Jul-18 8:00am    
"provided the html page should contain only one table." The above code is able to return more than one table.
Karthik_Mahalingam 21-Aug-18 6:30am    
meaning, your page contains more than 1 table.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900