Click here to Skip to main content
15,905,566 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
C#
string entry = bookEntry.Text;
entry.Replace(" ", "+");
// used on each read operation
byte[] buf = new byte[8192];
// prepare the web page we will be asking for
string url = null;

if (string.IsNullOrEmpty(isbnCode.Text))
{
   string urlTitle =
                    String.Format(
                    "http://www.abebooks.com/servlet/SearchResults?bi=0&bx=off&ds=30&recentlyadded=all&sortby=2&sts=t&tn={0}&x=76&y=18",
                    entry);
   url = urlTitle;
}
else if (string.IsNullOrEmpty(entry))
{
   string urlIsbn =
                    String.Format(
                    "http://www.abebooks.com/servlet/SearchResults?bi=0&bx=off&ds=30&isbn={0}&recentlyadded=all&sortby=2&sts=t&x=46&y=11",
                isbnCode.Text);
   url = urlIsbn;
}

// prepare request
CookieContainer cookies = new CookieContainer();
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(url);
HttpWebResponse response = null;

// set header values
request.Method = "GET";
request.CookieContainer = cookies;

// get response data
response = (HttpWebResponse)request.GetResponse();
request = null;

string responseData = new StreamReader(response.GetResponseStream()).ReadToEnd();
response.Close();

// show results
Response.Write(responseData);

HtmlAgilityPack.HtmlDocument htmlDoc = new HtmlAgilityPack.HtmlDocument();
htmlDoc.OptionFixNestedTags = true;
var rootNode = htmlDoc.DocumentNode;
var allBookResults = rootNode.SelectNodes("//div[@class='result-detail']");
string bookPrice = null;
var dataNode = allBookResults.SelectSingleNode(".//div[@Class='item-price']");
var bookPriceNode = dataNode.SelectSingleNode(".//span1");
bookPrice = bookPriceNode.InnerText.Trim();


HI all, happy new year.

The question is how do I make the response data (which is the generated link coming from the search results) be used by the html agility pack. The final result is to get the Price from the search list.

the Page Source of the website is as below:

XML
<div class="result-detail">
<span property ="url" content="http://www.abebooks.com/servlet/BookDetailsPL?bi=11698135711"></span>
<div class="item-price" typeof="Offer" property="offers">
    <span property="price" content="41.95"></span>
    <span property="priceCurrency" content="USD"></span>
    <span property="itemCondition" content="UsedCondition"></span>
    <span property ="availability" content="InStock"></span>

</div>


Thanks!

first code block added and indexation
Posted
Updated 2-Jan-14 4:23am
v4
Comments
Taha Akhtar 1-Jan-14 11:07am    
try below
dataNode.SelectSingleNode("span[@property='price']"); instead of var bookPriceNode = dataNode.SelectSingleNode(".//span1");
slayasty 1-Jan-14 15:18pm    
I see. But the problem is that the book results variable is null (im guessing since html agility is not using the generated website
Taha Akhtar 2-Jan-14 6:46am    
if you are using the code then it seems's that you are not loading html document. htmlDoc.Load(responseData);
slayasty 2-Jan-14 7:17am    
I tried that but now when it is trying to load the responseData its saying illegal character in path
Taha Akhtar 2-Jan-14 10:20am    
document. htmlDoc.LoadHtml(responseData);

make sure response data is valid

1 solution

Your first issue is tied to the XPath you use to get the 'item-price' nodes. You have a capital 'C' in your XPath but the HTML from the site has a lower case 'c'. XPath is case-sensitive.

The second issue I see is you didn't set any value to your htmlDoc variable. You create a new instance of it but you never load the html into it.

I re-worked your code, I would recommend not using the var keyword unless you are using it with LINQ. You should always use the correct type for your variables.

Here's your code re-factored. I have tested this and it works.
C#
string bookPrice = string.Empty;
HtmlAgilityPack.HtmlDocument htmlDoc = new HtmlAgilityPack.HtmlDocument();
htmlDoc.OptionFixNestedTags = true;
htmlDoc.LoadHtml(responseData1); // load html            
HtmlAgilityPack.HtmlNode rootNode = htmlDoc.DocumentNode;
HtmlAgilityPack.HtmlNodeCollection allBookResults = rootNode.SelectNodes("//div[@class='result-detail']");

foreach (HtmlAgilityPack.HtmlNode node in allBookResults)
{
    HtmlAgilityPack.HtmlNode dataNode = node.SelectSingleNode("//div[@class='item-price']");

    foreach (HtmlAgilityPack.HtmlNode bookPriceNode in dataNode.ChildNodes)
    {
        bookPrice = bookPriceNode.SelectSingleNode("//span[@property='price']").GetAttributeValue("content",null).ToString();            
    }
}
 
Share this answer
 
Comments
slayasty 2-Jan-14 17:43pm    
Mmm I understand better now. Thanks for the help!
idenizeni 2-Jan-14 19:08pm    
You're welcome :)

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900