Goodreads API Tutorial

Jesse G. Winston

4.33/5 (5 votes)

Oct 7, 2012

CPOL

13 min read

55384

1076

How to query an API using Goodreads.com as an example.

Download source - 44.4 KB

Introduction

In this article we'll go through the process of setting up a web page that can query information using another site's API. As an example, we'll use the API provided by Goodreads, and our goal will be to display general information about a book when we provide an author and book title. In order to accomplish this, we'll use Javascript with a little jQuery, an HTTP Handler with C# as the back end, and discuss converting XML to JSON in order for our Javascript to quickly read the data we query. Finally, we'll implement how to cache the data in order to reduce server load.

Before We Begin

Typically, in order to use an API, you'll need a developer key. This is a unique key tied to your account. It can be banned if abused, so it is important to read the terms of use associated with an API before you implement it on your site. To get a developer key, and to see the different methods available, for Goodreads, visit http://www.goodreads.com/api. The particular method I'll be calling in this article is book.title. Once you have your developer key, a sample URL is provided where you can see an example of the XML that we'll get back, and the information therein. Notice the formatting of the URL; it will be the pattern we'll need to follow as well.

We'll also be using a little jQuery, so we'll need to download the jQuery Javascript file; it can be found at http://jquery.com/download/. I recommend the minified version for this exercise. Once you download the file, you can drag and drop it into your solution window; though, I recommend putting it in its own folder inside your solution called "scripts".

Creating a Client Control

First, let's create a simple test page to display our data. All that I will be putting on this page is some basic HTML and a call to a Javascript function that will display the data.

<body>
    <form id="form1" runat="server">
    <h2>Get Book Information</h2>
    Author: <input type="text" id="authorTextbox" value="Patrick Rothfuss" /> <br />
    Title: <input type="text" id="titleTextbox" value="The Name of the Wind" /> <br />
    <input type="button" value="Get Book Information" id="getButton" onclick='getBookInformation()' />
    <br /><br />
    <div id="DataContainer" ></div>
    </form>
</body>

I've added a text box for the author and a text box for the book title. I've also added values for the text boxes already, so that we can easily test if the page is working without constantly adding in values. You certainly don't need to do that though. In addition to the text boxes is a button, that when clicked, calls a Javascript function called getBookInformation. We'll define this in a separate Javascript file later.

The final important item is the div. It's blank now, but in our js file we'll reference that div and populate it with the data we get from Goodreads. It must have an id that we can reference, and here, I'm just calling it DataContainer.

We'll need to add a reference on this page to our js file once it's created, otherwise we're done with this page, and we'll let the Javascript take care of the rest.

Creating the HTTP Handler and Javascript

The HTTP Handler is where we'll be specifying where to get our data, and how to format it once we've got it. First, we'll add a Generic Handler to our project. I've called mine goodreadsHttpHandler.ashx. We'll write our code inside the ProcessRequest method, and we'll leave IsReuseable alone. Before we forget, we want to change the Response.ContentType. We're going to be getting XML from Goodreads, then converting it to JSON, so we want to change the content type to "application/json" to reflect what we're ultimately returning.

context.Response.ContentType = "application/json";

Next, we're going to add a Javascript file to our solution. For now, we'll just add a simple function that gets the information in our text boxes and passes it to the HTTP Handler.

function getBookInformation() {
	$.get('goodreadsHttpHandler.ashx'
	, { bookAuthor: $('#authorTextbox').val(), bookTitle: $('#titleTextbox').val() }
	, function (data) {
	}
	);
}

Essentially, what's happening here is the function is looking up the handler and then passing it the information it needs (in this case the bookAuthor and bookTitle). Then, we pass into the function, in this case what we're calling "data", what the handler returned. Later, we'll display the data as HTML which we define inside the function. Note that the authorTextbox and titleTextbox are the id's of the text boxes we created on the client control page.

Back in our handler, we need a way to reference what the Javascript passes into it. We do this using the QueryString method.

string bookTitle = context.Request.QueryString["bookTitle"];
string bookAuthor = context.Request.QueryString["bookAuthor"];

Now that our handler knows the book title and book author, it has all the information it needs to query the Goodreads API. So let's create a new method in our handler called GetGoodreadsURI, and it'll take the bookAuthor and bookTitle as parameters.

public string GetGoodreadsURI(string bookAuthor, string bookTitle)
{
    string myKey = keyManager.GetConfigurationByKey("goodreadsDeveloperKey");
    string uri = "http://www.goodreads.com/book/title?format={0}&author={1}&key={2}&title={3}";
    return String.Format(uri, "xml", bookAuthor, myKey, bookTitle);
}

In my example, I'm getting my developer key from another class that refers to the web.config file. Your key should also be hidden from your users in a similar fashion. To add the file to your web.config, add the following into the "configuration block".

<appSettings>
    <add key="goodreadsDeveloperKey" value="YOUR_KEY_HERE"/>
</appSettings>

Whether you add it to another class or not, in order to reference this inside your web.config, you would use:

WebConfigurationManager.AppSettings["goodreadsDeveloperKey"];

Returning to our GetGoodreadsURI method, notice that for the format we're choosing XML. This particular API Method that we're calling allows us to choose the format as JSON specifically. However, if we do, Goodreads will only return reviews. Since we'd like to get more information about a book, we'll make the call using XML (which provides more general information) and convert it to JSON.

Back in our ProcessRequest method, we'll call our new GetGoodreadsURI method, and call its return value uri.

string bookTitle = context.Request.QueryString["bookTitle"];
string bookAuthor = context.Request.QueryString["bookAuthor"];
string uri = GetGoodreadsURI(bookAuthor, bookTitle);

Now that we know where to look for the XML, we want to copy it into a string that we can then turn into JSON.

public string GetResponseFromAPI(string uri, out int serverStatus)
{
    string responseData = String.Empty;
    try
    {
        HttpWebRequest req = (HttpWebRequest)WebRequest.Create(uri);
        HttpWebResponse res = (HttpWebResponse)req.GetResponse();
        serverStatus = (int)res.StatusCode;

        using (Stream s = res.GetResponseStream())
        {
            using (StreamReader sr = new StreamReader(s))
            {
                responseData = sr.ReadToEnd();
            }
        }
        return responseData;
    }
    catch (WebException e)
    {
        HttpWebResponse res = (HttpWebResponse)e.Response;
        serverStatus = (int)res.StatusCode;
        return responseData;
    }
}

First, we're simply making the request and then storing the response. After that, we're reading through the entire response that's been returned. Finally, we put it into a string that we can easily read from later.

You don't need to grab the StatusCode the server returns, but for debugging purposes, or to supply relevant error messages to your users, it might be valuable. If everything is successful, we'll get a StatusCode of 200. In the above code, the only exception I'm catching is if we get a 404 not found (i.e. the user searches for a book that Goodreads can't find in its database).

Now that we have the information from Goodreads captured into a string, we can convert it to JSON to send back to our Javascript function. Converting it isn't difficult, but it is tedious. To start with, we'll want to create a text file that maintains the JSON format we expect to send to our Javascript. You can look through the sample XML file Goodreads provides in order to determine the information you want to return back to the user. Based on what I've chosen, this is what my text file looks like:

{
	"Author":  "{{!Author!}}"
	, "Title":  "{{!Title!}}"
	, "Description": "{{!Description!}}"
	, "Average_Rating": "{{!Average_Rating!}} / 5"
	, "Cover_Image": "{{!Cover_Image!}}"
	, "Publication_Year": "{{!Publication_Year!}}"
	, "Publisher": "{{!Publisher!}}"
	, "ISBN": "{{!ISBN!}}"
	, "Reviews": "{{!Reviews!}}"
	, "Status": "{{!Status!}}"
}

Remember our empty Javascript function from before? If we passed the content of this text file to it, and called it data (which we are), we could then reference data.Author, and we would get back {{!Author!}}. What we're working toward is taking information from the string that holds our XML information, and replacing the relevant fields in our text file (which we'll put into a string also). So by replacing {{!Author!}} with, say, Patrick Rothfuss, we'll get back that author name when we call data.Author in our Javascript function.

Now that we have this text file, let's get it into a string. I'm going to create a new method called GetGoodreadsJSONResponse.

public string GetGoodreadsJSONResponse()
{
    string returnData;
    using (Stream s = Assembly.GetExecutingAssembly().GetManifestResourceStream(
	"YOUR_SOLUTION_NAME.JSON_Responses.goodreads.BookInformationFields.txt"))
    {
        using (StreamReader sr = new StreamReader(s))
        {
            returnData = sr.ReadToEnd();
        }
    }
    return returnData;
}

In my example, I have a folder called “goodreads” inside a folder called “JSON Responses”. Within the “goodreads” folder is my text file. You don't have to follow that folder structure, but do note that with the GetManifestResourceStream method, you pass a file name using periods where you'd normally see slashes. Otherwise, all this method does is return our text file as a string.

Now, we can add these two new methods to our ProcessRequest.

string returnData = GetGoodreadsJSONResponse();
string responseData = GetResponseFromAPI(uri, out serverStatusCode);
if (serverStatusCode == 200)
{
	returnData = ConvertGoodreadsXMLtoJSON(responseData, returnData);
}

In the code above, there's one method that we haven't yet defined: taking everything in the response string (formatted as xml) and putting it into our return string (formatted as JSON). Unfortunately, converting the XML to JSON is a large method:

public string ConvertGoodreadsXMLtoJSON(string responseData, string bookFieldsAsJSON)
{
	XDocument xmlResponseData = XDocument.Parse(responseData);
	XElement goodreadsRoot = xmlResponseData.Element("GoodreadsResponse");
	XElement bookRootElement = goodreadsRoot.Element("book");
	XElement workRootElement = bookRootElement.Element("work");
}

The first thing we're doing is creating an XML document based on the string that has the response data. Our string is already formatted as XML, so this is no problem. Next, we need to dive into each element to get information out of it. So if you look at the sample XML page again, you'll notice that there's an element called "GoodreadsResponse" that has all of the other elements within it. Inside that element is another element called "book" that houses everything related to the actual book. Under the "book" element is an element called "title". Now you'll notice that the "title" element actually has information we're looking for (particularly the name of the book).

XElement titleElement = bookRootElement.Element("title");
string bookTitleValue = titleElement.Value;

So we've essentially drilled down into the "title" element, and then gotten the value inside of it. To get the name of the author, you'd have to go from "GoodreadsResponse" to "book" to "authors" to "author" to "name" before you'd be at the element that has the value you want. Thus, it can be a tedious process to do for every single piece of information.

Now that we have the value in the title element, we want to replace it with the placeholder value in our JSON string.

bookFieldsAsJSON = bookFieldsAsJSON.Replace("{{!Title!}}", bookTitleValue);

You would now go through this process for each value you want to extract from the original XML. However, one problem is that the "reviews_widget" and the "description" values can contain HTML. If you put this HTML as is into your JSON string, it will not work. For this, we have to serialize it (i.e. convert it to a JSON-ready string).

JavaScriptSerializer js = new JavaScriptSerializer();
XElement descriptionElement = bookRootElement.Element("description");
string bookDescriptionValue = js.Serialize(descriptionElement.Value).Trim();
bookDescriptionValue = bookDescriptionValue.Substring(1, bookDescriptionValue.Length - 2);

The last two lines of code might seem strange, but they're necessary every time we use the Serialize method. Serialize will automatically put quotes around the string it creates, but as you'll recall, in our JSON text file, we already have quotes around each value that we want to display. The last two lines of code remove the superfluous quotes that get created around the string. Without getting rid of them, it will not work.

You'll need to serialize anything that might have HTML in it, so you'll definitely need to do this with the "reviews_widget" if you want to display it. You do NOT need to serialize URLs though, so you can pass the value for "image_url" as is without modifying it.

In case a book can't be found, I want to display an appropriate message, so I have a status field in my JSON text file. At the end of converting all the XML to JSON, I change the status to OK and then return my new string.

bookFieldsAsJSON = bookFieldsAsJSON.Replace("{{!Status!}}", "Status_OK");
return bookFieldsAsJSON;

Now we're ready to send this back to our Javascript function. So at this point, our ProcessRequest method is complete and functional, though later we'll come back and add caching.

public void ProcessRequest(HttpContext context)
{
	context.Response.ContentType = "application/json";
	
	string bookTitle = context.Request.QueryString["bookTitle"];
	string bookAuthor = context.Request.QueryString["bookAuthor"];
	
	string uri = GetGoodreadsURI(bookAuthor, bookTitle);
	
	string returnData = GetGoodreadsJSONResponse();
	string responseData = GetResponseFromAPI(uri, out serverStatusCode);
	if (serverStatusCode == 200)
	{
		returnData = ConvertGoodreadsXMLtoJSON(responseData, returnData);
	}
	context.Response.Write(returnData);
}

Completing the Javascript Function

Finally, our getBookInformation function is ready to do some work. We're going to use jQuery to insert HTML onto our page that contains the information our handler returned. To do this, we'll add a simple jQuery command inside our function.

function getBookInformation() {
	$.get('goodreadsHttpHandler.ashx'
	, { bookAuthor: $('#authorTextbox').val(), bookTitle: $('#titleTextbox').val() }
	, function (data) {
	$('#DataContainer').html(
	)
	}
	);
}

Remember, DataContainer is the name of the div we specified in our main HTML page earlier, so everything we put inside the .html() will get inserted into that div as HTML. This means that we can put HTML directly into that function, but everything that isn't Javascript has to be put inside quotation marks (including the HTML). You can style the content exactly how you want to using HTML and CSS, and it will automatically get filled into the div. To reference anything returned by our handler, we simply use data.whatever. So, if we want to get the author, we'd call data.Author (remember, these are based on the names you've given the values in your JSON text file).

function getBookInformation() {
    $.get('goodreadsHttpHandler.ashx'
    // Specify location of Author and Title to search for.
    , { bookAuthor: $('#authorTextbox').val(), bookTitle: $('#titleTextbox').val() }
    , function (data) {
    var titleStyle = "<p style=\"color:#666600;font-family:georgia,serif;\">";
    var spanStart = "<span style=\"color:black;\">";
    var spanEnd = "</span>";
    var reviewsDiv = "<div id=\"ReviewContainer\"><p style=\"cursor:pointer;color:#666600;font-family:georgia,serif;\">";
    var infoFromGoodreads = "<br/><br/><br/><p style=\"font-size:9px;\">
	Information provided by <a href=\"http://www.goodreads.com\">goodreads</a>.</p>";
    if (data.Status === "Status_OK") {
    // Specify div to fill (change in else statements below also).
    $('#DataContainer').html(
    "<table><tbody><tr>"
    + "<td style=\"width:100px;\">" + "<img src=" + data.Cover_Image + "></>" + "</td>"
    + "<td>" + titleStyle
    + "Author: " + spanStart + data.Author + spanEnd
    + "<br/>" + "Title: " + spanStart + data.Title + spanEnd
    + "<br/>" + "Average Rating: " + spanStart + data.Average_Rating + spanEnd
    + "<br/>" + "First Published: " + spanStart + data.Publication_Year + spanEnd
    + "<br/>" + "Publisher: " + spanStart + data.Publisher + spanEnd
    + "<br/>" + "ISBN: " + spanStart + data.ISBN
    + "</p></td></tr></tbody></table>"
    + titleStyle + "Description" + "<br/><br/>" + spanStart + data.Description + spanEnd + "</p>"
    + reviewsDiv + "Show reviews..." + "</p></div>"
    + infoFromGoodreads
    );
    } else if (data.Status === "Bad_XML") {
        $('#DataContainer').html("Unexpected XML encountered." + infoFromGoodreads);
    } else {
        $('#DataContainer').html("No book found." + infoFromGoodreads);
    }
    $("#ReviewContainer").click(function () { $('#ReviewContainer').html(data.Reviews); });
    }
    );
}

The last thing we need to do is add a reference to our script on the HTML page that's going to display this information. So go back to the client control and drag-and-drop both the Javascript we made, as well as the jQuery script, into the top of the HTML page.

<script src="scripts/jquery-1.8.0.min.js" type="text/javascript"></script>
<script src="scripts/goodreadsGetBookInfoJavascript.js" type="text/javascript"></script>

At this point, we have a working webpage that calls the Goodreads API and displays the data we've requested. However, there's nothing stopping a user from tapping the "Get Book Information" button as fast as they can, and getting our developer key banned. Additionally, it generally isn't worthwhile to make a call to Goodreads over and over again if people are searching for the same information repeatedly.

Caching the Data

First, let's create a new interface and call it ICacheData.

interface ICacheData
{
    string GetCacheValue(string key);
    void InsertToCache(string key, string value);
}

Now, create a new class called CacheManager that inherits from the interface we just made. This class only requires two methods. The first is to check if we already have something in the cache, and then return what's found. If something gets returned, we'll pass that to our user, rather than hitting Goodreads again.

public string GetCacheValue(string key)
{
    string returnData = null;
    if (null != HttpContext.Current.Cache[key])
    {
        returnData = HttpContext.Current.Cache[key].ToString();
    }
    return returnData;
}

Notice that we're accepting a string as a parameter. This needs to be something unique to each search. One possible key is the URI that we query to get the information to begin with (it contains the author and book title), so in our handler we'll pass that in as the "key" value.

Basically, we are just checking if that particular uri key is in our cache. If it is, we return the value that's stored (our JSON string), and if not, we just return null. Back in our handler, if we get null, we want to insert it into the cache.

public void InsertToCache(string key, string value)
{
    lock (_lockObject)
    {
        if (null == HttpContext.Current.Cache.Get(key))
        {
            HttpContext.Current.Cache.Add(
                key
                , value
                , null
                , DateTime.Now.Add(TimeSpan.FromSeconds(300))
                , System.Web.Caching.Cache.NoSlidingExpiration
                , System.Web.Caching.CacheItemPriority.Low
                , null
                );
        }
    }
}

Again, we're passing our URI as the key parameter. The value parameter will be the information we want cached. In our case, it's the final JSON string that we want to pass to our Javascript function. The other code to note is the DateTime.Now.Add(). What we're saying is to add 5 minutes to the current time (in production code, you would want to make this configurable), and that's how long our data will stay cached. If you expect that your data is going to be very dynamic and changing frequently, you may want to set that to something lower. On the other hand, if you expect your data won't change much, you could set this to a much longer span of time.

Now that we have methods to cache data, let's make our handler implement them.

CacheManager cm = new CacheManager();
string returnData = cm.GetCacheValue(uri);

if (null == returnData)
{
    returnData = GetGoodreadsJSONResponse();
    string responseData = GetResponseFromAPI(uri, out serverStatusCode);
    if (serverStatusCode == 200)
    {
        returnData = ConvertGoodreadsXMLtoJSON(responseData, returnData);
    }
    cm.InsertToCache(uri, returnData);
}

context.Response.Write(returnData);

As you can see, it's a fairly small addition to implement a simple form of caching data. We first check to see if data is cached, and if so, we just return it. If no data is cached, we go through the same code as before, but we cache it before moving forward.

*Note: You might notice that if you search for an author and capitalize the name, that data will get cached as expected. However, if you then search again but don't capitalize anything, it will cache the same data again for the different case. You could avoid this by putting all search terms to lower case before putting them in your URI. For whatever reason, at the time of this writing, if you put an author name in all lower case or all upper case, Goodreads will return a 404 error. If you're working with a different API, this hopefully won't be an issue.