Click here to Skip to main content
15,888,143 members
Articles / Web Development / ASP.NET
Article

Using ZIP content for delivery over HTTP

Rate me:
Please Sign up or sign in to vote.
4.47/5 (9 votes)
30 Oct 2008MIT4 min read 38.6K   344   47   1
This article shows how to use already compressed content for transmission over HTTP.

Introduction

Static content such as HTML and text files, styles, and client scripts can be compressed to reduce network usage. This article shows how to use already compressed content for transmission over HTTP protocol.

Background

HTTP Request and Response

An HTTP session consists of pairs of requests and responses. Every request and response has a header block that contains metadata about the content. The headers can specify the content type, the encoding, and the cache parameters of the transmitted information. We are interested in the Accept-Encoding header of an HTTP request and the Content-Encoding header of an HTTP response. Most web servers do not set the Content-Encoding header, and the HTTP communication happens as shown on Figure 1. The requested content is transmitted as is.

Diagram of communication beween a web browser and a web server

Figure 1. Communication between a web browser and a web server without Content-Encoding header

To save network bandwidth, a web server can be configured to compress (encode) a requested content. Most web browsers and search engine bots support the DEFLATE compression encoding [1]. You may notice Accept-Encoding is set to "gzip,deflate" that passed from your browser to a web server. We are interested in the DEFLATE encoding.

The compression comes with a price: initial response delay and more computation power is required from a web server; that decreases the number of concurrent users the web server may serve at the same time. The problem can be solved by adding a compressed content cache (see Figure 2) or by using pre-compressed data.

Diagram of communication beween a web browser and a web server when Content-Encoding set

Figure 2. Communication between a web browser and a web server when compression is configured

The second way looks more attractive – it does not require any processing power from the web server; but, it requires more work upfront such as compression of the content. That gives an additional headache for web designers and content authors when they are trying to publish their content.

ZIP File Format

The ZIP is one of the widely used compression formats. A ZIP file contains multiple files that are compressed by numerous archival methods. The most used one is DEFLATE. The file structure can be presented as two parts, the compressed data and a directory. [2] The compressed data contains pairs of local file headers and compressed data, the directory contains additional file attributes and references to local file headers.

Structure of a ZIP file

Figure 3. ZIP file format

We can use the data that was compressed by DEFLATE or no-compression methods. The DEFLATE'd data can be send over HTTP without additional re-encoding – the data is already compressed. (See Figure 4.) The Content-Encoding header has to be set to "deflate" for the HTTP response to tell a web browser that the content is encoded.

Diagram of communication beween a web browser and a web server when a ZIP file fragment is used

Figure 4. Communication between a web browser and a web server when a ZIP fragment is sent

Using the Code

The solution has two parts: a utility library and a web application. The utility library contains a configuration section, web cache, path rewrite module, and ZIP reader classes. The cache classes and path rewrite module can be used only within the web application context.

The web application contains baseline implementations of the HTTP handler (httpzip.ashx) that lists and delivers contents of the registered zip folder. The handler accepts three query string parameters:

  • name – refers to the registered ZIP archive;
  • action – list or get;
  • file – path of the file in the ZIP archive.

To get the rfc1951.txt file from the archive that is registered as deflate-rfcs, use the following URL:

http://servername/path/httpzip.ashx?name=deflate-rfcs&action=get&file=rfc1951.txt

The ZIP files and the rewrite module can be registered in the web.config file:

XML
<configuration>
  <configSections>
    <section name="httpZip" 
      type="HttpZipFolder.HttpZipConfigurationSection,HttpZipFolder" />
  </configSections>
  <httpZip>
    <add name="deflate-rfcs" 
      prefix="rfcs" zipPath="~/App_Data/deflate-rfcs.zip" />
  </httpZip>
  ...
</configuration>

For convenience, the path rewrite HTTP module is included in the utility library. It can be registered in the web.config file:

XML
...
<system.web>
  <httpModules>
    <add name="httpziprewrite" 
      type="HttpZipFolder.HttpZipRewriteModule,HttpZipFolder" />
  </httpModules>
  ...
</system.web>
...

To get the rfc1951.txt file from the archive with a prefix rfcs, use the following URL:

http://servername/path/rfcs/rfc1951.txt

Points of Interest

Most browsers use the If-Modified-Since header to avoid content transmission if the web browser's cache already has it. The handler will return 304 Not Modified as a response status if the header date matches the archived file date. That will tell the browser to use the cached content.

When the path rewrite module is used, the files delivered using the httpzip.ashx handler will be seen by a web browser as a static file on the web server.

Deployment using ZIP folder is easier than uploading separate HTTP files and their resources; and this saves network bandwidth for the client and the server.

References

  1. Deutsch, L.P., "DEFLATE Compressed Data Format Specification", RFC 1951, May 1996
  2. PKWARE, "APPNOTE.TXT - .ZIP File Format Specification", April 2004

License

This article, along with any associated source code and files, is licensed under The MIT License


Written By
Software Developer
United States United States
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.

Comments and Discussions

 
QuestionSuper! Pin
RAND 45586612-Mar-15 4:20
RAND 45586612-Mar-15 4:20 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.