Part II of this series talks about how there's an impedence mismatch between how images are used on the web and what might be convenient for the database.  Part I was about what to store in a database when you are using it as an image store. 

If you were implementing any fat-client application that needed to operate on images stored in a database, you might do something like this:

public void GetImagesAndProcess(IDbConnection conn)
{
    IDbCommand cmd = conn.CreateCommand();
    cmd.CommandText = 
       
"select ImageData from Images " +
        "where id between 2 and 8"
;
    IDataReader dr = cmd.ExecuteReader();
    while (dr.Read())
    {
        byte[] imageBytes = (byte[])dr["ImageData"];
        ProcessImage(imageBytes);
    }
}

In other words, you can get a DataReader and process the images one at a time. A common thing to want to do is resize the images to thumbnails and put them on your window, and you can do that sequentially like in the example.

But, if my client is a web-browser, things are a little more complicated. For one, I now have three machines in play, the client, the web-server, and the database server.  Bytes in the blob field in the database eventually need to be served up via HTTP to the browser.

And, unlike most of the data on your page, images are not sent to the browser with the HTML. Instead, you send img tags to the browser, like so:

    <img src="..." />
    <img src="..." />
    <img src="..." />

Then the browser uses the src attribute to go get the image. Since my images are pulled live from a database, the src will be to some server page that will read the image blob from the database and send the bytes back with the Content-type set for this image. There are a couple of things to understand about this

  1. Each image will be requested independently and in any order
  2. You need some way of constructing src so that the server page knows which image to get.

The consequence of (1) is that I can no longer get all of the images I need via a DataReader and then send them to the view.  Instead, each image is randomly requested by the view.

The consequence of (2) is that I need some way of positively identifying the image so that I can construct a url for src that will request the specific image that is needed. The obvious choice is to use the primary key, and if the number of images are small, that might be ok, you'd do something like this when you generate your HTML:

public void GenerateImgTags(IDbConnection conn, HttpResponse response, String imageGetterUrl)
{
    IDbCommand cmd = conn.CreateCommand();
    cmd.CommandText =
      "select id from Images " +
      "where id between 2 and 8"
;
    IDataReader dr = cmd.ExecuteReader();
    while (dr.Read())
    {
        response.Write(
          
"<img src=\"" +
           imageGetterUrl +
"?id=" +
 dr["id"] +
          
"\" />\n");
    }
}

There are a couple of issues with this.

  1. You need to read all of the ids from the database.
  2. You need to expose your primary keys to the browser (which could be a security problem -- like if you use SSN for primary keys)

A convenient thing to send is an index number from 0 to the count of images (which you might know without going to the database, or can be quickly obtained from the database and possibly cached). Then your img tags will look something like this:

    <img src="...?index=0" />
    <img src="...?index=1" />
    <img src="...?index=2" />

where the index is not a primary key to the database but instead refers to a 0-based row number in a result dataset.  Since each of these images are requested independently and in any order, you now need a way to efficiently treat the database like a random access store.

This used to be difficult -- the naive solution, read the dataset and skip to the record is very inefficient (because all of the data might be transmitted), and the more correct way involved basically doing the same thing but in a vendor-specific stored procedure.

Luckily, common web interfaces require fast access to parts of a dataset by index in order to implement paging (splitting a large dataset into pages and having the user navigate via next/prev and page number links). This is essentially the same problem, and RDBMS vendors have solved it by making proprietary extensions to SQL (Here's a nice overview of various vendor-specific ways of paging via SQL extensions) -- so, it's not ideal, but there's no cross-database solution yet.  In SQL Server 2005, the extension is called Common Table Expressions and a sample one looks like this (to get the 0th image, you need to get row number 1):

WITH imageRows AS
   ( 
    
SELECT id, imagedata,
        row_number() OVER (ORDER BY id) AS rownum
 
     
FROM images WHERE id BETWEEN 2 AND 4
  

SELECT imageData FROM imageRows WHERE rownum=1

The main caveat with using this technique is that you have to be able to get the rows in the same order each time and each request must be to the same filtered records. Also, CTE is a SQL Server-only feature -- your SQL will not be compatible with other RDBMS's. Since the main alternative is stored procedures, this might be something you have to live with until paging is implemented in standard SQL, OLEDb/ODBC drivers or data-access objects (like ADO.NET or JDBC) and can be abstracted.