Basics of PDF graphics and how to edit

Frank Rem

9 Nov 2015CPOL6 min read

21.2K

268

This article explains the basics of PDF graphics and how graphics can be edited if you really have to.

This article is in the Product Showcase section for our sponsors at CodeProject. These articles are intended to provide you with information on products and services that we consider useful and of value to developers.

Download source code - 1.5 MB

A question that I often hear is: "How can I change the graphics of my PDF, such as replacing text with some other text or replace a logo with another logo?" In general this is not a good idea. PDF is not designed for editing; it is designed as an end format much like ink on paper. Nevertheless, there may be circumstances - such as when you don't have access to the source format - when editing a PDF is a requirement. This article explains the basics of PDF graphics and how graphics can be edited if you really have to.

PDF graphics

A PDF document contains various types of information such as metadata (author, title, etc.), form fields, navigational data such as bookmarks, annotations such as comments, and last but not least, graphics. Graphics can roughly be divided into three categories: curves, text and images. The graphics on a PDF page are described by a sequence of operators. Operators can be divided into three groups:

draw operators that draw curves, text and images
graphics state operators that do things such as selecting a font, selecting a color or transforming the coordinate system (more about that later)
marked-content operators that associate high-level information with graphics but do not affect the appearance. I ignore this group for the purpose of this article.

Each operator takes zero or more operands. Here is a simple example that draws a straight red line:

150 250 m      % set the current point to (150, 250)
150 350 l      % append a straight line to (150, 350)
1 0 0 RG       % set the stroke color to red
S              % stroke the line

The operator follows the operands that are used by the operator. On the first line, operator 'm' uses operands 150 and 250.

Here is an example that involves text:

/F1 24 Tf        % set font to F1 and font size to 24
100 100 Td       % move text position to (100, 100)
(Hello World) Tj % draw the text 'Hello World'

On the first line, operator Tf takes operands /F1 and 24. Operand /F1 is the name of a font. Without going into the full details, it suffices to say that /F1 can be resolved to an actual font either inside or outside the PDF document.

And finally, here is a one-liner that draws an image:

/I1 Do           % draw image

Similar to selection a font, /I1 is resolved to an actual image inside the PDF document.

Graphics state

As said, operators can be divided into draw operators and graphics state operators. When the operators are processed from top to bottom a graphics state is maintained. The graphics state operators change the graphics state and the result of draw operators are affected by the graphics state. In the first sample we saw that the RG operator changed the stroke color to red and the S operator draws a line using the current stroke color.

Other graphics state operators set the line width, dash pattern, fill color, font size etc. Finally, there are two special operators that, respectively, save (q) and restore (Q) the graphics state. Simply put: the restore operator changes the graphics state back to the state at the previous save operator. They appear pair-wise and can be nested.

Coordinate system

A crucial part of the PDF imaging model is the coordinate system. The coordinate system determines where on the page a given coordinate such as (150, 250) is located and what the extend of a size is. PDF defines different coordinate systems. The most important two are user space and device space.

Devive space

The device space is determined by the output device such as a printer or display on which a PDF page is ultimately rendered. Let's say that we want to render a PDF page to a Windows bitmap at 300 DPI, then from a Windows development perspective, the device space has its origin at the top-left corner, the x axis points to the right, the y axis points downwards, and the length of a unit (a pixel) is 1/300 inch.

User space

As opposed to the device space, the user space is device independent. For every page, it is initialized such that its origin lies at the bottom-left corner, the x axis points to the right, the y axis point upwards and the length of a unit is 1/72 inch or 1 point. The coordinates in the above PDF operator examples are in user space.

Mapping user space to device space

How coordinates in user space are transformed to coordinates in device space is defined by the current transformation matrix or CTM. Let's see how this would look in code:

// width and height of a Letter page
float width = 612; // 612 points = 8.5 inches
float height = 792; // 792 points = 11 inches
// output device is a 600 dpi bitmap
float dpi = 600;
Bitmap bitmap = new Bitmap((int)(width * dpi / 72), (int)(height * dpi / 72));
// 4 page corners in user pace 
PointF[] points = new PointF[] { 
   new PointF(0, 0),           // bottom-left corner
   new PointF(0, height),      // top-left corner
   new PointF(width, height),  // top-right corner
   new PointF(width, 0)        // bottom-right corner
};
Console.WriteLine(
   string.Join("; ", points.Select(p => string.Format("({0}, {1})", p.X, p.Y))));
// calculate the coordinates of the corners in device space
Matrix ctm = new Matrix();
// flip vertical axis
ctm.Scale(1, -1);
ctm.Translate(0, -bitmap.Height);
// resolution
ctm.Scale(dpi / 72f, dpi / 72);
ctm.TransformPoints(points);
Console.WriteLine(
   string.Join("; ", points.Select(p => string.Format("({0}, {1})", (int)p.X, (int)p.Y))));

Changing the user space

The CTM is part of the graphics state and it can be changed using the cm operator. The cm take six operands that represent a transformation matrix. Changing the CTM will affect subsequent draw operators as you will see in the following example.

We have a page that measures 200 pt by 200 pt. The following image shows the empty page with the user space coordinate system laid on top of it:

We draw a red square measuring 50 by 50 and a smaller blue square measuring 25 by 25 inside the red square like this:

Next we transform the user space by translating it by (50, 75). Note that this is done before the figure is drawn.

Finally, the user space is rotated 30 degrees like this:

So instead of transforming the squares, we transform the user space and then draw the squares inside that user space. Depending on where you are coming from, this may feel counter-intuitive.

Shapes

From a development point of view, a sequence of operators is not a convenient format. E.g. you can not easily navigate to an image on the page and retrieve its position. Its properties depend on the accumulation of all previous operators so you would have to process all of them first. The same is true for text and curves.

Changing a graphic, such as moving a single image or rotating a piece of text would be even harder because you would have to insert operators in such a way that they would only affect the targeted graphic.

PDFKit.NET allows you to extract all graphics on a page as a collection of shape objects. Internally it will do all the hard work of interpreting the operators, creating shape objects from draw operators and assigning properties that reflect the current graphics state. After extracting the shapes, you can remove shapes, insert new shapes and change their respective properties. When done, you can write the shapes back to a PDF page. This will in turn generate the required sequence of operators and operands.

Example: replacing a logo

To demonstrate the use of shapes to edit graphics, we are going to replace a logo. See below the images of the original PDF and the PDF after replacing the logo:

Here is all the code:

static void Main(string[] args)
{
   using (FileStream fileIn = new FileStream(
      "indesign_shortcuts.pdf", FileMode.Open, FileAccess.Read))
   {
      Document pdfIn = new Document(fileIn);
      Document pdfOut = new Document();
      foreach (Page page in pdfIn.Pages)
      {
         ShapeCollection shapes = page.CreateShapes();
         replaceLogo(shapes);
         // add modified shapes to the new document
         Page newPage = new Page(page.Width, page.Height);
         newPage.Overlay.Add(shapes);
         pdfOut.Pages.Add(newPage);
      }
      using (FileStream fileOut = new FileStream(
         "out.pdf", FileMode.Create, FileAccess.Write))
      {
         pdfOut.Write(fileOut);
      }
   }
}
static void replaceLogo(ShapeCollection shapes)
{
   for (int i = 0; i < shapes.Count; i++)
   {
      Shape shape = shapes[i];
   
      if (shape is ShapeCollection)
      {
         // recurse
         replaceLogo(shape as ShapeCollection);
      }
      else if (shape is ImageShape)
      {
         ImageShape oldLogo = shape as ImageShape;
         shapes.RemoveAt(i);
         ImageShape newLogo = new ImageShape("new-logo.png");
         newLogo.Transform = oldLogo.Transform;
         newLogo.Width = oldLogo.Width;
         newLogo.Height = oldLogo.Height;
         shapes.Insert(i, newLogo); 
      }
   }
}

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Written By

Frank Rem

Software Developer

Netherlands

Worked for some years as a software engineer, architect and project leader for different software companies. Works at TallComponents, vendor of class libraries for creating, manipulating and rendering PDF documents.