Click here to Skip to main content
15,885,366 members
Articles / Programming Languages / Visual Basic
Tip/Trick

PDF to BMP using Adobe Reader and API Functions

Rate me:
Please Sign up or sign in to vote.
4.80/5 (5 votes)
21 Dec 2015CPOL3 min read 14.5K   588   6  
Extract pages from a .pdf file and save as bitmaps

Image 1

Introduction

For one of my projects, I needed to extract pages from .pdf files. I needed the images to be as sharp as possible, that means with the original resolution (DPI). There is a lot of software that claims to extract images from pdfs, and I tried several solutions. They do a poor job when it comes to saving an image with the original DPI. One might think that the original size is the same as a 100% zoomlevel view in Adobe Reader, but often that is not true. And when the image is saved as for example .jpg, it becomes even more distorted. So I decided to write my own.

Background

I was looking for a free solution. The Adobe Reader is free, and comes with an ActiveX control that can be embedded in VB6. However, the available functions are very limited (as opposed to the ActiveX that comes with Adobe Pro). So it was a challenge to extract the pages, and I had to turn to API calls to find (child) windows and send them messages. Another thing was that I wanted to hide the ActiveX Reader Window, because it is not pleasant to look at, being selected and deselected, and resized regularly. To send keystrokes and mouseclicks to a hidden window, or get it to repaint, requires extra coding.

Using the Code

The code addresses several topics:

  • Find the handle of any window within the application by its Classname and Text
  • Find the number of pages in a .pdf
  • Get the DPI of any page in a .pdf
  • Two methods to extract a page from a .pdf:
      • Send a mouse click to a hidden window
      • Simulate a Control-C input with API function keybd_event
      • Get data from the Clipboard with API functions
      • Two methods to get another DPI than the original
      • Resize a PictureBox.Picture to a high-quality image in another PictureBox
      • Paint a hidden window's content to a PictureBox
  • Save as image with various image format options (bmp, gif, jpg, png, tif) using GDI+

Download the source code to view all these issues with explanatory comments. The following snippets address two of these issues:

  • Find the handle of any window within the application by its Classname and Text:
VB.NET
Private Function FindWindowHandle(ByVal hwnd As Long, _
	SelectClass As String, SelectText As String, bSelect As Boolean) As Long
'' A recursive function to go through all the descendant windows of window with handle hwnd
'' Returns handle for window with Classname = SelectClass and Window Text 
'' that either contains SelectText if bSelect = True; or does not contain SelectText is bSelect = False
'' (There often is more than one window with the same class name)
'' SelectText may be empty ("") and then this function only searches for a Classname
'' Note : hwnd has to be ByVal

   Dim sClass As String, sText As String
   Dim sLen As Long
   Dim ParentHwnd As Long
   Dim FoundHwnd As Long
   FoundHwnd = 0
   
''Get Class name of window with handle hwnd

   sClass = Space(64)
   sLen = GetClassName(hwnd, sClass, 63)
   sClass = Left(sClass, sLen)
   If StrComp(sClass, SelectClass, 1) = 0 Then
        If SelectText <> "" Then
''Get Window Text of window
               sText = Space(256)
               sLen = SendMessageS(hwnd, WM_GETTEXT, 255, sText)
               sText = Left(sText, sLen)
''If bSelect = True : If the text matches we have found the window
               If bSelect = True Then
                   If InStr(sText, SelectText) > 0 Then
''FoundHwnd is the handle for the window with the required Classname and Text

                      FoundHwnd = hwnd
                   End If
               Else
''If bSelect = False : If the text does not match we have found the window
                   If InStr(sText, SelectText) = 0 Then
                      FoundHwnd = hwnd
                   End If
               End If
        Else
           FoundHwnd = hwnd
        End If
  End If
'' If the window is found, return its handle and exit

  If FoundHwnd <> 0 Then
          FindWindowHandle = FoundHwnd
          Exit Function
  End If
  
'' If the window is not found, look for the next child window
  ParentHwnd = hwnd 
  hwnd = FindWindowX(hwnd, 0, 0, 0)
  Do While hwnd
''Recursion : this function calls itself to find child windows of the child windows, 
''so all descendants, not just one level of child windows
      FoundHwnd = FindWindowHandle(hwnd, SelectClass, SelectText, bSelect)

      If FoundHwnd <> 0 Then
        Exit Do
      End If
'' FindWindowX is called repeatedly to find the next child window
      hwnd = FindWindowX(ParentHwnd, hwnd, 0, 0)
  Loop
        
  FindWindowHandle = FoundHwnd
  
End Function

The second snippet shows how to send a mouse click to a hidden window:

VB
Private Sub SendLeftClick(ByVal hwnd As Long, ByVal hwnd2 As Long, x As Long, y As Long)

''Send Left mouse click to invisible window with handle hwnd and with top-level parent window hwnd2
    Dim position As Long
    
''Set window as active window
    Call SetActiveWindow(hwnd)
''Calculate lParam to pass the mouses x and y position in the window, (x and y in pixels)
    position = y * &H10000 + x
''The required messages with their wParam and lParam were found by using Spy++

    Call SendMessage(hwnd, WM_MOUSEACTIVATE, ByVal hwnd2, _
    	ByVal CLng(&H2010001)) ''lParam is HTCLIENT(=1, low) and WM_LBUTTONDOWN(= &H201, high)
    Call SendMessage(hwnd, WM_SETCURSOR, ByVal CLng(0), ByVal CLng(&H2010001))
    Call SendMessage(hwnd, WM_LBUTTONDOWN, ByVal CLng(1), ByVal position)
    Call SendMessage(hwnd, WM_LBUTTONUP, ByVal CLng(0), ByVal position)

End Sub

Points of Interest

The application was written in VB6, I still like it a lot over .NET, and IMHO it shows that anything can be done with VB6 and a few API calls. But of course, if you have another favorite programming language, the source can be rewritten, that should be fairly easy to achieve if you are familiar with API, because the API functions are the core of this application.

History

Update: Adapted the code for Acrobat Reader DC. The DC ActiveX does not work with VB6 (nor with Visual Basic 2015, for that matter). Solution: Rename C:\Program Files (x68)\Common Files\Adobe\Acrobat\ActiveX\ and add a new directory (... )\ActiveX\ with the Acropdf.dll from the download in it. (The code also still works with previous versions of the Reader.) Also added: Save images in various image formats, and a routine to make PrintWindow() work with Adobe. The API function PrintWindow is notorious for returning black images with some applications, like Adobe. So I needed to add a check for black results, but more can be done to optimize the result.

VB.NET
Call RedrawWindow(PageViewhWnd, ByVal 0&, ByVal 0&, _
RDW_ERASE Or RDW_INVALIDATE Or RDW_FRAME Or RDW_ALLCHILDREN Or RDW_UPDATENOW) 

'' Printwindow often returns a black screen, 
'' especially with some applications (Adobe, among others)
'' The problem is that the window has not finished 
'' its asynchronous painting when Printwindow is done
'' The following functions seem to improve the Printwindow result 
'' bij adding to the time that Printwindow is busy
'' Also, when printing a large window the chance that it returns black increases
'' It seems that 1024 X 1024 pixels blocks can be returned in most cases, 
'' so PicSrc (container for Acropdf) is 1024 X 1024 and the fullsize AcroPdf 
'' (with PageView-window) is moved (for example, AcroPDf1.Top = -1024) 
                For i = 1 To 5
'' Repeatedly call Printwindow
                 PrintWindow picSrc.hWnd, PicTemp.hDC, 0&
                 
                
                   For j = 1 To 1000
'' Send extra WM_PAINTs (or call RedrawWindow in this loop, that also works but takes longer)

                      retVal = PostMessage(PageViewhWnd, WM_PAINT, PicTemp.hDC, 0&)
                     
                   Next j

                 Next i

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Retired
Netherlands Netherlands
My name is Franciska Ruessink. I studied IT at The Hague Hogeschool (University) and was employed in IT as software developer, database developer, network administrator, and IT project manager for several years.
After an early retirement I continue to write software for the fun of it and to help friends, I also did some web development. I usually work with VB but started again with VC++ recently.

Comments and Discussions

 
-- There are no messages in this forum --