Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / desktop / Win32

Shell Extension with Keyboard Hook

3.83/5 (5 votes)
31 Aug 2008CPOL9 min read 1   883  
Shell Extension with Keyboard Hook

Introduction

Some time ago, I discovered something I didn't like about Windows. The only directory you can move a file to on a single key press is the Recycle Bin. So I decided to fix that by mapping F12 to a smart move feature that moves the selected files to Documents, Pictures, Music or Videos, depending on the extension. While doing that, I discovered that adding a keyboard hook to Windows Explorer is far from trivial. There's a good reason for that: you should not do it, unless you really have a good use case for it. I'm posting this to help anyone who's trying to find out how to hook to keyboard events in Windows Explorer. The sample covers my Smart Move application, which maps F12 to this move feature. I must warn you that the project is Vista-specific. It will fail to register in XP, due to missing dependencies. I'll provide pointers on how to alter it to work on XP (and earlier) where appropriate.

Disclaimer

I'm not a native developer by trade. I'm a C# developer, with a huge VB background. I, therefore, might not be providing the cleanest code/implementation. I'm specially worried on the string handling, as there are more than enough string types in C++. I'm not using unsafe functions, and I'm freeing memory appropriately, so I think that it should be ok... but be warned that the main point of the article is the keyboard hook and the rest is optional. I used a lot of external information to build this. A few sites I mention as we go, the msdn site, and a book by Dino Esposito (old but still useful) I borrowed from a friend.

Shell Extensions

Coding a Shell Extension in Visual Studio is easy. You must open a new ATL Project, and add a Class of type ATL Simple Object. The class will be configured as-is by default, except for the IObjectWithSite checkbox, which you must check. I called my class ShellExtension, creating an IShellExtension interface and a CShellExtension class.

So let's start with the hook.

Since you added the IObjectWithSite interface, your object will be "sited". What this means in this sample, is that your SetSite method will be called if present. In the ShellExtension.h file, I have two public methods to make this happen.

The public section of the ShellExtension.h file looks like this:

C++
public:
    STDMETHOD(SubclassExplorer) (bool SubClass);
    STDMETHOD(SetSite) (IUnknown *pUnkSite);

I also have a few methods on the private section, and 2 private variables. I'll get to them in a moment:

C++
private:
    bool m_Subclassed;

    static BOOL CALLBACK WndEnumProc(HWND, LPARAM);
    static LRESULT CALLBACK KeyboardProc(int, WPARAM, LPARAM);
    static LRESULT CALLBACK NewExplorerWndProc(HWND, UINT, WPARAM, LPARAM);

    static VOID MoveSelectedFiles();
    static BOOL FindIShellView(HWND, IShellView**);

    static void AddFileToArray(LPCWSTR, LPWSTR, IFileOperation*);

The m_Subclassed boolean lets me know when the object is hooked to events. The m_hwndExplorer is the HWND for the explorer I'm listening events from.

The SetSite method, then, looks like this:

C++
STDMETHODIMP CShellExtension::SetSite(IUnknown *pUnkSite)
{
    HRESULT hr = SubclassExplorer(true);
    if (SUCCEEDED(hr))
        m_Subclassed = true;

    return S_OK;
}

Calling SubclassExplorer, where the actual hook will happen. The destructor for this class (which you have to declare in ShellExplorer.h also) is almost the same, but it instructs SubclassExplorer to unhook. It looks like this:

C++
CShellExtension::~CShellExtension()
{
    if (m_Subclassed)
    {
        SubclassExplorer(false);
        m_Subclassed = false;
    }
}

Registration

You'll have to register the code in HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\Explorer\Browser Helper Objects if you want the Explorer to open it. To add this registration to the COM self-registration of the CShellExtension class, add the following to ATLSmartMove.rgs (where ATLSmartMove is your project name):

HKLM {
  NoRemove SOFTWARE {
    NoRemove Microsoft {   
      NoRemove Windows {
        NoRemove CurrentVersion {
          NoRemove Explorer {
            NoRemove 'Browser Helper Objects' {
              ForceRemove '{6274E69B-9A6C-4818-97BA-123D645719C8}' = s 'SmartMove' 
            }
          }
        }
      }
    }
  }
}

The GUID is the ID from your library (you can find it in ATLSmartMove.idl in this project).

The Hook

So what does SubclassExplorer look like? Let me walk you line by line.

First, the header:

C++
STDMETHODIMP CShellExtension::SubclassExplorer(bool bSubclass){ 

I receive a single boolean value, telling me to hook (if true) or unhook (if false) from events.

Continuing with the hook, I have two if conditions. In the first one, If I'm subclassing (and not already subclassed), I then set the actual hook.
C++
if (bSubclass && !m_Subclassed)
{
    g_hHook = SetWindowsHookEx(WH_KEYBOARD, KeyboardProc, NULL, GetCurrentThreadId());
}

I hook on the NewExplorerWndProc method, and receive keyboard events on KeyboardProc. Unhooking is as simple:

C++
if (!bSubclass && m_Subclassed)
{
     UnhookWindowsHookEx(g_hHook);
}

Return S_OK (as everything worked), and be done with the hook!

If all you were interested in was the hook, that's it. If you are interested in the F12 code, stay with me a little longer.

Getting Keyboard Events

As I told you, the keyboard events are now being dispatched to KeyboardProc. The code for KeyboardProc is fairly easy. If the event is mine, I take it. If it isn't, I dispatch it. I'm also pre-dispatching on a few conditions. This code will be run for every single key pressed in any Windows Explorer window (which is almost everywhere in Windows). I want it to be fast. Also, the code is actually called twice: once for keydown and once for keyup. The weird flag comparisons with lParam are acting on this condition. NEWFOLDERKEY is a constant that maps to VK_F12. KeyboardProc is a static method (there's no easy way to hook a non-static method with SetWindowsHookEx), so I can't use instance variables here or in the called methods. The code looks like this:

C++
LRESULT CALLBACK CShellExtension::KeyboardProc(int nCode, WPARAM wParam, LPARAM lParam)
{
    if (nCode < 0)
        return CallNextHookEx(g_hHook, nCode, wParam, lParam);

    if ((lParam & 0x80000000) || (lParam & 0x40000000))
        return CallNextHookEx(g_hHook, nCode, wParam, lParam);

    if (wParam == NEWFOLDERKEY)
    {
        MoveSelectedFiles();
    }

    return CallNextHookEx(g_hHook, nCode, wParam, lParam);
}

Move Selected Files

Moving the files is actually quite easy. What's difficult is knowing which files are selected. Here's what the method header looks like:

C++
void CShellExtension::MoveSelectedFiles()
{

Although I'm hooked to the Windows Explorer window, I'm not really in the process as a regular Shell Extension. In a context menu, for instance, you'll get the information on what is selected when the code calls you. That's not true here. Luckily, you can always know what's selected on a Windows Explorer window, even from outside the process. You only need the HWND for that.

The next few lines are used to find the hwnd for the current explorer instance. As I'm sited, I'm running on the same thread as the explorer. So to find the hwnd, I walk every window in the thread and look for the one with the highest level. It looks like this:

C++
EnumThreadWindows(GetCurrentThreadId(), WndEnumProc, reinterpret_cast<LPARAM>(&m_hwndExplorer));

if (!IsWindow(m_hwndExplorer))
{
    return E_FAIL;
}
else
    g_hwndExplorer = m_hwndExplorer;

The call will send every window in the thread to WndEnumProc (one of our private methods) and stop when WndEnumProc returns FALSE. WndEnumProc looks for CabinetWClass<code><code><code>, the window with the highest level in Vista. If you want to port this code to an older OS, you should look for ExploreWClass instead. WndEnumProc is pretty simple, and it looks like this:

C++
BOOL CALLBACK CShellExtension::WndEnumProc(HWND hwnd, LPARAM lParam)
{
    TCHAR szClassName[MAX_PATH] = {0};

    GetClassName(hwnd, szClassName, MAX_PATH);

    if (!lstrcmpi(szClassName, __TEXT("CabinetWClass")))
    {
        HWND* phWnd = reinterpret_cast<HWND*>(lParam);
        *phWnd = hwnd;
        return FALSE;
    }

    return TRUE;
}  

I also need to get the path to all the common folders (Documents, Pictures, Videos and Music). I think not all of these folder constants are available before Vista, so check which ones work if you're porting to XP. The code looks like this:

C++
SHGetKnownFolderPath(FOLDERID_Documents, 0, NULL, ppszDocumentsPath);
SHGetKnownFolderPath(FOLDERID_Music, 0, NULL, ppszMusicPath);
SHGetKnownFolderPath(FOLDERID_Pictures, 0, NULL, ppszPicturesPath);
SHGetKnownFolderPath(FOLDERID_Videos, 0, NULL, ppszVideosPath);
Next, I'll show you the code to get an IShellView from the HWND of the Explorer window. I took it from Raymond Chen's blog. I understand what it's doing, but I could neve have guessed it myself. So here it goes:
BOOL CShellExtension::FindIShellView(HWND hwnd, IShellView** psv)
{
  BOOL fFound = FALSE;
     IShellWindows *psw;
 if (SUCCEEDED(CoCreateInstance(CLSID_ShellWindows, NULL, CLSCTX_ALL,
                                IID_IShellWindows, (void**)&psw))) {
  VARIANT v;
  V_VT(&v) = VT_I4;
  IDispatch  *pdisp;
  for (V_I4(&v) = 0; !fFound && psw->Item(v, &pdisp) == S_OK;
       V_I4(&v)++) {
   IWebBrowserApp *pwba;
   if (SUCCEEDED(pdisp->QueryInterface(IID_IWebBrowserApp, (void**)&pwba))) {
     HWND hwndWBA;
     if (SUCCEEDED(pwba->get_HWND((LONG_PTR*)&hwndWBA)) &&
       hwndWBA == hwnd) {
       IServiceProvider *psp;
       if (SUCCEEDED(pwba->QueryInterface(IID_IServiceProvider, (void**)&psp))) {
         IShellBrowser *psb;
         if (SUCCEEDED(psp->QueryService(SID_STopLevelBrowser,
                              IID_IShellBrowser, (void**)&psb))) {
           if (SUCCEEDED(psb->QueryActiveShellView(psv))) {
                    fFound = TRUE;
           }
           psb->Release();
         }
         psp->Release();
       }
     }
     pwba->Release();
   }
    pdisp->Release();
  }
  psw->Release();
 }

 return fFound;
} 

Cool, huh? So let's move on. IShellView will give you the list of files that are selected.

The following code I took from here. It's a part of MoveSelectedFiles, so let's go to that:

C++
IShellView* psv;
if (FindIShellView(g_hwndExplorer, &psv))
{
    CComPtr<IDataObject> spDataObject;
    if (SUCCEEDED(psv->GetItemObject(SVGIO_SELECTION,
          IID_PPV_ARGS(&spDataObject))))
    {
        FORMATETC fmt = { CF_HDROP, NULL, DVASPECT_CONTENT,
                          -1, TYMED_HGLOBAL };
        STGMEDIUM stg;
        stg.tymed =  TYMED_HGLOBAL;

        if (SUCCEEDED(spDataObject->GetData(&fmt, &stg)))
        {
            HDROP hDrop = (HDROP) GlobalLock ( stg.hGlobal );

            UINT uNumFiles = DragQueryFile ( hDrop, 0xFFFFFFFF, NULL, 0 );
            HRESULT hr = S_OK;

            IFileOperation *pfo;

            hr = CoCreateInstance(CLSID_FileOperation,


                          NULL,
                          CLSCTX_ALL,
                          IID_PPV_ARGS(&pfo));

            pfo->SetOperationFlags(FOFX_SHOWELEVATIONPROMPT);

In this first part, I find the IShellView from the hwnd of the window, and use it to get the selected files (that's what that SVGIO_SELECTION constant up there means). DragQueryFiles will tell me the name of the files. I also create a FileOperation object (casted to the IFileOperation interface). This is a Vista-specific class that will let me move the files from one place to another. The earlier version of the same thing was the SHFileOperation function, but IFileOperation is way cooler. You can get more information here.

Next, I iterate through the files, adding them to the collection:

C++
for(UINT i = 0; i < uNumFiles; i++)
{
    TCHAR szPath[MAX_PATH];
    szPath[0] = 0;
    DragQueryFile(hDrop, i, szPath, MAX_PATH);

    if (szPath[0] != 0)
        if (!(PathIsDirectory(szPath) || PathIsRoot(szPath)))
        {
            if (!PathMatchSpecEx(szPath, GRAPHICFILES, PMSF_MULTIPLE))
                AddFileToArray(szPath, *ppszPicturesPath, pfo);
            else if (!PathMatchSpecEx(szPath, VIDEOFILES, PMSF_MULTIPLE))
                AddFileToArray(szPath, *ppszVideosPath, pfo);
            else if (!PathMatchSpecEx(szPath, MUSICFILES, PMSF_MULTIPLE))
                AddFileToArray(szPath, *ppszMusicPath, pfo);
            else
                AddFileToArray(szPath, *ppszDocumentsPath, pfo);
        }
}

I'm ignoring directories and root paths. Also, I check the extension to see where to put it. PathMatchSpecEx tells me if the szPath matches a wildcard match (*.jpg, *.bmp, etc). GRAPHICFILES, VIDEOFILES and MUSICFILES are constants where I hardcoded well known extensions for file types:

C++
const TCHAR GRAPHICFILES[49] = TEXT("*.png;*.bmp;*.jpg;*.gif;*.jpeg;*.pcd;*.pcx;*.svg");
const TCHAR VIDEOFILES[87] = TEXT("*.avi;*.mpeg;*.mpg;*.flv;*.swf;*.fla;*.wmv;*.3gp;*.divx;*.rm;*.rmvb;*.srt;*.xvid;*.vid");
const TCHAR MUSICFILES[124] = TEXT("*.aac;*.aif;*.aiff;*.aud;*.m3u;*.mid;*.midi;*.mp1;*.mp2;*.mp3;*.mpa;*.mpga;;*.ogg;*.omf;*.omg;*.ra;*.r1m;*.wav;*.wave;*.wma");

AddFileToArray adds the move operation to the IFileOperation. It creates two IShellItem objects from the path of the source and the destination. It's pretty simple, and it works like this:

C++
void CShellExtension::AddFileToArray(LPCWSTR szPath, LPWSTR pszDest, IFileOperation* pfo)
{
    IShellItem *psiFrom = NULL;

    int hr = SHCreateItemFromParsingName(szPath, NULL, IID_PPV_ARGS(&psiFrom));

    if (SUCCEEDED(hr))
    {
        IShellItem *psiTo = NULL;

        if (NULL != pszDest)
            hr = SHCreateItemFromParsingName(pszDest,  NULL, IID_PPV_ARGS(&psiTo));

        if (SUCCEEDED(hr))
        {
            hr = pfo->MoveItem(psiFrom, psiTo, NULL, NULL);
            if (NULL != psiTo)
                psiTo->Release();
        }

        psiFrom->Release();
    }
} 

Finally, I call the PerformOperations() method in the FileOperations class, and do some cleanup:

C++
                pfo->PerformOperations();
                pfo->Release();

                GlobalUnlock ( stg.hGlobal );
                ReleaseStgMedium ( &stg );

            }

        }

        psv->Release();
    }

    CoTaskMemFree(*ppszDocumentsPath);
    CoTaskMemFree(*ppszMusicPath);
    CoTaskMemFree(*ppszPicturesPath);
    CoTaskMemFree(*ppszVideosPath);
} 

And that's it!

Points of Interest

So, there you go. Vista breaks compatibility for Shell Extensions in a rather spectacular way, but it's pretty cool about it. IFileOperation is way better than SHFileOperation. Let me explain why (uploaded from comments):

  1. In SHFileOperation, to use several items, you use a \0 separated string. That's bad in a few ways, the worst of them being that when you zero out the memory, you have to find the position where to start concatenating manually (as in szMyItems[wcslen(szMyItems) + 1]). In my first version, using SHFileOperation, I actually did that. It sent a shiver down my spine.
  2. IFileOperation is transactional. That is, you can put 3 copy operations, 4 move operations, 1 delete in a single PerformOperations() call, and when you hit Undo in Windows, it will undo all 8 operations. That's my favorite feature.
  3. IFileOperation is more verbose, but also way cleaner. The same code in SHFileOperation includes two strings which may be wrongly formatted and have no information whatsoever on the amount/type of information they contain. The IShellItem is way more expressive: if you have several, you use a collection. The collection has a different method. You can name evey item.
  4. As far as functionality goes, SHFileOperation isn't very consistent when returning error codes (see: http://shellrevealed.com/blogs/shellblog/archive/2006/09/11/Common-Questions-Concerning-the-SHFileOperation-API_3A00_-Part-1.aspx[^]). There are several errors that actually return S_OK, and you have to check in the fAnyOperationsAborted if it actually worked. IFileOperation returns an HR with S_OK or failure at the PerformOperations level, so you can check there. It shows the error messages on its own, so you don't have to show them. As a matter of fact, you only need to get the return value if you're planning on doing something with it, and that's why I'm ignoring it.
  5. IFileOperation provides a flag for Elevation in Vista.
  6. IFileOperation provides a way of hooking a sync to the progress, so you can know how many files where copied, how many are left, etc. in an event-oriented fashion.

Also, hooking to keyboard events in Windows Explorer is way more difficult than, say, hooking the right click of a mouse. The fact that KeyboardProc should be static makes sense because there's really only one process, but it's pretty limiting.

At the end of the day, I think it was an interesting little project to undertake, and I hope you can get something from it. I might have omitted details in the code. The attached source is complete. My next objective is to be able to deploy it, which I couldn't so far.

Finally, thanks to all the people I took information from!

Update

After a whole day of using the extension, I'm finding out that the hook gets unloaded after some idle time. It shouldn't, and I'm researching on why it does. The result is that closing the explorer window after the extension got unloaded will result on your explorer process to crash. I'll post the fix as soon as I find it.

Found the problem: there's quite a few static variables/references and multiple Explorer windows stepping on each other. I changed the code and the article to avoid this.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)