Microsoft Internet Explorer was fully scriptable using OLE Automation. This functionality is no longer available with the new Microsoft Edge browser. This tip presents a way to automate Edge and other Chrome based browsers using only VBA.
Introduction
Internet Explorer classic (IE in the following) was based on ActiveX technology. It was very easy to automate IE for tasks like Webscraping or testing from OLE-aware programming languages like VBA. But Microsoft will end support for IE in the near future and wants users to move to newer browsers like Microsoft Edge.
Microsoft Edge is no longer based on ActiveX technology. Microsoft seems uninterested in creating a drop-in replacement for the IE OLE Object. There are libraries that try to fill this gap using Selenium, see Seleniumbasic as an example. But this requires the installation of a Webdriver, which might not be feasible in some environments. The following solution needs no additional software, apart from a Chrome-based browser.
Keep in mind, that all running Edge procceses must be terminated before running the code. Otherwise the tabs are opened in the currently running process, not the one that has been started and subsequent communication between VBA and Edge fails.
CDP Protocol
The code uses the Chrome Devtools Protocol (CDP) to communicate with the browser. A full documentation of the protocol can be found here. The code implements only a very narrow set of functions:
- Basic functions to set up the communication channel
- Navigation to a url
- Evaluate arbitrary JavaScript expressions in the context of a page and return the result
But these functions should suffice to do basic Webscraping. The main code is as follows:
Sub runedge()
Dim objBrowser As clsEdge
Set objBrowser = New clsEdge
Call objBrowser.start
Call objBrowser.attach("")
Call objBrowser.navigate("https://google.de")
Call objBrowser.waitCompletion
Call objBrowser.jsEval("alert(""hi"")")
Call objBrowser.jsEval("document.getElementsByName(""q"")[0].value=""automate edge vba""")
Call objBrowser.jsEval("document.getElementsByName(""q"")[0].form.submit()")
Call objBrowser.waitCompletion
Call objBrowser.jsEval("document.evaluate("".//h3[text()='Automate Chrome / Edge using VBA - CodeProject']"", document).iterateNext().click()")
Call objBrowser.waitCompletion
Dim strVotes As String
strVotes = objBrowser.jsEval("ctl00_RateArticle_VountCountHist.innerText")
MsgBox ("finish! Vote count is " & strVotes)
objBrowser.closeBrowser
End Sub
The class clsEdge
implements the CDP protocol. The CDP protocol is a message-based protocol. Messages are encoded as JSON. To generate and parse JSON, the code uses the VBA-JSON library from here.
Low-Level Communication with Pipes
The low-level access to the CDP protocol is avaible by two means: Either Edge starts a small Webserver on a specific port or via pipes. The Webserver lacks any security features. Any user on the computer has access to the webserver. This may pose no risks on single user computers or dedicated virtual containers. But if the process is run on a terminal server with more than one user, this is not acceptable. That's why the code uses pipes to communicate with Edge.
Edge uses the third file descriptor (fd) for reading messages and the fourth fd for writing messages. Passing fds from a parent process to child process is common under Unix, but not under Windows. The WinApi call to create a child process (CreateProcess
) allows to setup pipes for the three common fds (stdin
, stdout
, stderr
) using the STARTUPINFO
structure, see CreateProcessA function (processthreadsapi.h) and STARTUPINFOA structure (processthreadsapi.h). Other fds cannot be passed to the child process.
In order to set up the fourth and fifth fds, one must use an undocumented feature of the Microsoft Visual C Runtime (MSVCRT
): If an application is compiled with Microsoft C, than one can pass the pipes using the lpReserved2
parameter of the STARTUPINFO
structure. See "Undocumented CreateProcess" for more details (scroll down the page).
The structure that can be passed in lpReserved2
is defined in the module modExec
.
Public Type STDIO_BUFFER
number_of_fds As Long
crt_flags(0 To 4) As Byte
os_handle(0 To 4) As LongPtr
End Type
The structure is defined to pass five fds in the os_handle
array. The values for the crt_flags
array can be obtained from https://github.com/libuv/libuv/blob/v1.x/src/win/process-stdio.c. The fields of the struct
must lie contiguously in memory (packed). VBA aligns struct
fields to 4 byte boundaries (on 32-bit systems). That's why a second struct
with raw types is defined.
Public Type STDIO_BUFFER2
number_of_fds As Long
raw_bytes(0 To 24) As Byte
End Type
After populating the STDIO_BUFFER struct
, the content is copied using MoveMemory
to the STDIO_BUFFER2
struct. The size of 25 bytes is enought to hold crt_flags
(5 bytes) and the pointers (20 bytes).
History
- 8th July, 2021: Initial version
- 18th August 2021, added support for 64bit Office
- 3rd November 2021, some minor improvements