Wednesday, September 16, 2009

Listening to HTML Events from BHO

Creating a plug-in for Internet Explorer was an interesting proposition. After years of .NET development, the first point I investigated was how to write my add-in using C#. I was surprised to find no official support from Microsoft for .NET development of IE add-ins. Thanks to COM Interop, however, it's still possible. More so, information and sample code abound for making it happen.

A number of Code Project articles have been written about how to setup your add-in project using COM Interop. But I wanted to get straight to writing C# code and seeing its effects within IE. In other words, I wanted an existing library that did the COM Interop stuff for me, and which I could simply reference from my project. Two such solutions exist: one commercial and one open source. The commercial one is Add-in Express for IE. I was always suspicious of their tool since a free trial isn't available, so I never shelled out the $149 (USD). The free open-source alternative known as SpicIE, however, is good enough.

My plug-in needs to be kept in sync with whatever text is currently selected within the browser window. Listening to HTML events raised on the browser's current page is necessary in order to do this. The SpicIE Contrib companion website includes a bare bones sample showing how to do this. This sample, however, hooks up the event handlers only after the entire page has loaded. I was looking for something a bit more interactive than that. In other words, I want to hook up the event handlers ideally as soon as the user is able to interact with page elements. This turned out to be a non-trivial task.

In response to the above problem, I developed a reusable class called HtmlPageEventManager whose purpose is to subscribe to a given list of HTML events for each new webpage that's opened. A goal is to attach handlers as soon as the user can begin interacting with page elements, even before the page load is complete. Using this class is simple -- just call the constructor:
var evts = new List<HtmlEvent>() {
HtmlEvent.onclick, HtmlEvent.ondblclick, HtmlEvent.onkeydown,
HtmlEvent.onselectstart, HtmlEvent.onselectionchange,
HtmlEvent.onfocusin
};
new HtmlPageEventManager( this, evts, this.HtmlEventHandler );
You can download my solution here to try it out. And feel free to use it in your own projects if you find it useful.

I encountered some issues while creating HtmlPageEventManager.
  • As mentioned above, I want to add the HTML event handlers as soon as possible, i.e. even before the page load is complete. Doing so in the NavigateComplete handler seemed to work great, until I tested opening a hyperlink in a new tab or window. The new tab or window often times didn't have the HTML event handlers functioning correctly. I could see from debugging that IHTMLDocument3.attachEvent was indeed called on the correct document or element instance and returned 'true' indicating handlers were attached, but the handlers were thereafter never invoked. It seemed that the document/element wasn't "ready" to have events attached, although that didn't concur with the 'true' return value.
    Via the debugger, I discovered that from within OnNavigateComplete, I can detect the problem situation by looking at the IWebBrowser2.Busy property, and adjust my strategy concerning when to add the handlers. So now the registrations occur in OnNavigateComplete only when browser.Busy is true. And if they haven't been registered by the time OnDocumentComplete is reached, then they get registered there instead.

  • When you "Refresh" a webpage, the event registrations were always being lost. That is, they weren't being reconnected on the new page elements. I found out that the NavigateComplete and DocumentComplete handles aren’t called on a "Refresh". When I tested it earlier, it seemed like they were called. But what I observed must have been those two handlers called for frames within the main document – not the main document itself. I overcame this issue by subscribing to the BHO's DownloadComplete event, where I again call my "RegisterEventHandlers" helper method that attaches the HTML event handlers. I incorporated the idea from this code of a "normalPageLoad" member variable to conditionally call RegisterEventHandlers.