VB6 AddIn hangs OL2007 (calling DoEvents) -- Repro
Rather than getting lost in a lot of technical definition and supposition,
I'll begin with a description of my repro, what to do with it, and the
visible difference between OL2007 and previous versions. (I think it will be
easier to demonstrate first, than to try to explain.)
This repro is an AddIn written in VB6, with a modeless form that uses a
timer object to perform [contrived] background processing on an interval,
set by the user, in milliseconds. Each time the timer event fires, it
executes a loop, the number of iterations for which is also set by the user.
(Note 1)
The repro AddIn is flagged to load on next startup only. OL must not be
running when you register the DLL. The repro's form will be displayed when
OL is started one time only. The form can also be shown by disconnecting
and then connecting the AddIn, using the builtin facility. (Note 2)
When the repro's form is first displayed, its background process launch
timer is disabled. Clicking the start button enables this timer; as it
runs, the status of background processing is displayed in the form, cycling
between "running" and "idle". While status is "running" the processing loop
is performing it's [useless and contrived] "work"; when the loop ends,
status changes to "idle", and it waits for the next timer event to fire.
(Note 3) Click the cancel button to disable the timer, to make background
processing stop. (Addiional detail about status transitions and event
firings is sent to the debugging port.)
If the "Call DoEvents in loop" checkbox is checked, the repro [intuitively]
calls DoEvents once each iteration of the loop; if the checkbox is clear it
[also intuitively] skips that call. ***This is key to proving-out the cause
of the problem in OL2007. When the checkbox is clear, OL2007 will not hang;
when it is checked, and you are typing in an email while the background
processing is running, OL2007 will hang.
Steps to reproduce OL2007 hanging:
1. With OL not running, register the repro DLL using regsvr32.exe.
2. Start OL -- the repro's form should display for one Outlook session only.
(Use the COM AddIns management dialog to disconnect and then connect the
AddIn, to show the form again in the same OL session.)
2a. After closing the COM AddIns dialog [and all of its modal parents] the
repro's [modeless] form will be displayed -- because it is modeless, it may
get buried behind OL, but it has a task bar icon, that will be in Outlook's
group, if grouping is active.
3. Create a new email, click the Start button in the repro's form, and then
start typing in the new email.
3a. Holding down a key and letting it repreat will not cause OL2007 to hang,
because OL2007 suspends the AddIn's execution until the key is released.
3b. In fact, typing very rapidly (e.g., bouncing the same key as fast as you
can) will not cause OL to hang straight away -- UI activity in OL2007
suspends AddIn processing (as can be seen using a debugging port reader, the
duration of each processing loop is output there -- notice that it takes
longer to complete if you hold a key for several seconds while status is
"running".)
3c. Position the repro's form so it is visible while you are typing; if you
can see its status transitioning between "idle" and "running" as you are
typing, it should not take very long for OL 2007 to hang.
The "Call DoEvents in loop" checkbox is checked by default when the form
loads. If the test above is performed with the checkbox cleared, the email
composition window (in fact, all of OL) will seem frozen while background
processing is running; when it completes, all user input that occurred in
the [apparently] frozen UI plays out almost instantaneously. This behavior
tends to be unsettling to users -- thus the reason for calling DoEvents in
the first place. However, hanging OL in mid-composition is more than
unsettling, so avoiding calls to DoEvents under OL2007 seems the lesser of
two evils.
The repro DLL was built on an OL2003 system, so it will work in either 2003
or 2007 -- though it will not hang OL2003. Under OL2003, native OL code and
AddIns seem to run without preference for one or the other, i.e., UI
activity does NOT suspend AddIn processing. (Logically it must slow the
AddIn down infantessimally, but with whole-second accuracy of duration
timing, it's imperceptible using this repro.) In OL2003, an AddIn can yield
the processor to permit OL's UI to be updated, and then continue safely on
its merry way. (This is EXACTLY as it should be.)
Conversely, under OL2007, preference is apparently given to OL UI
processing. As is stated multiple times above, OL2007 suspends AddIn
processing while the user interacts with its UI. Clearly the mechanism that
does this is fatally flawed, and causes OL2007 to hang.
Note that it is NOT the AddIn that causes OL2007 to hang! It merely gives
OL2007 enough [proverbial] rope to hang itself. The tragedy to me is that
this new OL2007 behavior is unnecessary when using well-written AddIns
(which will often be callers of DoEvents.) It's crappy AddIns that block
OL's UI for inordinate lengths of time that no-doubt prompted this change --
how appropriate that well-behaved AddIns get the shaft... NOT! It's a
travesty as well, because even this ill-considered mechanism does not
overcome a tight loop, OL2007 is just as unresponsive as earlier versions.
So basically the only thing it accomplishes is breaking countless VB6
AddIns.
The culprit seems to be an Office-specific function within the VB6
runtime -- which obviously predates OL2007, but still... Perhaps they
should've provided an Office object model method to do the equivilent of
DoEvents, rather than build Office-specific behavior into the VB6 runtime --
this would've allowed them to update that method along with Office.
As is, as of Office 2007, the VB6 runtime is left with a very dangerous and
detrimental call. Perhaps it's intended to force developers to quit using
VB6, but in reality it will force us to release updated VB6-based versions
of existing AddIns, because this problem is too severe to force users to
deal with it long enough to port code to a supported language/compiler.
All sources are attached. Any comments are welcome.
From what I have found, mine is the first repro of this widely reported
problem, as well as the first to identify the problem. If you republish or
otherwise distribute any part of this, please give appropriate credit.
-Mark McGinty
(Note 1) The code inside the loop concatenates a short string to itself
until the length of the string exceeds an arbitrary length, then it
truncates the string back to a short one and does it again. (In other
words, it exercises string object memory allocation. It is unremarkable,
but it is such that it cannot be optimized away as an invariant loop. Also
it does not touch OOM in any way, eliminating that as source of the
problem.)
(Note 2) When you connect the AddIn it loads a form, and then tries to make
that [modeless] form visible -- which initially fails because the built-in
COM AddIn dialog is modal. It eats the error, and retries every 3/4 seconds
(using another timer) until it succeeds. (I agree it's quick and dirty, not
suggested for use in useful AddIns.)
(Note 3) If the timer fires while the loop is still running, the timer's
event handler is coded to exit. However, in testing, the timer never fired
while the loop was running (even though the duration of the loop exceeded
the timer's interval.)
|