Phillip Trelford's Array

POKE 36879,255

A single line stack trace

This week I was assigned a bug with a single line stack trace:

Telerik.Windows.Controls.RadDocking.<>c__DisplayClass1b'::'<DropWindow>b__19
 

The exception type was of type NullReferenceException. The issue could be reproduced by repeatedly docking and undocking a window in the application for about 30 seconds. The result was an unhandled exception that took down the application.

The single line indicated that the exception originated somewhere in Telerik’s RadControls for Silverlight, probably a compiler generated class for a closure.

Ildasm

Ildasm is a tool that lets you look at the .Net IL code generated by the compiler. Looking at the Telerik Docking dll with Ildasm, the generated class and method can be seen:.

IL DASM - Telerik.Windows.Controls.RadDocking.c_DisplayClass1b

DropWindow_b_19

.method public hidebysig instance void 'b__19'() cil managed
{
 // Code size 18 (0x12)
 .maxstack 8
 IL_0000: ldarg.0
 IL_0001: ldfld class Telerik.Windows.Controls.RadPane 
  Telerik.Windows.Controls.RadDocking/'<>c__DisplayClass1b'::activePane
 IL_0006: callvirt instance class Telerik.Windows.Controls.RadPaneGroup 
  Telerik.Windows.Controls.RadPane::get_PaneGroup()
 IL_000b: callvirt instance bool [System.Windows]
  System.Windows.Controls.Control::Focus()
 IL_0010: pop
 IL_0011: ret
} // end of method '<>c__DisplayClass1b'::'b__19'

The IL code shows a PanegGroup property being accessed followed by a call to a Focus method. The c__Displayclass class name indicates a closure.

Source Code

Telerik’s source code contains a RadDocking class with a DockWindow method that contains a closure that calls SetFocus on PaneGroup. Bingo!

Dispatcher.BeginInvoke(() => activePane.PaneGroups.SetFocus());

Workaround

The workaround is a common one in C#, add a null check against the property (PaneGroups) before calling the method (SetFocus).

What can we learn?

This fatal exception was found in a third party framework, thankfully during development. Lets examine how this happened and what can be done

Null checks

Tony Hoare, inventor of QuickSort, speaking at a conference in 2009:

I call it my billion-dollar mistake.

The billon-dollar mistake is the invention of the null reference in 1965.

C# references are null by default, and nullability is implicit.

Are null references really a bad thing? – Top Answer on Stack Overflow:

The problem is that because in theory any object can be a null and toss an exception when you attempt to use it, your object-oriented code is basically a collection of unexploded bombs.

How could this be done differently?

In F# references are not nullable by default, and nullability is explicit via the Option type, i.e. this issue could be removed by design.

Mutation

The PaneGroup property is most likely initialized with a valid reference before the call to BeginInvoke. The BeginInvoke method adds the Action to a queue and call it some time in the future.

C# objects are mutable by default.

This means that the state of the PaneGroup property may be mutated (set to null) before the closure is called.

F# objects are immutable by default, i.e. this issue could be removed by design.

BeginInvoke

It looks like SetFocus is being called asynchronously as UI Duct Type to workaround another issue where focus can not be set until the control is initialized:

It’s a standing joke on my current Silverlight project that when something isn’t working, just try Dispatcher.BeginInvoke.

This issue would require a framework fix where you could specify the control that receives focus by default.

Asynchronous calls

As the call to the closure was asynchronous it would be added to a queue, and later processed. The act of adding the closure to the queue removes it’s calling context which makes debugging hard.

Conclusions

Just this single line stack trace demonstrates a cacophony of language and framework design issues. Nullability by default in C# makes code look like a collection of unexploded bombs. Add asynchronous calls to the mix and you have even more chances of triggering one of those bombs. Worse working around the framework often forces you to make asynchronous calls to workaround other issues. Finally when a bomb does go off you are left with very little information to diagnose it.

Is OOP really a good paradigm for modern asynchronous UI programming?

Reporting Bugs to Microsoft

Software is a product of humans, humans exhibit defects as do their software. Microsoft software is no different. You can report issues on the Microsoft Connect site or to Microsoft Support directly. My recommendation is that if possible you should do both.

Silverlight

One of the Microsoft frameworks I have been exercising heavily over the last few years is Silverlight. In the process, as with any framework, I have encountered bugs. Several have been minor or easily worked around so have not been reported. Others however have warranted reporting, so far only one out of five has been fixed.

COM Leak

The first major issue I discovered was a Silverlight 4 COM memory leak which I reported back in April 2011. I received this canned response a day later:

We are rerouting this issue to the appropriate group within the Visual Studio Product Team for triage and resolution. These specialized experts will follow-up with your issue.

The Silverlight team has not followed-up on this issue. In the end I found a workaround for this. The issue still exists in Silverlight 5.

Disappearing windows

The next serious issue encountered was in the Silverlight 5 native window support. When a borderless window is maximized on a second monitor it disappears off screen. The issue received 17 votes and 12 people reported that they could reproduce the issue. Again no response from the Silverlight team and the issue still exists.

Performance degradation

After not receiving any human feedback from the Silverlight team via the Microsoft Connect site for over a year I reported the next issue directly to Microsoft Support, with mixed results. The issue relates to a serious performance degradation when opening multiple windows in Silverlight 5. It was reported over 10 weeks ago, and relatively quickly it was accepted as a bug in Silverlight by the three Microsoft support engineers that have been investigating it (the first a lucky intern). They did make contact every week or two to let me know that they were still looking at it. Unfortunately reporting the issue solely to the Microsoft Support team has delayed the issue being reported to the Silverlight team by over 10 weeks, as they put off doing this while they looked for workarounds. With hindsight I would have reported the issue on Microsoft Connect at the same time, which I have now done.

Visual Studio 2010 SP1 hangs

After applying SP1 to Visual Studio 2010 the IDE started hanging while debugging Silverlight applications. We encountered the first hang when expanding an F# discriminated union type in the debugger.(the issue however is not isolated to F#). In this case the F# team responded and diagnosed that the issue was a bug in the debugger introduced with SP1. A number of workarounds were offered and the issue has been fixed for Visual Studio 2012.

Context Menus & Child Windows

Not officially part of the Silverlight release the ContextMenu and ChildWindow controls are distributed with the Silverlight Toolkit. Both of these controls have not been updated to support the new Silverlight 5 Mutiple Window feature, so if you open a context menu or child window from a secondary window they will actually appear in the main window. As the Silverlight Toolkit is open source I was able to relatively easily workaround both of these issues, and package up the fixes in another open source project.

WPF

As mentioned at the start, a lot of my work recently has been with Silverlight so this is where I have been finding bugs. Silverlight was originally called WPF/E, and is a subset of WPF. WinRT in Windows 8 is itself derived from Silverlight 3. Some bugs are unique to Silverlight but many are inherited from WPF. for example windows with a black background flash white when first shown. On the pragmatic side, sometimes a workaround for a WPF originated issue can be applied to Silverlight and vice versa, the same may soon apply to WinRT. If you’re interested in hearing the point of view of a WPF developer check out six years of WPF.

Conclusions

Some teams at Microsoft do respond quickly to reported issues, for example I have always had a speedy response from the Visual F# team. Otherwise reporting a bug on a public forum means that not only do the product teams see the issue, so do the community, who may have already found a workaround. Reporting issues to Microsoft Support means you are guaranteed contact with support engineers at Microsoft, but this may also mean it takes longer to be seen by the product team as they look for a workaround, so I’d advise reporting issues on Microsoft Connect too. Hopefully Microsoft will continue to open source more of it’s projects in the future (F# is open source as is the Silverlight Toolkit) so that the community have the chance to fix issues as well.

Update: Microsoft no longer accepts WPF & Silverlight runtime issues, please use the respective forums instead:

Silverlight 5 Native Windows P/Invoke

Silverlight 5 brings native window and P/Invoke support for Silverlight desktop apps (elevated trust out-of-browser applications). The Silverlight Window class contains a subset of the properties and methods provided by in the equivalent WPF Window class. That subset of functionality may be enough for many users, and anything else can be implemented with a sprinkling of P/Invoke. Remember that .Net is mostly managed spackle over the Win32 API. With Silverlight 5 you can create your own managed spackle using P/Invoke.

Setting a Silverlight window’s transparency

WPF provides an AllowsTransparency property that you can set. Win32 provides a SetLayeredWindowAttributes function you can call:

public static void SetTransparency(IntPtr hwnd, byte alpha)
{
  // Note: the window must be in the hidden state for this to take effect
  SetWindowLong(hwnd, GWL_EXSTYLE, (int)(WS_EX_LAYERED + WS_EX_TRANSPARENT));
  SetLayeredWindowAttributes(hwnd, 0, alpha, LWA_ALPHA);
}

The SetLayeredAttributes function works on Layered Windows, so first we must set the window’s style to WS_EX_LAYERED using the Win32 SetWindowLong function.

[DllImport("user32.dll", SetLastError = true)]
static extern int GetWindowLong(IntPtr hwnd, int nIndex);

[DllImport("user32.dll")]
static extern int SetWindowLong(IntPtr hWnd, int nIndex, int dwNewLong);

Finally before we call our SetTransparency function we need to find the window’s handle:

public static IntPtr FindHwnd(Window window)
{
    var oldTitle = window.Title;
    var id = oldTitle + "(" + Guid.NewGuid().ToString() + ")";
    window.Title = id;
    var hwnd = FindWindowByCaption(IntPtr.Zero, id);
    window.Title = oldTitle;
    return hwnd;
}

The trick here is to temporarily give the window a unique title (using a GUID) and then find the window via it’s caption (title) using the Win32 FindWindow function.

[DllImport("user32.dll", EntryPoint = "FindWindow", SetLastError = true)]
static extern IntPtr FindWindowByCaption(IntPtr ZeroOnly, string lpCaption);

Note: you can not click on the new transparent window which limits its usefulness to specific operations, for example a drag and drop cue window.

Removing a Silverlight window from the task bar

WPF provides a ShowInTaskBar property for this. Win32 provides an Extended Window Style WS_EX_NOACTIVATE which does the same:

public static void RemoveFromTaskBar(IntPtr hwnd)
{
    // Note: the window must be hidden for this to take effect
    SetWindowLong(hwnd, GWL_EXSTYLE, (int)WS_EX_NOACTIVATE);
}

Note: the window must have been shown then hidden before the method can work.

window.Show();
window.Hide();
Hwnd.RemoveFromTaskBar(hwnd);

Detecting when a windows is Activated/Deactivated

WPF provides Activated and Deactivated events. Win32 provides a WM_ACTIVATE message that you can listen in to:

[AllowReversePInvokeCalls]
private IntPtr WindowHook(int code, IntPtr wParam, IntPtr lParam)
{
    if (code == HC_ACTION)
    {
        var messageInfo = new CWPSTRUCT();
        Marshal.PtrToStructure(lParam, messageInfo);

        if (messageInfo.message == WM_ACTIVATE)
        {
            var hwnd = messageInfo.lparam;
            if ((int)messageInfo.wparam == WA_INACTIVE)
            {
                var e = WindowActivated;
                if (e != null)
                    e(this, new WindowActivatedEventArgs(hwnd));
            }
            else
            {
                var e = WindowDeactivated;
                if (e != null)
                    e(this, new WindowDeactivatedEventArgs(hwnd));
            }
        }

    }
    return CallNextHookEx(_hHook, code, wParam, lParam);
}

The method above is a hook for windows events. It triggers an event when a message signals that a window has become active or inactive. Calling SetWindowsHookEx starts it:

SetWindowsHookEx(WH_CALLWNDPROC, _callback, IntPtr.Zero, GetCurrentThreadId());

Summary

With a sprinkling of Managed spackle we’ve been able to extend the Silverlight 5 multi window feature. Silverlight 5’s P/Invoke feature provides a lot of customization potential.

The source code to this post is available on BitBucket: