<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	>

<channel>
	<title>Inside Echobit</title>
	<atom:link href="http://inside.echobit.net/dreijer/feed/" rel="self" type="application/rss+xml" />
	<link>http://inside.echobit.net/dreijer</link>
	<description>Dreijer's Notebook</description>
	<pubDate>Sun, 16 Nov 2008 16:37:38 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.7</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>When security gets in the way of things</title>
		<link>http://inside.echobit.net/dreijer/archives/2008/11/16/when-security-gets-in-the-way-of-things/</link>
		<comments>http://inside.echobit.net/dreijer/archives/2008/11/16/when-security-gets-in-the-way-of-things/#comments</comments>
		<pubDate>Sun, 16 Nov 2008 16:36:53 +0000</pubDate>
		<dc:creator>dreijer</dc:creator>
		
		<category><![CDATA[Rants]]></category>

		<category><![CDATA[Security]]></category>

		<guid isPermaLink="false">http://inside.echobit.net/?p=39</guid>
		<description><![CDATA[A while back I went clothes shopping with my brother. While we were waiting in line, it occurred to me how the stores go to great lengths to prevent shoplifting. They obviously cannot have security cameras in the fitting rooms so they need another mechanism, and one popular way is to allow the customers to [...]]]></description>
			<content:encoded><![CDATA[<p>A while back I went clothes shopping with <a href="http://volaband.com/">my brother</a>. While we were waiting in line, it occurred to me how the stores go to great lengths to prevent shoplifting. They obviously cannot have security cameras in the fitting rooms so they need another mechanism, and one popular way is to allow the customers to bring only a certain number of items into the fitting rooms. A lot of places enforce this by counting the number of items you&#8217;re bringing into the room (with an upper limit) and handing you a small badge showing exactly how many you&#8217;re bringing with you. When you come back out, the number on the badge is compared to the number of items you&#8217;re carrying.</p>
<p>This is a pretty simple and straightforward scheme that works quite well. With this post, however, I wanted to highlight how commonly used this approach has become that the stores (and their employees) seem to have forgotten why it was created in the first place.<span id="more-39"></span></p>
<p>My brother and I happened to end up in a store that used this exact approach. After having picked out the clothes we wanted to try on, we went to the fitting room area where we were met by a huge line of people waiting for their turn. When we finally got to the front of the line, it turned out that there were actually plenty of fitting rooms available but there was only one sales assistant<strong> </strong>around who could hand out badges. Everybody therefore had to wait for him to go through the line, one by one, and showing each customer to an available fitting room.</p>
<p>Unfortunately, he was so busy counting the number of items for the people entering the fitting rooms that he barely had time to look at the people coming out and instead they just dumped the badges on the nearest table and left. The sales assistant didn&#8217;t seem to care at all.</p>
<p>When my brother and I came out of the fitting room, the sales assistant was nowhere to be found. Other people came out too and looked similarly confused. Just like the people before us, we simply dumped our badges and the clothes we didn&#8217;t want to buy and left.</p>
<p>What annoyed me the most was that we&#8217;d spent more than 10 minutes in line for no reason at all. There was absolutely <strong>no</strong> point whatsoever in having the sales assistant hand out badges since he never checked them when people came back out.</p>
<p>In my opinion, this is what happens when security gets in the way of things. It&#8217;s understandable that the store wants to avoid shoplifting, but if they don&#8217;t follow through on their security measures then it just becomes a customer annoyance.</p>
<p>I hate wasting my time&#8230;</p>
]]></content:encoded>
			<wfw:commentRss>http://inside.echobit.net/dreijer/archives/2008/11/16/when-security-gets-in-the-way-of-things/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Security is only as strong as the weakest link</title>
		<link>http://inside.echobit.net/dreijer/archives/2008/11/02/security-is-only-as-strong-as-the-weakest-link/</link>
		<comments>http://inside.echobit.net/dreijer/archives/2008/11/02/security-is-only-as-strong-as-the-weakest-link/#comments</comments>
		<pubDate>Sun, 02 Nov 2008 21:40:28 +0000</pubDate>
		<dc:creator>dreijer</dc:creator>
		
		<category><![CDATA[Rants]]></category>

		<category><![CDATA[Security]]></category>

		<guid isPermaLink="false">http://inside.echobit.net/dreijer/?p=201</guid>
		<description><![CDATA[I recently had to register myself at the Danish Consulate in New York since I&#8217;ve relocated to the US. The registration page asked for various information such as name, address, phone number, e-mail address, and addresses of relatives. It also asked for my passport information, although that was optional.
Most people probably wouldn&#8217;t have noticed, but [...]]]></description>
			<content:encoded><![CDATA[<p>I recently had to register myself at the Danish Consulate in New York since I&#8217;ve relocated to the US. The registration page asked for various information such as name, address, phone number, e-mail address, and addresses of relatives. It also asked for my passport information, although that was optional.</p>
<p>Most people probably wouldn&#8217;t have noticed, but as a security-conscious IT professional I immediately saw that the registration page wasn&#8217;t encrypted with SSL. This, in my opinion, is particularly bad practice for a government-controlled website that expects its users to enter confidential information &#8212; and we&#8217;re not &#8220;just&#8221; talking credit card information here.<span id="more-201"></span></p>
<p>Since I had to complete the form, I reluctantly filled out the remaining fields and hit Submit. I was redirected to a confirmation page, which told me that a confirmation e-mail had be been sent to me to verify the e-mail address I had entered.</p>
<p>Fair enough. That&#8217;s standard practice these days.</p>
<p>A couple of minutes later the confirmation e-mail arrived. I was horrified to learn, however, that all the information I had entered on the registration page had been reprinted in the e-mail &#8212; even my passport information.</p>
<p>That did it. I immediately fired off an e-mail to the Consulate trying to voice my concerns about the security of the site. I was fortunate enough to have a contact at the Consulate from a previous correspondence, and when I told her about my experiences she was kind enough to forward my e-mail to the person responsible.</p>
<p>I received a response within an hour (what a pleasant surprise) and it turns out the site was supposed to be SSL encrypted, but for some reason the main page was linking to the wrong version of the page. This just illustrates how easily things can go wrong, even if it was done with the best intentions.</p>
<p>The confirmation e-mail was deliberate, though, and the government official assured me that they&#8217;d address the (obvious) security issue in an upcoming large-scale redesign of the site in January.</p>
<p>I&#8217;m very pleased that the Consulate responded so quickly to my concerns. I think it happens way too often that sites remain broken and unsafe for long periods of time even though the security holes are known to the maintainers.</p>
<p>Kudos to the Consulate!</p>
]]></content:encoded>
			<wfw:commentRss>http://inside.echobit.net/dreijer/archives/2008/11/02/security-is-only-as-strong-as-the-weakest-link/feed/</wfw:commentRss>
		</item>
		<item>
		<title>How Task Manager displays 16-bit processes</title>
		<link>http://inside.echobit.net/dreijer/archives/2008/10/17/how-task-manager-displays-16-bit-processes/</link>
		<comments>http://inside.echobit.net/dreijer/archives/2008/10/17/how-task-manager-displays-16-bit-processes/#comments</comments>
		<pubDate>Fri, 17 Oct 2008 23:32:56 +0000</pubDate>
		<dc:creator>dreijer</dc:creator>
		
		<category><![CDATA[Internals]]></category>

		<guid isPermaLink="false">http://inside.echobit.net/dreijer/?p=97</guid>
		<description><![CDATA[When Microsoft made the shift from 16-bit to 32-bit they had to still include support for the many 16-bit applications. These applications run in real mode whereas 32-bit applications operate in protected mode. As a result, Windows had to run these legacy applications through an emulation layer (a Virtual DOS Machine [VDM]) called NTVDM. NTVDM [...]]]></description>
			<content:encoded><![CDATA[<p>When Microsoft <a href="http://en.wikipedia.org/wiki/Windows#Hybrid_16.2F32-bit_operating_environments">made the shift</a> from 16-bit to 32-bit they had to still include support for the many 16-bit applications. These applications run in real mode whereas 32-bit applications operate in protected mode. As a result, Windows had to run these legacy applications through an emulation layer (a Virtual DOS Machine [VDM]) called <a href="http://en.wikipedia.org/wiki/NTVDM">NTVDM</a>. NTVDM has shipped with all 32-bit releases of Windows, but is no longer included in 64-bit Windows versions.</p>
<p>When a 16-bit application is launched on 32-bit Windows, NTVDM is used as a proxy application in order to launch the original application. NTVDM provides a complete <a href="http://en.wikipedia.org/wiki/Virtual_8086_mode">virtual 8086 mode</a> environment for the 16-bit application to run in. (In fact, all the proxied applications share a dedicated thread in NTVDM.) Since these applications are hosted internally by NTVDM, they only show up in Task Manager if the user has enabled the &#8220;Options-&gt;Show 16-bit tasks&#8221; menu option.<span id="more-97"></span></p>
<p>As can be seen in the screenshot below, two 16-bit applications (wowexec.exe and rdo001gl.exe) are hosted by ntvdm.exe on my computer. Wowexec.exe works together with ntvdm.exe to provide a 16-bit environment.</p>
<p style="text-align: center;"><a href="http://inside.echobit.net/dreijer/wp-content/uploads/2008/10/taskmgr_processes.png"><img class="size-medium wp-image-185 aligncenter" title="taskmgr_processes" src="http://inside.echobit.net/dreijer/wp-content/uploads/2008/10/taskmgr_processes-300x261.png" alt="" width="300" height="261" /></a></p>
<p>If you use <a href="http://technet.microsoft.com/en-us/sysinternals/bb896653.aspx">Process Explorer</a> from Sysinternals, these 16-bit processes won’t show up in the process list because they&#8217;re not considered &#8220;real&#8221; processes on a 32-bit operating system. Personally, though, I find it quite useful that I can view all the processes running on my system whether they’re 16-bit or 32-bit. It’s sort of weird if an application’s window is present in the taskbar but a corresponding process cannot be found in the process list.</p>
<p>So, how does Task Manager go about showing these 16-bit processes? It uses something called the Virtual DOS Manager Debug library (<a href="http://support.microsoft.com/kb/182559">VDMDBG</a>) (part of the Windows SDK), which lets you access 16-bit process information on a 32-bit operating system. For instance, VDMDBG lets you enumerate all VDMs currently running 16-bit processes (or tasks, as they’re referred to internally), or all the tasks running in a particular VDM.</p>
<p>Two functions are central in updating the process list view in taskmgr.exe: <strong>CProcPage::UpdateProcInfoArray</strong> and <strong>CProcPage::UpdateProcListview</strong>. The first function obtains a listing of all the processes currently running on the system by calling <a href="http://msdn.microsoft.com/en-us/library/ms725506(VS.85).aspx"><strong>ntdll!ZwQuerySystemInformation</strong></a> and steps through each one and adds it to an internal array. The function also extracts various information about the process (image name, CPU time, etc.) and calls <strong>CProcInfo::SetData</strong> to set it internally. <strong>CProcPage::UpdateProcListview</strong>, on the other hand, is responsible for updating the GUI by tapping into the aforementioned internal process info array.</p>
<p>The <strong>CProcInfo::SetData</strong> function is particularly interesting because it checks to see if the current process is ntvdm.exe:</p>
<pre>push    <strong>offset aNtvdm_exe ; "ntvdm.exe"</strong>
push    <strong>eax             ; wchar_t *</strong>
call    <strong>ds:__imp___wcsicmp</strong>
test    eax, eax
pop     ecx
pop     ecx
jnz     loc_100CA37</pre>
<p>If it is, <strong>CProcInfo::SetData</strong> calls <strong><a href="http://msdn.microsoft.com/en-us/library/bb963831(VS.85).aspx">VDMDBG!VDMEnumTaskWOWEx</a></strong> to obtain information about the 16-bit processes currently being hosted by ntvdm.exe. The second parameter to the function is a pointer to a <a href="http://msdn.microsoft.com/en-us/library/bb963828(VS.85).aspx">callback function</a>, which is set to <strong>CProcPage::WowTaskCallback</strong>.</p>
<p>In the screenshot above of Task Manager, ntvdm.exe hosted two 16-bit applications, wowexec.exe and rdo001gl.exe. On my computer, we therefore expect <strong>CProcPage::WowTaskCallback</strong> to be called twice, once for each task. To verify, we can set a breakpoint in the function and take a look at the fourth and fifth parameters passed to it:</p>
<pre>0:000&gt; da poi(ebp+14)
001ae6f8  <strong>"RDO001GL"</strong>
0:000&gt; da poi(ebp+18)
001ae701  <strong>"C:\PROGRA~1\BC31\BOOK\RDO001GL.E"</strong>
001ae721  <strong>"XE"</strong></pre>
<p><strong>CProcPage::WowTaskCallback</strong> calls <strong>CProcPage::SetDataWowTask</strong> to obtain information about the process, and to add it to the internal process info array alongside the 32-bit processes. However, to disinguish the two types of processes (16-bit and 32-bit), Task Manager displays the 16-bit processes as sub-processes of the ntvdm.exe process by indenting them in the process list.</p>
<p>That&#8217;s all there is to it.</p>
]]></content:encoded>
			<wfw:commentRss>http://inside.echobit.net/dreijer/archives/2008/10/17/how-task-manager-displays-16-bit-processes/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Name clashing</title>
		<link>http://inside.echobit.net/dreijer/archives/2008/10/10/name-clashing/</link>
		<comments>http://inside.echobit.net/dreijer/archives/2008/10/10/name-clashing/#comments</comments>
		<pubDate>Fri, 10 Oct 2008 13:43:43 +0000</pubDate>
		<dc:creator>dreijer</dc:creator>
		
		<category><![CDATA[Various]]></category>

		<guid isPermaLink="false">http://inside.echobit.net/dreijer/?p=147</guid>
		<description><![CDATA[I was in Redmond last week with Ken And Steve to attend Microsoft&#8217;s DriverDeveloper Conference (DDC). When registering on the first day, we all received these small laptop bags. At some point I was just toying around with mine and my eye caught onto the handle of the zipper:
It had my name on it and [...]]]></description>
			<content:encoded><![CDATA[<p>I was in Redmond last week with <a href="http://www.nynaeve.net">Ken</a> And <a href="http://kernelmustard.com/">Steve</a> to attend Microsoft&#8217;s DriverDeveloper Conference (DDC). When registering on the first day, we all received these small laptop bags. At some point I was just toying around with mine and my eye caught onto the handle of the zipper:</p>
<p>It had my name on it and it was even spelled with the Danish o-with-a-slash (ø) character (see picture below).<span id="more-147"></span></p>
<p>My first thought was that Microsoft certainly went all out to impress their conference attendees by individually customizing each bag, but it quickly turned out that both Ken and Steve had the same name on their bags. Little did I know that it was merely a coincidence (rather than a gesture) that the company, which made the bags, was simply named similarly to me.</p>
<p style="text-align: center;"><a href="http://inside.echobit.net/dreijer/wp-content/uploads/2008/10/img_0275.jpg"><img class="size-medium wp-image-149 aligncenter" title="img_0275" src="http://inside.echobit.net/dreijer/wp-content/uploads/2008/10/img_0275-300x225.jpg" alt="" width="300" height="225" /></a></p>
<p style="text-align: left;">It still made my day, though. What are the odds&#8230;</p>
]]></content:encoded>
			<wfw:commentRss>http://inside.echobit.net/dreijer/archives/2008/10/10/name-clashing/feed/</wfw:commentRss>
		</item>
		<item>
		<title>For-loops and the Visual Studio debugger</title>
		<link>http://inside.echobit.net/dreijer/archives/2008/10/07/for-loops-and-the-visual-studio-debugger/</link>
		<comments>http://inside.echobit.net/dreijer/archives/2008/10/07/for-loops-and-the-visual-studio-debugger/#comments</comments>
		<pubDate>Tue, 07 Oct 2008 15:23:25 +0000</pubDate>
		<dc:creator>dreijer</dc:creator>
		
		<category><![CDATA[Development]]></category>

		<guid isPermaLink="false">http://inside.echobit.net/?p=33</guid>
		<description><![CDATA[A while ago, a friend of mine discovered an interesting discrepancy in how the Visual Studio debugger shows local variables in relation to for-loops. When he demonstrated the issue, I decided to investigate the problem a little further.
To start things off, consider the following code snippet:
void SomeFunction()
{
    for (int i = 0; [...]]]></description>
			<content:encoded><![CDATA[<p>A while ago, a <a href="http://www.blacksmith-studios.dk/blog/">friend of mine</a> discovered an interesting discrepancy in how the Visual Studio debugger shows local variables in relation to for-loops. When he demonstrated the issue, I decided to investigate the problem a little further.</p>
<p>To start things off, consider the following code snippet:</p>
<pre>void SomeFunction()
{
    for (int i = 0; i &lt; 1; ++i);

    for (int i = 0; i &lt; 1; ++i);
}</pre>
<p>The function above contains two for-loops that do absolutely nothing useful (unless you consider looping useful). What makes these two for-statements interesting, though, is how they both use an iterator variable named <code>i</code>. According to the C standard (C98), <code>i</code> is local to the for-statement in which it is defined. That is, code outside the scope of the for-statement cannot access <code>i</code>.<span id="more-33"></span></p>
<p>Older Microsoft C++ compilers (pre-Visual Studio 2005) didn&#8217;t follow the standard and instead considered <code>i</code> to be valid outside the scope of the for-loop. Fortunately, Microsoft decided to make their newer versions of the compiler significantly more standards-compliant, and one of the things they fixed was the scope of for-loops. As a result, the two iterator variables (<code>i</code>) in the code snippet above are therefore considered to be separate entities.</p>
<p>However, it looks like they forgot to update the Visual Studio debugger to reflect this change. The screen shot of the Locals window below illustrates what I’m talking about:</p>
<p style="text-align: center;"><a href="http://inside.echobit.net/dreijer/wp-content/uploads/2008/10/locals.png"><img class="size-medium wp-image-113 aligncenter" title="The Visual Studio Locals window" src="http://inside.echobit.net/dreijer/wp-content/uploads/2008/10/locals.png" alt="" width="288" height="119" /></a></p>
<p>For some reason, the variable <code>i</code> is shown twice in the list when execution hits the second loop, even though it has gone out of scope in the first loop and is no longer valid. The disassembly of the code reveals that <code>i</code> has been created as a local variable on the stack (a separate stack location exists for each loop variable). The compiler could easily have used the same local stack location for both of these loop variables, of course, since they’re never used at the same time, but for this post we&#8217;re only considering unoptimized debug builds.</p>
<p>The Visual Studio debugger and WinDbg differ somewhat in how they show these stale variables. The first simply shows all the variables in the Locals window as in the previous screenshot, which makes it really hard to distinguish them from each other. WinDbg, on the other hand, shows the variables that have gone out of scope as &lt;Eclipsed&gt;:</p>
<p style="text-align: center;"><a href="http://inside.echobit.net/dreijer/wp-content/uploads/2008/10/eclipsed.png"><img class="size-medium wp-image-121 aligncenter" title="eclipsed" src="http://inside.echobit.net/dreijer/wp-content/uploads/2008/10/eclipsed-300x139.png" alt="" width="300" height="139" /></a></p>
<p>Executing &#8220;dv i&#8221; in WinDbg yields the following output:</p>
<pre>0:000&gt; dv i
    i = 1
    i = 1</pre>
<p>This tells us that the debugger is very much aware of the presence of both of the iterator variables even though neither is valid anymore.</p>
<p>If we dig further, it turns out that variables declared in the body of a single-line for-loop also show up in the debugger once they have gone out of scope. The following piece of code illustrates that:</p>
<pre>for (int i = 0; i &lt; 1; ++i)
    int foo = 0;

for (int i = 0; i &lt; 1; ++i)
    int foo = 0;

int i = 1;

// When we get here, we have three <code>i</code>'s and two <code>foo</code>'s in the Locals window</pre>
<p>Contrast this to the behavior of e.g. while-loops where variables declared in the body of the loop are correctly removed from the Locals window once they go out of scope.</p>
<p>Interestingly, the issue seems to only manifest itself whenever scope brackets, { and }, aren&#8217;t used. Consider the following example:</p>
<pre>for (int i = 0; i &lt; 1; ++i)
{
    int foo = 0;
}

for (int i = 0; i &lt; 1; ++i)
{
    int foo = 0;
}

{
    int bar = 1;
}

// When we get here, only the two i's are present in the Locals window</pre>
<p>In this example, there is always only one <code>foo</code> or one <code>bar</code> shown in the debugger, and they disappear from the Locals window whenever the variable goes out of scope. As soon as the scope brackets are removed, though, we get the aforementioned behavior.</p>
<p>It&#8217;s worth mentioning that the assembly code of a for-loop with and without the scope brackets is, of course, completely identical.</p>
<p>This lingering local variable issue is most likely a bug in the debugger since the for-loops should essentially work just like while- and do-while-loops. It’s probably a relic from the days where the Visual C++ compiler didn&#8217;t conform as much to the C++ standard as it does now and variables declared in for-loops didn&#8217;t just have local scope.</p>
<h2>Implications</h2>
<p>At first, this behavior seems quite harmless. However, take a look at what happens in the following case:</p>
<pre>int n = /* someArbitraryNumber */;

for (int i = 0; i &lt; n; ++i);

int m = /* someOtherArbitraryNumber */;

for (int i = 0; i &lt; m; ++i);

// .. A few more loops using i ..

int i = /* Complex calculation */;

...

// When looking at i in the Locals window at this point, it's difficult to determine exactly
// which one is used in the calculation below since multiple exist.
int result = i + /* some other variables */;</pre>
<p>In the example above, the variable <code>i</code> is used multiple times. When you get to calculating the result (the last line), how do you know which <code>i</code> to look at in the debugger? If you&#8217;re lucky enough to step over the complex calculation in the debugger, the correct variable will most likely be highlighted in red by Visual Studio, but you could just as well have set a breakpoint further down the code path or have broken into the debugger due to an exception.</p>
<p>It&#8217;s fairly easy to get past this issue simply by looking at the actual instruction in disassembly mode to figure out the stack location that’s being referred to, or by hovering the mouse over the variable in the source file, but it seems to me that this is unnecessary work for something that the debugger should be able to tell you right away in the Locals window. After all, the previous instances of <code>i</code> have gone out of scope and are no longer valid, so why show them?</p>
<h2>Test Script</h2>
<p>I&#8217;ve uploaded a simple test script if you want to test out the behavior yourself: <a href="http://inside.echobit.net/dreijer/files/loop.cpp">loop.cpp<br />
</a></p>
]]></content:encoded>
			<wfw:commentRss>http://inside.echobit.net/dreijer/archives/2008/10/07/for-loops-and-the-visual-studio-debugger/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Stupid mistake: Forgetting to return a value</title>
		<link>http://inside.echobit.net/dreijer/archives/2008/08/23/stupid-mistake-forgetting-to-return-a-value/</link>
		<comments>http://inside.echobit.net/dreijer/archives/2008/08/23/stupid-mistake-forgetting-to-return-a-value/#comments</comments>
		<pubDate>Sat, 23 Aug 2008 10:52:39 +0000</pubDate>
		<dc:creator>dreijer</dc:creator>
		
		<category><![CDATA[Stupid mistakes]]></category>

		<guid isPermaLink="false">http://inside.echobit.net/?p=41</guid>
		<description><![CDATA[No matter how long you&#8217;ve been programming, you&#8217;re bound to hit a problem at some point which takes you multiple hours or days to fix, and which turns out to be a simple mistake on your part. This post is the first in a new series I’ll be writing on stupid programming mistakes I&#8217;ve made [...]]]></description>
			<content:encoded><![CDATA[<p>No matter how long you&#8217;ve been programming, you&#8217;re bound to hit a problem at some point which takes you multiple hours or days to fix, and which turns out to be a simple mistake on your part. This post is the first in a new series I’ll be writing on stupid programming mistakes I&#8217;ve made in the past (and that I&#8217;m not particularly proud of).</p>
<p>A couple of days ago, I hit a problem when testing the <a href="http://www.lanbridger.com">LAN Bridger</a> central server, which is hosted on a Linux box. I do most of my development and testing on Windows, though, so as a result the LAN Bridger server runs on both Windows and Linux for ease of debugging. From time to time, and especially toward the end of a release cycle, I typically have to compile and test the server thoroughly on Linux to make sure everything works.</p>
<p>I usually do all of my testing with debug builds (contrary to <a href="http://www.nynaeve.net/?p=184">Ken’s beliefs</a>). Once I’m sure everything runs smoothly and I don’t get any assertions or erratic behavior, I turn to release builds. In this particular case, the server worked flawlessly for debug builds, but exhibited a rather strange behavior for release builds.<span id="more-41"></span></p>
<p>LAN Bridger uses a connection manager, which is responsible for timing out connections if they become idle. It basically checks to see when the last packet was received and if it’s more than a certain threshold, the connection is closed. When testing the debug build of the server, the connections were correctly timed out and removed. For release builds, not so much.</p>
<p>My first cause of action was to attach a debugger (gdb) to the server process. Unfortunately, since this was a release build, debugging became much harder. For instance, as shown in the stack trace below it was pretty difficult to identify which thread corresponded to the connection manager, even after showing the stack traces for each thread:</p>
<pre>(gdb) info threads
  9 Thread 114696 (LWP 2979) 0x08129acf in nanosleep () at cryptlib.h:1144
  8 Thread 98311 (LWP 2978)  0x08129891 in accept () at cryptlib.h:1144
  7 Thread 81926 (LWP 2977)  0x081299b1 in recvfrom () at cryptlib.h:1144
* 6 Thread 65541 (LWP 2976)  0x08129acf in nanosleep () at cryptlib.h:1144
  5 Thread 49156 (LWP 2975)  __pthread_sigsuspend (set=0xb6f8dcdc) at ../linuxthreads/sysdeps/
                             unix/sysv/linux/pt-sigsuspend.c:56
  4 Thread 32771 (LWP 2974)  __pthread_sigsuspend (set=0xb778dcdc) at ../linuxthreads/sysdeps/
                             unix/sysv/linux/pt-sigsuspend.c:56
  3 Thread 16386 (LWP 2973)  __pthread_sigsuspend (set=0xb7f8dcdc) at ../linuxthreads/sysdeps/
                             unix/sysv/linux/pt-sigsuspend.c:56
  2 Thread 32769 (LWP 2972)  0x082936fa in poll ()
  1 Thread 16384 (LWP 2969)  __pthread_sigsuspend (set=0xbf97834c) at ../linuxthreads/sysdeps/
                             unix/sysv/linux/pt-sigsuspend.c:56</pre>
<p>By manually setting a breakpoint in the connection manager&#8217;s main loop, I was able to determine that it was thread number 6. The stack trace for that particular thread was very uninformative:</p>
<pre>(gdb) bt
#0  0x08129acf in nanosleep () at cryptlib.h:1144
#1  0x00000000 in ?? ()</pre>
<p>The first thought that hit me was that I was trashing the stack, but setting a breakpoint in the connection manager thread clearly indicated that it was just a misleading stack trace and that the thread was running as intended.</p>
<p>What puzzled me the most was why connections were removed correctly for debug builds, but not for release builds. The server also ran flawlessly on Windows. The only real difference between the debug and release builds was the <strong>-O2</strong> compiler flag. To rule out the possibility that the compiler was making a funky optimization error, I decided to upgrade gcc to the latest version (4.3).</p>
<p>This was my first real mistake – never assume that the compiler is doing something wrong until you’ve thoroughly examined all other possibilities. I guess I got blinded by certain optimization bugs I had <a href="http://www.nynaeve.net/?p=108">read about</a> in the past. Unfortunately, the bug also manifested itself even when the server was compiled with the updated version of gcc.</p>
<p>Having wasted valuable time upgrading the compiler and making both the server and all of its third-party dependencies compile and link, I went back to the root of the problem. By stepping through the connection manager, I noticed that the variable, which held the time the last packet was received, was zero. Interestingly, that particular variable was only being assigned values from a single function, <strong>GetTickCount()</strong>, which was supposed to return a monotonically increasing number, i.e. it should never be zero.</p>
<p>Here’s where I made the second mistake. Rather than just checking the return value of the <strong>GetTickCount()</strong> function, I decided to verify that the calculation in <strong>GetTickCount()</strong> was correct. (I’d recently updated the implementation of the function so I thought it quite possible that I&#8217;d made a mistake somewhere.) The result of the calculation, however, was correct.</p>
<p>Finally, I checked the actual return value of <strong>GetTickCount()</strong>. The value of the <strong>eax </strong>register was 0. This was impossible given that I’d just verified that the calculation was correct.. unless I’d forgotten to return the result!</p>
<h2>My own fault</h2>
<p>There were several reasons why this mistake had gone unnoticed. First of all, I had originally coded the function while on Windows and it never got compiled by the Visual Studio compiler since the function was only defined for Linux. Secondly, I had deliberately left out the &#8216;<strong>-Wall</strong>&#8216; compiler flag because I got a ton of warnings from third-party libraries. Unfortunately, this meant I didn’t get a warning when I forgot to return a value from the <strong>GetTickCount()</strong> function. (Compare that to the Visual Studio compiler which automatically promotes such [rather serious] warnings to errors.)</p>
<p>These are both mistakes on my end, and indeed some that I should’ve caught. However, the point of this story is to highlight just how easily small things can lead you astray. For instance, I didn’t notice that the problem was actually in the <strong>GetTickCount()</strong> function until much later in the process after taking a major detour.</p>
<p>I can&#8217;t help but wonder how quickly I would’ve caught this mistake if I was debugging on Windows. As you might have guessed, I’m much more comfortable with WinDbg and the Visual Studio debugger than I am with gdb, and it wouldn’t have taken me more than a few seconds to notice the incorrect value of the timestamp variable, or the incorrect return value of the <strong>GetTickCount()</strong> function. A bug like this really shows the importance of knowing your tools &#8212; and knowing them well. On the plus side, though, I got a chance to work much closer with gdb than I usually do.</p>
<h2>Wrapping Up: Why it worked for debug builds</h2>
<p>The <strong>GetTickCount()</strong> function is used to get a timestamp that can be used in timeout operations. The reason the connection manager worked correctly for debug builds is due to how the compiler repurposes the registers. On x86, return values are stored in the <strong>eax</strong> register, but since <strong>GetTickCount()</strong> didn’t return a value, <strong>eax</strong> was essentially left untouched and would instead contain the result of a previous calculation or the return value from another function.</p>
<p>For the debug builds, when receiving a packet and storing the current time in a timestamp variable, <strong>GetTickCount()</strong> might return zero due to the way the compiler had purposed the registers. When the connection manager later checked to see if the connection had timed out by subtracting the current value of <strong>GetTickCount()</strong>, which might be non-zero depending on the current value of the <strong>eax</strong> register, with the previously stored timestamp, the difference might be larger than the timeout threshold and the connection would be removed.</p>
<p>For the release builds, <strong>eax</strong> was always zero when <strong>GetTickCount()</strong> was called, and thus the timeout threshold was never exceeded and no connections were ever removed.</p>
<p><strong>If you’d like to share your own stupid mistake stories, drop me a line at [dreijer at echobit dot net] and I&#8217;ll publish them here (with due credit, of course).</strong></p>
]]></content:encoded>
			<wfw:commentRss>http://inside.echobit.net/dreijer/archives/2008/08/23/stupid-mistake-forgetting-to-return-a-value/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Reflections on Hungarian notation</title>
		<link>http://inside.echobit.net/dreijer/archives/2008/08/12/reflections-on-hungarian-notation/</link>
		<comments>http://inside.echobit.net/dreijer/archives/2008/08/12/reflections-on-hungarian-notation/#comments</comments>
		<pubDate>Tue, 12 Aug 2008 20:41:07 +0000</pubDate>
		<dc:creator>dreijer</dc:creator>
		
		<category><![CDATA[Development]]></category>

		<category><![CDATA[Rants]]></category>

		<guid isPermaLink="false">http://inside.echobit.net/?p=40</guid>
		<description><![CDATA[Coding style is very sensitive subject. The war on Hungarian notation, for instance, has been going on for ages and is still very much alive.
A few days ago I stumbled upon Herb Sutter&#8217;s latest remarks on his personal preference and some of the comments to his post sparked my interest. For instance, I find the [...]]]></description>
			<content:encoded><![CDATA[<p>Coding style is very sensitive subject. The war on Hungarian notation, for instance, has been going on for ages and is still very much alive.</p>
<p>A few days ago I stumbled upon Herb Sutter&#8217;s latest remarks on his <a href="http://herbsutter.wordpress.com/2008/07/15/hungarian-notation-is-clearly-goodbad/">personal preference</a> and some of the comments to his post sparked my interest.<span id="more-40"></span> For instance, I find the following statement particularly bad:</p>
<blockquote><p>&#8220;The compiler or IDE knows the type of the variable, so why do you need to prefix it?&#8221;</p></blockquote>
<p>The answer is <strong>readability</strong>. I think <a href="http://herbsutter.wordpress.com/2008/07/15/hungarian-notation-is-clearly-goodbad/#comment-686">John</a> hit it spot on in his comment to Herb&#8217;s post:</p>
<blockquote><p>“Variable names are for the code reader, not just the compiler. … Most people read code more often than they write it. The IDE is not always available (and it’s certainly not quick, if we’re talking about Visual Studio). The variable declaration is often far away from its usage. Having the type embedded in the name saves people from having to dig around for the declaration.”.</p></blockquote>
<p>Personally, I don&#8217;t prefix variables with type information such as <strong>dw </strong>or <strong>ul </strong>since I don&#8217;t believe it makes the code any clearer. I&#8217;m more interested in <em>how </em>the variable is used. For instance, I use <strong>p</strong> to denote pointer types since the semantics are clearly different from using an integral type, as illustrated below:<br />
&#8220;if (foo == 0)” and “if (pFoo == 0)”</p>
<p>One of my biggest pet peeves is member variables. Here at work, people have very different ideas on how member variables should be identified. Some just write them like they do local variables, others capitalize the first letter, and yet others prefix the variables with an underscore or <strong>m_</strong>. I prefer the latter simply because it improves readability. It enables me to quickly get an overview of a piece of code and the scope of its variables without having to use any features of the IDE.</p>
<p>To demonstrate to our coworkers the usefulness of showing the scope of their variables by augmenting the variable name, a <a href="http://steinware.dk">friend</a> and I compiled a list with three different scenarios:</p>
<ul>
<li>If intelligent syntax highlighting isn&#8217;t available, you cannot figure out whether you&#8217;re assigning a value to a local or a member variable just by looking at the code. Instead, you have to manually search through the function or rely on the IDE (e.g. by hovering the cursor over the variable name)</li>
<li>Local and member variable name clashing:
<pre>void SomeClass::SomeFunc()
{
    // Define a local variable
    int foo = ...;

    // Assign the local variable to a member variable
    // We have to use the 'this' keyword since the two variables are named identically
    this-&gt;foo = foo;
}</pre>
</li>
<li>Prefixing member variables makes auto-suggestion tools like IntelliSense and Visual Assist more accurate. For instance, typing &#8216;m_&#8217; would cause the IDE to suggest only member variables, whereas typing the variable name would bring up local, global, and member variables in the suggestion box. (I know this point sort of contradicts the first scenario since it relies on the IDE, but I think it&#8217;s important to highlight how prefixing can also increase the usefulness of such tools.)</li>
</ul>
<p>Before concluding this post, I think it&#8217;s also worth commenting on some peoples&#8217; habits of embedding type information in variables. As I said before, I don&#8217;t personally do it, but not because I think it creates a huge maintainability issue. Changing the type of a variable (e.g. from int to float) changes the contract between the variable and the code, and the programmer is forced to <a href="http://blogs.msdn.com/larryosterman/archive/2004/06/22/162629.aspx#163529">go through every single location</a> that uses it to make sure the code still behaves correctly. Not too bad, in my opinion&#8230;</p>
<p>These viewpoints are, of course, completely my own, and I&#8217;m sure a lot of people disagree with me. If you&#8217;re one of those, I&#8217;d love to hear your sentiments about why you think I&#8217;m wrong.</p>
]]></content:encoded>
			<wfw:commentRss>http://inside.echobit.net/dreijer/archives/2008/08/12/reflections-on-hungarian-notation/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Be careful who you blame (another McAfee issue)</title>
		<link>http://inside.echobit.net/dreijer/archives/2008/08/10/be-careful-who-you-blame-another-mcafee-issue/</link>
		<comments>http://inside.echobit.net/dreijer/archives/2008/08/10/be-careful-who-you-blame-another-mcafee-issue/#comments</comments>
		<pubDate>Sun, 10 Aug 2008 19:56:47 +0000</pubDate>
		<dc:creator>dreijer</dc:creator>
		
		<category><![CDATA[Rants]]></category>

		<guid isPermaLink="false">http://inside.echobit.net/?p=24</guid>
		<description><![CDATA[I recently talked about how third-party applications sometimes have a bad influence on other applications. In this post, I&#8217;ll continue that series.
I just got a new laptop from work preloaded with the usual stuff such as an office suite and antivirus software. I tend to prefer manually installing only the software I need when I [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://inside.echobit.net/archives/2008/02/11/be-careful-who-you-blame/">I recently talked about</a> how third-party applications sometimes have a bad influence on other applications. In this post, I&#8217;ll continue that series.</p>
<p>I just got a new laptop from work preloaded with the usual stuff such as an office suite and antivirus software. I tend to prefer manually installing only the software I need when I get my hands on a new computer, but since this was for work I was pretty limited in what I was allowed to do with it. So, I just shed a tear and then went along pretending I was happy.</p>
<p>Not surprisingly, it didn&#8217;t take long before I started noticing problems.<span id="more-24"></span></p>
<p>Here at work, we use Exchange and Outlook 2007 for e-mail. I&#8217;ve installed two profiles, one for my Exchange account and one for my personal account. I keep the Exchange profile active most of the day (after all, I&#8217;m at work..), but occasionally I exit Outlook and switch profiles so I can check my personal e-mail.</p>
<p>When trying to relaunch Outlook, however, nothing happens. A quick look in Task Manager reveals two instances of the Outlook.exe process, one of them consuming more memory than the other. Killing the first process (the one with the biggest memory footprint) causes the second to continue running and the new instance of Outlook to appear.</p>
<p>It would seem the old instance of Outlook isn&#8217;t shutting down correctly, and the lingering process is keeping all new instances of Outlook from loading properly. This issue is quite problematic for ordinary (read: non-technical) users who think the application has already exited because the Outlook window has disappeared. Instead, they probably try to launch the application five more times before giving up and finally calling tech support.</p>
<p>It&#8217;s not hard to guess who gets the blame: Outlook, yet again.</p>
<p>The prospect of falsely blaming Outlook, and because a coworker saw similar behavior with his copy of Outlook, caused me to start digging deeper to find the root cause of why the application was hanging at shutdown. Was it really a bug in Outlook, or was it caused by something else?</p>
<p>The first step was to figure out <em>when</em> the issue surfaced. It occurred to me that I&#8217;d only noticed the behavior when I was switching from the Exchange profile to my personal e-mail profile. This lead me to believe it might be an issue with Exchange since my personal e-mail accounts only used POP3 and IMAP. The company I work for uses an off-site e-mail provider, and as such we have to log on the very first time Outlook connects to the Exchange server, Digging deeper, it turned out Outlook didn&#8217;t hang if I dismissed the Log On dialog and instead quit the application right away. Furthermore, if I <em>did</em> log on and the first entry in my Inbox was anything but an e-mail, such as a calendar event, and I closed Outlook before opening any e-mails, the application wouldn&#8217;t hang upon shutdown either.</p>
<p>Unfortunately, the previous analysis didn&#8217;t shed any light as to <em>why</em> this issue only happened when accessing Exchange. The natural next step was therefore to attach a debugger to the lingering process. WinDbg spat out the following message right away:</p>
<pre>Break-in sent, waiting 30 seconds...
WARNING: Break-in timed out, suspending.
This is usually caused by another thread holding the loader lock</pre>
<p>Aha, sounds like we&#8217;re dealing with a deadlocked process. WinDbg revealed that only two threads were left in the process:</p>
<pre>0:000&gt; ~* k

.  0  Id: d34.5d4 Suspend: 1 Teb: 7ffdf000 Unfrozen
ChildEBP RetAddr
0013f7bc 7c90df3c ntdll!KiFastSystemCallRet
0013f7c0 7c8025db ntdll!NtWaitForSingleObject+0xc
0013f824 7c802542 kernel32!WaitForSingleObjectEx+0xa8
0013f838 77566f71 kernel32!WaitForSingleObject+0x12
0013f854 775146e7 ole32!CDllHost::ClientCleanupFinish+0x30
0013f880 77514657 ole32!DllHostProcessUninitialize+0x80
0013f89c 774ff231 ole32!ApartmentUninitialize+0xd6
0013f8b4 774fee98 ole32!wCoUninitialize+0x41
0013f8d0 05f9d912 <strong>ole32!CoUninitialize+0x5b</strong>
WARNING: Stack unwind information not available. Following frames may be wrong.
0013fd34 05fa6b01 <strong>saPlugin+0xd912</strong>
0013fd5c 7c923aba <strong>saPlugin!DllUnregisterServer+0x62e1</strong> ; DllMain
0013fde0 7c81ca96 ntdll!LdrShutdownProcess+0x14f
0013fed4 7c81cb0e kernel32!_ExitProcess+0x42
0013fee8 78131720 kernel32!ExitProcess+0x14
0013fef0 78131a03 msvcr80!__crtExitProcess+0x14
0013ff2c 78131a4b msvcr80!_cinit+0x101
0013ff3c 300051f6 msvcr80!exit+0xd
0013ffc0 7c817067 OUTLOOK+0x51f6
0013fff0 00000000 kernel32!BaseProcessStart+0x23

1  Id: d34.17a8 Suspend: 1 Teb: 7ffde000 Unfrozen
ChildEBP RetAddr
00fffc0c 7c90df3c ntdll!KiFastSystemCallRet
00fffc10 7c91b22b ntdll!NtWaitForSingleObject+0xc
00fffc98 7c901046 ntdll!RtlpWaitForCriticalSection+0x132
00fffca0 7c91e395 ntdll!RtlEnterCriticalSection+0x46
00fffd18 7c90e437 ntdll!_LdrpInitialize+0xf0
00000000 00000000 ntdll!KiUserApcDispatcher+0x7</pre>
<p>The stack trace for the main thread (thread 0) shows that the application is shutting down. In fact, it&#8217;s stuck trying to uninitialize COM in the saplugin module. Let&#8217;s have a look at who owns that module:</p>
<pre>0:000&gt; lmvm <strong>saplugin</strong>
start    end        module name
04e20000 04e57000   saPlugin  (export symbols)  <strong>C:\Program Files\SiteAdvisor\6261\saPlugin.dll</strong>
Loaded symbol image file: C:\Program Files\SiteAdvisor\6261\saPlugin.dll
Image path: C:\Program Files\SiteAdvisor\6261\saPlugin.dll
Image name: saPlugin.dll
Timestamp:        Fri May 16 18:40:18 2008 (482DB8F2)
CheckSum:         00040AA8
ImageSize:        00037000
File version:     2.6.0.6261
Product version:  2.6.0.0
File flags:       0 (Mask 3F)
File OS:          4 Unknown Win32
File type:        1.0 App
File date:        00000000.00000000
Translations:     0409.04b0</pre>
<p>Interesting. The module, which is called SiteAdvisor, seems to be owned by McAfee and has been installed as part of McAfee VirusScan. According to their <a href="http://www.siteadvisor.com/">website</a>, SiteAdvisor protects your computer against viruses, spam, and adware. I&#8217;d previously noticed it as a plugin in Firefox, but had decided against uninstalling it in order to figure out what it actually did (who knows, maybe it did something good).</p>
<p>What puzzled me, however, was why it suddenly appeared in Outlook. It hadn&#8217;t been installed as an add-in, so something had to cause saplugin.dll to be loaded by Outlook. I restarted Outlook and instructed it to break whenever saplugin.dll was loaded:</p>
<pre>0:000&gt; sxe ld:saplugin; g

ModLoad: 06000000 06037000   <strong>C:\Program Files\SiteAdvisor\6261\saPlugin.dll</strong>
eax=00000003 ebx=00000000 ecx=0602f050 edx=f6000000 esi=00266110 edi=00000000
eip=7c90e4f4 esp=0013dc54 ebp=0013dd48 iopl=0         nv up ei pl zr na pe nc
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00000246
ntdll!KiFastSystemCallRet:
7c90e4f4 c3

0:000&gt; kb
ChildEBP RetAddr  Args to Child
0013dc50 7c90d50c 7c91d956 00000990 ffffffff ntdll!KiFastSystemCallRet
0013dc54 7c91d956 00000990 ffffffff 0013dd2c ntdll!ZwMapViewOfSection+0xc
0013dd48 7c91624a 001d8d10 0013ddd4 0013e2fc ntdll!LdrpMapDll+0x759
0013e008 7c9164b3 00000000 001d8d10 0013e2fc ntdll!LdrpLoadDll+0x1e9
0013e2b0 7c801bbd 001d8d10 0013e2fc 0013e2dc ntdll!LdrLoadDll+0x230
0013e318 7c80aeec 0013e338 00000000 00000000 kernel32!LoadLibraryExW+0x18e
0013e32c 10001282 <strong>0013e338 </strong>003a0043 0050005c kernel32!LoadLibraryW+0x11
WARNING: Stack unwind information not available. Following frames may be wrong.
<strong>0013e780 7e44f8ee 00050003 000c1570 0013e7ac saHook!saHooker_Uninitialize+0xc2</strong>
0013e7b4 7c90e453 0013e7c4 00000080 00000080 USER32!__fnHkINLPCBTCREATESTRUCT+0x82
0013e840 7e43e1ad 7e43e18a 00000003 000c1570 ntdll!KiUserCallbackDispatcher+0x13
0013e868 74730f0a 000102df 00000003 000c1570 USER32!NtUserCallNextHookEx+0xc
0013e8b0 7e431923 00000003 000c1570 0013e910 MSCTF!SysCBTProc+0xd2
0013e8e4 7e44f8e7 00050003 000c1570 0013e910 USER32!DispatchHookA+0x101
0013e918 7c90e453 0013e928 0000007c 0000007c USER32!__fnHkINLPCBTCREATESTRUCT+0x7b
0013e9a0 7e42e389 7e42e34f 00000000 0013eec8 ntdll!KiUserCallbackDispatcher+0x13
0013ee44 7e42e442 00000000 0013eec8 00000000 USER32!NtUserCreateWindowEx+0xc
0013eef0 7e42d0d6 00000000 32a7d830 00000000 USER32!_CreateWindowEx+0x1ed
0013ef2c 32629a19 00000000 32a7d830 00000000 USER32!CreateWindowExW+0x33
0013ef94 326d3cc1 00000000 32a7d830 00000000 mso!Ordinal6700+0x301
0013efdc 326d3ac8 0008157e 0127ad00 7c80e4cd mso!Ordinal3679+0x6b

0:000&gt; du <strong>0013e338 </strong>
0013e338  "C:\Program Files\SiteAdvisor\626"
0013e378  "1\saPlugin.dll"</pre>
<p>That explained it! SiteAdvisor was injected into all processes by the means of a <a href="http://msdn.microsoft.com/en-us/library/ms632589(VS.85).aspx">global hook</a> called <strong>sahook</strong>.</p>
<p>Now, the stack trace shown earlier, which identified saplugin as being part of the critical path that led to the deadlock, could just as well have been caused by an earlier event that eventually manifested itself when SiteAdvisor was being unloaded. As with classic memory corruption cases, it might be the case that some other module in the process had done something bad which later caused the process to deadlock. Consequently, more digging for proof was required before blaming SiteAdvisor.</p>
<p>Fortunately, I didn&#8217;t have to look far. It was clear that the main thread of execution was waiting for something &#8212; after all it was stuck in a call to <strong>WaitForSingleObject</strong>. The second thread in the process was attempting to acquire a lock as can be seen by the call to <strong>RtlEnterCriticalSection</strong>:</p>
<pre>0:000&gt; ~1 kb
ChildEBP RetAddr  Args to Child
00fffc0c 7c90df3c 7c91b22b 00000218 00000000 ntdll!KiFastSystemCallRet
00fffc10 7c91b22b 00000218 00000000 00000000 ntdll!NtWaitForSingleObject+0xc
00fffc98 7c901046 0197b178 7c91e395 7c97b178 ntdll!RtlpWaitForCriticalSection+0x132
00fffca0 7c91e395 <strong>7c97b178 </strong>00fffd2c 00000004 ntdll!RtlEnterCriticalSection+0x46
00fffd18 7c90e437 00fffd2c 7c900000 00000000 ntdll!_LdrpInitialize+0xf0
00000000 00000000 00000000 00000000 00000000 ntdll!KiUserApcDispatcher+0x7

0:000&gt; !critsec <strong>7c97b178</strong>
CritSec <strong>ntdll!LdrpLoaderLock</strong>+0 at <strong>7c97b178</strong>
LockCount          1
RecursionCount     1
OwningThread       <strong>5d4</strong>
EntryCount         64
ContentionCount    64
*** Locked

0:000&gt; ~
.  0  Id: d34.<strong>5d4</strong> Suspend: 1 Teb: 7ffdf000 Unfrozen
   1  Id: d34.17a8 Suspend: 1 Teb: 7ffde000 Unfrozen</pre>
<p>The second thread, however, is trying to acquire the loader lock, which is already owned by the main thread. The latter is in turn waiting for COM to shut down (that second thread could very well be an APC call dispatched by the COM runtime), and voilá, we have a deadlock.</p>
<p><a href="http://msdn.microsoft.com/en-us/library/ms688715(VS.85).aspx">MSDN states</a> <a href="http://msdn.microsoft.com/en-us/library/ms695279(VS.85).aspx">in several places</a> that &#8220;Because there is no way to control the order in which in-process servers are loaded or unloaded, do not call <strong>CoInitialize</strong>, <strong>CoInitializeEx</strong>, or <strong>CoUninitialize</strong> from the <strong>DllMain</strong> function.&#8221; In general, doing something non-trivial in <strong>DllMain</strong> is a big no-no, and I&#8217;d expect anyone developing DLLs to have read Microsoft&#8217;s excellent document titled <a href="http://www.microsoft.com/whdc/driver/kernel/DLL_bestprac.mspx">Best Practices for Creating DLLs</a>.</p>
<p>Unfortunately, the stack trace I printed in the beginning of this post clearly shows that the saplugin module is calling <strong>CoUninitialize </strong>in its <strong>DllMain </strong>function while the loader lock has been acquired. (The loader lock is acquired by the call to <strong>LdrShutdownProcess</strong>.) This is a clear violation of the restrictions highlighted above.</p>
<p>I&#8217;ve uninstalled SiteAdvisor and the issue seems to be gone. Unfortunately, this case illustrates how easy it is for third-party applications to make their host applications exhibit undefined behavior.</p>
<p>With the previous issues I&#8217;ve had with McAfee&#8217;s software, I&#8217;m really starting to dislike the invasiveness (and buginess) of their applications.</p>
]]></content:encoded>
			<wfw:commentRss>http://inside.echobit.net/dreijer/archives/2008/08/10/be-careful-who-you-blame-another-mcafee-issue/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Be careful who you blame</title>
		<link>http://inside.echobit.net/dreijer/archives/2008/02/11/be-careful-who-you-blame/</link>
		<comments>http://inside.echobit.net/dreijer/archives/2008/02/11/be-careful-who-you-blame/#comments</comments>
		<pubDate>Mon, 11 Feb 2008 17:54:17 +0000</pubDate>
		<dc:creator>dreijer</dc:creator>
		
		<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://inside.echobit.net/archives/2008/02/11/be-careful-who-you-blame/</guid>
		<description><![CDATA[Don&#8217;t you just hate it when people mistakenly blame Windows and Microsoft for operating system crashes? I do. Most people don&#8217;t realize that it&#8217;s often the fault of a third-party component such as a kernel-mode driver. It must be rough for Microsoft &#8212; or any operating system developer, really &#8212; to deal with being blamed [...]]]></description>
			<content:encoded><![CDATA[<p>Don&#8217;t you just hate it when people mistakenly blame Windows and Microsoft for operating system crashes? I do. Most people don&#8217;t realize that it&#8217;s often the fault of a third-party component such as a kernel-mode driver. It must be rough for Microsoft &#8212; or any operating system developer, really &#8212; to deal with being blamed for others&#8217; faults.</p>
<p>The other day I caught myself making the exact same mistake, although in a slightly different setting. For quite some time, I&#8217;ve been having issues with Outlook 2003/2007. The application would hang for 30-60 seconds at startup and every time I received more than one e-mail. Outlook would always become responsive after a while, though, so I just ignored the problem and did other things during the wait.</p>
<p>This unresponsive behavior was particularly annoying if I was just opening Outlook to look up a particular e-mail and instead had to wait around for the application to load. I would see the same behavior if I received multiple e-mails and tried viewing the first one while the others were still downloading.<span id="more-22"></span></p>
<p>I had the great idea to blame Outlook &#8212; in specific, the IMAP portions. After all, Microsoft Office is a huge suite and has <a href="http://inside.echobit.net/archives/2008/01/15/importing-text-data-in-excel-2007/">shown to be buggy in the past</a>. I probably blamed the IMAP portion because the aforementioned behavior first started when I added an IMAP account to Outlook. (This turned out to be a coincidence, though.)</p>
<p>Being fed up at this waste of time, however, I finally decided to take a closer look at what was really doing on. I attached a debugger and after a cursory look at the running threads, I noticed the following thread&#8217;s call stack (I&#8217;ve highlighted the interesting part in red):</p>
<pre>0  Id: e98.f6c Suspend: 1 Teb: 7ffdf000 Unfrozen
ChildEBP RetAddr  Args to Child
<strong>002ef868 774d06a0 776177d4 00000508 00000000 ntdll!KiFastSystemCallRet (FPO: [0,0,0])
002ef86c 776177d4 00000508 00000000 00000000 ntdll!NtWaitForSingleObject+0xc (FPO: [3,0,0])
002ef8dc 77617742 00000508 ffffffff 00000000 kernel32!WaitForSingleObjectEx+0xbe (FPO: [Non-Fpo])
002ef8f0 1190ad11 00000508 ffffffff 1192226c kernel32!WaitForSingleObject+0x12 (FPO: [Non-Fpo])
WARNING: Stack unwind information not available. Following frames may be wrong.
002ef910 119096c8 011992a4 00000000 002ef940 Scanotlk!ExchEntryPoint+0xc18
002ef920 1190a864 02b707c8 011992a4 0119b6d8 Scanotlk+0x96c8
002ef940 3010b112 02b707d4 011992a4 00000001 Scanotlk!ExchEntryPoint+0x76b</strong>
002ef964 2feb0bef 0119b60c 0098a3ac 02cc02c4 OUTLOOK!GetCurrentDate+0xac028
002ef984 30131c0c 0119460c 0098a3ac 02cc02c4 OUTLOOK!HrDisplayFolderPickerForOutlookToday+0x4ff65
002ef9b4 30131b47 00000001 083b7f40 00000001 OUTLOOK!GetCurrentDate+0xd2b22
002ef9e8 2faa2b74 00000001 00000000 083b7f40 OUTLOOK!GetCurrentDate+0xd2a5d
002ef9fc 67af64a2 01ca46a4 00000001 083b7f40 OUTLOOK!SmoothScroll+0x2591
002efa1c 76661a10 000d03be 0000040d 00000000 olmapi32!HrThisThreadAdviseSinkEx+0x328
002efa48 76661ae8 67af6443 000d03be 0000040d USER32!InternalCallWinProc+0x23
002efac0 76662a47 003e40ac 67af6443 000d03be USER32!UserCallWinProcCheckWow+0x14b (FPO: [Non-Fpo])
002efb24 76653c8a 67af6443 00000001 002efb44 USER32!DispatchMessageWorker+0x322 (FPO: [Non-Fpo])
002efb34 6945d8c7 002efb5c 002efb5c 002efb7c USER32!DispatchMessageA+0xf (FPO: [Non-Fpo])
002efb44 2fa62ec2 002efb5c 00000000 3041aa00 mso!Ordinal4661+0x399
002efb7c 2fa62dab 00000000 00000000 00000000 OUTLOOK!GetCentralObject+0x16ed4
002efba4 2f9a51cd 2f9a0000 00000000 003a23ac OUTLOOK!GetCentralObject+0x16dbd</pre>
<p>As the output above shows, Outlook calls into the Scanotlk module and then waits for something. Astute readers might notice that the thread being blocked on the wait is actually the main thread of execution, i.e. the UI thread. Ah a!  That explained why the UI was hanging.</p>
<p>The Scanotlk turns out to be a part of  McAfee VirusScan Enterprise:</p>
<pre>0:033&gt; lm mScanotlk
start    end        module name
11900000 11942000   Scanotlk   (export symbols)       C:\Program Files\McAfee\VirusScan Enterprise\Scanotlk.dll</pre>
<p>When I disabled the On-Delivery E-mail Scanner and relaunched Outlook, the application stopped hanging on the receipt of e-mail. Problem solved.</p>
<p>This story has an important morale to it, though. I mentioned in the beginning that we software users should really be more careful who we blame when something doesn&#8217;t work. Often, it&#8217;s not what&#8217;s actually right in front of us. In this case, I instinctively blamed Outlook. Who should&#8217;ve known that the hang was actually caused by a third-party add-in.</p>
<p><span style="text-decoration: underline;"><strong>Resources:</strong></span></p>
<ul>
<li>ExchEntryPoint: <a href="http://support.microsoft.com/kb/285999">http://support.microsoft.com/kb/285999</a></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://inside.echobit.net/dreijer/archives/2008/02/11/be-careful-who-you-blame/feed/</wfw:commentRss>
		</item>
		<item>
		<title>I expect you to log my data, sir</title>
		<link>http://inside.echobit.net/dreijer/archives/2008/01/22/i-expect-you-to-log-my-data-sir/</link>
		<comments>http://inside.echobit.net/dreijer/archives/2008/01/22/i-expect-you-to-log-my-data-sir/#comments</comments>
		<pubDate>Tue, 22 Jan 2008 09:42:51 +0000</pubDate>
		<dc:creator>dreijer</dc:creator>
		
		<category><![CDATA[Rants]]></category>

		<guid isPermaLink="false">http://inside.echobit.net/archives/2008/01/22/i-expect-you-to-log-my-data-sir/</guid>
		<description><![CDATA[I&#8217;ve been struggling with this one for a while now, and after I lost yet another chat log I&#8217;ve finally had enough.
When I first started instant messaging way back in the days, I used Windows Live Messenger (or MSN Messenger as it used to be called). Even though Microsoft has added quite a few features [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve been struggling with this one for a while now, and after I lost yet another chat log I&#8217;ve finally had enough.</p>
<p>When I first started instant messaging way back in the days, I used Windows Live Messenger (or MSN Messenger as it used to be called). Even though Microsoft has added quite a few features that I find irrelevant and completely useless, and file transfers never seem to go any faster than 1 kB/s, I&#8217;ve stuck with it for the sake of compatibility with the majority of my IM friends, who also happen to use Live Messenger.</p>
<p>However, over the past few months I&#8217;ve noticed a disturbing trend in how Live Messenger stores my conversations: a conversation is only stored to disk <strong>once the chat window has been closed</strong>. That is, Live Messenger doesn&#8217;t log the conversation continuously like any other application would, but instead relies on the user to indicate that he&#8217;s finished.<span id="more-16"></span></p>
<p>At first, this might not seem so bad. At least your conversations are stored to disk at some point, right? Sadly, no. If you shut down Windows and there are open conversation windows, these are for some reason not closed normally and thus the conversations aren&#8217;t stored to disk. Imagine having had a conversation with a coworker all day and deciding that you should reboot your computer because Windows Update has installed new patches. Once the system comes back up, the entire conversation in Live Messenger is nowhere to be seen&#8230;</p>
<p>There is also the scenario of Live Messenger crashing, of course, which has been known to happen occasionally. In that case, your conversations aren&#8217;t saved either.</p>
<p>In my opinion, logging is something that should happen transparently to the user. He shouldn&#8217;t have to think about whether his current conversation has been saved or not. For instance, after having an hour long chat conversation in which several important things have been said, I shouldn&#8217;t have to manually close the conversation window to make sure it&#8217;s been stored to disk.</p>
<p>What&#8217;s the point in having logs if you can&#8217;t rely on them being accurate? If an application tells me that it saves my conversations, it really should do just that.</p>
<p>My friends all tell me to switch to another IM application, and maybe I will. I just find it very sad that an application like Live Messenger, which is in so wide-spread use, can&#8217;t get the simple features right before adding tons of other ones. I don&#8217;t care if I can insert cool, animated smileys or make the window shake from side to side if I can&#8217;t expect the application to actually do what it was originally built to do (or at least was supposed to do): simple and reliable chatting.</p>
]]></content:encoded>
			<wfw:commentRss>http://inside.echobit.net/dreijer/archives/2008/01/22/i-expect-you-to-log-my-data-sir/feed/</wfw:commentRss>
		</item>
	</channel>
</rss>
