<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments for Preshing on Programming</title>
	<atom:link href="http://preshing.com/comments/feed" rel="self" type="application/rss+xml" />
	<link>http://preshing.com</link>
	<description></description>
	<lastBuildDate>Thu, 17 May 2012 10:56:11 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.2</generator>
	<item>
		<title>Comment on Finding Bottlenecks by Random Breaking by Jeff Preshing</title>
		<link>http://preshing.com/20110723/finding-bottlenecks-by-random-breaking#comment-11293</link>
		<dc:creator>Jeff Preshing</dc:creator>
		<pubDate>Thu, 17 May 2012 10:56:11 +0000</pubDate>
		<guid isPermaLink="false">http://preshing.com/?p=1156#comment-11293</guid>
		<description>xperf rocks, and random breaking is by no means a substitute. However, it can be a forehead-slap moment when you realize it’s possible to identify certain (obvious) bottlenecks this way &lt;em&gt;at all&lt;/em&gt;. At least, for me it was.

One thing I didn’t mention in the article is that when breaking, there’s no guarantee you’ll end up looking at the guilty thread, for some reason. A couple times, I’ve had to check the busiest thread IDs using &lt;a href=&quot;http://technet.microsoft.com/en-us/sysinternals/bb896653&quot; rel=&quot;nofollow&quot;&gt;Process Explorer&lt;/a&gt;, then switch to those threads to view their stacks after breaking.</description>
		<content:encoded><![CDATA[<p>xperf rocks, and random breaking is by no means a substitute. However, it can be a forehead-slap moment when you realize it’s possible to identify certain (obvious) bottlenecks this way <em>at all</em>. At least, for me it was.</p>
<p>One thing I didn’t mention in the article is that when breaking, there’s no guarantee you’ll end up looking at the guilty thread, for some reason. A couple times, I’ve had to check the busiest thread IDs using <a href="http://technet.microsoft.com/en-us/sysinternals/bb896653" rel="nofollow">Process Explorer</a>, then switch to those threads to view their stacks after breaking.</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on Finding Bottlenecks by Random Breaking by Bruce Dawson</title>
		<link>http://preshing.com/20110723/finding-bottlenecks-by-random-breaking#comment-11281</link>
		<dc:creator>Bruce Dawson</dc:creator>
		<pubDate>Thu, 17 May 2012 06:20:53 +0000</pubDate>
		<guid isPermaLink="false">http://preshing.com/?p=1156#comment-11281</guid>
		<description>I&#039;ve used this technique before, and it certainly can work, but I&#039;m not sure it&#039;s a great idea these days, except for the most highly focused of performance problems. Given that you can get a high-frequency high-quality CPU sampling profiler for free, what is the advantage to random breaking?

Xperf: up to 8 KHz sampling rate
Random breaking: up to 0.5 Hz sampling rate

Granted, the learning curve for xperf is significantly higher, but for any non-trivial work it is well worth it.

http://randomascii.wordpress.com/2011/09/03/xperf-for-excess-cpu-consumption/</description>
		<content:encoded><![CDATA[<p>I&#8217;ve used this technique before, and it certainly can work, but I&#8217;m not sure it&#8217;s a great idea these days, except for the most highly focused of performance problems. Given that you can get a high-frequency high-quality CPU sampling profiler for free, what is the advantage to random breaking?</p>
<p>Xperf: up to 8 KHz sampling rate<br />
Random breaking: up to 0.5 Hz sampling rate</p>
<p>Granted, the learning curve for xperf is significantly higher, but for any non-trivial work it is well worth it.</p>
<p><a href="http://randomascii.wordpress.com/2011/09/03/xperf-for-excess-cpu-consumption/" rel="nofollow">http://randomascii.wordpress.com/2011/09/03/xperf-for-excess-cpu-consumption/</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on Roll Your Own Lightweight Mutex by Bruce Dawson</title>
		<link>http://preshing.com/20120226/roll-your-own-lightweight-mutex#comment-11280</link>
		<dc:creator>Bruce Dawson</dc:creator>
		<pubDate>Thu, 17 May 2012 06:16:01 +0000</pubDate>
		<guid isPermaLink="false">http://preshing.com/?p=2749#comment-11280</guid>
		<description>Another reason to not roll your own is that the OS locks are likely to have additional features that your lock won&#039;t have. On Windows I can think of several:

1) When you create a critical section you can specify a spin count, which can give a nice balance between spinning and doing a kernel transition to wait on a semaphore.
2) Windows 7 has a flag you can set (RTL_CRITICAL_SECTION_FLAG_DYNAMIC_SPIN) which automates the process of choosing the correct spin count.
3) Critical sections in Windows are ETW instrumented so that you can follow wait chains -- to allow sophisticated profiling.

As I said in a recent post:
&quot;The ability to profile your code trumps any small performance improvement.&quot;
http://randomascii.wordpress.com/2012/05/05/xperf-wait-analysisfinding-idle-time/</description>
		<content:encoded><![CDATA[<p>Another reason to not roll your own is that the OS locks are likely to have additional features that your lock won&#8217;t have. On Windows I can think of several:</p>
<p>1) When you create a critical section you can specify a spin count, which can give a nice balance between spinning and doing a kernel transition to wait on a semaphore.<br />
2) Windows 7 has a flag you can set (RTL_CRITICAL_SECTION_FLAG_DYNAMIC_SPIN) which automates the process of choosing the correct spin count.<br />
3) Critical sections in Windows are ETW instrumented so that you can follow wait chains &#8212; to allow sophisticated profiling.</p>
<p>As I said in a recent post:<br />
&#8220;The ability to profile your code trumps any small performance improvement.&#8221;<br />
<a href="http://randomascii.wordpress.com/2012/05/05/xperf-wait-analysisfinding-idle-time/" rel="nofollow">http://randomascii.wordpress.com/2012/05/05/xperf-wait-analysisfinding-idle-time/</a></p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on xkcd Password Generator by Graham</title>
		<link>http://preshing.com/20110811/xkcd-password-generator#comment-11201</link>
		<dc:creator>Graham</dc:creator>
		<pubDate>Wed, 16 May 2012 06:26:29 +0000</pubDate>
		<guid isPermaLink="false">http://preshing.com/?p=1404#comment-11201</guid>
		<description>I&#039;ve found that the small American English dictionary (sudo apt-get install wamerican-small) makes a good source of words.  It has 51175 entries, but after stripping plurals and limiting to 3-8 character words, there are only 25077 entries.

Then all you take a random selection of 25077 common words:

grep -v \&#039;s /usr/share/dict/american-english-small &#124; egrep &#039;^.{3,8}$&#039; &#124; shuf -n 100 &#124; fmt</description>
		<content:encoded><![CDATA[<p>I&#8217;ve found that the small American English dictionary (sudo apt-get install wamerican-small) makes a good source of words.  It has 51175 entries, but after stripping plurals and limiting to 3-8 character words, there are only 25077 entries.</p>
<p>Then all you take a random selection of 25077 common words:</p>
<p>grep -v \&#8217;s /usr/share/dict/american-english-small | egrep &#8216;^.{3,8}$&#8217; | shuf -n 100 | fmt</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on Memory Reordering Caught in the Act by Jeff Preshing</title>
		<link>http://preshing.com/20120515/memory-reordering-caught-in-the-act#comment-11190</link>
		<dc:creator>Jeff Preshing</dc:creator>
		<pubDate>Wed, 16 May 2012 02:22:04 +0000</pubDate>
		<guid isPermaLink="false">http://preshing.com/?p=3026#comment-11190</guid>
		<description>The main reason reordering is detected so rarely in this sample is because the semaphore timing is unpredictable and I introduced randomness to compensate. You could probably detect reordering much more frequently if you use a different synchronization method in the worker threads; for example, busy-waiting on a flag. I chose not to do this in the sample because then, I&#039;d need to ensure correct memory ordering around the flag, and that would draw attention away from the place where I really wanted to demonstrate reordering.

Also, as Bruce Dawson points out below, memory reordering is not necessarily an artifact of CPU instructions being switched. Memory reordering tends to be more the result of the way CPU caches work: If you are really interested in those details, I&#039;d recommend Appendix C of &lt;a href=&quot;http://kernel.org/pub/linux/kernel/people/paulmck/perfbook/perfbook.2011.01.02a.pdf&quot; rel=&quot;nofollow&quot;&gt;Is Parallel Programming Hard&lt;/a&gt;.</description>
		<content:encoded><![CDATA[<p>The main reason reordering is detected so rarely in this sample is because the semaphore timing is unpredictable and I introduced randomness to compensate. You could probably detect reordering much more frequently if you use a different synchronization method in the worker threads; for example, busy-waiting on a flag. I chose not to do this in the sample because then, I&#8217;d need to ensure correct memory ordering around the flag, and that would draw attention away from the place where I really wanted to demonstrate reordering.</p>
<p>Also, as Bruce Dawson points out below, memory reordering is not necessarily an artifact of CPU instructions being switched. Memory reordering tends to be more the result of the way CPU caches work: If you are really interested in those details, I&#8217;d recommend Appendix C of <a href="http://kernel.org/pub/linux/kernel/people/paulmck/perfbook/perfbook.2011.01.02a.pdf" rel="nofollow">Is Parallel Programming Hard</a>.</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on Memory Reordering Caught in the Act by Jeff Preshing</title>
		<link>http://preshing.com/20120515/memory-reordering-caught-in-the-act#comment-11189</link>
		<dc:creator>Jeff Preshing</dc:creator>
		<pubDate>Wed, 16 May 2012 01:42:47 +0000</pubDate>
		<guid isPermaLink="false">http://preshing.com/?p=3026#comment-11189</guid>
		<description>Ha. Windows guy alert! Fixed now.</description>
		<content:encoded><![CDATA[<p>Ha. Windows guy alert! Fixed now.</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on Memory Reordering Caught in the Act by Jeff Preshing</title>
		<link>http://preshing.com/20120515/memory-reordering-caught-in-the-act#comment-11187</link>
		<dc:creator>Jeff Preshing</dc:creator>
		<pubDate>Wed, 16 May 2012 00:54:18 +0000</pubDate>
		<guid isPermaLink="false">http://preshing.com/?p=3026#comment-11187</guid>
		<description>Hi Bruce. Indeed, my original wording made it sound like instruction reordering was the sole reason for memory reordering. I&#039;ve since revised that paragraph. In this post, I don&#039;t really want to say too much about the causes of memory reordering -- just the effects.</description>
		<content:encoded><![CDATA[<p>Hi Bruce. Indeed, my original wording made it sound like instruction reordering was the sole reason for memory reordering. I&#8217;ve since revised that paragraph. In this post, I don&#8217;t really want to say too much about the causes of memory reordering &#8212; just the effects.</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on Memory Reordering Caught in the Act by Roy</title>
		<link>http://preshing.com/20120515/memory-reordering-caught-in-the-act#comment-11183</link>
		<dc:creator>Roy</dc:creator>
		<pubDate>Wed, 16 May 2012 00:24:44 +0000</pubDate>
		<guid isPermaLink="false">http://preshing.com/?p=3026#comment-11183</guid>
		<description>On my Dual Dual-Core Xeon 5150, I get a significantly higher rate of reordering...

163090 reorders detected after 3000077 iterations - 5.4% of the time.  That&#039;s not rare.</description>
		<content:encoded><![CDATA[<p>On my Dual Dual-Core Xeon 5150, I get a significantly higher rate of reordering&#8230;</p>
<p>163090 reorders detected after 3000077 iterations &#8211; 5.4% of the time.  That&#8217;s not rare.</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on Memory Reordering Caught in the Act by Adam White</title>
		<link>http://preshing.com/20120515/memory-reordering-caught-in-the-act#comment-11182</link>
		<dc:creator>Adam White</dc:creator>
		<pubDate>Wed, 16 May 2012 00:19:08 +0000</pubDate>
		<guid isPermaLink="false">http://preshing.com/?p=3026#comment-11182</guid>
		<description>I have little to say about the article. I just want to compliment you on your awesome Jumpman avatar.  It really brings back thoughts of my childhood spent on a C64.</description>
		<content:encoded><![CDATA[<p>I have little to say about the article. I just want to compliment you on your awesome Jumpman avatar.  It really brings back thoughts of my childhood spent on a C64.</p>
]]></content:encoded>
	</item>
	<item>
		<title>Comment on Memory Reordering Caught in the Act by Jeff Preshing</title>
		<link>http://preshing.com/20120515/memory-reordering-caught-in-the-act#comment-11174</link>
		<dc:creator>Jeff Preshing</dc:creator>
		<pubDate>Tue, 15 May 2012 22:11:36 +0000</pubDate>
		<guid isPermaLink="false">http://preshing.com/?p=3026#comment-11174</guid>
		<description>I tried it on a Core 2 Duo E6300 in Win32. The rate of occurrence indeed increased from 1 out of 7500 to 1 out of 4700 or so.</description>
		<content:encoded><![CDATA[<p>I tried it on a Core 2 Duo E6300 in Win32. The rate of occurrence indeed increased from 1 out of 7500 to 1 out of 4700 or so.</p>
]]></content:encoded>
	</item>
</channel>
</rss>

