Welcome to Atalasoft Community Sign in | Join | Help

Tuning the garbage collector to the specific context of the particular application can significantly improve the performance of both non-threaded and multi-threaded applications. In this post I discuss the gcConcurrent and gcServer settings which allow you to exercise some control how the Garbage Collector operates.

Articles in This Series

Part 1 – Basic Housekeeping

Part 2 – Improving Performance Through Stack Allocation

Part 3 – Increasing the Size of your Stack

Part 4 – Choosing the Right Garbage Collector Settings

 

Concurrent Garbage Collection, In My Program?

By default the CLR Garbage Collector operates concurrently. What this means is it operates in a separate thread and periodically blocks the operation of the application thread. This is the best behavior for most windows applications as it allows the interface to update while the Garbage Collector operates.

 

However, you may not want the overhead of all this context switching. If your application is console based or does not need to keep updating an interface, you can see significant performance gains by not having your application’s Garbage Collector in a separate thread.

 

So, How Do I Make My Garbage Collector Not Concurrent?

In your application’s app.config file set the gcConcurrent tag to false:

<configuration>
   <runtime>
       <gcConcurrent enabled="false"/>
   </runtime>
</configuration>

 

You should also know:

-          The default value is true and so that setting will have no effect.

-          In ASP.NET applications you do not need to change this setting.

-          On machines with only one processor this has no effect.

 

What If I Have Lots of Threads?

On the other side of the coin, you could have an application with many threads on a computer with many cores. In this case your performance may be limited by the single Garbage Collector thread’s ability to allocate or deallocate fast enough. Your threads would be waiting on your Garbage Collector.

 

The solution to this problem is to have many Garbage Collectors, each with their own thread. This would remove the limitation in exchange for a small amount of overhead. In .NET, this is called Server Mode.

 

When the Garbage Collector is in Server Mode there is one Garbage Collector thread per core and one set of heaps per garbage collector thread. The application threads allocate from them in a round-robin fashion.

 

How Do I Enable Server Mode?

In your application’s app.config file set the gcServer tag to true:

<configuration>
   <runtime>
      <gcServer enabled="true"/>
   </runtime>
</configuration>

 

You should also know:

-          The default value is false and so that setting will have no effect.

-          This setting should be fastest on computers with more than two cores.

-          You can use the GCSettings.IsServerGC property to check if this type of garbage collection is enabled.

 

 

References:

There seems to be little information floating around about what exactly these settings do. I obtained most of my information from the MSDN library and this dotnetmonster.com thread.

 

I have a new CodeProject article up which details how to make a Debugger Visualizer in the case where you need to custom serialize the object. The actual classes I build in the tutorial are only useful with our DotImage project line. However, the process of creating a Custom Serializer should be useful to any .NET developer.

 

Debugger Visualizer

 

In our case, we automatically serialize AtalaImages to PNG format by default. It turns out that PNG encoding and decoding is slow enough that visualizing a medium sized image would cause Visual Studio to time out. I didn't have the option of changing our default serialization process as it would have a huge impact on our customers, some of which have databases full of already serialized AtalaImages.

 

What I ended up doing was making a Custom Serializer which serializes the AtalaImage into a bitmap instead of a PNG (a process which is pretty much a row by row copy). This sped things up enough so that now just about any image works.

I’ve been doing a lot of work in the PDF space lately. While implementing Binary Cross Reference Streams I was surprised to see that they could be encoded with PNG Predictors. This was surprising to me because binary cross reference streams aren’t images, they are byte tables:

 

Data

 

While the values vary, they are often within the same ranges. The first byte can only contain the numbers 0, 1 and 2 and are often the same, the middle two bytes are often the same and the last byte is generally increasing by one each time. Knowing this about the data you can develop a lossless algorithm which normalizes the data and so makes PKZIP/GZIP/FLATE compression work much better.

 

For the cross reference stream in PDF documents the most common predictor algorithm is the mind-blowingly simple UP filter:

 

Up(x) = Raw(x) – Prior(x)

 

This means each byte is simply its own value minus the value of the byte above it. A complete list of different PNG Predictors is available in the spec. Let’s take a look at what this very simple algorithm does to a small sample table:

 

02 0002 00

02 0002 01

02 0002 02

02 0002 03

The last column continues to increase incrementally upwards as it is an index.

 

02 0002 00

00 0000 01

00 0000 01

00 0000 01

 

This example may look contrived but it’s actually right out of the PDF 1.7 Spec. By subtracting values like this you can decrease the vocabulary the ZIP encoder has to know and so significantly reduce the encoded stream size.

 

A particularly great example comes right from the libpng docs:

If you make a 24-bit 4096 x 4096 RGB image which contains one pixel of each color it is 48 MB as raw data, 36 MB with normal GZIP compression and an insane 59,852 bytes with PNG Predictor Filtering.

 

The real moral of the story here is that images are just a subclass of multidimensional byte tables and the same kinds of techniques can be used on both to achieve much better rates of compression. That is, if you have a priori information about the data and the compression algorithm.

 

Synopsis

 

I gave an hour long talk today, here at Atalasoft, on Concurrency in F#. It featured some slides and a small ant colony simulation to demonstrate different kinds of threading. Overall, I liked developing in F# quite a bit; however, puzzling through the interpreter errors was a brutal process indeed.

 

You can grab my slides here and my ant colony simulation here.

 

The Long Version

 

It all started two weeks ago…

 

…I had been reading about F# on various blogs for months and had done little more than dabble with it. At the time there was an open slot at my office to give a lunchtime talk. So, in order to spur myself to actually get some real experience I committed to giving a talk about F#.

 

Thinking back to Rich Hickey’s Clojure talk in Northampton (which I was very impressed with) I decided to write a simple ant colony simulation. An ant colony is a great environment to test out different threading techniques because it involves a large number of very small tasks. Incidentally, it’s also pretty for an audience to look at.

 

I built my simulation with threading abstracted as much as possible. It was surprising how well this worked out in F#; it was particularly easy due to the functional nature of the language. I kept the behavioral code, forms code and in-between code separate while using wrapper functions for locking, looping and other small threading things.

 

Originally, I had planned to build 5 different styles of threading into my simulation for the sake of testing:

1) Single Thread via Array2.iter

2) Asynchronous Execution via Async.Parallel

3) Massive Shared-Memory Threading via .NET’s lock and Thread

4) Message Passing via Mailbox

5) Software Transactional Memory via Greg Neverov’s Library

 

In the end, I ended up struggling with the new syntax much more than I had anticipated and so only managed to get Single Thread, Async and Shared-Memory done for my presentation. If you are interested, you can grab the simulation here and the slides here.

 

This is how it ended up looking:

 

Ant Colont Sim

 

 

I plan on finishing up my sim by adding the other two threading styles (Mailboxes and STM) and making it convenient to switch between them with form buttons. This should make it easy to profile and see how the different setups compare performance wise.


After that I think the next step is to post sections of it on the fshub forums and let the experts tear it apart. Really, that’s the only way to learn anything truly useful.

 

 

Attachment Disclaimer:

The Ant Colony Simulation attached to this post is provided as-is with no warrantees or guarantees of any kind. I claim no responsibility for any effect it may have on you or your computer. I would like to hear from you if you do anything cool with it though.

 

edit: It looks like the image and download links were broken all weekend. They should work now.

In the previous article I discussed a few of the benefits of stack allocation as well as a couple of C# keywords which help you to leverage those benefits. However, the one megabyte default stack size is too small for stack allocation to be used with a large dataset. Alternatively, in some threading situations one megabyte per thread/fiber can be too large and bottleneck your system. In this article I will discuss the different ways you can modify the stack size.

 

Articles in This Series

Part 1 – Basic Housekeeping

Part 2 – Improving Performance Through Stack Allocation

Part 3 – Increasing the Size of your Stack

Part 4 – Choosing the Right Garbage Collector Settings

Why Not To Increase Your Stack Size

There are many cases in which it is best to not to increase your stack size. In fact right inside the Microsoft documentation for the Thread Constructor it states:

 

If a thread has memory problems, the most likely cause is programming error, such as infinite recursion.

 

And the Thread Stack Size operating system documentation gives this advice:

It is best to choose as small a stack size as possible and commit the stack that is needed for the thread or fiber to run reliably. Every page that is reserved for the stack cannot be used for any other purpose.

This is all generally good advice to follow. However, there are some cases in which it may be appropriate or even necessary to change your stack size.

 

Why You Might Want To Modify Your Stack Size

 

Scenario 1: Many Threads

If you are in a situation where you need to create a great number of threads (or fibers) each will require its own stack. In this case each of those threads having a large stack size can eat up a ton of memory. By decreasing your stack size it is possible to accommodate a much larger number of threads.

 

Scenario 2: Optimization

You may want to utilize the convenience and speed of stack allocation. Some might see this as poor design but many an ugly hack has been made in the name of performance.

 

Other Scenarios

Obviously, there are other scenarios where stack size modification could be helpful. I would love to hear about your personal experience with it.

 

Stack Size Modification Techniques

In C++ you can simply specify the linker’s /stack option but in C# you have to jump through a few hoops in order to change stack size.

 

The Easiest Way ( .NET 2.0 )

In .NET 2.0 and newer you can simply specify thread size in a thread’s constructor. Unfortunately, this method is only compatible only with Windows XP and newer operating systems. You can specify this parameter on those platforms but it will have no effect; the stack size in the binary header will be used.

 

    using System.Threading;

   

    Thread T = new Thread(threadDelegate, stackSizeInBytes);

    T.Start();

 

Pros:

-Very Easy

-Can Dynamically Specify Thread Size at Creation Time

 

Cons:

-Only Available in .NET 2.0 and Above

-Stack Size Parameter Ignored in Pre-XP Operating Systems

 

The Old Way ( .NET 1.x )

In .NET 1.x the only option is to programmatically specify thread size is to PInvoke into kernel32.dll and execute CreateThread. This method also has the advantage of being extremely backwards compatible. It’s not pretty, but it gets the job done.

 

using System.Runtime.InteropServices;

unsafe class Kernel32Thread

[DllImport("kernel32.dll")]

static extern IntPtr CreateThread(...

hThread = CreateThread( IntPtr.Zero, stackSizeInBytes, threadDelegate, pArguments, 0, out threadId );

WaitForSingleObject( hThread, timeout );

CloseHandle( hThread );

 

This is only a general overview of what is necessary. The complete code needed is fairly large, so I have attached it as a separate file.

 

The MSDN documentation specifies that this will be backwards compatible to Windows 2000. However, kernel32.dll supported specifying the stack size all the way back to Win95 and NT 3.1.

 

Pros:

-Backwards Compatible to Windows 95

-Can Dynamically Specify Thread Size at Creation Time

-.NET 1.x Support

 

Cons:

-Unsafe

-External Calls to kernel32.dll

-Difficult

 

Links:

If you are interested, you can learn about creating a thread in another process in an article on Mike Stall’s blog.

Maxim Alekseyken has a Code Project article which describes running a thread directly from inline byte code.

 

The Static Way ( External Utility )

The last option is to use an external utility to modify the binary executable’s header. Visual Studio comes with a tool for this task and it is very simple to use:

 

            EDITBIN.EXE /STACK:reserve[,commit] <files>

 

Where reserve is the maximum memory to allocate for stack the commit value depends on your operating system:

           

The optional commit argument is subject to interpretation by the operating system. In Windows NT, Windows 95, and Windows 98, commit specifies the amount of physical memory to allocate at a time.

 

An example use would be:

 

            EDITBIN.EXE /STACK:131072 file.exe

 

In my opinion, it’s best to do stack size changes in the code if at all possible. Using a command line utility, even if it’s in the post build event, is not always obvious and could be easily overlooked. a StackOverflowException will be thrown if you try to use more memory than is available in your stack.

 

Pros:

-Very Easy

-Backwards Compatible to Windows 95

 

Cons:

-No Dynamically Sized Stacks

-Not Part of the Code and So Easy To Forget About

 

Misc Extra Info on Stack Size Modification

From The Thread Stack Size section of the MSDN Win32 Development Documentation:

 

·         The default size for the reserved and initially committed stack memory is specified in the executable file header.

·         Thread or fiber creation fails if there is not enough memory to reserve or commit the number of bytes requested.

·         The operating system rounds up the specified size to the nearest multiple of the system's allocation granularity (typically 64 KB).

 

 

Articles in This Series

Part 1 – Basic Housekeeping

Part 2 – Improving Performance Through Stack Allocation

Part 3 – Increasing the Size of your Stack

Part 4 – Choosing the Right Garbage Collector Settings

Introduction

In C#, when you create managed objects or arrays of value types, they are created on the Heap and you are passed back a reference to the memory in which that allocated object lives. This is normally a very good thing because it allows you to safely do what you need with it and have it be magically garbage collected when there are no longer any strong references. However, this process incurs a lot of overhead both at allocation time as well as during garbage collection.

 

The alternative to this is stack allocation. Except for a few exceptions in which the compiler is trying to do some fancy tricks or save you from yourself, references (pointers), structs and value types are kept on the stack. This makes sense because if all of these tiny objects had to be heap allocated and then garbage collected it would take a lot of extra overhead. If you are willing to use unsafe code you can leverage the stack to greatly enhance the performance of some types of applications.

 

When is it done for me?

Value types are always allocated on the stack:

 

int num = 10;

 

All unmanaged members of a struct will be kept on the stack.

 

struct TestStruct1

{

public int i;

      public byte y;

      public double z;

}

TestStruct1 ts1 = new TestStruct1();

 

The following example is a managed struct. The struct and integer i will normally be kept on the stack while k will be a reference to an array allocated on the heap.

 

struct TestStruct2

{

      public int i;

      public int[] k;

}

TestStruct2 ts2 = new TestStruct2();

ts2.k = new int[1024];       

 

The compiler will also sometimes decide to put things on the stack on its own. I did an experiment with TestStruct2 in which I allocated it both an unsafe and normal context. In the unsafe context the array was put on the heap, but in the normal context when I looked into memory the array had actually been allocated on the stack.

 

Unsafe Code

In order to have control over stack allocation you need to execute your code in contexts marked as unsafe. You can do this by using the unsafe keyword at the class, method or code block level:

 

unsafe class Class1

{

}

 

static unsafe void Main(string[] args)

{

}

 

unsafe

{

}

 

You will also need to make sure the /unsafe compiler flag is enabled.

 

It is important to understand that unsafe code can open the door to security and stability problems. Before writing unsafe code for use in a production environment, make sure to read both the Security section of the C# developer’s guide and the section on Unsafe Code.

 

It is also important to note that libraries or applications with unsafe code can only be used in Full Trust environments. This is particularly an issue if you are writing a web application which will be deployed in an IIS context which you do not have control over.

 

The stackalloc keyword

C# has the stackalloc keyword which lets you force unmanaged type arrays to be allocated on the stack inside unsafe contexts:

 

int* ptr = stackalloc int[1024];

 

This will allocate a contiguous array of 1024 integers on the stack. Similarly, you can allocate an array of unmanaged structs on the stack as well:

 

TestStruct1* ptr2 = stackalloc TestStruct1[10];

 

This type of allocation is really fast, the garbage collector is never invoked and transversing the array has greatly reduced overhead. This is can be a huge win for performance minded applications. There is one big disadvantage that should be kept in mind however: because things allocated on the stack will go away when they go out of context, they must be copied if you want to use their contents outside of the current method.

 

The only case in which memory is bounds checked in unsafe code is with stackalloced memory. This constant checking does not come for free and so you may see even better performance when using the fixed keyword inside a stack allocated struct.

 

The fixed Keyword

.NET 2.0 and later have a keyword called fixed for defining fixed length arrays as part of a struct in unsafe contexts. Unlike using dynamic array allocation, a fixed length array will be considered as part of the struct instead of just a reference. It also has the added bonus of being an unmanaged type and so a struct which uses it will be stack allocated by default.

 

struct TestStructFixed

{

      public int i;

      public fixed int k[1024];

}

TestStructFixed tsf = new TestStructFixed();

 

It is very important to note that arrays defined with the fixed keyword will do no bounds checking for you.

 

Overview of Stack Allocation

 

Pros:

-Fast Allocation

-Fast Array Traversal

-No Need for Garbage Collection

 

Cons:

-Only Basic Types and Pointers Can Be Stack Allocated

-Contents Must be Copied to Be Used Out of the Current Scope

-Stack Memory is, By Default, Very Limited in Size ( One Megabyte )

-Unsafe Code Opens the Potential for Security and Stability Problems

 

stackalloc Keyword:

-Used For Dynamic Array Allocation

-Can Allocate Arrays of Any Unmanaged Type

-Bounds Checked

 

fixed Keyword:

-Used For Static Sized Arrays Inside Structs

-Only Available in .NET 2.0 and Newer

-Not Bounds Checked

Additional Information

In this post I did not fully explore all of the differences between the stack and the heap. To explore these differences further, C# Corner has a set of fairly comprehensive set of articles on the topic.

 

 

I went to my first Code Camp this weekend. I was a bit wary at first because it was hosted by Microsoft and I hate Corporate Kool-Aid. Thankfully, that was kept to a minimum and the focus was where it should be: on the code.

 

All of the panels I went to were worthwhile to some degree. However, three stood out to me as particularly informative or entertaining:

 

Advanced Techniques for Everyday Development by Edwin Ames

While this talk was all business and no glitter it was by far the most informative presentation I went to at Code Camp. Edwin focused on NMock2, which I previously hadn’t heard of. I had read about mock objects but actually building them always seemed more pain and time than they were worth. With NMock2 it is possible to quickly create mock objects for testing in .NET.  It also allows the user to easily build in assertions and expected output. I’m going to start using NMock2 for my own testing right away.

 

Extending Powershell by Lou Franco

Atalasoft’s own Lou Franco gave a really entertaining introductory talk about Powershell. Powershell is the scripting language Microsoft should have had built into its operating systems ages ago. With it you get all of the power of Linux shell scripting along with full .NET support. Powershell is not only useful for programmers who want to throw something together fast. It’s also a great tool for Sysadmins and IT workers as it also supports access to the guts of many of Microsoft’s server and operating system products.

 

An Introduction to Game Development with XNA by Chris Bowen

I was happy to see that Chris didn’t dwell on the obviously-premade-by-Microsoft-Marketing XNA slides and got right into the meat of things. Using XNA, Chris built a pong clone complete with graphics, sound and simple AI, in about an hour. Everyone from Atalasoft who attended was blown away. Granted, all of the code came from prewritten snippets, but it was clear to everyone who attended just how fast and simple game development with XNA is. He also a mentioned a Boston XNA users group, if I lived in Boston I would defiantly check it out.

 

Room for Improvement

The worst thing about Code Camp was the poor accommodations. There was no Wifi for non-presenters which really shows poor hospitality on the part of Microsoft. It’s 2008, there are whole towns with free Wifi, is some internet really too much to ask for? The catering was a fiasco, with the beverages showing up way after the food and there being no dressing for the salad. The coffee disappeared after about noon and so I almost passed out in a couple of the afternoon talks. The food for day two was exactly the same (Plain Cheese Pizza) and so we decided to bail on the second half of the day and go get some real food.  It wouldn’t have been so bad if enough time had been given for lunch so that we could go get something and stock up on caffeine. Also, the content of the panels was great, but it would have been nice to see more Advanced-rated panels.

 

In Summary,

- Overall, I found my first Code Camp to be a very positive experience.

- There were some great panels and I learned a lot.

- It was full of friendly and like-minded developers.

However…

- Two days without Wifi is a lot to ask.

- A longer Mid-day break would have helped a lot.

- I would like to see more Advanced panels.

 

 

This is the first in a series of posts I will be writing about managing memory in .NET. Before I move on to more complex techniques, I thought it would be good to cover the basics.

Articles in This Series

Part 1 – Basic Housekeeping

Part 2 – Improving Performance Through Stack Allocation

Part 3 – Increasing the Size of the Stack

Part 4 – Garbage Collector Settings  

Disposing IDisposable Objects

When an object implements IDisposable you can explicitly determine when it is finalized. In fact, IDisposable objects sometimes will not release unmanaged data unless they are explicitly told to and so can leak memory if not handled properly. If at all possible it is best to use IDisposable objects inside a using statement:

 

using (AtalaImage img = new AtalaImage(@"C:\File.png"))

{

    ...

}

 

A using statement will ensure at the object is cleaned up even if an exception is thrown or a method is terminated unexpectedly.  If the object needs to be kept around for an indeterminate amount of time be sure to call the Dispose() method when you are done with it.

 

class MyClass

{

    AtalaImage img;

 

    public void NewImage(string filename)

    {

        if (img != null)

            img.Dispose();

        img = new AtalaImage(filename);

    }

    public void Clean()

    {

        if (img != null)

            img.Dispose();

    }

}

 

Please note that this is a toy example. In most cases you should implement the IDisposable pattern yourself when keeping references to IDisposable objects. See Steve's article IDisposable Made E-Z for more information.

 

Forcing Garbage Collection

Sometimes when a lot of memory is needed it can help a lot to stop