Thursday, June 26, 2008 6:32 PM
A Safe and Asynchronous One to Many Stream Copy Through IL and Inheritance
Because .NET Streams have state, they are difficult to use in multithreaded environments. In this post I discuss ways to manage or work around problems arising from the statefulness of .NET Streams. I explain how this is possible both through traditional inheritance and also through some indulgence in hacking of object protection levels by emitting custom IL.
I wanted to call this post Using IL to Break the Object Model For an Easy, Secure and Asynchronous One to Many Stream Copy in .NET but it was too long.
The intended way to deal with Streams in a multithreaded environment is through the use of Asynchronous I/O calls. While these methods are usable in many cases, they often come with their own set of headaches. They depend on the initial position of the Stream and so require management of this position for multiple threads reading from the same stream. As stated in the MSDN Stream documentation, this behavior is predictable:
The current position in the stream is updated when the asynchronous read or write is issued, not when the I/O operation completes.
This implementation can be extremely useful for multithreaded writing to a stream. However, in the case of reading it can be a huge headache to deal with. For instance, if your consumers are all in different threads making calls at unpredictable times, the position at the time a BeginRead call is made can be impossible to guarantee without locking.
Another problem with using the Asynchronous I/O methods is the handling of exceptions. An exception thrown in a BeginRead call will be kept in the threadpool and won’t be exposed until EndRead is called. This means that immediate exception cleanup or recovery is not possible.
Providing a Read Only MemoryStream to Untrusted Client Components
Consider a plugin system where you have many dynamically loaded and untrusted client assemblies in your application. You want to be able to provide the same large set of data to each of them without allowing any of them to corrupt or damage the data the other is processing.
It would be really nice if we had a way to easily build read only copies of current MemoryStream objects. We could then simply pass these read only streams on to each of the untrustworthy components.
Now, you might think that at this point we already have our solution. Don’t we simply call GetBuffer and feed that to a new MemoryStream along with false for the writable parameter? Unfortunately, if the data in your stream is not exactly the same size as its buffer, your new stream will have a bunch of extra data on the end. MemoryStream does have a constructor which lets you specify an index and count along with a buffer, but using this will end up copying your buffer.
Ideally we would be able to do a member-wise clone on a MemoryStream and then set it to read only. Unfortunately this is all but impossible due two limitations in its design. The first is that the MemberwiseClone method is protected and so we can only gain access to it by making our own subclass of MemoryStream. The second is that even though readability is only controlled by a single private boolean, it can only be set inside MemoryStream’s constructor.
This leaves us with three options: Our first option is to do a member-wise clone the hard way and figuring out a way to simultaneously change the value of a private variable. The second is to create a new object which inherits from MemoryStream and add the functionality we need to it. The third is to create an entirely new, more robust, stream object which is tailored specifically to the task at hand.
In his article “C# Object Clone Wars” Timm Martin discusses six different ways to clone a .NET object. Most of the techniques discussed will only work if you have control of the object or require copying of the data. However, the fifth entry in the article links to another blog by a man only known as Whizzo.
Whizzo’s article is entitled “Object Cloning Using IL in C#”. In this blog he provides an example for emitting IL directly in order to do a member-wise clone of any object. I found myself thoroughly impressed by this technique. By manipulating the IL you can bend the access restrictions imposed by the C# language. Whizzo’s emitted IL code was able to make a perfect copy of my MemoryStream.
With a little help from my trusty .NET Reflector I was to figure out the name of the internal variable which controlled if a MemoryStream was read only. A simple check later I was sneaking my new variable value in through the back door:
if (field.Name == "_writable")
//bools are really just int32s
It was exactly as I hoped. My clones were perfect copies except for being read only. This example code was just a nasty hack to see what could be done. A better approach might be to use a delegate method to be able to extend this further and differently for different objects.
Wrappers, Inheritance (and Reality)
As for my current project here at Atalasoft, I ended up making a slightly extended MemoryStream with a customized constructor which allows you to provide a buffer and specify how much of that buffer is already filled. The code is easy to understand and it’s unlikely to have bugs. It just gets the job done.
It's way less cool though.
Edit: I've attached my test code to this post. It runs a series of expeirmnets which both test the IL cloned object and compare the cloned object with MemoryStreams created by calls to new MemoryStream(ms1.GetBuffer()) and new MemoryStream(ms1.ToArray()).