Visual Studio 2013 support

Visual Studio 2013

I’m pleased to let you know that all Mindscape products have full support for Visual Studio 2013!

If you have have an active subscription you can download the latest nightly builds to get this new support. If your subscription has ended, you can renew it to obtain the latest builds.

Some products needed explicit support to work at all (e.g. Web Workbench), while others just have nice-to-have improvements like putting WPF controls into the toolbox for you.

Happy Coding!

Runtime code generation for types

Last week I discussed how you could use runtime code generation to create fast methods to replace slow reflection code, even when you didn’t have the target types to compile against. But runtime code generation allows you to go further than isolated methods: you can actually create whole new types at runtime.

Why might you want to do that? The classic example is to subclass a user-defined class so as to intercept certain calls that you are interested in. Some object-relational mappers do this, for example, so that users can write entity properties using automatic property syntax and yet the object-relational mapper can still intercept the getters and setters to perform, for example, lazy loading. Another example could be to create objects that are implemented according to a regular pattern but whose properties are determined from a schema that is only available at runtime (say, from a configuration file or a database’s catalogue); such objects might not be terribly useful for programming with, unless there were design-time interfaces through which you could address them, but could be handy for using with ASP.NET Dynamic Data or a data grid control.

Of the methods I talked about in my previous post, expression trees can’t be used to implement whole types. They’re restricted to, well, expressions. CodeDom can be used, in the same way and with the same considerations as before, so I’m not going to repeat that discussion. Dynamic methods can’t be used, because like expression trees they represent methods rather than types — but there is a closely related technique which can.

Enter the Reflection.Emit namespace

As well as the DynamicMethod class, the Reflection.Emit namespace includes builder classes for all the elements that make up a .NET assembly: modules, types, fields, properties, methods and so on. In many cases, the builder class is actually just like the familiar Reflection equivalent — for example, PropertyBuilder is a derived class of PropertyInfo — but writeable instead of read-only. For generating code, the MethodBuilder class exposes an ILGenerator which is the same as that exposed by DynamicMethod. So creating a type with these builder classes is a bit like combining what you already know about read-only Reflection with what you already know about dynamic method code generation. Easy, right?

Well, there are a few small hoops that you need to jump through. But once you’ve got your head around them… yes, it’s not too difficult. Let’s try it!

Preliminaries

The first thing we need to do is get some preliminaries out of the way. A type has to live inside an assembly. But it’s not quite that simple. Back in the day, some bright spark at Microsoft decided it would be a good idea to distinguish between modules (roughly, physical files that contain stuff) and assemblies (roughly, units of deployment and versioning). The idea was that your assembly, instead of being a huge monolithic DLL or EXE, could be split up so that large resources could be… actually, I don’t know why I’m explaining this, as only seven people even remember that the whole ‘multi-module assembly’ concept ever existed, and they are one by one being erased from history by a mysterious and terrifying force with a dodgy Austrian accent. All you need to know is that your ‘assembly’ is actually a wrapper around a thing called a ‘module,’ and it is within this mysterious ‘module’ that all the excitement happens.

So without further ado, let’s create a dynamic assembly-module-thingy to hold our runtime generated types.

AssemblyName name = new AssemblyName("MyRuntimeTypes");
AssemblyBuilder assembly = AppDomain.CurrentDomain.DefineDynamicAssembly(name, AssemblyBuilderAccess.Run);
ModuleBuilder module = assembly.DefineDynamicModule("MyRuntimeTypes");

Not much to see here, except for the AssemblyBuilderAccess option. For most dynamic assemblies, you’ll set this to Run. However, if you want to be able to persist the emitted assembly, you can specify RunAndSave instead. (In this case, you’ll need to call the DefineDynamicModule overload which also takes a file name.) You might do this to cache the dynamic assembly to save rebuilding it on every run, or for diagnostic purposes so you can open it in Reflector or ILDASM and see where it all went so horribly wrong.

Great, so now we have a big empty module. Time to get down to business.

Defining a dynamic type

To define a dynamic type in a module, you call the DefineType method. (Gasps of amazement.) This gives you a TypeBuilder object, which you can then party on. Once you’ve got the type the way you want it, you call CreateType and this gives you a Type object representing the compiled type — you can now create instances of this using Activator.CreateInstance or the faster techniques I discussed in the previous post.

For my example, I’m going to imagine that my system includes a user profile object whose properties are defined in a configuration file or something, and I have to implement an object with these properties so it can be displayed in a dynamic form builder or a data grid or something like that. (It’s not a stupendously realistic example but I want to keep it reasonably simple so as not to get bogged down in extraneous detail.) So here’s how I define and build that type:

TypeBuilder type = module.DefineType("Mindscape.RuntimeTypes.Profile");
// magic goes here
Type compiledType = type.CreateType();

What about inheritance? You can pass a parent type to DefineType:

TypeBuilder type = module.DefineType("Mindscape.RuntimeTypes.Profile", TypeAttributes.Public, typeof(ProfileBase));

If you want your dynamic type to implement an interface, you can specify it using AddInterfaceImplementation:

type.AddInterfaceImplementation(typeof(IProfile));

Of course this doesn’t create the actual properties or methods to implement the interface — it just marks the type as implementing the interface. You’ll still need to supply the interface members, or you’ll get an error when you create the type.

Implementing the type’s members: fields

Implementing a member field is dead easy: call the DefineField method:

FieldBuilder myField = type.DefineField("_myField", typeof(int), FieldAttributes.Private);

You’re probably going to want to refer to your fields when you come to build your methods and properties (otherwise, there’s not much point having the fields, right?), so you’ll usually want to keep references to them.

Implementing the type’s members: methods

Defining a method is dead easy too: call the (you guessed it) DefineMethod method.

MethodBuilder myMethod = type.DefineMethod("GetLength", MethodAttributes.Public, typeof(int), new Type[] { typeof(string) });

Here the third argument is the return type of the method, and the fourth the parameter types: so the above is equivalent to the signature public int GetLength(string).

Of course if you try to create the type in this state you will get an error because you haven’t provided a method body for GetLength. Fortunately, you can do this in exactly the same way as you did for ‘floating’ dynamic methods, as discussed in my earlier post.

ILGenerator ilgen = myMethod.GetILGenerator();
ilgen.Emit(OpCodes.Ldarg_1);
ilgen.Emit(OpCodes.Callvirt, typeof(string).GetProperty("Length").GetGetMethod());
ilgen.Emit(OpCodes.Ret);

As always with IL generation the tricky bit is knowing what IL to write and as always the answer is to cheat. Write an example of the type and its methods in C#, compile it (in Release mode), open the compiled assembly in ILDASM and copy the IL opcodes from there (mutatis mutandis of course). It’s not as hard as it looks! However, there are some special things to be aware of when your method needs to refer to other members of the type you’re building. We’ll talk about those in a moment.

Implementing the type’s properties

After all that alarming IL, properties are nice and easy. The only thing to watch out for is that you need to implement the get and set code (if you have any of course) as methods, conventionally called get_Xxx and set_Xxx (where Xxx is your property name). The signatures of the getter and setter must be consistent with the property type. With this done, you just associate those methods with the property using SetGetMethod and/or SetSetMethod. Unlike the C# syntactic sugar, you don’t write any code in the property itself.

PropertyBuilder myProperty = type.DefineProperty("MyProperty", PropertyAttributes.None, typeof(int), Type.EmptyTypes);
 
MethodBuilder myPropertyGetter = type.DefineMethod("get_MyProperty", MethodAttributes.Public, typeof(int), Type.EmptyTypes);  // getter takes no args and returns the property type
// implement myPropertyGetter using ILGenerator in the usual way
 
myProperty.SetGetMethod(myPropertyGetter);

Notice that the property itself doesn’t even have an access modifier like public or protected — the access modifiers are on the getter and setter, again slightly different from the way C# lays it out.

And there’s more

TypeBuilder also provides methods for defining events, constructors, generic type parameters etc., setting attributes on members, creating nested types and all the other spiffy things you can do to make your CLR types a joy to behold. I’m not going to plough through them all — you can figure them out if you need them.

Putting it all together

We’ve seen how to define members in isolation. Now let’s put it together and see how to build an entire type.

For our example, I just want to create a set of read-write properties according to some input spec (maybe read in from a config file), but I want the class to raise property changed notifications so it can be used in WPF or Silverlight.

This leaves me with a design decision: should I implement INotifyPropertyChanged in the runtime generated code, or should I create a base class which implements INotifyPropertyChanged and have my runtime generated class take advantage of that? From the point of view of a consumer of the runtime generated class, there’s not much difference between these two options, but from the point of view of implementation, the second is a lot easier, both to understand and to maintain, because I can write the INotifyPropertyChanged implementation, which is unchanging, in C# which everybody understands, and use IL only for the dynamic bits.

So enter a base class:

public class ViewModelBase : INotifyPropertyChanged
{
  protected void Set<T>(ref T field, T value, string propertyName)
  {
    if (!Object.Equals(field, value))
    {
      field = value;
      OnPropertyChanged(propertyName);
    }
  }
 
  protected virtual void OnPropertyChanged(string propertyName)
  {
    var handler = PropertyChanged;
    if (handler != null)
    {
      PropertyChanged(this, new PropertyChangedEventArgs(propertyName));
    }
  }
 
  public event PropertyChangedEventHandler PropertyChanged;
}

You’re cordially invited to ILDASM the ViewModelBase class and imagine the joy of generating the equivalent code dynamically into the Profile class.

With that helper sorted, we can set to implementing the Profile class itself:

private static Type CreateProfileType(Dictionary<string, Type> properties)
{
  AssemblyName name = new AssemblyName("MyRuntimeTypes");
  AssemblyBuilder assembly = AppDomain.CurrentDomain.DefineDynamicAssembly(name, AssemblyBuilderAccess.Run);
  ModuleBuilder module = assembly.DefineDynamicModule("MyRuntimeTypes");
 
  // Our runtime-generated type inherits from ViewModelBase
  TypeBuilder type = module.DefineType("Mindscape.RuntimeTypes.Profile", TypeAttributes.Public, typeof(ViewModelBase));
 
  foreach (var property in properties)
  {
    ImplementProperty(type, property.Key, property.Value);
  }
 
  return type.CreateType();
}

This is all familiar framework, but this time, we’re implementing a property for every entry in the dictionary, instead of just some hardwired name. The real work happens in the ImplementProperty method:

private static void ImplementProperty(TypeBuilder type, string propertyName, Type propertyType)
{
  FieldBuilder field = type.DefineField("_" + propertyName, propertyType, FieldAttributes.Private);
 
  PropertyBuilder property = type.DefineProperty(propertyName, PropertyAttributes.None, propertyType, Type.EmptyTypes);
 
  MethodBuilder getter = type.DefineMethod("get_" + propertyName, MethodAttributes.Public, propertyType, Type.EmptyTypes);
  ILGenerator getterIL = getter.GetILGenerator();
  getterIL.Emit(OpCodes.Ldarg_0);
  getterIL.Emit(OpCodes.Ldfld, field);
  getterIL.Emit(OpCodes.Ret);
 
  MethodBuilder setter = type.DefineMethod("set_" + propertyName, MethodAttributes.Public, typeof(void), new Type[] { propertyType });
  ILGenerator setterIL = setter.GetILGenerator();
  setterIL.Emit(OpCodes.Ldarg_0);
  setterIL.Emit(OpCodes.Ldarg_0);
  setterIL.Emit(OpCodes.Ldflda, field);
  setterIL.Emit(OpCodes.Ldarg_1);
  setterIL.Emit(OpCodes.Ldstr, propertyName);
  setterIL.Emit(OpCodes.Call, typeof(ViewModelBase).GetMethod("Set", BindingFlags.NonPublic | BindingFlags.Instance).MakeGenericMethod(propertyType));
  setterIL.Emit(OpCodes.Ret);
 
  property.SetGetMethod(getter);
  property.SetSetMethod(setter);
}

Whew! Although all that IL looks a bit scary, we can pick this apart bit by bit to see how each property gets implemented. First we need a backing field, so we call DefineField. We’re going to need to refer to this field later, when we implement the property getter and setter, so we keep hold of the returned FieldBuilder. Next we define the property itself, which is as exciting as a not very exciting thing. Then comes the good bit: defining the property getter and setter. Notice the “get_”/”set_” prefix convention and the getter and setter signatures.

The setter has a couple of interesting features. The first is that when we call the base class Set method in the setter, we can pass a MethodInfo obtained in the usual way, via good old-fashioned Reflection. The second is that when we want to refer to the backing field — to get its value in the getter, and to pass a reference to it in the setter — we can do so using the FieldBuilder we got back from DefineField. This might seem inconsistent — how come we use an old-school Reflection MethodInfo for the method but a dynamic FieldBuilder for the field? Well, all the ‘builder’ classes are also ‘info’ classes: for example, a FieldBuilder is a FieldInfo. The Emit method accepts the ‘info’ classes, so that we can pass members that aren’t part of the dynamic assembly (like the ViewModelBase.Set method), and thanks to inheritance this also allows the Emit method to accept the ‘builder’ classes, which we need so we can pass members which *are* part of the dynamic assembly (like the field).

Finally, how did I know what IL to write in the getter and setter? You should know this by now: I cheated. I wrote an example Profile class in C#, compiled that, opened it in ILDASM and just adapted the method bodies to the property name and type variables.

It works!

And that’s it. We now have a working type — we can instantiate it from a list of properties, set those dynamically created properties, see the change notifications and retrieve their values.

var t = CreateProfileType(new Dictionary<string, Type> { { "SomeInt", typeof(int) }, { "SomeStr", typeof(string) } });
dynamic o = Activator.CreateInstance(t);
((INotifyPropertyChanged)o).PropertyChanged += (s, e) => { Console.WriteLine("changed " + e.PropertyName); };
o.SomeInt = 123;
o.SomeStr = "hello";
Console.WriteLine(o.SomeInt);
Console.WriteLine(o.SomeStr);

As I said, it’s not a very realistic example, but I hope it’s simple enough to be easy to follow, and that this has given you enough of a leg up to create your own more sophisticated runtime generated types when the occasion demands! Have fun!

P.S. Carel Lotz was kind enough to draw our attention to a library named RunSharp which provides a rather friendlier wrapper over some of the IL nastiness. He’s got a tutorial on it at http://fromthedevtrenches.blogspot.com/2010/08/runsharp-il-generation-for-dummies.html — check it out!

Tagged as Visual Studio

Reflection, performance and runtime code generation

If you’re developing a library or a utility module, you’ll often you need to make it work with different types — including types that you don’t know about at build time. In many cases, you can handle this with .NET generics, but sometimes you need to work with the specific features of types, without knowing what those types are. For example, in LightSpeed, we need to be able to set the fields of an entity from a database, without knowing what those entity fields are. Or if you’re writing a generic Clone method, you need to be able to copy fields from one object to another. You can’t do this with generics because you need to access the specific fields, whereas generics only allow you to access stuff in interfaces or other constraints declared at design time.

The traditional solution to this is to use Reflection. Reflection gives you a way to enumerate and invoke methods, properties, constructors and fields. It’s pretty easy to write the ‘set all fields’ or ‘copy all fields’ code this way. However, the big problem is that it’s slow. Horribly, horribly slow.

If your code runs only occasionally, this may not be a big deal, but if you’re working with a lot of objects, Reflection code quickly becomes a bottleneck. At this point, it’s worth looking at an alternative: runtime code generation.

The idea of runtime code generation goes something like this.

  • If we could write code that was specific to each type we had to handle, then that would run really fast.
  • But we can’t do that, because we don’t have access to the types that the user of our library is going to come up with.
  • But if we did have access to those types, the code we’d write for each type would follow a predictable pattern (maybe simple, maybe complicated — but there must *be* a pattern or we couldn’t be writing a generic library in the first place!).
  • Now if the type-specific code follows a predictable pattern, we can write a program where we give it a type, and it generates and compiles the type-specific code.
  • And in that case, we can run that program at run-time, and it will create fast, type-specific methods for each of our user’s types!

At first this may sound crazy. Running a code generator and compiler sounds like it will be even slower than using Reflection. And won’t the code generator have to use Reflection to get the details of the user type anyway? Yes and yes. But the code generator and compiler have to run only once per type. They produce a compiled method, and we can call that method again and again, without needing to regenerate or recompile it.

Okay, so maybe it’s not crazy, but isn’t it fearfully difficult? Well, it can be, but it certainly doesn’t have to be. We’ll look at three ways, starting out at the highest level and gradually dropping down to the lowest.

Generating code using a high-level language

The easiest way to generate code is as C# or Visual Basic! For example, you could build up C# code as a string, either by concatenating strings or by hosting a templating engine such as T4. You can then compile the generated code using CodeDom.

This is pretty easy, though it requires careful attention to detail. However, CodeDom is quite heavyweight and slow, and you have to basically compile an entire assembly, which means you incur even more overhead creating classes and locating your generated method in the assembly. Since we’re doing this for performance reasons, this often makes CodeDom a somewhat unattractive choice, but for some scenarios it can be excellent.

Generating code with expression trees

Expression trees were introduced as part of LINQ in .NET 3.5. An expression tree is like an abstract representation of a piece of code, not bound to any particular language. Usually the compiler creates expression trees for you as part of a LINQ expression, but you can also create them yourself using the expressions API, and (tada!) you can compile them to delegates.

The trick to runtime code generation using expression trees is to figure out what your desired code would look like as a lambda. For example, suppose you want to generate a method to instantiate objects of an unknown type — basically the same as Activator.CreateInstance, but fast. (I’m going to ignore the possibility of using generics and the new() constraint, which would solve this problem in many, but not all, cases, because I want to show you expression trees, and this is a nice simple example.) Here’s a lambda that would do that:

Func<T> creator = () => new T();
// usage: T instance = creator();

How can we create a corresponding expression tree for a type? Fortunately, the expressions API corresponds pretty closely to the constructs we used in our lambda:

var lambdaBody = Expression.New(type);  // this corresponds to "new T()"
var lambdaExpr = Expression.Lambda<Func<object>>(lambdaBody);  // this corresponds to "() => ..."
Func<object> creator = lambdaExpr.Compile();

The flow here is a bit tricky to read because it’s inside-out rather than left-to-right. We first create an expression representing the body of the lambda: “new T()”. Our body is so simple we can do this using a single Expression.New call. Then we wrap it in an Expression.Lambda, which roughly corresponds to sticking the “() =>” on the front. .NET requires the Expression.Lambda instead of allowing us to compile the body directly because in the more general case the lambda might have parameters.

If you’re wondering whether this is worthwhile, calling the “creator” delegate is now about five times faster than calling Activator.CreateInstance (at least on my machine). If the delegate were replacing more Reflection calls — as in the object cloning example — then the savings would probably be even greater.

Expression trees are reasonably easy to write, reasonably safe, and quite efficient. For many runtime code generation applications, these could well be the sweet spot.

Generating code using dynamic methods

Expression trees are spiffy, but have some limitations. One of these is that in .NET 3.5 you are limited to simple expressions. (This limitation is greatly relaxed in .NET 4, which provides new expression types to represent things like loops, if statements and goto. Yes, all this whizzy modern technology, and it still has goto.) Another is that you can’t use an expression tree to generate a member method, which is important if your generated code needs access to private members of the user type, as in the cloning example which needs access to private fields. For that, you need to drop down to the lowest level: generating IL directly using dynamic methods.

(By the way, dynamic methods aren’t the only place you can use the direct IL generation technique. If you need to generate an entire class or assembly, then you can’t use expression trees for that, so you have to either move up to CodeDom, or down AssemblyBuilder, TypeBuilder and related Reflection.Emit classes, in which case you’ll be generating IL into the method bodies. I’m not going to cover that here; maybe some other time.)

Setting up a dynamic method is pretty easy. Here’s one for the ‘instantiate a new object’ example we had for expression trees:

// using System.Reflection.Emit;
 
var dynamicMethod = new DynamicMethod(
  "create_" + type.GetName(),     // name of the dynamic method
  typeof(object),                 // the return type
  new Type[0],                    // the dynamic method has no parameters
  type);                          // which type it's a member of
// TODO: create method body
Func<object> creator = dynamicMethod.CreateDelegate(typeof(Func<object>));

You just need to provide a name for the method, specify the signature (parameter and return types), and say which type the method is a member of, though this is usually only an issue if the method needs to access private members such as fields.

The tricky bit is generating the IL for the dynamic method. No friendly expression trees or C# compiler to help you here: just man versus IL opcode, two enter, only one leaves. This means you have two alternatives: (1) learn IL or (2) cheat.

I’m not going to talk about option 1.

To cheat, you write an example instance of your dynamic method in a high level language such as C#, baking in an example ‘user type’ you have created for the purpose. You then compile this in the usual way to produce an EXE or DLL. (Make sure you compile in Release mode — if you’re in Debug mode then the compiler will emit all sorts of extra nonsense which is no use to you.)

Now open your compiled EXE or DLL in ILDASM. Yes, that’s right, ILDASM, the application you last looked at in 2002 and which you thought had been completely superseded by Reflector. ILDASM will show you the IL op codes that your example method compiled to, and all you need to do is copy those op codes into your DynamicMethod, replacing specific references to fields, types, etc. of your example type with dynamically-generated references to the fields, types, etc. of the type for which you are creating the DynamicMethod.

For example, here’s what ILDASM shows for new-ing up an object:

We can map this across pretty mechanically to IL op code objects in the System.Reflection.Emit.OpCodes class, and thereby generate the equivalent IL into our dynamic method.

var constructor = type.GetConstructor(new Type[] { });  // we want to call the default constructor
 
var ilGenerator = dynamicMethod.GetILGenerator();
ilGenerator.Emit(OpCodes.Newobj, constructor);
ilGenerator.Emit(OpCodes.Ret);

Obviously, in most cases, the code generation would be more complex: for example, in the object cloning scenario, you’d loop over the fields you wanted to copy, emitting a Ldfld (load field) and Stfld (store to field) op code for each one until you’d built up the entire dynamic method.

IL generation is low-level, but that makes it very efficient. There’s no extra compilation step as there was with expression trees. If you have to generate a lot of dynamic methods, then IL and DynamicMethod could be the way to go.

Conclusion

Most applications don’t need runtime code generation — either they know everything they need to know at compile time, or the performance considerations are such that Reflection is good enough. However, if you’re running a lot of Reflection code, then replacing it with type-specific code generated at runtime can give you a big performance boost. You’ll need to figure out how to trade off difficulty (and hence maintainability) against performance, but I hope this article gives you some idea of what the options are and what tradeoffs you’re making with each one. Have fun!

Tagged as Visual Studio

In bed with Roslyn

Microsoft have been talking about their “C# compiler as a service” project, aka Roslyn, for a couple of years now, and yesterday the C# team finally released a preview version. Being a bit of a language geek, I was interested to see if I could use Roslyn not only to perform analysis and refactorings but to extend the C# language itself. The reason for wanting to do this is to replace repetitive and error-prone boilerplate code with terser, more expressive code — quicker to write, easier to read and no chance for human error to muck it up.

The motivating example

If you’ve written much WPF or Silverlight code, you’ve undoubtedly written a bunch of dependency properties. Because C# has no built-in support for DPs, this means you have to write a bunch of code spread across several members, something like this:

public static readonly DependencyProperty WidgetudeProperty =
  DependencyProperty.Register("Widgetude", typeof(int), typeof(Widget),
    new FrameworkPropertyMetadata(OnWidgetudeChanged));
 
public int Widgetude
{
  get { return (int)GetValue(WidgetudeProperty); }
  set { SetValue(WidgetudeProperty, value); }
}
 
private static void OnWidgetudeChanged(DependencyObject d, DependencyPropertyChangedEventArgs e)
{
  // on-change logic here
}

This is not only boring to write, it is hard to read because the code is cluttered with boilerplate, and it is prone to error. If I had a dollar for everybody who put business logic in the CLR property setter and then wondered why it wasn’t being run when a binding updated… well, I’d have $3.00. And it’s awful to maintain. For example, if I change the property to a double, I have to update the type in three places. If I change the property name, I have to update it in seven!

Wouldn’t it be sweet if C# had language support for DPs, so you could write:

[dependency] public int Widgetude
{
  change { /* on-change logic here */ }
}

and the C# compiler would treat it as though you’d written all the boilerplate? Of course, that’s impossible because even if you used something like PostSharp to do a rewriting at the IL level, there’s no way to represent a “property change” element in IL — it’s simply not legal C#.

Enter Roslyn.

Roslyn is forgiving

Roslyn is about building tooling for source code, and one of the nasty facts about source code is that it is in an invalid state more often than not. For example, you may be halfway through writing an ‘if’ statement and haven’t done the closing brace yet. Or you’ve written a call to a method but you haven’t written that method yet. Or you’ve misspelt ‘get’ as ‘grt’ in a property accessor.

A compiler won’t stand for this nonsense. Roslyn, at least at the parsing stage, is much more forgiving. You can write any old gibberish and Roslyn will do its best to organise it into a nice syntax tree, even if that syntax tree is uncompilable.

This forgiving nature means we can sneak our own illegal constructs into Roslyn, and use the Roslyn APIs to transform them into legal constructs. It’s a risky business, because Roslyn will try to interpret whatever we write as valid C#, so we probably shouldn’t be relying on any particular interpretation of invalid C#, but what the heck. If you’re going to have a sweet, forgiving nature, you must expect people to take advantage of it.

Reading code in with Roslyn

Reading code in with Roslyn is a snap — just call SyntaxTree.ParseCompilationUnit.

SyntaxTree tree = SyntaxTree.ParseCompilationUnit(sourceCode);
var root = (CompilationUnitSyntax)(tree.Root);

A SyntaxTree is an abstract representation of the source code — warts and all. It includes everything: good code, bad code, whitespace, rude comments about the client, rude comments about the client’s mother, the lot. This is all represented as a tree — at the top is a root node containing usings, members, trivia (e.g. whitespace and comments), etc., then each member (e.g. a namespace) contains sub-members (e.g. the classes in that namespace), and so on down to individual statements and expressions.

If we give Roslyn our made-up dependency property syntax (suitably enclosed in a class of course), it cheerfully parses it as a property (represented as a PropertyDeclarationSyntax object). And the change { } fragment looks enough like a get or set accessor that the Roslyn parser actually accepts it as an accessor, noting only with clinical disinterest that it wasn’t recognised as the get or set accessor.

Of course, the compiler stage of Roslyn would pitch a fit if it saw an unrecognised accessor. But we’re not going to ask the compiler stage its opinion. We’re going to take what the parser produces, and turn it into something of our own.

Rewriting code with Roslyn

Roslyn syntax trees are immutable. This is computer science jargon for “good,” but it does mean we can’t just party on the SyntaxTree object. Instead we have to transform it to produce a new SyntaxTree object. Roslyn provides a bunch of helper methods for updating, extending and pruning syntax trees, but for our purposes we’re going to use a gadget called SyntaxRewriter. SyntaxRewriter lets you traverse an entire syntax tree making whatever changes you want (though again, remember that ‘making changes’ doesn’t actually modify an existing object, it produces a new one).

Our syntax rewriter is going to rewrite our property to be a CLR dependency property wrapper, and it’s also going to inject a DependencyProperty field and an optional change callback into the enclosing class. That means we’re interested in two kinds of syntax tree node. First, we’re interested in property declarations, because (a) we need to rewrite them and (b) we need to generate fields and callback methods from them. Second, we’re interested in class declarations, because we need to poke those fields and callback methods into the containing class.

This observation makes it easy to stub out our rewriter.

public class DPRewriter : SyntaxRewriter
{
  private readonly List<FieldDeclarationSyntax> _fields = new List<FieldDeclarationSyntax>();
  private readonly List<MethodDeclarationSyntax> _methods = new List<MethodDeclarationSyntax>();
 
  protected override SyntaxNode VisitPropertyDeclaration(PropertyDeclarationSyntax node)
  {
    // TBA
  }
 
  protected override SyntaxNode VisitClassDeclaration(ClassDeclarationSyntax node)
  {
    // TBA
  }
}

The fields and methods collections are going to hold the dependency property fields and the change callback methods that we generate during property visits, so that we can later inject them into the class.

I’ve also been lazy and chosen an attribute-like syntax for marking a property as a dependency property. This makes the first stage of our property handler pretty easy too:

protected override SyntaxNode VisitPropertyDeclaration(PropertyDeclarationSyntax node)
{
  bool isDP = node.Attributes
                  .SelectMany(a => a.ChildNodes().OfType<AttributeSyntax>())
                  .Any(a => a.Name.PlainName == "dependency");
  if (!isDP)
  {
    return base.VisitPropertyDeclaration(node);
  }
 
  // real work TBA
}

Unfortunately, this is pretty much the end of the easy bit. Now we need to dive in and actually do our rewriting.

Rewriting the dependency property syntax

You probably want to see some code here, but you see, the other thing about Roslyn is, Roslyn is verbose. Really, really, verbose. I mean, it makes CodeDom look terse. It’s not really Roslyn’s fault — you’re manipulating syntax trees which are trying to provide full flexibility and fidelity to source code, so there’s bound to be a lot of detail and depth — but it does make it a pain in the backside to post any useful chunk of Roslyn code.

Consider yourself warned.

Our rewrite code needs to do three things:

  • Add a field of type DependencyProperty to the fields list
  • Replace the property with the DP implementation
  • If the property had a change ‘accessor,’ wrap the body of that accessor up in a method and add that to the methods list

To build new Roslyn objects, we use the Syntax.* factory methods. Each of these takes 84 parameters, most of which are optional, so if you don’t like C# named arguments, look away now. Here, for example, is how to build the field. (BIG CAVEAT: I’ve only been playing with Roslyn for a few hours. I’m pretty sure I’m using the wrong idioms for some things, and that there are easier ways to do others. Don’t take this as a best practice guide.)

// Desired output:
// DependencyProperty XxxProperty = DependencyProperty.Register("Xxx",
//   typeof(PropType), typeof(OwnerType),
//   new FrameworkPropertyMetadata(OnXxxChanged);
 
// Set up some useful elements
var changeCallback = node.AccessorList.Accessors.FirstOrDefault(a => a.Kind == SyntaxKind.UnknownAccessorDeclaration && a.Keyword.ValueText == "change");
bool hasChangeCallback = changeCallback != null;
 
var dpType = Syntax.ParseTypeName("System.Windows.DependencyProperty");
 
var ownerType = Syntax.ParseTypeName(node.FirstAncestorOrSelf<ClassDeclarationSyntax>().Identifier.ValueText);
string propName = node.Identifier.ValueText;
ExpressionSyntax propNameExpr = Syntax.LiteralExpression(SyntaxKind.StringLiteralExpression, Syntax.Literal(text: '"' + propName + '"', value: propName));
ExpressionSyntax typeofPropType = Syntax.TypeOfExpression(argumentList: Syntax.ArgumentList(arguments: Syntax.SeparatedList(Syntax.Argument(expression: node.Type))));
ExpressionSyntax typeofOwnerType = Syntax.TypeOfExpression(argumentList: Syntax.ArgumentList(arguments: Syntax.SeparatedList(Syntax.Argument(expression: ownerType))));
 
// Build the arguments to the Register call
var registerArgs = new List<ArgumentSyntax> {
  Syntax.Argument(expression: propNameExpr),
  Syntax.Argument(expression: typeofPropType),
  Syntax.Argument(expression: typeofOwnerType)
};
 
if (hasChangeCallback)
{
  ExpressionSyntax changeMethod = Syntax.ParseName("On" + propName + "Changed");
  ExpressionSyntax fpm = Syntax.ObjectCreationExpression(
    type: Syntax.ParseTypeName("System.Windows.FrameworkPropertyMetadata"),
    argumentListOpt: Syntax.ArgumentList(
      arguments: Syntax.SeparatedList(Syntax.Argument(expression: changeMethod))
    )
  );
  registerArgs.Add(Syntax.Argument(expression: fpm));
}
 
var argSeparators = Enumerable.Repeat(Syntax.Token(SyntaxKind.CommaToken), registerArgs.Count - 1).ToList();
 
// Build the call to the Register method
ExpressionSyntax dpexpr = Syntax.InvocationExpression(
  expression: Syntax.ParseName("System.Windows.DependencyProperty.Register"),
  argumentList: Syntax.ArgumentList(
    arguments: Syntax.SeparatedList(registerArgs, argSeparators)
  )
);
 
// Build the field variable
string fieldName = propName + "Property";
VariableDeclaratorSyntax declarator = Syntax.VariableDeclarator(
  identifier: Syntax.Identifier(fieldName),
  initializerOpt: Syntax.EqualsValueClause(
    value: dpexpr
  )
);
 
// Build the field declaration
FieldDeclarationSyntax newField = Syntax.FieldDeclaration(
  modifiers: Syntax.TokenList(Syntax.Token(SyntaxKind.PublicKeyword), Syntax.Token(SyntaxKind.StaticKeyword)),
  declaration: Syntax.VariableDeclaration(
    type: dpType,
    variables: Syntax.SeparatedList(declarator)));
 
// Store it to add to the class later
_fields.Add(newField);

Don’t panic at this. I know it’s long by bloggy standards (about 50 lines), and that it appears very dense. But a lot of the density is down to the named arguments and the deep nesting of Roslyn trees, so if you walk through it methodically, it’s not that difficult, just lengthy. Here’s what happens:

  • We build up ExpressionSyntax objects representing the arguments to the DependencyProperty.Register call: the property name literal string, the typeof() expressions for the property type and the owner type, and optionally a “new FrameworkPropertyMetadata” (ObjectCreationExpression) with a reference to the callback method name
  • We build a call to the DependencyProperty.Register method, passing the arguments we built in the last step. Roslyn asks us to wrap the arguments in an ArgumentList, which incurs some tedious messing around with separators. Roslyn may be forgiving on the read side, but it’s pedantic as all hell on the write side!
  • We build the field declaration itself. This comes in two bits. First we create a VariableDeclaratorSyntax representing the field variable and the initialiser for that variable. The initialiser is just the DependencyProperty.Register call we built in the previous step. Then we wrap this up with a “public” visibility modifier and a data type into a FieldDeclarationSyntax.
  • Job done — stick the FieldDeclarationSyntax in the collection for later processing.

After this epic, the other two bits are relatively concise. Here’s the code that generates the CLR wrapper property:

// Desired output:
// public PropType Xxx
// {
//   get { return (PropType)GetValue(XxxProperty); }
//   set { SetValue(XxxPropety, value); }
// }
ExpressionSyntax getval = Syntax.ParseExpression("GetValue(" + fieldName + ")");
ExpressionSyntax casty = Syntax.CastExpression(type: node.Type, expression: getval);
StatementSyntax getter = Syntax.ReturnStatement(expressionOpt: casty);
 
StatementSyntax setter = Syntax.ParseStatement("SetValue(" + fieldName + ");");
 
PropertyDeclarationSyntax newProperty = Syntax.PropertyDeclaration(
  modifiers: Syntax.TokenList(Syntax.Token(SyntaxKind.PublicKeyword)),
  type: node.Type,
  identifier: node.Identifier,
  accessorList: Syntax.AccessorList(
    accessors: Syntax.List(
      Syntax.AccessorDeclaration(
        kind: SyntaxKind.GetAccessorDeclaration,
        bodyOpt: Syntax.Block(
          statements: Syntax.List(
            getter
          )
        )
      ),
      Syntax.AccessorDeclaration(
        kind: SyntaxKind.SetAccessorDeclaration,
        bodyOpt: Syntax.Block(
          statements: Syntax.List(
            setter
          )
        )
      )
    )
  ));

Again, there’s lots of fiddly wrapper classes and lots of nesting, but the core concepts aren’t too hard. We build a GetValue call, apply a cast to it, and stick a return statement on it, and that’s our getter body. For our setter body, we build a SetValue call. Notice that Roslyn lets us build our code as C# source using string operations via the Parse methods, so we don’t have to do everything with trees. Then we bundle the getter and setter bodies up into accessors and stuff them into a property declaration.

Finally comes the on-changed callback. This is relatively easy because all we need to do is create a method, and transplant whatever the user wrote inside the change { } pseudo-accessor into that method.

// Desired output:
// private static void OnXxxChanged(DependencyObject d, DependencyPropertyChangedEventArgs e)
// {
//   /* body */
// }
 
// Reminder
var changeCallback = node.AccessorList.Accessors.FirstOrDefault(a => a.Kind == SyntaxKind.UnknownAccessorDeclaration && a.Keyword.ValueText == "change");
bool hasChangeCallback = changeCallback != null;
 
if (hasChangeCallback)
{
  // Specify the method parameters (always d and e)
  List<ParameterSyntax> parameterList = new List<ParameterSyntax>
  {
    Syntax.Parameter(identifier: Syntax.Identifier("d"), typeOpt: Syntax.ParseTypeName("System.Windows.DependencyObject")),
    Syntax.Parameter(identifier: Syntax.Identifier("e"), typeOpt: Syntax.ParseTypeName("System.Windows.DependencyPropertyChangedEventArgs")),
  };
  var paramSeparators = Enumerable.Repeat(Syntax.Token(SyntaxKind.CommaToken), parameterList.Count - 1).ToList();  // must be an easier way
  ParameterListSyntax parameters = Syntax.ParameterList(
    parameters: Syntax.SeparatedList(parameterList, paramSeparators)
  );
 
  // Build the method
  MethodDeclarationSyntax changeMethod = Syntax.MethodDeclaration(
    modifiers: Syntax.TokenList(Syntax.Token(SyntaxKind.PrivateKeyword), Syntax.Token(SyntaxKind.StaticKeyword)),
    identifier: Syntax.Identifier("On" + propName + "Changed"),
    returnType: Syntax.PredefinedType(Syntax.Token(SyntaxKind.VoidKeyword)),
    parameterList: parameters,
    bodyOpt: changeCallback.BodyOpt
  );
  _methods.Add(changeMethod);
}

A lot of the noise here comes from building up the parameter list; once we get into building the method, you can almost read off what you need to do from the desired output syntax. public static, those are the modifiers; OnXxxChanged, that’s the identifier; void, that’s the return type; and the method body, well hey, Roslyn has already parsed that into body of the change pseudo-accessor, so we just pass a reference to that!

Taken together, this probably looks intimidating. Brave heart! It’s not difficult, just long and fiddly. Everything we’re doing here maps very exactly to the C# syntax in the “desired output” sections. If you feel yourself getting lost, map it back to those.

One final thing: our property rewriter needs to return the rewritten property.

return newProperty;

Back on familiar ground at last!

Rewriting the class

Okay, so we rewrote the property by returning it from VisitPropertyDeclaration, but how are we going to get those fields and methods we’ve been collecting emitted into the generated code? Well, fields and methods live at class level, so for this we need to rewrite the class. Fortunately, we only need to do a very simple rewrite, just jamming in a few extra members. Here’s the code:

protected override SyntaxNode VisitClassDeclaration(ClassDeclarationSyntax node)
{
  var newTypeDeclaration = (TypeDeclarationSyntax)base.VisitClassDeclaration(node);
 
  if (_fields.Count > 0 || _methods.Count > 0)
  {
    var members = new List<MemberDeclarationSyntax>(newTypeDeclaration.Members);
    members.InsertRange(0, _methods);
    members.InsertRange(0, _fields);
 
    return ((ClassDeclarationSyntax)newTypeDeclaration).Update(
        newTypeDeclaration.Attributes,
        newTypeDeclaration.Modifiers,
        newTypeDeclaration.Keyword,
        newTypeDeclaration.Identifier,
        newTypeDeclaration.TypeParameterListOpt,
        newTypeDeclaration.BaseListOpt,
        newTypeDeclaration.ConstraintClauses,
        newTypeDeclaration.OpenBraceToken,
        Syntax.List(members.AsEnumerable()),
        newTypeDeclaration.CloseBraceToken,
        newTypeDeclaration.SemicolonTokenOpt);
  }
 
  return newTypeDeclaration;
}

What’s the deal? The first thing we have to do is call the base class implementation, which goes off to all the elements within the class and gives them the chance to rewrite themselves. This is essential, because otherwise our property rewriter would never get called. And it’s also essential to do it first, so that the fields and methods that are by-products of the property rewriting are ready when we need them.

Then we take a copy of the existing members of the class, add in the fields and methods that came out of the property rewriting process, and use the Update helper method to create a copy of the class with the extended member collection. As before, it looks scarier than it is — most of the Update call is boilerplate copying, there’s just quite a bit of it!

Let’s rewrite

Finally, we need to call the rewriter to transform our illegal pseudo-C# into legal C#.

SyntaxTree tree = SyntaxTree.ParseCompilationUnit(sourceCode);
var root = (CompilationUnitSyntax)(tree.Root);
 
var rewriter = new DPRewriter();
var newRoot = (CompilationUnitSyntax)(rewriter.Visit(root));
newRoot = newRoot.Format();
Console.WriteLine(newRoot.ToString());

Just new up a rewriter, pass it the node you want to rewrite — in our case we’ll rewrite the whole source by passing the root — and call ToString() on the result to get the transformed code. Easy!

Here’s the results (slightly reformatted for readability):

// Input:
public class Widget : DependencyObject
{
  [dependency] public int Widgetude
  {
    change { Console.WriteLine("DP change: " + e + " on " + d); }
  }
}
 
// Output:
public class Widget : DependencyObject
{
  public static System.Windows.DependencyProperty WidgetudeProperty =
    System.Windows.DependencyProperty.Register("Widgetude",
      typeof (int), typeof (Widget),
        new System.Windows.FrameworkPropertyMetadata(OnWidgetudeChanged));
 
  private static void OnWidgetudeChanged(System.Windows.DependencyObject d, System.Windows.DependencyPropertyChangedEventArgs e)
  {
    Console.WriteLine("DP change: " + e + " on " + d);
  }
 
  public int Widgetude
  {
    get { return (int)GetValue(WidgetudeProperty); }
    set { SetValue(WidgetudeProperty); }
  }
}

Sweet!

Take it for a spin

The possibilities for Roslyn make my head spin. It’s going to lower the barrier to entry for building code analysis and refactoring tools, which I reckon is going to lead to a huge flowering of new and cool products in this sector. But it also opens up some exciting new possibilities for code generation — from the legitimate, like aspect weaving on source code, to the slightly naughty, like syntax extensions.

You can get the Roslyn preview for VS2010 SP1 here. It has a bunch of great samples and walkthroughs which are well worth checking out. My code is about a million times uglier, and is I’m sure riddled with unnecessary kludges and workarounds, but if you want to have a play with it, here it is.

Enjoy, and go crazy!

Enjoyed this post? Please upvote on Hacker News or Reddit — thanks!

Tagged as Visual Studio

NHibernate Designer 2 is here!

NHibernate Visual Designer for Visual Studio

I’m very pleased to announce the immediate availability of the NHibernate Designer 2.0! This release represents a big set of improvements for existing users and a great experience for new users.

Key new features:

  • File per class generation option added
  • Setup wizard for creating a new model
  • Generate mappings with either inline XML or Fluent NHibernate
  • Configuration guidance wizard to integrate quicker with your project
  • Automatically add references to required NHibernate assemblies
  • Fully integrated schema migrations framework
  • View entity database tables directly within Visual Studio
  • Set up validation in the designer for NHibernate.Validator
  • Per project code generation templates

Along with all these great additions we’ve added a lot of polish to really boost the easy of use — we think you’ll be very pleased with the results. Read more about the NHibernate Designer 2 features here.

How to get it today

If you already have the NHibernate Designer installed in Visual Studio 2010, simply open the Extension Manager (Tools -> Extension Manager) and check for updates. The latest build will be there, install it and you’re away.

If you have not yet tried the NHibernate Designer then click here to install the latest version.

One more thing…

We’ve dropped the price for the launch of version 2.0. Down to a tight $99 USD per developer! Given the hours of time you’ll save it’s the perfect time to invest in your toolset!

Free Download Download the free edition

Archives

Join our mailer

You should join our newsletter! Sent monthly:

Back to Top