Sunday, May 27, 2012

Magic strings & member names

One of the most annoying thing that i see a lot of times when i get assigned to an aready existing project, is the massive amount of strings in the code, especially the ones called 'magic strings'. Magic strings are strings that when changed, will break program correctness. This exaple is most often found in WPF projects, where the INotifyPropertyChanged interface has the event which takes the property name in its argument:
private string _MyProperty = null;  
     public string MyProperty  
     {  
       get { return _MyProperty; }  
       set  
       {  
         _MyProperty = value;  
         RaisePropertyChanged("MyProperty");  
       }  
     }  
where obviously RaisePropertyChanged will call the PropertyChanged event if not null.
Now what happens when you rename MyProperty? The program will still compile, but if you have logic built on top of this property changed notification, that will break. Obviously not good.

Now im not gonna show you anything new, the solution for this is widely known for everyone who searches the correct terms, like here or here. All these techniques use a simple solution: pass in the parameter name in a lambda expression. The comiler will complain if the property doesnt exist, and you only use the lambda expression to retrieve the name of the property.

Now here instead of reinventing the wheel, ill just try to paint it in a nice color.
You will notice that i am a big fan of fluent interfaces and fluent programming. I like a lot when a program is almost selfdeveloping. You get an instance, and you can only choose from 3-4 things what to do with it. Its also good for everyone who reads it, saves time to understand the code, and sometimes even makes people (e.g developers) smile :)


So lets see how would we want to use it. Lets have the above mentioned MyProperty in a class called MyClass. The possible usages i see:
  1.  I am in myclass. In this case i want to have the simplest access to my property, without additional codes. For example:
    var propName = MethodName.Prop(() => MyProperty);  
    
  2.  I am somewhere in the code, and i know i will need the given property name. I dont want to create an instance of the class, and i dont want to specify the type of the property neither.
    var propName = MethodName<MyClass>.Prop(x => x.MyProperty);  
    
    Here the easiest syntax i could come up with was the expression taking an imaginable MyClass. This is to get the maximum amount of help from IntelliSense
  3.  I have an object of class MyClass, and i dont want to specify any types at all. Id use:
     var propName = MethodName.For(myobject).Prop(x => x.MyProperty);  
    
    Here im trying to use a fluent style programming, to speed up this retrieval.
All these requirements are easily achievable. Going one by one, the first one. First we will need a static class to contain the methods.
      /// <summary>  
      /// a helper class to retrieve the name of a property from an memberexpression  
      /// </summary>  
      public static class MemberName  
      {  
      }  
Now we need to implement the static method that will take the simple member expression we try to use.
     /// <summary>  
     /// gets the property name for a member access  
     /// </summary>  
     /// <typeparam name="T"></typeparam>  
     /// <param name="propertyExpression"></param>  
     /// <returns></returns>  
     public static string Prop<T>(Expression<Func<T>> propertyExpression)  
     {  
       if (propertyExpression == null)  
         throw new ArgumentNullException(MemberName.Prop(() => propertyExpression));  
       if (propertyExpression.Body.NodeType != ExpressionType.MemberAccess)  
         throw new ArgumentException("Only direct member access expressions are supported!", MemberName.Prop(() => propertyExpression));  
       return ((MemberExpression)propertyExpression.Body).Member.Name;  
     }  
We define the method with one generic parameter, that is the type of the property. This is better then using Func<object>, cause in that case the runtime will add a cast in the expression.
Besides this we just do the basic stuff: do some argument check, and then use the magic from any of the above mentioned article. One cool stuff is, we can already use our not even yet created function, since the argument exceptions all take the argument name as string in the constructor.


How would the second use case look? We need a typed version of our class, that takes and saves the type argument, so that later when entering the argument for the Prop method, the compiler can guess the type of the argument in the lambda expression, and thus give you some nice intellisense.
We go about this almost the same way as above:
      /// <summary>  
      /// a typed helper class to retrieve the name of a property from an expression  
      /// </summary>  
      /// <typeparam name="TClass"></typeparam>  
      public static class MemberName<TClass>  
      {  
      /// <summary>  
      /// retrieves the name of the propery based on an expression tree  
      /// </summary>  
      /// <param name="propertyExpression"></param>  
      /// <returns></returns>  
      public static string Prop<TProp>(Expression<Func<TClass, TProp>> propertyExpression)  
           {  
                if (propertyExpression == null)  
                     throw new ArgumentNullException(MemberName.Prop(() => propertyExpression));  
                if (propertyExpression.Body.NodeType != ExpressionType.MemberAccess)  
                     throw new ArgumentException("The parameter needs to be a member access", MemberName.Prop(() => propertyExpression));  
                return ((MemberExpression)propertyExpression.Body).Member.Name;  
           }  
      }  

So far so good. Now what about the third case?
Easy thing as well.We could try to change the MemberName<T> class to not static, and try to create an instance of it with the .For() method of the not generic class, but then we will have the problem, that you cannot have 2 methods of the same arguments in a class, having one as static and one as not.
So we will have to introduce a third class, just to be able to keep the same name. (Of course you could just use a different name for the static and non static method, and have only one version of the generic type, but i think on a long term code readability its better if all these methods have the same name.

So we create our third class:
      /// <summary>  
      /// a typed helper class to retrieve the name of a property from an expression  
      /// </summary>  
      /// <typeparam name="TClass"></typeparam>  
      public class MemberNameFor<TClass>  
      {  
      /// <summary>  
      /// retrieves the name of the propery based on an expression tree  
      /// </summary>  
      /// <param name="propertyExpression"></param>  
      /// <returns></returns>  
      public string Prop<TProp>(Expression<Func<TClass, TProp>> propertyExpression)  
           {  
                if (propertyExpression == null)  
                     throw new ArgumentNullException(MemberName.Prop(() => propertyExpression));  
                if (propertyExpression.Body.NodeType != ExpressionType.MemberAccess)  
                     throw new ArgumentException("The parameter needs to be a member access", MemberName.Prop(() => propertyExpression));  
                return ((MemberExpression)propertyExpression.Body).Member.Name;  
           }  
      }  
and then we can return a new instance of this class in our first class with the For method:
     /// <summary>  
     /// gets a type specific version of property name getting  
     /// </summary>  
     /// <typeparam name="T"></typeparam>  
     /// <param name="item"></param>  
     /// <returns></returns>  
           public static MemberNameFor<T> For<T>(T item)  
           {  
                return new MemberNameFor<T>();  
           }  
Now we can also replace the function in our second class(MemberName<TClass>) to a simple relay call, to remove too much redundancy:
     /// <summary>  
     /// retrieves the name of the propery based on an expression tree  
     /// </summary>  
     /// <param name="propertyExpression"></param>  
     /// <returns></returns>  
     public static string Prop<TProp>(Expression<Func<TClass, TProp>> propertyExpression)  
           {  
                return new MemberNameFor<TClass>().Prop(propertyExpression);  
           }  
Cool, all this done.


So what else might we want? Obviously this technique can be used only when something stinks of reflection, but sometimes that is not avoidable.
So what about method calls?
Well, for the first case, when we are in the class itself, there is a simple solution.
Lets say we have a method:
     public string MyMethod(string myParameter)  
     {  
       throw new NotImplementedException();  
     }  
To get the following syntax:
 var methodname = MemberName.Method(MyMethod);  
all we need to do is add a function that will take a Func<TParam, TResult> to our MemberName static class:
     /// <summary>  
     /// gets the name of the method specified  
     /// </summary>  
     /// <typeparam name="T"></typeparam>  
     /// <param name="method"></param>  
     /// <returns></returns>  
     public static string Method<TParam, TResult>(Func<TParam, TResult> method)  
     {  
       if (method == null)  
         throw new ArgumentNullException(MemberName.Prop(() => method));  
       return method.Method.Name;  
     }  
Now why are we using a Func<TParam, TResult> instead of the well working Expression? Well first of all, it is enough here. We dont need in this case an expression tree, as we can get the method name from the delegate directly. The second reason you can see in just a minute.
As you can see from the definition, all you have to do, is implement an overload of this method with as many type parameters, as you need.
An unfortunate thing is here that if the method has some overloads, then you cannot omit the type arguments, because the compiler will not know which method you tried to specify.


Now what about using this in case 2 and 3?
The unfortunate thing is, that the compiler cannot infer the return type of a method group. So if we want to use the above syntax, e.g. pass the method as a method group:
 var methodname = MethodName<MyClass>.Method(x => x.MyMethod);  
Then the method implementation would be in MemberName<TClass> something like:
     /// <summary>  
     /// gets the name of a method  
     /// </summary>  
     /// <typeparam name="TResult"></typeparam>  
     /// <param name="method"></param>  
     /// <returns></returns>  
     public static string Method<TResult>(Expression<Func<TClass, TResult>> method)  
     {  
       if (method == null)  
         throw new ArgumentNullException(MemberName.Prop(() => method));  
       if (method.Body.NodeType != ExpressionType.Call)  
         throw new ArgumentException("The parameter needs to be a method call", MemberName.Prop(() => method));  
       return ((MethodCallExpression)method.Body).Method.Name;  
     }  
but in this case we cannot simply write
 var methodName = MemberName<MyClass>.Method(x=> x.MyMethod));  
because the compiler will complain that it cannot infer the type of the arguments, and you will need to specify them manually.
This works then, so you can say:
 var methodName = MemberName<MyClass>.Method<Func<string>>(x=> x.MyMethod));  
but specifying explicitly here the type of the method breaks the beauty of fluent coding, and also the dynamic nature of the code.
Obviously the same case stands for the 3rd scenario.

But if we already have to use the types of the methods, then probably the most user friendly use would be to specify the method call in an expression. This way all refactor tools will recognize the automatically the correct method signature, and rename the method call in the expression tree too, if the original method is renamed.
In this case the usage would look like:
 var methodName = MemberName<MyClass>.Method(x => x.MyMethod(default(string)))  
Here the drawback is that the tipes need to be specified explicitly, but at least you will get a compile time warning, if they dont match, and this binds your function call to a direct instance of that method, so changing it's signature or name the compiler will enforce the changes to be propagated here too.
This syntax is already supported by the above implementation.

Now we achieved all the functionatlity we wanted to. You can download the final source code here.

Any comments?