Monday, February 18, 2013

Memorizing method results with PostSharp (part 2)

This post is the continuation of the previous post Memorizing method results with PostSharp (part 1).
In the previous post we created an attribute, that allowed caching method results. The caching was implemented with a static dictionary object, that stored a reference to the objects and the method atguments as well. That solution is harly a good one for a production environment, as it might lead to memory leaks, the dictionary just growing and having more n more elements, withou allowing the GC to collect the objects that we dont use.

Reference to the object instance

Help from PostSharp

Now PostSharp has a solution for the first problem of our, having a reference to the object itself.
In the PostSharp documentation you can read about the lifecycle and scope of the postsharp aspects. The instance-scoped aspects seem to be exactly what we need. It creates one instance ofthe attribute for an instance of the class it is used on. In addition it automatically has an instance for static methods.
The documentation states that we only need to implement the IInstanceScopedAspect interface, and the magic will happen.
This interface has 2 methods:
object IInstanceScopedAspect.CreateInstance(AdviceArgs adviceArgs)
void IInstanceScopedAspect.RuntimeInitializeInstance()
The first one is called on to create the aspect when a new instance of the class is created. The second one is called on the newly created aspect.
We will simply copy over all the data from our aspect, and that should do. Since we are creating an instance aspect, we should be able to remove the first 2 keys from the dictionary, and just use the arguments as keys, and the return value as value.

Unit tests

Lets start first extending our test class with a static and a second instance method, and then write some tests for static, and 2 different instance methods.
        public class TestClass
        {
            public static int StaticNrOfExecutions = 0;
            [CacheResultAttribute.CacheResult(CacheLocations.Static)]
            public static int StaticDoubleTheNumber(int i)
            {
                StaticNrOfExecutions++;
                return i * 2;
            }

            public int CallStaticDoubleTheNumber(int i)
            {
                return StaticDoubleTheNumber(i);
            }
            
            public int NrOfExecutions = 0;
            public int NrOfExecutions2 = 0;
            [CacheResultAttribute.CacheResult(CacheLocations.Static)]
            public int DoubleTheNumber(int i)
            {
                NrOfExecutions++;
                return i * 2;
            }
            [CacheResultAttribute.CacheResult(CacheLocations.Static)]
            public int DoubleTheNumber2(int i)
            {
                NrOfExecutions2++;
                return i * 2;
            }
        }
So now having this class, lets write a test for static methods:
        [TestMethod]
        public void TestStaticResultCached()
        {
            var sa = new TestClass();
            var sb = new TestClass();
            Assert.AreEqual(0, TestClass.StaticNrOfExecutions);
            TestClass.StaticDoubleTheNumber(1);
            Assert.AreEqual(1, TestClass.StaticNrOfExecutions);
            TestClass.StaticDoubleTheNumber(2);
            Assert.AreEqual(2, TestClass.StaticNrOfExecutions);
            TestClass.StaticDoubleTheNumber(1);
            Assert.AreEqual(2, TestClass.StaticNrOfExecutions);
            sa.CallStaticDoubleTheNumber(4);
            Assert.AreEqual(3, TestClass.StaticNrOfExecutions);
            sa.CallStaticDoubleTheNumber(2);
            Assert.AreEqual(3, TestClass.StaticNrOfExecutions);
            sb.CallStaticDoubleTheNumber(4);
            Assert.AreEqual(3, TestClass.StaticNrOfExecutions);
        }
One for the 2 instance methods
[TestMethod]
        public void MethodsCachedSeparately()
        {
            var sa = new TestClass();
            Assert.AreEqual(0, sa.NrOfExecutions);
            Assert.AreEqual(0, sa.NrOfExecutions2);
            sa.DoubleTheNumber(1);
            Assert.AreEqual(1, sa.NrOfExecutions);
            Assert.AreEqual(0, sa.NrOfExecutions2);
            sa.DoubleTheNumber(2);
            Assert.AreEqual(2, sa.NrOfExecutions);
            Assert.AreEqual(0, sa.NrOfExecutions2);
            sa.DoubleTheNumber(1);
            Assert.AreEqual(2, sa.NrOfExecutions);
            Assert.AreEqual(0, sa.NrOfExecutions2);
            sa.DoubleTheNumber2(1);
            Assert.AreEqual(2, sa.NrOfExecutions);
            Assert.AreEqual(1, sa.NrOfExecutions2);
            sa.DoubleTheNumber2(2);
            Assert.AreEqual(2, sa.NrOfExecutions);
            Assert.AreEqual(2, sa.NrOfExecutions2);
            sa.DoubleTheNumber2(1);
            Assert.AreEqual(2, sa.NrOfExecutions);
            Assert.AreEqual(2, sa.NrOfExecutions2);

        }
and then a test for the same method on 2 objects.
[TestMethod]
        public void TestDifferentObjectsCached()
        {
            var sa = new TestClass();
            var sb = new TestClass();
            Assert.AreEqual(0, sa.NrOfExecutions);
            Assert.AreEqual(0, sb.NrOfExecutions);
            sa.DoubleTheNumber(1);
            Assert.AreEqual(1, sa.NrOfExecutions);
            Assert.AreEqual(0, sb.NrOfExecutions);
            sa.DoubleTheNumber(2);
            Assert.AreEqual(2, sa.NrOfExecutions);
            Assert.AreEqual(0, sb.NrOfExecutions);
            sa.DoubleTheNumber(1);
            Assert.AreEqual(2, sa.NrOfExecutions);
            Assert.AreEqual(0, sb.NrOfExecutions);
            sb.DoubleTheNumber(1);
            Assert.AreEqual(2, sa.NrOfExecutions);
            Assert.AreEqual(1, sb.NrOfExecutions);
        }
We trust PostSharp, that the instance attribute will go away with the object itself, so we dont do a test for that

The implementation

With the implementation we dont have a lot to change. We implement the required interface, and change our dictionary to only hold reference to the arguments.
    public class CacheResultAttribute : PostSharp.Aspects.OnMethodBoundaryAspect, IInstanceScopedAspect
    {
        private IDictionary methodCache = new Dictionary();

        // previous code
        [...]

        #region IInstanceScopedAspect implementation
        object IInstanceScopedAspect.CreateInstance(AdviceArgs adviceArgs)
        {
            var result = this.MemberwiseClone() as CacheResultAttribute;
            result.methodCache = new Dictionary();
            return result;
        }

        void IInstanceScopedAspect.RuntimeInitializeInstance()
        {
        }
        #endregion IInstanceScopedAspect implementation
    }
As you can see, we totally removed the necessity of storing anything static. Since the aspect lives as long as the object, and for static methods statically, no more implementation is required.
Also with this our methods get shorter as well, since we need less dictionary key lookups.
        public override void OnEntry(PostSharp.Aspects.MethodExecutionArgs args)
        {
            if (this.cacheLocation != CacheLocations.Static)
                throw new NotImplementedException("Only static cache location is implemented for method return cache");
            var item = args.Instance ?? args.Method.ReflectedType;
            object argsKey = args.Arguments.Count == 0 ? string.Empty : tupleCreator(args.Method.GetParameters().Select(x => x.ParameterType).ToArray(), args.Arguments.ToArray());
            if (methodCache.ContainsKey(argsKey))
            {
                args.ReturnValue = methodCache[argsKey];
                args.FlowBehavior = PostSharp.Aspects.FlowBehavior.Return;
            }

            base.OnEntry(args);
        }

        public override void OnExit(PostSharp.Aspects.MethodExecutionArgs args)
        {
            if (this.cacheLocation != CacheLocations.Static)
                throw new NotImplementedException("Only static cache location is implemented for method return cache");
            var item = args.Instance ?? args.Method.ReflectedType;
            object argsKey = args.Arguments.Count == 0 ? string.Empty : tupleCreator(args.Method.GetParameters().Select(x => x.ParameterType).ToArray(), args.Arguments.ToArray());
            methodCache.Add(argsKey, args.ReturnValue);

            base.OnExit(args);
        }
Great, now running our unit tests, we get them all passed. Amazing! First problem resolved, thank you PostSharp.

Reference to arguments

Now the other problem we are having, is hard references to arguments. Lets create a new TestClass2, with a method that takes TestClass as an argument.
        public class TestClass2
        {
            public int NrOfExecutions = 0;
            [CacheResultAttribute.CacheResult(CacheLocations.Static)]
            public int DoSomething(TestClass tc)
            {
                NrOfExecutions++;
                return 4;
            }
        }
Now we want to make sure, that the caching does not hold a reference to our argument object. Lets write a test:
        [TestMethod]
        public void TestReferenceToArgumentsNotRemembered()
        {
            var ta = new TestClass();
            var t2 = new TestClass2();
            Assert.AreEqual(0, t2.NrOfExecutions);
            t2.DoSomething(ta);
            Assert.AreEqual(1, t2.NrOfExecutions);
            t2.DoSomething(ta);
            Assert.AreEqual(1, t2.NrOfExecutions);
            var wr = new WeakReference(ta);
            ta = null;
            GC.Collect();
            Assert.AreEqual(null, wr.Target);
        }
Here we simply create the instance ta, then use it to call the method. Then create a WeakReference to it. WeakReference is a builtin framework class, that allows an optional reference to an object, as long as the object has a reference somewhere else. But WeakReference does not block the garbage collection on that object.
We remove our reference to ta, and we force a garbage collection, which should normally result in the ta to be garbage collected. If you run this test, you will see it fail, as the Dictionary inside our attribute has a strong reference to ta, and prevent the GC from collecting it.

The help again comes from outside. The framework provides a class with similar functionality as WeakReference, just in a table like form. It is called ConditionalWeakTable.
Lets try to use this class instead of the dictionary we had.
Well, in reality, we were using Tuple<> to wrap the arguments, and it is actually the tuple that has the reference to the object, and the dictionary has a reference to the tuple. We will rewrite our dictionary to use the keys directly, and use the ConditionalWeakTable class. We will need to handle the argumentless case separately, but otherwise we will store a ConditionalWeakTable for each argument, so an arg list of (string, int, Type) will be ConditionalWeakTable<object, ConditionalWeakTable<object, ConditionalWeakTable<object, object>>>. As you can see we store everything as objects. This might cause a problem, but lets see.
        /// <summary>
        /// the result in case there are no arguments
        /// </summary>
        private object noArgResult;
        /// <summary>
        /// whether the no argument result has been calculated
        /// this is required, as the result can be null
        /// </summary>
        private bool isNoArgResultInitialized = false;

        [NonSerialized]
        private ConditionalWeakTable<object, object> methodCache = new ConditionalWeakTable<object, object>();

        //previous code
        [...]

        public override void OnEntry(PostSharp.Aspects.MethodExecutionArgs args)
        {
            if (this.cacheLocation != CacheLocations.Static)
                throw new NotImplementedException("Only static cache location is implemented for method return cache");

            if (args.Arguments.Count == 0)
            {
                if (isNoArgResultInitialized)
                {
                    args.ReturnValue = noArgResult;
                    args.FlowBehavior = FlowBehavior.Return;
                }
            }
            else
            {
                var types = args.Method.GetParameters().Select(x => x.ParameterType).ToList();
                bool isMissingKey = false;
                var resultDictionary = this.methodCache;
                for (int i = 0; i < args.Arguments.Count - 1; i++)
                {
                    var arg = args.Arguments[i];
                    object result;
                    if (resultDictionary.TryGetValue(arg, out result))
                    {
                        resultDictionary = result as ConditionalWeakTable<object, object>;
                    }
                    else
                    {
                        isMissingKey = true;
                        break;
                    }
                }
                object finalresult;
                if (!isMissingKey && resultDictionary.TryGetValue(args.Arguments.Last(), out finalresult))
                {
                    args.ReturnValue = finalresult;
                    args.FlowBehavior = FlowBehavior.Return;
                }

            }

            base.OnEntry(args);
        }

        public override void OnExit(PostSharp.Aspects.MethodExecutionArgs args)
        {
            if (this.cacheLocation != CacheLocations.Static)
                throw new NotImplementedException("Only static cache location is implemented for method return cache");

            if (args.Arguments.Count == 0)
            {
                isNoArgResultInitialized = true;
                noArgResult = args.ReturnValue;
            }
            else
            {
                var table = methodCache;
                for (int i = 0; i < args.Arguments.Count - 1; i++)
                {
                    var arg = args.Arguments[i];
                    object nextTable;
                    if (table.TryGetValue(arg, out nextTable))
                    {
                        table = nextTable as ConditionalWeakTable<object, object>;
                    }
                    else
                    {
                        var nextTable2 = new ConditionalWeakTable<object, object>();
                        table.Add(arg, nextTable2);
                        table = nextTable2;
                    }
                }
                table.Add(args.Arguments.Last(), args.ReturnValue);
            }

            base.OnExit(args);
        }
Well, looks a bit more complicated, but it is not that much. We simply go through the arguments, and if another argument found, we add a new ConditionalWeaktable.

Cool, lets run our unit tests, and .... BANG. The last one passes, but the previoud ones all fail. Whaaaaat? What just happened?
To understand what happened, we need to have a look at the ConditionalWeakTable documentation, and see the note at the middle of the page: "ConditionalWeakTable class supports one attached value per managed object. Two keys are equal if passing them to the Object.ReferenceEquals method returns true". Hmm, ok but we didnt really passed in objects in our failed tests. We passed in and integer of 3, why did it fail?
Well the answer is boxing. C# allows boxing of any value into an obejct. This allows you to add an integer to an object list, like
var list = new List<object>();
int i = 8;
list.Add(i);
Here the compiler is actually boxing the value 8 to an object, and add it to the list.
Combining this with the fact, that the ConditionalWeakTable checks for object.ReferenceEquals, we now understand what happened. We have 2 different object, which both are the same integer, but do not say true to object.ReferenceEquals.
A quick test confirms it.

        [TestMethod]
        public void Test()
        {
            int i = 6;
            object a = i;
            object b = i;
            Assert.IsFalse(Object.ReferenceEquals(a, b));
        }
So now what? What we really want to, is not to have memory used up when it will not be required anymore. How do we know it is not required anymore? Well for an object it is easy, if no references exist to it, we dont need it. But with a number? or a string? Is it even our responsability to care about this?
Well, i would argue about this. The functionality we are trying to achieve, is to cache method results. Now if you cache a result for the number 3, do you want to hold on to that result forever? Well, with our intention, of having the cache static, it is probably what you want. So for simple types, it is probably not an issue to have them stored in the dictionary directly. The problem might arise, when you start to use quite complex tructs as arguments, as those will not be garbage collected ever. This is actually a limitation, that we will just accept, as for the time being, it is good enough.
So for the not reference types, we will just use a dictionary, and we will use the ConditionalWeakReference for the rest.
Implementing this gives us the code:

        public override void OnEntry(PostSharp.Aspects.MethodExecutionArgs args)
        {
            if (this.cacheLocation != CacheLocations.Static)
                throw new NotImplementedException("Only static cache location is implemented for method return cache");

            if (!this.methodCache.ContainsKey(args.Method))
            {
                this.methodCache.Add(args.Method, null);
            }

            if (args.Arguments.Count == 0)
            {
                if (isNoArgResultInitialized)
                {
                    args.ReturnValue = methodCache[args.Method];
                    args.FlowBehavior = FlowBehavior.Return;
                }
            }
            else
            {
                var types = args.Method.GetParameters().Select(x => x.ParameterType).ToList();
                bool isMissingKey = false;
                object resultDictionary = this.methodCache;
                if (resultDictionary != null)
                {
                    for (int i = 0; i < args.Arguments.Count; i++)
                    {
                        object result;
                        var arg = args.Arguments[i];
                        if (!types[i].IsValueType)
                        {
                            if ((resultDictionary as ConditionalWeakTable<object, object>).TryGetValue(arg, out result))
                            {
                                resultDictionary = result;
                            }
                            else
                            {
                                isMissingKey = true;
                                break;
                            }
                        }
                        else
                        {
                            if ((resultDictionary as IDictionary).Contains(arg))
                            {
                                resultDictionary = (resultDictionary as IDictionary)[arg];
                            }
                            else
                            {
                                isMissingKey = true;
                                break;
                            }
                        }

                    }
                    if (!isMissingKey)
                    {
                        args.ReturnValue = resultDictionary;
                        args.FlowBehavior = FlowBehavior.Return;
                    }
                }
            }

            base.OnEntry(args);
        }

        public override void OnExit(PostSharp.Aspects.MethodExecutionArgs args)
        {
            if (this.cacheLocation != CacheLocations.Static)
                throw new NotImplementedException("Only static cache location is implemented for method return cache");

            if (args.Arguments.Count == 0)
            {
                isNoArgResultInitialized = true;
                methodCache[args.Method] = args.ReturnValue;
            }
            else
            {
                var types = args.Method.GetParameters().Select(x => x.ParameterType).ToList();
                Func<object> getter = () => methodCache;
                Action<object> setter = o => methodCache = o;
                object table = getter();
                for (int i = 0; i < args.Arguments.Count; i++)
                {
                    var arg = args.Arguments[i];
                    if (!types[i].IsValueType)
                    {
                        if (getter() == null)
                        {
                            setter(new ConditionalWeakTable<object, object>());
                        }
                        var currenttable = getter();

                        getter = () =>
                        {
                            object temp;
                            (currenttable as ConditionalWeakTable<object, object>).TryGetValue(arg, out temp);
                            return temp;
                        };
                        setter = o =>
                        {
                            (currenttable as ConditionalWeakTable<object, object>).Add(arg, o);
                        };
                    }
                    else
                    {
                        if (getter() == null)
                        {
                            setter(new Hashtable());
                        }
                        var currenttable = getter();
                        getter = () => (currenttable as Hashtable)[arg];
                        setter = o =>
                        {
                            (currenttable as Hashtable).Add(arg, o);
                        };
                    }

                }
                setter(args.ReturnValue);
            }

            base.OnExit(args);
        }


Now when running our unit tests, we have all of them passed. So we managed what we set out for.

Limitations

Well, as we saw above, we might have some memory leak when using large value type objects, like complex structs as parameters. This is a VERY IMPORTANT limitaion, and you do have to be aware of this, when you use this attribute extensively.

Generics

Well what bout generic methods and generic types? Well, the above implementation does not really support them, so you have to keep that in mind. But the implementaion can be extended to support those concepts too.

Null values

Im not sure, but i think ConditionalWeakTable does not support null values, so an extra implementation is required to handle those too.

Another limitaiton i can think of comes up, when you have a reference type parameter after a value type parameter in the arguments list order. For example having [CacheResult] int Method(int number, TestClass testClass) Now when we first call this with 0 and ta (where ta is a TestClass instance), then the the result is cached in a HashTable, for 0 it has a value of a ConditionalWeakTable<object, object>, that has a weak reference to ta.
Now when ta goes out of scope, the ConditionalWeakTable will loose the reference to it, but the HashTable will still have in it an empty ConditionalWeakTable object. This is obviously some memory loss, even though it is less then having a reference to the class itself, if you use this attribute extensively, you should consider addressing that issue as well.

If you are curious how to remove some of these limitations, please follow onto the next post.

No comments:

Post a Comment