Showing posts with label System.XML.Serialization. Show all posts
Showing posts with label System.XML.Serialization. Show all posts

Friday, December 21, 2012

XML (de)Serialization - A list of a base object, containing a mix of derived objects.

So here's the problem. I've got an XML file containing a list of basic shapes I need to draw in my application. I've broken the shapes down to different classes, but stuck them in a single XML list.

Here's an example XML:
<Device>
  <Shape>       
    <Line Colour="blue">
      <Point x="29" y="55"/>
      <Point x="43" y="55"/>
    </Line>
    <Ellipse Colour="yellow">
      <Point x="44" y="50"/>
      <Point x="53" y="59"/>
    </Ellipse>
    <Triangle Colour="red">       
      <Point x="1456" y="191"/>
      <Point x="1456" y="201"/>
      <Point x="1465" y="201"/>
    </Triangle>
  </Shape>
</Device>
Each device has a shape which is a list of different drawing objects. The simple way to do this would be to have a list for each object type (Line, Ellipse, Triangle) but that's not what I wanted. The order of the XML is also the order of drawing on the screen, so I wanted these to remain in a single list as a grouping of objects, derived from a simple object class.

Just being lazy and using [XmlElement] on the Shape list in a C# class did not work, so I had to go deeper. First, let's have a look at my objects. I defined my own Point class, instead of using System.Drawing.Point, just so they could be represented as attributes in my XML (a design decision).
    public sealed class Point
    {
        [XmlAttribute]
        public int x
        {
            get;
            set;
        }
        [XmlAttribute]
        public int y
        {
            get;
            set;
        }
    }
I then created a base drawing object, with a colour and a list of points. Because the size of the Point array changes based on each derived object, the XML Serialiser ignores the Point array in the base class.
    public class DrawingObject
    {
        [XmlAttribute]
        public string Colour
        {
            get;
            set;
        }
        [XmlIgnore]
        public Point[] Points;
    }   
Now, I derive each specific object from this base class. To set the size of the Point array, I use a private field and then modify the base array to become a getter, using the 'new' keyword. The XmlElement is defined in these derived classes for the Serialiser (and yes, I realise the Ellipse is the same as a Line, but there's other code I removed for this example. It still serves the point of showing different derived classes). 
    public sealed class Line : DrawingObject
    {
        private Point[] _points = new Point[2];
        [XmlElement("Point")]
        new public Point[] Points
        {
            get
            {
                return _points;
            }
            set
            {
                _points = value;
            }
        }
    }
    public sealed class Triangle : DrawingObject
    {
        private Point[] _points = new Point[3];
        [XmlElement("Point")]
        new public Point[] Points
        {
            get
            {
                return _points;
            }
            set
            {
                _points = value;
            }
        }
    }   
    public sealed class Ellipse : DrawingObject
    {
        private Point[] _points = new Point[2];
        [XmlElement("Point")]
        new public Point[] Points
        {
            get
            {
                return _points;
            }
            set
            {
                _points = value;
            }
        }
    }
Very good. Now, let's make a list of the base object and force the XML Serialiser to add the different element names (Line, Triangle, Ellipse) to the single list. This is when we hit our first slightly different XML definition. To get this to work, .NET makes us add an enumeration which is ignored by the XML. The XML Serialiser then uses this to help detect what object type it is (http://msdn.microsoft.com/en-us/library/system.xml.serialization.xmlchoiceidentifierattribute%28v=vs.100%29.aspx).

So we define a public enumeration of the different object types:
    [XmlType(IncludeInSchema = false)]
    public enum ShapeChoiceType
    {
        Line,
        Triangle,
        Ellipse
    }
Then in our serializing Shape class we add an array of this enumeration, so it can be matched with the list being serialised. But we get the Serialiser to ignore it.
    // Do not serialize this next field:
    [XmlIgnore]
    public List ItemType;
Finally we add the List! We have to use the XmlChoiceIdentifier, pointing to our List of ItemTypes, to help cast the objects. In our XmlElement definition, we specify the name of each object type, as well as what the C# type will be.
    [XmlElement("Line", typeof(Line))]
    [XmlElement("Triangle", typeof(Triangle))]
    [XmlElement("Ellipse", typeof(Ellipse))]
    [XmlChoiceIdentifier("ItemType")]
    public List DrawingObjects
    {
        get;
        set;
    }
This builds all fine! But the first time you try to deserialise the XML in the application, we get an error! Oh dear. With XML, the CLR tends to compile the XML classes at run-time.
    System.InvalidOperationException was caught
      Message=Unable to generate a temporary class (result=1).
    error CS1061: 'System.Collections.Generic.List' does not contain a definition for 'Length' and no extension method 'Length' accepting a first argument of type 'System.Collections.Generic.List' could be found (are you missing a using directive or an assembly reference?)
So, what does this mean? For reasons I'm not going into, I use List for my collections. However, List does not work with the XmlChoiceIdentifier. This Microsoft bug report (http://connect.microsoft.com/VisualStudio/feedback/details/681487/xmlserializer-consider-that-an-element-adorned-with-xmlchoiceidentifier-could-be-an-ienumerable-or-an-icollection-but-code-generation-fail) shows that by design, it needs to be an array. So, let's change it to arrays. And hey presto, it works!

Final class definitions below!

    public sealed class Point
    {
        [XmlAttribute]
        public int x
        {
            get;
            set;
        }
        [XmlAttribute]
        public int y
        {
            get;
            set;
        }
    }
   
    public class DrawingObject
    {
        [XmlAttribute]
        public string Colour
        {
            get;
            set;
        }
        [XmlIgnore]
        public Point[] Points;
    }
   
    public sealed class Line : DrawingObject
    {
        private Point[] _points = new Point[2];
        [XmlElement("Point")]
        new public Point[] Points
        {
            get
            {
                return _points;
            }
            set
            {
                _points = value;
            }
        }
    }

    public sealed class Triangle : DrawingObject
    {
        private Point[] _points = new Point[3];
        [XmlElement("Point")]
        new public Point[] Points
        {
            get
            {
                return _points;
            }
            set
            {
                _points = value;
            }
        }
    }
   
    public sealed class Ellipse : DrawingObject
    {
        private Point[] _points = new Point[2];
        [XmlElement("Point")]
        new public Point[] Points
        {
            get
            {
                return _points;
            }
            set
            {
                _points = value;
            }
        }
    }
   
    [XmlType(IncludeInSchema = false)]
    public enum ShapeChoiceType
    {
        Line,
        Triangle,
        Ellipse
    }
   
    public sealed class Shape
    {
        [XmlElement("Line", typeof(Line))]
        [XmlElement("Triangle", typeof(Triangle))]
        [XmlElement("Ellipse", typeof(Ellipse))]
        [XmlChoiceIdentifier("ItemType")]
        public DrawingObject[] DrawingObjects
        {
            get;
            set;
        }

        // Do not serialize this next field:
        [XmlIgnore]
        public ShapeChoiceType[] ItemType;
    }

    public sealed class Device
    {
        [XmlElement("Shape")]
        public List Shapes
        {
            get;
            set;
        }
    }

Monday, September 20, 2010

C# Serialization problem

The core of the messaging program I am working on right now involves messages being encapsulated in XML. C# makes it really easy with their System.XML.Serialization classes. You just have to create a class, populate it, and feed it into :
XmlSerializer serializer
Then by running the Serialize() command you can turn it into XML data. Simple! Of course there's lots of customisation you can do with it, and then it becomes a powerful tool. However, this entry is not discussing the basics, just do a web search for it and you can find 100 of great examples.

My problem came from serializing my message classes. I had one base class, with lots of inherited classes. The very first time I ran either serialize() or deserialize() in my program, it would take up to 15 seconds to complete. But only for the first serialize/deserialize. Every subsequent one was quick.

A bit of research showed that C# builds up assemblies for the XML at runtime, only when needed. With the inheritance of my multiple classes, there was a lot of reflection and recursion done to build the assembly. And since I had it in my startup, my program would take 15 seconds to start up! Not acceptable.

It turns out there is a way to precompile these custom serialization assemblies, but it requires the construction of an additional DLL.

Since .Net 2.0, Microsoft has included a little executable in it's SDK to do it. It's in the SDK folder, so it may be different to my examples on your computer. The program is called SGen. It looks into any of your assemblies and builds up the serialization assemblies required. To use it, run the following command:
"C:\Program Files\Microsoft SDKs\Windows\v7.0A\bin\NETFX 4.0 tools\sgen.exe" /f /v /a:
The flags are:
  1. /f = forces a rewrite of the file
  2. /v = verbose output, it's handy for figuring out problems with the serialisation
  3. /a: = the file to assemble the serialization assembly from. Replace with your .exe or .dll file.
Running this will create a a file with the assembly name followed by .XmlSerialisers.dll. Just place this DLL into the executable folder and it will load automatically on startup. On runtime, the program automatically checks for this file and no longer has to waste my 15 seconds compiling the serialisation data!

If you want to be tricky, move it into your post-build command: