Making the Most of .NET XML Serialization

.NET’s XML Serialization Framework is a handy way to convert your objects to and from XML, without having to deal with the nitty-gritty of working with XML structures, e.g. through DOM with XmlDocument, or the stream-based XmlReader and XmlWriter classes. Annotations can be placed on classes and properties using .NET attributes such as [XmlElement] in order to control the serialization (XML name and namespace, whether a property represents an XML element or an XML attribute, etc.).

The downside of the framework is that whilst the easy things are pretty easy, more sophisticated mappings between the XML and the object world can get tricky (this is in part due to mapping XML and objects being a complicated problem in general). In sufficiently complex scenarios, you will be better off using other frameworks, or just doing the XML conversions the “old-fashioned” way using regular XML libraries. However, if your needs are just a little outside what the XML Serialization Framework does well, or just a minority of your types are hard to serialize, there are some work-arounds you can employ.

Nullable Value Types for Attributes

This is one of the most commonly encountered inconveniences of the XML Serialization Framework. Whilst basic types like strings, integers and DateTime objects can be [de]serialized just fine, only reference types like string implicitly have the concept of being optional (use="optional" in W3C XML Schema), because reference types can be null. Value types can be made nullable using the Nullable<T> generic (or the “?” syntax), but the XML Serialization Framework doesn’t support nullable value types for attributes (but does for elements, strangely). Your choice then is to either have a default value (possibly set in your class constructor), or to provide a proxy for serialization that enables your property to be seen as nullable to the rest of your application, but gets serialized and deserialized through a separate proxy property that the XML Serialization Framework understands.

You’ll see in this example that the int? IntegerAttribute property is annotated with [XmlIgnore]. This hides it from the XML Serialization Framework (if you don’t, it will fail to construct the XmlSerializer). . There is the non-nullable proxy property XmlProxy_IntegerAttribute which the framework uses to do the serialization/deserialization of the integer element, and an additional property XmlProxy_IntegerAttributeSpecified which the framework uses to determine (when serializing) or to flag (when deserializing) whether the element is actually present. As the proxy property has a different name from the desired element or attribute name, you will have to specify its name explicitly using an [XmlAttribute] annotation.

Note that the *Specified proxy property is also annotated with [XmlIgnore]. Both proxy properties have the EditorBrowsableState.Never annotation, in order to hide the proxy properties from IntelliSense. You don’t have to include this if you don’t want to, and note that this only works for classes in another assembly; you still see the proxy properties in the IntelliSense if the classes are in the project you’re working in.

It is possible to do the proxying in the opposite direction from this example, i.e. the XML proxy properties could compute their values from the non-proxy nullable property, rather than the nullable property doing the computation, but in this specific case it’s a little less messy to use the proxy properties for the actual data storage.

XML Proxy Type

The proxy concept with the nullable value type can be used more generally, where you have one type which you want to use in your application-facing class API, but it either is not serializable by the framework, or it is serializable but not into the XML structure you want.

For an XML attribute or for a single XML element, this can be done with a straightforward translation in either the proxy or proxied property.

Here, the XmlProxy_NonSerializableType class is an XML-serializable class, which is used as an intermediary when serializing or deserializing the data of the NonSerializableType class.

Collections of Proxy Types

For a collection, proxying gets a bit trickier, but it can still be achieved. A collection would be used when an element can appear more than once (for example, maxOccurs="unbounded" in your schema).

As .NET will get the collection object from the property and then Add to it to fill it in when deserializing, rather than setting the collection property in one fell swoop with the full collection of deserialized objects, a class needs to know whether it is currently working in XML-proxy mode (the framework is serializing or deserializing, so I need a collection of the proxy types) or in non-XML-proxy application-facing mode (I need a collection of the “normal” non-proxy types). This can be done by having two private fields, one backing the application-facing property, and one backing the proxy property.

The implementation of the get and set methods on the properties will translate and change “mode” on-the-fly if necessary. For example, if the .NET XML serialization framework has filled in the proxy collection after having deserialized, and the application-facing collection is then retrieved, the get method will automatically translate from proxy to non-proxy collection, with the field for proxy collection then being set to null, and the field for the non-proxy collection getting the translated values. Successive attempts to retrieve the collection from the application-facing non-proxy property will not have to do any translation, so the translation expense is incurred only when “changing” mode. The key feature is that only one or other of the proxy or non-proxy mode has “ownership” of the data at any time, and that that be enforced by the implementation of all the get and set methods.

Implementing IXmlSerializable

You can implement the IXmlSerializable interface in your class to have it do custom [de]serialization instead of the automatic framework behaviour. Given that the attraction of the XML Serialization Framework is to avoid having to deal with the XML structures directly, I’d consider this to be a last resort.  That said, if it gets you out of a hole you’ve got yourself into for one class, but the framework was quite nicely doing what you needed for many other classes, it might be just the ticket. This is a good article on the subject of implementing IXmlSerializable.

Dealing with Multiple Document Types

You may have a scenario where you have to read XML that may be in one of several possible formats, and you want to recognize the format and deserialize into the right type of object on-the-fly. Depending on the nature of the XML, there are a few possible approaches to this:

Element Name is the Class Name (Reflection Approach)

In this case, you have a different XML root element for each kind of document, and you want to deserialize using the class whose name matches the element name.

First, you can read the element name from the XML (this is delving into the XML structures, which we generally try to avoid when using the XML Serialization Framework, but as it is such a trivial interaction, and the real grunt work is still done by the framework, we’re not losing much):

Then you could find the class with which to deserialize by reflection like this (adjust reflection as necessary if your classes are in a different assembly):

You would be advised to cache the mapping from the element name to the XmlSerializer object if performance is important.

Element Name Maps to the Class Name

Slight variation of the above: if the XML element names are different from the class names, or you don’t want to find the classes by reflection, you could instead create a Dictionary mapping from the element names in the XML to the XmlSerializer objects, but still look-up the XML root element name to decide which serializer to use, as in the example above.

Common Root Element with Polymorphism Through xsi:type

If your XML messages all have the same root element name, and the polymorphism is achieved by having types identified using xsi:type, then you can do something like this:

Note the second parameter to the XmlSerializer constructor which is an array of additional types that you want the .NET serializer to know about.

Custom Reader and Writer Settings

Finally, it’s worth remembering that not all of the behaviour of the XML Serialization Framework is controlled through the serialization code itself.

You can pass instances of the XmlReaderSettings and XmlWriterSettings classes to the Deserialize and Serialize methods of the XmlSerializer class respectively.

For example, XmlReaderSettings allows control of:

  • Schema validation
  • DTD processing
  • Whitespace processing

XmlWriterSettings allows control of:

  • Indentation of elements
  • Whether the XML declaration should be written
  • Prefixes used for XML namespaces
  • Encoding, and whether to include a BOM (Byte-Order-Mark)
Share on FacebookTweet about this on TwitterShare on Google+Share on LinkedInEmail this to someonePrint this page

Comments are closed