Unit testing XSD schemas

Once in a while a new task no-one is really eager to work on pops-up. From my experience in teams that are not focusing on or use extensively Xml related technologies most (if not all) tasks that have anything to do with XSD schemas belong to this group. This was the case in our team recently and I ended up to be a “volunteer” since the schedule was tight and I previously worked on the managed Xml team. So, I started refreshing my rusted XSD skills and soon I got something that more or less worked. It was a good starting point but then I asked myself – “how do I test this”. I needed something lightweight that would fit in our unit tests. I briefly searched the Internet but could not find anything that would be suitable.  As the old saying goes, necessity is mother of invention, so I came up with my own way of testing the schema. I like it because it contains just 3 small (less than 30 lines total) helper methods, one helper schema and most of the unit tests are just 2-3 lines. The tests actually also helped me come up with a better design I originally had. Note, I don’t know if this is the “right” approach or if it would scale for bigger schemas. I only know that for the schema I had to write it worked fine.

So, let’s say we need to write a schema for Xml files that have a structure like this:

<Settings>
    <ServiceProvider Type="typeName">
      <Setting Name="Setting1" Value="Value1" />
      <Setting Name="Setting2" Value="Value2" />
    </ServiceProvider>

   <Factory Type="typeName">
     <Setting Name="Setting1" Value="Value1" />
     <Setting Name="Setting2" Value="Value2" />
     <Setting Name="Setting3" Value="Value3" />
   </Factory>
 </Settings>

and that both ServiceProvider and Factory elements are optional.

First we need to create a starting schema. For new schemas I usually create a sample Xml file, open it in Visual Studio and use Xml → Create Schema. The schema created by the VS is not really usable but gives me something I can iterate on. The main problem with the generated schema is that all the types are defined inline. This makes it hard to test – ideally we would like to test each type separately. Generating inline types leads to another problem – each element has its own type even if the same element is used repeatedly (let alone cases where the same types are used for different elements or where inheritance is involved). The key to testing a schema is to have simple types. The simpler the type the easier it is to test. Once a type is tested it can be used as a building block to build more complicated types but it won’t require any more comprehensive testing as part of the more complicated type. For the Xml structure above we can identify three types:

  • Setting (for Setting element)
  • ServiceTypeInitializer (a common type for ServiceProvider and Factory elements)
  • Settings (for Settings element)

The problem with unit testing all these types in separation is that the schema itself should not allow any but Settings element as the document element. Fortunately for testing purposes we can create a helper schema that will allow document elements of types that are normally not allowed to be document elements. We will conditionally add this helper schema to the schema set used for validating the input Xml. Why the helper schema needs to be added conditionally? The tested schema should not allow any but the Settings element as the document element. So, when testing the Settings element we must not add the helper schema to the schema set to make sure that this is the only element allowed as the document element. Let’s see how this looks like in practice. Here is the schema created by refactoring the initial schema created by Visual Studio:


<?xml version="1.0" encoding="utf-8"?>
<xs:schema attributeFormDefault="unqualified" elementFormDefault="qualified"
  xmlns:xs="http://www.w3.org/2001/XMLSchema">

  <xs:element name="Settings" type="Settings_Type" />

  <xs:complexType name="Settings_Type">
    <xs:sequence>
      <xs:element name="ServiceProvider" type="ServiceTypeInitializer_Type" minOccurs="0" maxOccurs="1" />
      <xs:element name="Factory" type="ServiceTypeInitializer_Type" minOccurs="0" maxOccurs="1" />
    </xs:sequence>
  </xs:complexType>

  <xs:complexType name="ServiceTypeInitializer_Type">
    <xs:sequence>
      <xs:element maxOccurs="unbounded" name="Setting" type="Setting_Type" />
    </xs:sequence>
    <xs:attribute name="Type" type="xs:string" use="required" />
  </xs:complexType>

  <xs:complexType name="Setting_Type">
    <xs:attribute name="Name" type="xs:string" use="required" />
    <xs:attribute name="Value" type="xs:string" use="required" />
  </xs:complexType>
</xs:schema>

Now let’s create the helper schema that will allow testing each of the types separately:


<?xml version="1.0" encoding="utf-8"?>
<xs:schema attributeFormDefault="unqualified" elementFormDefault="qualified" xmlns:xs="http://www.w3.org/2001/XMLSchema">
  <xs:element name="Setting" type="Setting_Type" />
  <xs:element name="ServiceTypeInitializer" type="ServiceTypeInitializer_Type" />
</xs:schema>

(In the above schema the Settings element is not present since it’s already allowed at the top level by the other schema). After creating the helper schema we need a function that will validate Xml documents against our schemas:


private static IEnumerable<ValidationEventArgs> RunValidation(string inputXml, bool includeHelperSchema)
{
    var schemaSet = new XmlSchemaSet();
    schemaSet.Add(schemaUnderTest);

    if (includeHelperSchema)
    {
        schemaSet.Add(helperSchema);
    }

    var readerSettings = new XmlReaderSettings()
    {
        Schemas = schemaSet,
        ValidationType = ValidationType.Schema,
        ValidationFlags = XmlSchemaValidationFlags.ReportValidationWarnings,
    };

    var events = new List<ValidationEventArgs>();
    readerSettings.ValidationEventHandler += (s, e) => { events.Add(e); };

    using (var reader = XmlReader.Create(new StringReader(inputXml), readerSettings))
    {
        while (reader.Read())
            ;
    }

    return events;
}

There are two interesting points here. First we need to turn on reporting validation warnings. This is because XmlSchemaSet has a nasty behavior where no error is reported if the document element of the validated Xml document is in different namespace that the targetNamespace of the schema. This may result in accepting documents that are not being validated at all. Turning on reporting warnings is the first step to catch this condition. The second interesting point is that schema validation will throw exceptions for validation errors but not for warnings. Again, to catch the condition where the expected and actual namespaces don’t match we have to set XmlReaderSettings.ValidationEventHandler which will be invoked for both validation errors and warnings. Other than that the method is pretty straightforward – we create an XmlSchemaSet instance and add the schema under test and conditionally the helper schema. Then we create an XmlReaderSettings object and set it up for schema validation. We use the reader settings to create a validating XmlReader. Finally we read the input xml with the validating reader – all errors and warnings are reported by invoking the validation event handler we set.
With the test driver method ready we can start writing test cases. We write test cases for each type starting from “leaf” types (i.e. types that are defined using only pre-defined schema types) moving to more complex types. If a type contains an element of a type that has already been tested we just test that schema accepts an Xml with the simplest child element of that type and, if the type is mandatory, the Xml is rejected if it does not contain the element. If there are multiple elements of the same type we just write test cases to test the type itself and not test cases to test all the possible elements of that type (they will be tested when testing their parent type). If there was a hierarchy we would write test cases for the base type and then test cases just for what was added (or removed – in case of derivation by restriction) in the derived type. The test cases themselves are simple – in most cases a hardcoded minimal Xml document is validated using the validation method we created and we check whether expected errors are reported or that there are no errors for valid Xml documents. Some examples:


[Fact]
public void Schema_accepts_minimal_valid_Xml()
{
    Assert.True(!RunValidation("<Settings />", false).Any());
}

[Fact]
public void Schema_rejects_Setting_Type_without_Name()
{
    var error = 
        RunValidation(@"<Setting Value=""ABC"" />", true)
        .Single();

    Assert.Equal(XmlSeverityType.Error, error.Severity);
    Assert.Equal(
        "The required attribute 'Name' is missing.",
        error.Message);
}

An exemplary test suite using XUnit can be found on my github. The Readme contains details about requirements, setting up the environment, building and running tests. If you just want to see what’s most interesting (i.e. the code) you can find it here

Pawel Kadluczka

Advertisement

One thought on “Unit testing XSD schemas

  1. Major thanks for the article. Want more.

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: