Resharper plug-in: Minify XML

The Problem

I've been working on a little tool that uses WCF serialization lately and had to deal with these XML files representing deserialized datacontracts. One of the requirements of the tool is that some users should be able to feed it with such XML files. Think about it as feeding the tool with an input XML file, processing it to create some output, and compare the output with some other XML file that represents the expected output. You figured it's some kind of testing tool right? :)

Anyway, the expected output and the actual output are being serialized when written to disk, and deserialized when read. The default settings for the WCF DataContractSerializer make sure that the entire deserialized XML string gets saved into one single line. I know you can tweak these settings and change the formatting of the XML created by the DataContractSerializer, but if that's not an option, you'll have to make sure those files respect the expected format.

You have to know that some of the users of the tool have Visual Studio + ReSharper, so these guys will be tempted to open the XML files and change a little value here and there before saving the file and feeding it again to the tool. That's where formatting comes in: if a user wants to get a nice overview of what's inside the file and easily modify a value, he'll just format the document make the job easier. However, this document must be minified again on one single line for the tool to be able to deserialize the file. That's when it struck me that minifying CSS and JS is supported by various tools, but none is available for XML.

My Solution

Now that's a great tip, thank you Paul! When he told me it would be easy to do so using the ReSharper 6.1 SDK, I decided to give it a shot and time-box it to an hour. One hour only!

Hence, I installed the SDK, which on first sight looked pretty complete, including samples, project templates and a whole lot of item templates. Awesome! Creating a new ReSharper Plug-In project is really peanuts: it even is preconfigured to debug the project against the Visual Studio Experimental Hive, passing it the necessary parameters to plug-in your plug-in and enabling some hidden internal Resharper diagnostic windows.

In addition to that, Hadi Hariri has a great series on writing plug-ins for Resharper, amongst which the first one helped my quite a long way to implementing my own plug-in. In less than a few minutes, I had my solution set up and could focus on the work at hand: minify some XML. That's really a great experience to get started with something you've never done before!

Simply put, the plug-in needs to know when it is available, and needs to know what it has to do. The availability part is easy: I want to be able to minify the entire file, so making it available in the root XML node makes sense. The execution itself, is pretty straightforward as well: take the entire XML element (including its child nodes if any) and replace it with a minified version, leaving the XML structure intact. That's just a matter of pulling a RegEx monkey out of your sleeves and make sure those whitespaces, tabs and carriage returns get removed.

Minify XML

Minifying is not necessarily equal to obfuscating or compressing. When it comes to XML, the element names and attribute names usually are meaningful as well. Changing this would definitely break deserialization of such files. Hence, I only had to take care of the whitespace formatting part.

Minifying the XML is done using a tiny XmlMinifier class shown below:

    public class XmlMinifier : IMinifier
    {
        public string Minify(string input)
        {
            // Remove carriage returns, tabs and whitespaces
            string output = Regex.Replace(input, @"\n|\t", " ");
            output = Regex.Replace(output, @">\s+<", "><").Trim();
            output = Regex.Replace(output, @"\s{2,}", " ");

            // Remove XML comments 
            output = Regex.Replace(output, "<!--.*?-->", String.Empty, RegexOptions.Singleline);

            return output;
        }
    }

For the full implementation details, I'll invite you to take a look at the GitHub repository. It's only something like 100 lines of code amongst which the most meaningful are probably stated above. You might wonder what I did the rest of that hour :)

How to use it

Once the plug-in is installed, and you open any XML file into Visual Studio, you'll see a new option appearing when you put your cursor in the root XML element.

Alt-Enter (don't you dare to use the mouse!) and select Minify file.

There you go!

Give me the goodies

Installing the plug-in is easy as well: simply fetch the dll and put it in the following location:C:\Program Files (x86)\JetBrains\ReSharper\v6.1\Bin\PluginsDon't forget to unblock the file after downloading it because it might be very unsafe! :)You can verify if the plug-in got installed correctly by navigating to ReSharper > Tools > Options and select Plug-ins.

Potential for Optimization

I think it would be cool if I could just minify a selected element and leave the rest of the file formatting untouched. However, in order for this to work without any strange behavior, I need to be able to save the file after each execution of the plug-in (or at least update the cache, because ReSharper seems to work on the cached source file as long as the file isn't saved). If anyone has an idea on how to do this (‘cause I couldn't find a working sample), please reach out or submit a patch :-)

Sharing is caring

Posted by Xavier Decoster on
Last revised: 27 Oct, 2012 08:48 PM
blog comments powered by Disqus