HTML Minifier - Taritsyn/WebMarkupMin GitHub Wiki

HTML Minifier produces minification of HTML and XHTML code. As a result of minification on the output we get valid HTML code.

Consider a simple example of usage of the HTML Minifier:

using System;
using System.Collections.Generic;

using WebMarkupMin.Core;

namespace WebMarkupMin.Sample.ConsoleApplication
{
    class Program
    {
        static void Main(string[] args)
        {
            const string htmlInput = @"<!DOCTYPE html>
<html>
    <head>
        <meta charset=""utf-8"" />
        <title>The test document</title>
        <link href=""favicon.ico"" rel=""shortcut icon"" type=""image/x-icon"" />
        <meta name=""viewport"" content=""width=device-width"" />
        <link rel=""stylesheet"" type=""text/css"" href=""/Content/Site.css"" />
    </head>
    <body>
        <p>Lorem ipsum dolor sit amet...</p>

        <script src=""http://ajax.aspnetcdn.com/ajax/jQuery/jquery-1.9.1.min.js""></script>
        <script>
            (window.jquery) || document.write('<script src=""/Scripts/jquery-1.9.1.min.js""><\/script>');
        </script>
    </body>
</html>";

            var htmlMinifier = new HtmlMinifier();

            MarkupMinificationResult result = htmlMinifier.Minify(htmlInput,
                generateStatistics: true);
            if (result.Errors.Count == 0)
            {
                MinificationStatistics statistics = result.Statistics;
                if (statistics != null)
                {
                    Console.WriteLine("Original size: {0:N0} Bytes",
                        statistics.OriginalSize);
                    Console.WriteLine("Minified size: {0:N0} Bytes",
                        statistics.MinifiedSize);
                    Console.WriteLine("Saved: {0:N2}%",
                        statistics.SavedInPercent);
                }
                Console.WriteLine("Minified content:{0}{0}{1}",
                    Environment.NewLine, result.MinifiedContent);
            }
            else
            {
                IList<MinificationErrorInfo> errors = result.Errors;

                Console.WriteLine("Found {0} error(s):", errors.Count);
                Console.WriteLine();

                foreach (var error in errors)
                {
                    Console.WriteLine("Line {0}, Column {1}: {2}",
                        error.LineNumber, error.ColumnNumber, error.Message);
                    Console.WriteLine();
                }
            }
        }
    }
}

First we create an instance of the HtmlMinifier class, and then call its the Minify method with the following parameters: first parameter contains HTML code, and second - flag for whether to allow generate minification statistics (default value - false, because generation of statistics requires time and additional resources). Minify method returns an object of the MarkupMinificationResult type, which has the following properties:

  • MinifiedContent - minified HTML code;
  • Errors - list of errors, that occurred during minification;
  • Warnings - list of warnings about the problems, which were found during minification;
  • Statistics - statistical information about minified code.

If list of errors is empty, then print minification statistics and minified code to the console, otherwise print error information to the console.

Consider an example of a more advanced usage of the HTML Minifier:

using System;
using System.Collections.Generic;
using System.Text;

using WebMarkupMin.Core;
using WebMarkupMin.Core.Loggers;

namespace WebMarkupMin.Sample.ConsoleApplication
{
    class Program
    {
        static void Main(string[] args)
        {
            const string htmlInput = @"<!DOCTYPE html>
<html>
    <head>
        <meta charset=""utf-8"" />
        <title>The test document</title>
        <link href=""favicon.ico"" rel=""shortcut icon"" type=""image/x-icon"" />
        <meta name=""viewport"" content=""width=device-width"" />
        <link rel=""stylesheet"" type=""text/css"" href=""/Content/Site.css"" />
    </head>
    <body>
        <p>Lorem ipsum dolor sit amet...</p>

        <script src=""http://ajax.aspnetcdn.com/ajax/jQuery/jquery-1.9.1.min.js""></script>
        <script>
            (window.jquery) || document.write('<script src=""/Scripts/jquery-1.9.1.min.js""><\/script>');
        </script>
    </body>
</html>";

            var settings = new HtmlMinificationSettings();
            var cssMinifier = new KristensenCssMinifier();
            var jsMinifier = new CrockfordJsMinifier();
            var logger = new NullLogger();

            var htmlMinifier = new HtmlMinifier(settings, cssMinifier,
                jsMinifier, logger);

            MarkupMinificationResult result = htmlMinifier.Minify(htmlInput,
                fileContext: string.Empty,
                encoding: Encoding.GetEncoding(0),
                generateStatistics: false);
            if (result.Errors.Count == 0)
            {
                Console.WriteLine("Minified content:{0}{0}{1}",
                    Environment.NewLine, result.MinifiedContent);
            }
            else
            {
                IList<MinificationErrorInfo> errors = result.Errors;

                Console.WriteLine("Found {0:N0} error(s):", errors.Count);
                Console.WriteLine();

                foreach (var error in errors)
                {
                    Console.WriteLine("Line {0}, Column {1}: {2}",
                        error.LineNumber, error.ColumnNumber, error.Message);
                    Console.WriteLine();
                }
            }
        }
    }
}

When creating an instance of the HtmlMinifier class, we pass through the constructor: HTML minification settings, CSS minifier, JS minifier and logger. In the Minify method passed another two additional parameters:

  • fileContext. Can contain a path to the file or URL of the web page. The value of this parameter is used when logging.
  • encoding. Contains a text encoding, which is used in the minification process and statistics generation.

The values of parameters in the above code correspond to the default values.

And now let's consider in detail properties of the HtmlMinificationSettings class:

Property name Data type Default value Description
WhitespaceMinificationMode WhitespaceMinificationMode enumeration Medium Whitespace minification mode. Can take the following values:
  • None. Keep whitespace.
  • Safe. Safe whitespace minification: removes whitespace characters from top and bottom of HTML document; multiple whitespace characters are replaced by a single space; removes all leading and trailing whitespace characters from DOCTYPE declaration; removes all leading and trailing whitespace characters from outer and inner contents of invisible tags (html, head, body, meta, link, script, etc.); removes unnecessary leading and trailing whitespace characters from outer contents of non-independent tags (li, dt, dd, rb, rtc, rt, rp, option, tr, td, th, etc.).
  • Medium. Medium whitespace minification: executes all operations of the safe whitespace minification + removes all leading and trailing whitespace characters from outer and internal contents of block-level tags.
  • Aggressive. Aggressive whitespace minification: executes all operations of the medium whitespace minification + removes all leading and trailing whitespace characters from internal contents of inline and inline-block tags.
PreserveNewLines Boolean false Flag for whether to collapse whitespace to one newline string when whitespace contains a newline.
NewLineStyle NewLineStyle enumeration Auto Style of the newline. Can take the following values:
  • Auto. Auto-detect style for newline based on the source input.
  • Native. CRLF in Windows, LF on other platforms.
  • Windows. Force the Windows style for newline (CRLF).
  • Mac. Force the Macintosh style for newline (CR).
  • Unix. Force the Unix style for newline (LF).
RemoveHtmlComments Boolean true Flag for whether to remove all HTML comments, except conditional, noindex, KnockoutJS containerless comments, AngularJS 1.X comment directives, React DOM component comments and Blazor markers.
PreservableHtmlCommentList String Empty string Comma-separated list of string representations of simple regular expressions, that define what HTML comments can not be removed (e.g. "/^\s*saved from url=\(\d+\)/i, /^\/?\$$/, /^[\[\]]$/"). Simple regular expressions somewhat similar to the ECMAScript regular expression literals. There are two varieties of the simple regular expressions:
  • /pattern/
  • /pattern/i
RemoveHtmlComments­FromScriptsAndStyles Boolean true Flag for whether to remove HTML comments from script and style tags.
RemoveCdataSections­FromScriptsAndStyles Boolean true Flag for whether to remove CDATA sections from script and style tags.
UseShortDoctype Boolean true Flag for whether to replace existing document type declaration by short declaration - <!DOCTYPE html>.
CustomShortDoctype String Empty string Custom short DOCTYPE (e.g. <!DOCTYPE HTML>, <!doctype html>, or <!doctypehtml>).
PreserveCase Boolean false Flag for whether to preserve case of tag and attribute names (useful for Angular 2 templates).
UseMetaCharsetTag Boolean true Flag for whether to replace <meta http-equiv="content-type" content="text/html; charset=…"> tag by <meta charset="…"> tag
EmptyTagRenderMode HtmlEmptyTagRenderMode enumeration NoSlash Render mode of HTML empty tag. Can take the following values:
  • NoSlash. Without slash (for example, <br>).
  • Slash. With slash (for example, <br/>).
  • SpaceAndSlash. With space and slash (for example, <br />).
RemoveOptionalEndTags Boolean true Flag for whether to remove optional end tags (html, head, body, p, li, dt, dd, rb, rtc, rt, rp, optgroup, option, colgroup, thead, tfoot, tbody, tr, th and td).
PreservableOptionalTagList String Empty string Comma-separated list of names of optional tags, which should not be removed (e.g. "li, rb, rtc, rt, rp").
RemoveTagsWithoutContent Boolean false Flag for whether to remove tags without content, except for textarea, tr, th and td tags, and tags with class, id, name, role, src and custom attributes.
CollapseBooleanAttributes Boolean true Flag for whether to remove values from boolean attributes (for example, checked="checked" is transforms to checked).
AttributeQuotesStyle HtmlAttributeQuotesStyle enumeration Auto Style of the HTML attribute quotes. Can take the following values:
  • Auto. Auto-detect style for attribute quotes based on the source input.
  • Optimal. Optimal style for attribute quotes based on the attribute value.
  • Single. Single quotes.
  • Double. Double quotes.
AttributeQuotesRemovalMode HtmlAttributeQuotesRemovalMode enumeration Html5 HTML attribute quotes removal mode. Can take the following values:
  • KeepQuotes. Keep quotes.
  • Html4. Removes a quotes in accordance with standard HTML 4.X.
  • Html5. Removes a quotes in accordance with standard HTML5.
RemoveEmptyAttributes Boolean true Flag for whether to remove attributes, which have empty value (valid attributes are: class, id, name, style, title, lang, event attributes, action attribute of form tag and value attribute of input tag).
RemoveRedundantAttributes Boolean false
  • <a id="…" name="…" …>
  • <area shape="rect" …>
  • <button type="submit" …>
  • <form autocomplete="on" …>
  • <form enctype="application/x-www-form-urlencoded" …>
  • <form method="get" …>
  • <img decoding="auto" …>
  • <input type="text" …>
  • <script src="…" charset="…" …>
  • <script language="javascript" …>
  • <textarea wrap="soft" …>
  • <track kind="subtitles" …>
RemoveJsTypeAttributes Boolean true Flag for whether to remove type="text/javascript" attributes from script tags.
RemoveCssTypeAttributes Boolean true Flag for whether to remove type="text/css" attributes from style and link tags.
PreservableAttributeList String Empty string Comma-separated list of string representations of attribute expressions, that define what attributes can not be removed (e.g. "form[method=get i], input[type], [xmlns]"). Attribute expressions somewhat similar to the CSS Attribute Selectors. There are six varieties of the attribute expressions:
  • [attrName]
  • tagName[attrName]
  • [attrName=attrValue]
  • tagName[attrName=attrValue]
  • [attrName=attrValue i]
  • tagName[attrName=attrValue i]
RemoveHttpProtocol­FromAttributes Boolean false Flag for whether to remove the HTTP protocol portion (http:) from URI-based attributes (tags marked with rel="external" are skipped).
RemoveHttpsProtocol­FromAttributes Boolean false Flag for whether to remove the HTTPS protocol portion (https:) from URI-based attributes (tags marked with rel="external" are skipped).
RemoveJsProtocol­FromAttributes Boolean true Flag for whether to remove the javascript: pseudo-protocol portion from event attributes.
MinifyEmbeddedCssCode Boolean true Flag for whether to minify CSS code in style tags.
MinifyInlineCssCode Boolean true Flag for whether to minify CSS code in style attributes.
MinifyEmbeddedJsCode Boolean true Flag for whether to minify JS code in script tags.
MinifyInlineJsCode Boolean true Flag for whether to minify JS code in event attributes and hyperlinks with javascript: pseudo-protocol.
MinifyEmbeddedJsonData Boolean true Flag for whether to minify JSON data in script tags with application/json, application/ld+json, importmap and speculationrules types.
ProcessableScriptTypeList String "text/html" Comma-separated list of types of script tags, that are processed by minifier (e.g. "text/html, text/ng-template"). Currently only supported the KnockoutJS, Kendo UI MVVM and AngularJS 1.X views.
MinifyKnockout­BindingExpressions Boolean false Flag for whether to minify the KnockoutJS binding expressions in data-bind attributes and containerless comments.
MinifyAngular­BindingExpressions Boolean false Flag for whether to minify the AngularJS 1.X binding expressions in Mustache-style tags ({{}}) and directives.
CustomAngularDirectiveList String Empty string Comma-separated list of names of custom AngularJS 1.X directives (e.g. "myDir, btfCarousel"), that contain expressions. If value of the MinifyAngularBindingExpressions property equal to true, then the expressions in custom directives will be minified.
⚠️ **GitHub.com Fallback** ⚠️