Generating NSAttributedString from HTML - codepath/ios_guides GitHub Wiki

Generating NSAttributedString from HTML

When server-rendered or CMS-authored copy arrives as an HTML fragment (<p>, <b>, <a>, <ul>, etc.), the cheapest way to render it inside a UILabel or UITextView is to convert the markup to an NSAttributedString. Foundation's NSAttributedString(data:options:documentAttributes:) initializer accepts raw bytes and a documentType option, and returns an attributed string with inline styling, links, and basic block structure already applied.

This page covers the modern Swift API, the threading requirement that catches most teams off guard, and the common gotchas you will hit the first time you ship HTML-driven content.

Basic Usage

import UIKit

let html = "<p>Hello <b>world</b>. Visit <a href=\"https://example.com\">our site</a>.</p>"

guard let data = html.data(using: .utf8) else { return }

do {
    let attributed = try NSAttributedString(
        data: data,
        options: [
            .documentType: NSAttributedString.DocumentType.html,
            .characterEncoding: String.Encoding.utf8.rawValue
        ],
        documentAttributes: nil
    )
    label.attributedText = attributed
} catch {
    print("Failed to parse HTML: \(error)")
}

The initializer is throws, so wrap it in do/try/catch. In practice, well-formed HTML input rarely throws — but malformed input or an unsupported encoding can, and the call has to be on the main thread anyway (see below), so swallowing the error silently is a bad idea.

The Options Dictionary

The options parameter is a [NSAttributedString.DocumentReadingOptionKey: Any] dictionary. The two keys you will reach for almost every time are:

  • .documentType — the format of the input. Set this to NSAttributedString.DocumentType.html for HTML. Other accepted values include .plain, .rtf, and .rtfd.
  • .characterEncoding — the encoding used to decode the byte buffer, passed as the rawValue of a String.Encoding. Use .utf8 unless you know the source explicitly emits something else.

If .characterEncoding is omitted, the parser will try to detect the encoding from a <meta charset="…"> tag inside the HTML — supplying it explicitly avoids that guesswork.

Main-Thread Requirement

NSAttributedString uses WebKit internally to parse HTML, and WebKit is not thread-safe. Apple's archived documentation states this directly:

Since OS X v10.4, NSAttributedString has used WebKit for all import (but not for export) of HTML documents. Because WebKit document loading is not thread safe, this has not been safe to use on background threads.

In practice, that means HTML parsing has to happen on the main thread. Calling the initializer from a background queue will, at best, hop back to the main thread (stalling whatever you were trying to parallelize) and at worst surface as a crash or NSInternalInconsistencyException if the main thread is blocked.

Do this:

DispatchQueue.main.async {
    let attributed = try? NSAttributedString(
        data: data,
        options: [
            .documentType: NSAttributedString.DocumentType.html,
            .characterEncoding: String.Encoding.utf8.rawValue
        ],
        documentAttributes: nil
    )
    self.label.attributedText = attributed
}

Not this:

// ❌ Don't do this — HTML parsing must run on the main thread.
DispatchQueue.global(qos: .userInitiated).async {
    let attributed = try? NSAttributedString(data: data, options: opts, documentAttributes: nil)
    // ...
}

Performance

Because the initializer spins up WebKit (and, internally, JavaScriptCore), the first HTML parse you do in the lifetime of the app can take noticeably longer than subsequent calls — far too slow to run inside tableView(_:cellForRowAt:) or collectionView(_:cellForItemAt:) if you have any kind of scrolling content. Subsequent calls are typically much faster thanks to internal caching, but you should not rely on that for hot paths.

The pragmatic pattern is to parse HTML up front (on the main thread) and cache the resulting NSAttributedString on your view model. A lazy property looks tempting here, but lazy initialization runs on whichever thread first reads the property — that can easily be a background prefetcher, which would reintroduce the threading hazard described above. Use a @MainActor-isolated method that the caller invokes explicitly on the main thread:

final class Article {
    let bodyHTML: String
    private(set) var attributedBody: NSAttributedString?

    init(bodyHTML: String) {
        self.bodyHTML = bodyHTML
    }

    @MainActor
    func prepareAttributedBody() {
        guard attributedBody == nil, let data = bodyHTML.data(using: .utf8) else { return }
        attributedBody = try? NSAttributedString(
            data: data,
            options: [
                .documentType: NSAttributedString.DocumentType.html,
                .characterEncoding: String.Encoding.utf8.rawValue
            ],
            documentAttributes: nil
        )
    }
}

If you have a list of items, do the parsing right after the network fetch resolves, on the main thread, and store the NSAttributedString for the cell to consume.

Overriding Font and Color

HTML parsing applies its own default font, which is usually 12pt Times — not what you want in an app that otherwise uses the system font. The cleanest way to override is to wrap the HTML in a <style> block that sets defaults on body:

let css = """
<style>
  body { font-family: -apple-system; font-size: 16px; color: #222222; }
  a    { color: #0a84ff; }
</style>
"""
let wrapped = css + html

Alternately, parse first and then re-apply attributes to an NSMutableAttributedString by enumerating ranges:

let mutable = NSMutableAttributedString(attributedString: attributed)
mutable.enumerateAttribute(.font, in: NSRange(location: 0, length: mutable.length)) { value, range, _ in
    if let oldFont = value as? UIFont {
        // Preserve the bold/italic traits the HTML parser inferred, replace the family/size.
        let descriptor = UIFont.systemFont(ofSize: 16).fontDescriptor
            .withSymbolicTraits(oldFont.fontDescriptor.symbolicTraits) ?? UIFont.systemFont(ofSize: 16).fontDescriptor
        mutable.addAttribute(.font, value: UIFont(descriptor: descriptor, size: 16), range: range)
    }
}

The CSS approach is shorter; the post-processing approach gives you finer control if the HTML has nested styling you want to selectively preserve.

Objective-C

The same API is available from Objective-C:

NSData *data = [html dataUsingEncoding:NSUTF8StringEncoding];
NSDictionary *options = @{
    NSDocumentTypeDocumentAttribute: NSHTMLTextDocumentType,
    NSCharacterEncodingDocumentAttribute: @(NSUTF8StringEncoding)
};
NSError *error = nil;
NSAttributedString *attributed = [[NSAttributedString alloc]
    initWithData:data
         options:options
documentAttributes:nil
           error:&error];

When Not to Use This

The HTML importer is convenient for short snippets of rich text — article bodies, formatted descriptions, marketing copy. It is not a web renderer: it ignores most JavaScript, has uneven CSS support, and is slow enough that you should never invoke it inside a scroll callback. If you need to render arbitrarily complex HTML pages, use WKWebView instead. If you only need a couple of bold or italic ranges, build the NSAttributedString directly with addAttribute(_:value:range:) — it's faster and avoids the WebKit dependency entirely.

⚠️ **GitHub.com Fallback** ⚠️