Ideas - lo48576/iri-string GitHub Wiki

TODO (or not TODO) list.

Items listed here can be implemented / integrated to iri-string crate, but the author doesn't decided whether to implement or not. If you want the features listed here, feel free to open an issue or discussions.

I myself am not going to make issues for the items here, since I don't need them or I don't have concrete idea, and the issue created without anyone who needs the feature will be stale for long time. I don't want to create such long-term noise actively.

  • URI-to-IRI mapping (with percent-decoding)
    • RFC 3987, section 3.2
    • This conversion actively decode percent-encoded characters (if possible), so users who'd like to use and/or print non-ASCII characters aggressively may wish for this.
    • However, usually this won't be necessary if the program is handling non-user-facing data or the program attempt to stay in the safe ASCII string world.
      • Bidi, EAW, emojis, etc... Handling full-featured Unicode string properly is quite hard!
    • RFC 3987 IRI normalization actively decodes unreserved characters, so this might be enough?
  • Protocol-aware normalization
    • Maybe HTTP/HTTPS can be used in many place?
    • Such normalization won't need to be done on a single string buffer or slice. url crate or something like that would be an option.
  • RFC 6874 support
    • RFC 6874 extends IP-literal rule defined at RFC 3986.
    • IPv6 addresses with Zone ID (such as [2001:db8::7%25eth0]) become accepted as host.
      • Note that %25 is percent-encoded percent (%) character.
    • Defining Rfc6874UriSpec and Rfc6874IriSpec would be useful?
    • Should fe80::a%en1 should be allowed?
      • It is desirable for all browsers to recognise a ZoneID preceded by a percent-encoded "%". In the spirit of "be liberal with what you accept", we also suggest that URI parsers accept bare "%" signs when possible (i.e., a "%" not followed by two valid and meaningful hexadecimal characters). This would make it possible for a user to copy and paste a string such as "fe80::a%en1" from the output of a "ping" command and have it work.

      • Implementing conversion such as RiString::<Rfc6874IriSpec>::from_nondecoded(s: &str) would be enough?
    • See also: https://github.com/yescallop/fluent-uri-rs/issues/1.
  • Proc macros to create URI/IRI strings
    • String literals can be validated at compile time.
    • It would be problematic if the versions mismatch between macro and the crate, or the crate was renamed (e.g. imported by the name different from iri_string).
    • I implemented PoC of the proc macro (as personal expermient) that does not suffer from version mismatch. It is actually possible, but the code will be dirty and rustc also have a problem for submodule path resolution around #[path = ..].
  • Enable FixedBaseResolver to have not only a borrowed IRI, but also an owned IRI.
    • Owned FixedBaseResolver with 'static lifetime maybe useful in some situations.
    • However, I'm not sure how such feature should be provided. Using Cow<'_, _> makes the resolver type bigger, but having separete resolver type is also not happy.
⚠️ **GitHub.com Fallback** ⚠️