Skip to content

Junctions Points, Symbolic Links and Reparse Points

Jean-Pierre André edited this page Jul 7, 2022 · 10 revisions

Table of Contents

  1. Introduction
  2. Directory Junctions
  3. Volume Junctions
  4. Symbolic Links
  5. Other Types of Reparse Points

Introduction

NTFS defines the concept of "reparse point" which is an optional attribute of files and directories meant to define some sort of preprocessing before accessing the said file or directory. For instance reparse points can be used to redirect access to files which have been moved to long term storage so that some application would retrieve them and make them directly accessible.

A Junction point is a specific reparse point to redirect a directory access to another directory which can be on the same volume or another volume. There are two sorts of junction points : volume junctions, which redirect directories to a whole volume (for instance to escape the 26 drive letters limit in Windows) and directory junctions, which redirect directories to another directory. In both situations the redirection target is defined by an absolute path.

The similar concept of symbolic link is also available since Windows Vista. The symbolic links can redirect to a file or a directory defined by an absolute or a relative path. When defined on a remote file system, they are processed on the local system, whereas the directory junctions are processed on the file server, which makes a difference when the target is not accessible by the file server. The symbolic links by Windows are different from Interix symbolic links created by ntfs-3g which are also interoperable with Windows.

Junction points were available since Windows 2000, but they were not widely used until Windows Vista used directory junctions to redirect access to legacy directories (such as \Documents and Settings), in order to avoid breaking older software accessing directories for which Vista defines a new location. The symbolic links are new to Vista and used in paths (such as \Users\All Users) which were not used earlier.

We will hereafter describe how junction points and symbolic links are made to appear in Linux as symbolic links. Dereferencing junction points and symbolic links created by Windows is thus made possible, so are hard linking, renaming and deleting, but creating new ones is not.

Finally, we will examine the use of reparse points to trigger upper layer features which ntfs-3g implements as plugins. Two of them are currently available : one for reading system compressed files, another for reading deduplicated files.

Directory Junctions

A directory junction, as created by Windows, always defines the full (case-insensitive) path to the target, including a drive letter. Examples of target definitions are:

C:\Users
c:\users (this is the same as C:\Users)
d:\
C:\Users\Tom\AppData\Local

Notes:

  • Windows does not accept the character '/' as a directory separator in the target definition,
  • when creating a junction, Windows translates a relative target definition to a full target,
  • only void directories can be made directory junctions by setting reparse data.

In order to translate a directory junction to a Linux symbolic link, the following points have to be addressed:

  • translate the drive letter to a mount point
  • translate the case-insensitive path to a case-sensitive one
  • and, as these are not always possible, detect and signal problems

Translating the drive letter

The drive letter is a physical address loosely related to the semantics of the target. A pluggable device (such as a USB key) gets different drive letters on different computers and on a specific computer different devices get the same drive letter if they are plugged in turn into the same slot.

Translating drive letters to Linux paths can probably not be done automatically, but there are two possible ways to deal with them : recognizing directory junctions local to a device, which can be translated to relative paths, and relying on some user defined mapping of drive letters to mount points.

Checking whether the drive letter designates the current volume can be approximated by making sure the target path designates an existing directory in the volume. After validity checks C:\Users can be converted to ./Users and C:\Users\Tom\AppData\Local converted to ../AppData/Local. This is subject to errors, as a similar (case-insensitive) path meant for another volume may be found on the current volume. This would be the case for any target defined as the root of a volume, as there would be no directory to be checked, and it is wise to always reject such target guesses.

Another option is to let the user define what a drive letter should be mapped to in Linux. Such definitions should be located in the .NTFS-3G directory of the current file system, as symbolic links to the matching mount point. Then, C:\Users can be converted automatically to ./.NTFS-3G/C:/Users with C: having to be defined explicitly as a symbolic link to some mount point.

Both methods are implemented in ntfs-3g, according to the following rules:

  • if the drive letter is not defined in /.NTFS-3G, an attempt to interpret the junction point target as a path to an existing directory on the same volume is first made. If such directory is found, the path is converted to a relative symbolic link whose name is translated to match the directory chain exactly.
  • if the drive letter is defined in /.NTFS-3G or if the attempt to find a local directory fails (even if there is no drive letter defined in /.NTFS-3G), the junction is translated to a relative symbolic link referring the possible definition. The drive letter should be defined with an upper case followed by a colon, and the path should match the characters used in the junction point definition.

Note that .NTFS-3G is a hidden directory located at the root of the file system containing the junction point. It may have to be replicated if there are several NTFS file systems with junction points in them.

Translating the case-insensitive path

The target is defined in Windows as a case-insensitive path, with chars which may have a different "casing" from those stored in directory levels, but an exact case-sensitive match is required for a symbolic link to be valid on Linux.

This obviously leads to examining the path and adjusting the names to those defined in the directory levels. However walking along a case-insensitive path may lead to ambiguities. For instance both c:\Users and c:\users may be present and designate different directories. Trying to solve such ambiguity is probably useless as the target is supposed to have been created by Windows according to its own rules, and Windows would not be able to make a better guess when faced to the same ambiguity.

Because of the possible ambiguities, the translation of a case insensitive path is only done when searching the target on the current volume. Only the drive letter is translated (and made upper case) when redirecting to a definition in .NTFS-3G, and user definitions should always match the target.

Examples

Assuming the C: volume is mounted on /Vista and /Vista/.NTFS-3G/D:/Packages is defined as a symbolic link to /shared/packages:

  • if /Vista/Documents and Settings is a directory junction to C:\USERS and /Vista/Users exists on the same volume, it will be seen as a symbolic link to ./Users
  • if /Vista/global is defined as a directory junction to c:\Shared and there is no directory /Vista/shared to be found whatever the letter case, it will be seen as a symbolic link to ./.NTFS-3G/C:/Shared
  • if /Vista/Users/Tom/TomData is a directory junction to d:\shared\TomData, it will be seen as a symbolic link to ../../.NTFS-3G/D:/shared/Tom/TomData, even if there is no such directory.

Except in the first case, a second symbolic link has to be defined to get to the target directory.

Volume Junctions

A volume junction, as created by Windows, defines a GUID to designate a physical drive. For example a target definition for a volume junction would appear as:

\\?\Volume{cb71f9d2-945f-11dd-8eac-00188b73099c}\

As the GUID is related to the physical drive (or USB port), the relation to the semantics of the data is poorly established, much like a drive letter. The Volume junction itself can apparently not be defined on a pluggable file system, but the target can be, allowing the usage of the same path to mean different data when the media is changed.

The way to make a volume junction appear like a symbolic link is also to define the volumes as symbolic links in the predefined location .NTFS-3G of the volume in which the junction is defined.

For instance, a volume junction in C:\Users\Tom\Data defined as *\?\Volume{[ID]}* meaning a USB key which mounts in /media/TomData in Linux, will be seen in /Vista/Users/Tom/Data as a symbolic link to ../../.NTFS-3G/Volume{[ID]} which is expected to be defined as a symbolic link to /media/TomData. Of course plugging in another USB key with a different label can only be done if the definition is adjusted.

Symbolic Links

A symbolic link, as created by Windows (since Vista), is much similar to a directory junction, but unlike a directory junction it can point to a file or a remote network file or directory. The target may be defined as a path relative to the symbolic link position, or an absolute path in the current volume or another one. Also note that symbolic links to files are different from symbolic links to directories and the target must match the definition.

If the target is defined as an absolute path, it is processed like a directory junction:

  • if no drive letter is present in the target definition, an attempt is made to translate the path to a case-sensitive one in the current volume,
  • if a drive letter is present in the target definition and not defined in .NTFS-3G, an attempt is also made to recognize the path in the current volume,
  • if a drive letter is present in the target definition, and defined in .NTFS-3G, the path is interpreted as it were relative to .NTFS-3G. The path is not translated or checked, only the drive letter is capitalized.

In the three situations a symbolic link relative to the current location is generated.

If the target is defined as a relative path, an attempt is made to translate the path to a case-sensitive one. The translation fails if it leads to a loop or leads out of the current volume. If successful, a new symbolic link with the translated path is generated.

Other Types of Reparse Points

Reparse points may be used by Windows to force some special processing when a file is requested. When such processing is not supported by ntfs-3g, the file or directory which holds the reparse point is made to appear as a symlink to "unsupported reparse point".

Since ntfs-3g-2016.2.22AR.1, plugins may be used to benefit from some features defined by reparse points:

  • System compression
  • Deduplicated files
  • OneDrive files

System compression is used by Windows 10 (only on powerful computers) to save space in the Windows system directory by compressing the executables and DLLs. The compression algorithms and the data layout used are different from the usual NTFS compression and they are unsuitable for compressing on the fly. A plugin has been developed for reading system compressed files, and it is available as a package from most distributions. Creating or updating such files are not supported, only normal files can be created and updated.

File deduplication is used by Windows Server 2012 to save storage space by sharing the space used by similar files, thus avoiding redundancy. The algorithms used and the data layout are also unsuitable for deduplicating on the fly. The plugin for reading deduplicated files is not part of released versions. Creating or updating such files are not supported, only normal files can be created and updated.

OneDrive is a pseudo-volume used by Windows 8.1 and 10 for storing files on the cloud, with the files marked as "always keep on this device" also stored locally. Only these local copies can be accessed by ntfs-3g through a plugin, and there is no synchronization with the cloud copy. The synchronization of updated files can however be done subsequently when rebooted to Windows. The plugin for accessing OneDrive files is not part of released versions.

When reporting an issue about an unsupported reparse point, please post the output of the commands below:

# For reporting an issue with a file or directory showing an unsupported
# reparse point, replace "file-or-directory" by the name of the said
# file or directory in the command below
#
getfattr -h -n system.ntfs_reparse_data -e hex file-or-directory
#
# For reporting your ntfs-3g configuration :
#
$(which ntfs-3g) -help 2>&1 | grep ration
file $(which ntfs-3g)
md5sum $(which ntfs-3g)
ls -ld $(strings $(which ntfs-3g) | grep ntfs-plugin | sed -e 's/ntfs-plugin.*//')
md5sum $(strings $(which ntfs-3g) | grep ntfs-plugin | sed -e 's/%08lx/*/')