Skip to content

Using Extended Attributes

Rakesh Pandit edited this page Aug 24, 2021 · 3 revisions

Table of Contents

  1. Introduction
  2. NTFS Attribute
  3. Reparse Data
  4. NTFS ACLs
  5. DOS names
  6. File times
  7. Object ids
  8. EFS Info
  9. NTFS EA

Introduction

Extended attributes are properties organized in (name, value) pairs, optionally set to files or directories in order to record information which cannot be stored in the file itself. They are supported by operating systems such as Windows, Linux, Solaris, MacOSX and others, with variations.

On Linux, specifically, four categories of extended attributes have been defined:

  • trusted : to record properties which should only be accessed by the kernel,
  • security : to record security properties of a file,
  • system : to record other system related properties on which the file owner has some control,
  • user : to record properties defined by applications.

The names of the extended attributes must be prefixed by the name of the category and a dot, hence these categories are generally qualified as name spaces. Examples of extended attribute names are security.selinux, system.posix_acl_access or user.mime_type. However a high-level language may hide the system prefixing so that, for instance, the user.mime_type attribute name would appear as "mime_type" in a source code.

They can be retrieved and set through system calls (getxattr(2), setxattr(2), removexattr(2)) or shell commands (getfattr(1), setfattr(1)), provided appropriate access conditions are met:

condition for getting condition for setting
trusted root root
security read access root
system read access owner
user read access write access

The extended attributes are enabled through the mount option streams_interface=xattr. With this option (activated by default), the four name spaces are supported by ntfs-3g on Linux.

Your operating system has to provide an extended attribute interface for the features referenced on this page to be available. Some Linux distributions and most other operating systems do not support a compatible extended attribute interface.

On systems with open name spaces, such as Mac OS X, these features are enabled through the mount option streams_interface=openxattr.

The extended attributes in user name space are stored on NTFS as alternate data streams whose name is the unprefixed name of the attribute, and whose contents is the value of the attribute. They can be read and modified in Windows by using the standard file access functions, with a colon and the stream name (which is the unprefixed extended attribute name) appended to the file name.

Example:

First part is executed on Linux, and then on Windows and eventually back on Linux. The Linux file /media/tmp/tests/xattr/file is the same physical file as k:tests\xattr\file on an NTFS partition.

Linux

[user xattr]$ # Set the extended attribute "user.color" with value "green"
[user xattr]$ setfattr -n user.color -v green /media/tmp/tests/xattr/file
[user xattr]$ # Check the value of the attribute
[user xattr]$ getfattr -n user.color /media/tmp/tests/xattr/file
getfattr: Removing leading '/' from absolute path names
# file: media/tmp/tests/xattr/file
user.color="green"

Windows

D:\xattr>REM On windows display the alternate stream "color"
D:\xattr>more < k:tests\xattr\file:color
green
D:\xattr>REM Change the alternate stream contents to "white"
D:\xattr>echo white > k:tests\xattr\file:color
D:\xattr>more < k:tests\xattr\file:color
white

Linux

[user xattr]$ # Back on Linux, display the extended attribute "user.color"
[user xattr]$ getfattr -n user.color /media/tmp/tests/xattr/file
getfattr: Removing leading '/' from absolute path names
# file: media/tmp/tests/xattr/file
user.color="white"

A few extended attributes in the system name space are used to give access to NTFS internal data which cannot be accessed through Linux standard methods. They can be accessed with any setting of the option streams_interface.

The program ntfscp.c and the shell command ntfscp.sh, available in tools.zip, demonstrate how files can be copied with their internal data.

Being in the system name space, they are not returned when the extended attribute list is queried, and consequently, not copied by standard file management tools.

(The extended attribute mapping described below only applies since ntfs-3g-2010.5.16AR.1) To get the extended attribute copied by standard tools, ntfs-3g has to be compiled with the ./configure option --enable-xattr-mappings and a mapping to an alternate extended attribute in the user name space has to be defined. The standard file defining the mapping is XattrMapping in the hidden directory .NTFS-3G on the root of the NTFS partition. An alternate location may be defined by the mount option xattrmapping=path where path is either a full path on a previously mounted volume or a path relative to the root of the same NTFS partition.

Doing so, the mapped extended attributes will be copied, for instance by tar with option --xattrs, by cp with option --preserve=xattr, by rsync with option -X, etc.

Please note : having standard tools blindly copy attributes may lead to conflicts (ACLs, symbolic links, OIDs, etc.) or undesired behavior (data compression).

The extended attribute mapping file should have a line per mapped attribute with two fields separated by a colon. The first field is the system name, the second is the user name.

Example:

system.ntfs_attrib:user.ntfs_attrib
# this is a comment line
system.ntfs_times:user.ntfs_times

The following sections describe how to access a few NTFS internal data as extended attributes.

NTFS Attributes

The NTFS attributes are a set of miscellaneous flags associated with a file or directory.

(The attribute system.ntfs_attrib_be only exists since ntfs-3g-2010.5.22AC.5) The NTFS attributes are mapped to two four-byte word extended attributes named system.ntfs_attrib and system.ntfs_attrib_be. The value of the former is represented with the endianness of the processor used (suitable for use with system functions such as getxattr(2)), the value of the latter is represented as big-endian and is more convenient for use with commands such as getfattr(1).

Only eight flags can be changed. The system flag (FILE_ATTRIBUTE_SYSTEM) is used in symbolic links and special files, and should not be changed unawares. The compressed flag on a directory is only used when creating new files into the directory, it has no effect on existing files. The other changeable flags are not used by Linux.

Names of settable flags value (hex)
FILE_ATTRIBUTE_READONLY 1
FILE_ATTRIBUTE_HIDDEN 2
FILE_ATTRIBUTE_SYSTEM 4
FILE_ATTRIBUTE_ARCHIVE 20
FILE_ATTRIBUTE_TEMPORARY 100
FILE_ATTRIBUTE_COMPRESSED (directories only) 800
FILE_ATTRIBUTE_OFFLINE 1000
FILE_ATTRIBUTE_NOT_CONTENT_INDEXED 2000

When using the setfattr(1) command, be sure to define the value as a four-byte number (or eight hexadecimal digits) representing the sum of the desired flags values.

Examples:

# Display the current NTFS attributes of the file source-file
getfattr -h -e hex -n system.ntfs_attrib_be source-file

# Set the NTFS read-only flag to file target-file (any computer)
setfattr -h -v 0x00000001 -n system.ntfs_attrib_be target-file

# Set the compression flag on a directory (small-endian computer)
setfattr -h -v 0x10080000 -n system.ntfs_attrib target-directory

Reparse Data

NTFS use reparse data to define special actions before opening a file or directory. For instance reparse data is used to define Windows junctions and symlinks.

If present, the reparse data of a file or directory is mapped to an extended attribute named system.ntfs_reparse_data, and is returned as a raw variable-length record of small-endian items. Please note:

  • Setting invalid reparse data may have adverse consequences, as no check can be done when setting new data,
  • when creating or deleting a junction or symlink by setting or deleting reparse data, the status of the file is not updated in the system caches and subsequent actions on the file may fail. The caches can be updated by setting the owner again after the reparse data change.
  • for creating a junction or symlink by setting reparse data, the type of the target (plain file or directory) must match the type of the source on which the reparse data is set.
  • a file or directory cannot have both reparse data and a set of EAs.

Examples:

# Display the reparse data of the file source-file
getfattr -h -e hex -n system.ntfs_reparse_data source-file

# Copy the reparse data of the file source-file to the file target-file
REPARSE=`getfattr -h -e hex -n system.ntfs_reparse_data source-file | \
         grep '=' | sed -e 's/^.*=//'`
setfattr -h -v $REPARSE -n system.ntfs_reparse_data target-file

NTFS ACLs

The NTFS ACLs are used to control access to files or directories. On linux they are translated into ownership, permissions and Posix ACLs parameters.

The NTFS ACL of a file or directory is mapped to an extended attribute named system.ntfs_acl, and is returned as a raw variable-length record of small-endian items. All the components of the ACL are returned : the owner id, the group id, the discretionary ACL and the system ACL. The setting of a new NTFS ACL is rejected is the consistency check fails.

Examples:

# Display the NTFS ACL of the file source-file
getfattr -h -e hex -n system.ntfs_acl source-file

# Copy the ACL of the file source-file to the file target-file
ACL=`getfattr -h -e hex -n system.ntfs_acl source-file | \
         grep '=' | sed -e 's/^.*=//'`
setfattr -h -v $ACL -n system.ntfs_acl target-file

DOS names

An NTFS file may have an alternate short name to enable access by legacy application which can only process files whose name complies to a 8.3 pattern (the main part of the name has at most 8 characters, followed by a dot and a suffix which has at most 3 characters). Linux usually does not use such short names.

The short name of a file can be queried and set as if it were the value of an extended attribute named system.ntfs_dos_name. When there is a short name, the character set used to build both the short and the long name is restricted : control characters (whose code is less than 0x20) and nine specific characters (<>:"/|?*) are not allowed in either name. As a consequence no short name can be set on a file whose existing name contains forbidden characters.

The Windows algorithm to derive a short name from a long name is not enforced by Linux, only the character set and the unicity of case-sensitive names are checked. When queried, the short name is always capitalized.

Getting or Setting dos names on files with several names (hard links) is buggy and was not detected in older versions. As it may lead to corruptions, this facility is discontinued.

Please note:

  • a file cannot have several short names and associated long names,
  • setting a DOS name requires having write access to the parent directory

Examples:

# Display the short name of the file source-file
getfattr -h -n system.ntfs_dos_name source-file

# Set TARGET~1 as the short name of the file target-file
setfattr -h -v "TARGET~1" -n system.ntfs_dos_name target-file

File times

An NTFS file is qualified by a set of four time stamps "representing the number of 100-nanosecond intervals since January 1, 1601 (UTC)", though UTC has not been defined for years before 1961 because of unknown variations of the earth rotation. These are:

  • the time the file was created,
  • the time the file was last written or truncated
  • the time the file was last accessed,
  • the time some attribute of the file was last changed

(The attribute system.ntfs_times_be only exists since ntfs-3g-2010.5.22AC.5) These times are mapped to two extended attributes named system.ntfs_times and system.ntfs_times_be as an array of 64-bit quantities in the above order. For the former the times are represented with the endianness of the processor used, for the latter, they are represented as big-endian values. All four times can be queried, but only the first three can be set, as the last one records, for unknown purpose, the time of any change (possibly journalling, mirroring or backup). Creation time is set if the value is defined with at least 8 bytes, modification time is set if the value is defined with at least 16 bytes, and access time is set if the value is defined with at least 24 bytes.

Example:

# Get and hex display the creation time of the file source-file
CRTIME=`getfattr -h -e hex -n system.ntfs_times source-file | \
       grep '=' | sed -e 's/^.*=\(0x................\).*$/\1/'`
echo $CRTIME
# Set only this creation time to the file target-file
setfattr -h -v $CRTIME -n system.ntfs_times target-file

(The attributes system.ntfs_crtime and system.ntfs_crtime_be only exist since ntfs-3g-2010.5.22AC.5) Moreover, the creation time is mapped to two extended attributes named system.ntfs_crtime and system.ntfs_crtime_be as 64-bit values represented in the two endianness modes.

Object ids

Object ids (OIDs, also called GUIDs or UUIDs) are 16-byte labels identifying objects unambiguously. On NTFS, files and directories may have their own local OID which points to the initial OID of the file concatenated to the OID of the volume and the domain on which the file was initially created (64 bytes in all). The OID can thus be used to find files and directories which were relocated (see MSDN)

The full GUID of a file is mapped to an extended attribute named system.ntfs_object_id as a sequence of 64 bytes.

EFS Info

EFS is the methodology used in NTFS to deal with encrypted files. An encrypted file consists of the user data, encoded such that an unauthorized person is not able to interpret it, and the decryption information, the EFS Info, which contains the key, encrypted so that only authorized people can get it in order to interpret the user data.

(The attribute system.ntfs_efsinfo only exists since ntfs-3g-2010.5.22AC.5) The decryption information is mapped to an extended attribute named system.ntfs_efsinfo. It can thus be retrieved and recreated, so that backup applications can save and restore encrypted files and associated data without interpreting anything. The EFS info is specific to a file, so the file contents, its user extended attributes and the EFS info have to be kept associated.

Reading and writing raw encrypted data and decryption information are only possible when using the mount option efs_raw. This causes the size of the raw data to appear slightly bigger than the original data. When the option efs_raw is set, the EFS info is automatically mapped to the alternate extended attribute user.ntfs.efsinfo in user name space for easier use with standard backup tools.

Please note : encryption and compression are not compatible. Do not restore an encrypted file into a directory marked for compression.

NTFS EA

The set of EAs is an attribute of files of directories originally designed to hold extended attributes for the OS/2 operating system. This attribute is not what Unix uses to store its own extended attributes, though the concepts are much similar. It was not much used until Windows 8 came out, but some usage is emerging now.

(The attribute system.ntfs_ea only exists since ntfs-3g-2014.2.15AR.1) The full set of EAs of a file or directory is mapped to the extended attribute system.ntfs_ea, and there is no direct access to an individual EA. The structure of each individual EA is what Windows defines as FILE_FULL_EA_INFORMATION, with the difference that the last NextEntryOffset field is not seen as zero as it is in the Win32 functions ZwQueryEaFile() and ZwSetEaFile(). The on-disk representations are however the same.

When setting a set of EAs, its consistency is checked and the update is rejected if the check fails.

Please note : a file or directory cannot have both a set of EAs and Reparse Data.