Linux exFat traps - hpaluch/hpaluch.github.io GitHub Wiki
If you plan to use exFAT on Linux be beware of critical issue:
You may happily create invalid Unicode filename, but the exFAT driver will crash everytime you access it.
What is even worse -
fsck.exfat
will not catch such problem and thus will not help.
There is similar bug reported on:
-
https://www.mail-archive.com/[email protected]/msg1756058.html
But it was not my case (
fsck.exfat
) finished without errors.
I have LUKS partition with exFat for backups. So I first map that partition as device using:
cryptsetup luksOpen /dev/sdd1 myluks
# this creates device /dev/mapper/myluks
Now I can use regular fsck.exfat
to test for problems:
fsck.exfat -rv /dev/mapper/myluks
exfatprogs version : 1.0.4
volume label [WD_LUKS]
sector size: 512.00 B
cluster size: 128.00 KB
volume size: 1023.99 GB
/dev/mapper/myluks: clean. directories 3149, files 19715
As you can see everything is OK and shiny. Now we will mount it in read-only and debug mode:
mount.exfat-fuse -d -o ro /dev/mapper/myluks /mnt/test/
# it will stay attached to terminal
In another terminal I will just run find to scan all directories:
find /mnt/test/
# suddenly there will be errors:
find: '/mnt/test/some_path': Software caused connection abort
find: '/mnt/test/some_path': Transport endpoint is not connected
...
find: failed to read file names from file system at or below ‘/mnt/test/’: Transport endpoint is not connected
On terminal with mount.fuse-exfat
we will see:
LOOKUP /some_path
getattr /some_path
NODEID: 443
unique: 4478, success, outsize: 144
unique: 4480, opcode: OPENDIR (27), nodeid: 443, insize: 48, pid: 6552
unique: 4480, success, outsize: 32
unique: 4482, opcode: READDIR (28), nodeid: 443, insize: 80, pid: 6552
readdir[0] from 0
ERROR: illegal UTF-16 sequence.
BUG: failed to convert name to UTF-8.
Aborted (core dumped)
To try again we have to first force un-mount filesystem:
umount -f /mnt/test
Related packages:
$ rpm -qf /usr/sbin/fsck.exfat
exfatprogs-1.0.4-150300.3.6.1.x86_64
$ rpm -qf /sbin/mount.exfat-fuse
fuse-exfat-1.3.0-bp154.1.20.x86_64
Now we have to install debuginfo packages using:
zypper --plus-content debug in exfatprogs-debuginfo fuse-exfat-debuginfo libfuse2-debuginfo
Also install GDB:
zypper in gdb
This time we will use mount.exfat-fuse
command again, but rather in GDB:
gdb /sbin/mount.exfat-fuse
(gdb) run -d -o ro /dev/mapper/myluks /mnt/test/
Again in another terminal invoke:
find /mnt/test/
Now you will find that GDB catched abort() call (so find just hang, but does not report yet):
eaddir[0] from 0
ERROR: illegal UTF-16 sequence.
BUG: failed to convert name to UTF-8.
Program received signal SIGABRT, Aborted.
0x00007ffff7b95c6b in raise () from /lib64/libc.so.6
Missing separate debuginfos, use: zypper install libfuse2-debuginfo-2.9.7-3.3.1.x86_64
# we can try backtrace:
(gdb) bt
#0 0x00007ffff7b95c6b in raise () from /lib64/libc.so.6
#1 0x00007ffff7b97305 in abort () from /lib64/libc.so.6
#2 0x0000555555403e89 in exfat_bug (format=format@entry=0x5555554096a8 "failed to convert name to UTF-8")
at log.c:58
#3 0x0000555555408054 in exfat_get_name (node=node@entry=0x5555558addf0,
buffer=buffer@entry=0x7fffffffda00 "(Precko") at utils.c:53
#4 0x0000555555402363 in fuse_exfat_readdir (path=<optimized out>, buffer=0x55555560e4b0,
filler=0x7ffff7d6dc60 <fill_dir>, offset=<optimized out>, fi=<optimized out>) at main.c:131
#5 0x00007ffff7d73287 in fuse_fs_readdir (fs=0x55555560f010,
path=0x55555562f660 "/PATH/dane/obecne", buf=0x55555560e4b0,
filler=0x7ffff7d6dc60 <fill_dir>, off=0, fi=0x7fffffffddd0) at fuse.c:2009
#6 0x00007ffff7d73448 in readdir_fill (fi=0x7fffffffddd0, dh=0x55555560e4b0, off=0, size=4096, ino=443,
req=0x55555562f9b0, f=0x55555560eeb0) at fuse.c:3467
#7 fuse_lib_readdir (req=0x55555562f9b0, ino=443, size=4096, off=0, llfi=<optimized out>) at fuse.c:3493
#8 0x00007ffff7d79f72 in do_readdir (req=<optimized out>, nodeid=<optimized out>, inarg=<optimized out>)
at fuse_lowlevel.c:1390
#9 0x00007ffff7d7b101 in fuse_ll_process_buf (data=0x55555560f1a0, buf=0x7fffffffe050, ch=<optimized out>)
at fuse_lowlevel.c:2443
#10 0x00007ffff7d779af in fuse_session_loop (se=se@entry=0x55555560e700) at fuse_loop.c:40
#11 0x00007ffff7d6fcd8 in fuse_loop (f=f@entry=0x55555560eeb0) at fuse.c:4322
#12 0x00007ffff7d8024c in fuse_main_common (argc=argc@entry=5, argv=argv@entry=0x7fffffffe1d0,
op=op@entry=0x55555560c1e0 <fuse_exfat_ops>, op_size=op_size@entry=360, user_data=user_data@entry=0x0,
compat=compat@entry=25) at helper.c:371
--Type <RET> for more, q to quit, c to continue without paging--
#13 0x00007ffff7d802fe in fuse_main_real_compat25 (argc=argc@entry=5, argv=argv@entry=0x7fffffffe1d0,
op=op@entry=0x55555560c1e0 <fuse_exfat_ops>, op_size=op_size@entry=360) at helper.c:476
#14 0x0000555555401d43 in fuse_exfat_main (mount_point=0x7fffffffe711 "/mnt/test/",
mount_options=0x55555560e330 "allow_other,big_writes,blkdev,default_permissions,debug,fsname=/dev/mapper/myluks,ro,blksize=4096") at main.c:511
#15 main (argc=<optimized out>, argv=<optimized out>) at main.c:603
We can already see problematic path frame #5:
#5 0x00007ffff7d73287 in fuse_fs_readdir (fs=0x55555560f010,
path=0x55555562f660 "/PATH/dane/obecne", buf=0x55555560e4b0,
To see part exactly what filename caused crash we have to look into frame 3:
(gdb) frame level 3
#3 0x0000555555408054 in exfat_get_name (node=node@entry=0x5555558addf0,
buffer=buffer@entry=0x7fffffffda00 "(Precko") at utils.c:53
53 in utils.c
Let's try first simple command:
gdb) print node->name
$13 = {{__u16 = 40}, {__u16 = 80}, {__u16 = 114}, {__u16 = 55981}, {__u16 = 121}, {__u16 = 32}, {
__u16 = 102}, {__u16 = 121}, {__u16 = 122}, {__u16 = 105}, {__u16 = 99}, {__u16 = 107}, {
...
To make it more readable (for later analysis) we can print them as 16-bit hex numbers:
(gdb) x/16xh node->name
0x5555558ade48: 0x0028 0x0050 0x0072 0xdaad 0x0079 0x0020 0x0066 0x0079
0x5555558ade58: 0x007a 0x0069 0x0063 0x006b 0xd842 0xdff3 0x006f 0x0062
We can clearly see that there are 2 ASCII characters (0x28 and 0x50) followed by some weird Unicode character 0xdaad)
To print Unicode (and potentially malformed name) we can use trick from - https://stackoverflow.com/questions/39141801/how-to-print-unicode-string-in-gdb-when-debugging-in-windows
(gdb) x/sh node->name
0x5555558ade48: u"(Pr\xdaady fyzick.....poradcu).pdf"
(Unprintable mess replaced with dots)
If you want to see details of node
variable we can follow
another trick from https://stackoverflow.com/questions/1768620/how-do-i-show-what-fields-a-struct-has-in-gdb
(gdb) ptype /o node
type = const struct exfat_node {
/* 0 | 8 */ struct exfat_node *parent;
/* 8 | 8 */ struct exfat_node *child;
/* 16 | 8 */ struct exfat_node *next;
/* 24 | 8 */ struct exfat_node *prev;
/* 32 | 4 */ int references;
/* 36 | 4 */ uint32_t fptr_index;
/* 40 | 4 */ cluster_t fptr_cluster;
/* XXX 4-byte hole */
.....
* 64 | 8 */ uint64_t size;
/* 72 | 8 */ time_t mtime;
/* 80 | 8 */ time_t atime;
/* 88 | 512 */ le16_t name[256];
So now we know what filename caused crash of fuse-exfat. The question is, how to fix it? (Some people on Internet advice to use Windows machine, but there is currently not good LUKS compatible driver for Windows - all known projects has been abandoned).
To see how looks source we can try:
zypper si fuse-exfat
rpmbuild -bp /usr/src/packages/SPECS/fuse-exfat.spec
less /usr/src/packages/BUILD/fuse-exfat-1.3.0/libexfat/utils.c
Important sources:
void exfat_get_name(const struct exfat_node* node,
char buffer[EXFAT_UTF8_NAME_BUFFER_MAX])
{
if (utf16_to_utf8(buffer, node->name, EXFAT_UTF8_NAME_BUFFER_MAX,
EXFAT_NAME_MAX) != 0)
exfat_bug("failed to convert name to UTF-8");
}
Finally in /usr/src/packages/BUILD/fuse-exfat-1.3.0/libexfat/utf.c
we can see:
int utf16_to_utf8(char* output, const le16_t* input, size_t outsize,
size_t insize)
{
const le16_t* inp = input;
char* outp = output;
wchar_t wc;
while (inp - input < insize)
{
inp = utf16_to_wchar(inp, &wc, insize - (inp - input));
if (inp == NULL)
{
exfat_error("illegal UTF-16 sequence");
return -EILSEQ;
}
// ...
}
// ...
}
// and also
static const le16_t* utf16_to_wchar(const le16_t* input, wchar_t* wc,
size_t insize)
{
if ((le16_to_cpu(input[0]) & 0xfc00) == 0xd800)
{
if (insize < 2 || (le16_to_cpu(input[1]) & 0xfc00) != 0xdc00)
return NULL;
*wc = ((wchar_t) (le16_to_cpu(input[0]) & 0x3ff) << 10);
*wc |= (le16_to_cpu(input[1]) & 0x3ff);
*wc += 0x10000;
return input + 2;
}
else
{
*wc = le16_to_cpu(*input);
return input + 1;
}
}
So there is no way to handle this case gracefully...
Tested on this environment:
$ cat /etc/SUSE-brand
openSUSE
VERSION = 15.4
Install these packages:
$ sudo zypper in exfatprogs fuse-exfat libfuse2
Tested versions:
$ rpm -q exfatprogs fuse-exfat libfuse2
exfatprogs-1.0.4-150300.3.6.1.x86_64
fuse-exfat-1.3.0-bp154.1.20.x86_64
libfuse2-2.9.7-3.3.1.x86_64
How to reproduce:
# 1st terminal:
sudo mkdir -p /mnt/test
rm -f ~/exfat.img
cd
truncate -s 128M ~/exfat.img
/usr/sbin/mkfs.exfat -L EXFAT_BUG ~/exfat.img
loop_dev=`sudo /sbin/losetup -f --show ~/exfat.img`
sudo /usr/sbin/mount.exfat -d $loop_dev /mnt/test
# will print something like:
FUSE exfat 1.3.0
FUSE library version: 2.9.7
nullpath_ok: 0
...
Now open another terminal session and run:
# 2nd terminal:
sudo mkdir /mnt/test/bad`echo -ne '\xed\xaa\xad'`
Now unmount and mount again exfat to force reload of Unicode filenames from disk:
# 2nd terminal:
sudo umount /mnt/test
Back on 1st terminal mount that filesystem again:
# 1st terminal:
sudo /usr/sbin/mount.exfat -d $loop_dev /mnt/test
And on 2nd terminal:
# 2nd terminal:
ls -l /mnt/test/
ls: reading directory '/mnt/test/': Software caused connection abort
Ooops, fatal error, you can see on 1st terminal that process was abort()
-ted with
readdir[0] from 0
ERROR: illegal UTF-16 sequence.
BUG: failed to convert name to UTF-8.
Aborted
Tip: before remount you need to do force unmount:
# 1st terminal:
sudo umount -f /mnt/test
Now we are in serious trouble: everytime some process access our filename the fuse-exfat will crash leaving mounted filesystem stuck and inaccessible...
Recently I got ZIP file with really weird encoding.
Using this command:
unzip -l archive.zip | od -c
Revealed really this 8-bit encoding:
0001140 3 7 203 e s t n e p r o h l
We know that 0203
Octal (131 decimal, 0x83 hexa-decimal) should be
Ccaron
(e.g. Č
). I was unable to find any suitable encoding in
iconv -l
output that would fit this definition.
Thanks to my early involvement in
Linux (started with MCC around 1993, later switches to RedHat 4.0 around 1996) I recalled
utility cstocs
written by good Czech guy there:
131 Ccaron
So it is Cork
encoding - originally used for TeX fonts.
So to convert it you can download yourself:
-
https://metacpan.org/dist/Cstools/view/bin/cstocs.PL
And use arguments like
unzip -l archive.zip | cstocs cork utf8
Please note that there is no easy way to unpack filenames from ZIP with re-encoding (without writing such program by yourself). However now we at least know what it has encoding. Some pointers:
- https://unix.stackexchange.com/questions/251969/how-can-i-correctly-decompress-a-zip-archive-of-files-with-hebrew-names
- https://unix.stackexchange.com/questions/362812/encoding-of-cyrillic-filenames-in-zip-files but no real solution in this case.
Maybe modifying this Python script would help :-)
- GitHub source for fuse-exfat from spec file:
- https://github.com/relan/exfat
- RPM Version: 1.3.0
- GitHub source for exfatprogs
- https://github.com/exfatprogs/exfatprogs
- RPM Version: 1.0.4
- Reported bug here:
- Interesting comment:
exfat-utils is superseded by exfatprogs.
- and also: