How to download non ASCII filename from S3 - suguanyu/WechatTinyProgram GitHub Wiki

Ref: http://stackoverflow.com/questions/93551/how-to-encode-the-filename-parameter-of-content-disposition-header-in-http/29459051#29459051 Ref: https://forums.aws.amazon.com/message.jspa?messageID=410130

When use primary method to download a S3 file with non-ASCII key, I meet the error message

This XML file does not appear to have any style information associated with it. The document tree is shown below.
<Error>
<Code>InvalidArgument</Code>
<Message>
Header value cannot be represented using ISO-8859-1.
</Message>
<ArgumentName>response-content-disposition</ArgumentName>
<ArgumentValue>
attachment; filename="oB57r0M2q8OjJrxBSziERXSUZmBQ-1491715829-中文.jpg"
</ArgumentValue>
<RequestId>6487281B6704800A</RequestId>
<HostId>
ZD5xMsOZ8y0U4aH+Dy7yjCNN54/Xlbvt++cgur02z2y0qkRXRkaxtxA+E3SDSOVlbuKL4r8gi90=
</HostId>
</Error>

Solution for PHP:

Solution 1

In PHP this did it for me (assuming the filename is UTF8 encoded):

header('Content-Disposition: attachment;'
    . 'filename="' . addslashes(utf8_decode($filename)) . '";'
    . 'filename*=utf-8\'\'' . rawurlencode($filename));

Tested against IE8-11, Firefox and Chrome. If the browser can interpret filename*=utf-8 it will use the UTF8 version of the filename, else it will use the decoded filename. If your filename contains characters that can't be represented in ISO-8859-1 you might want to consider using iconv instead.

Solution 2

I ended up with the following code in my "download.php" script (based on this blogpost and these test cases).

$il1_filename = utf8_decode($filename); $to_underscore = ""\#*;:|<>/?"; $safe_filename = strtr($il1_filename, $to_underscore, str_repeat("_", strlen($to_underscore)));

header("Content-Disposition: attachment; filename="$safe_filename"" .( $safe_filename === $filename ? "" : "; filename*=UTF-8''".rawurlencode($filename) )); This uses the standard way of filename="..." as long as there are only iso-latin1 and "safe" characters used; if not, it adds the filename*=UTF-8'' url-encoded way. According to this specific test case, it should work from MSIE9 up, and on recent FF, Chrome, Safari; on lower MSIE version, it should offer filename containing the ISO8859-1 version of the filename, with underscores on characters not in this encoding.

Final note: the max. size for each header field is 8190 bytes on apache. UTF-8 can be up to four bytes per character; after rawurlencode, it is x3 = 12 bytes per one character. Pretty inefficient, but it should still be theoretically possible to have more than 600 "smiles" %F0%9F%98%81 in the filename.

⚠️ **GitHub.com Fallback** ⚠️