Nginx upload in multipart encoding - downgoon/hello-world GitHub Wiki

编译第三方上传模块

三个模块: Image Filter Module & Upload Module & Upload Progress Module

利用nginx上传模块Nginx_upload_module可更有效实现大文件断点续传,还可安装nginx-upload-progress-module扩展显示文件上传进度。

本文要安装的是 nginx上传模块 nginx-upload-module 和 上传进度条模块 nginx_uploadprogress_module

file uploads using multipart/form-data encoding (RFC 1867)

“上传模块” 源代码:https://github.com/vkholodkov/nginx-upload-module

“上传进度模块” 源代码:https://github.com/masterzen/nginx-upload-progress-module

上传架构图

上传架构图.png

我们平时上传头像,一般都是分两个请求:

  • 上传一张照片到文件服务器,文件服务器返回URL地址;
  • 前端拿到这个URL地址,通知meta server 更新uid对应的头像地址。

而这个架构是:上传请求,Nginx保存图片,然后Nginx负责通知后端meta server。

下载代码

下载 nginx 和 第三方上传模块 源代码

$ wget http://nginx.org/download/nginx-1.0.15.tar.gz
$ wget http://www.grid.net.ru/nginx/download/nginx_upload_module-2.2.0.tar.gz
$ wget https://github.com/masterzen/nginx-upload-progress-module/archive/v0.9.2.tar.gz

注意版本:Nginx版本不宜过高,本文选择nginx-1.0.15.tar.gz。笔者发现: nginx_upload_module-2.2.0.tar.gznginx-1.11.10.tar.gz 两版本不兼容,编译报错nginx_upload_module-2.2.0/ngx_http_upload_module.c:14:10: fatal error: 'md5.h' file not found

在Mac下,编译依然有错误。官方 ISSUES#61 问题尚未解决。临时方案是:

I am seconding @jslhcl report that branch 2.2 fails to compile on nginx 1.11.4

I worked around the problem by adding

#define NGX_HAVE_OPENSSL_MD5_H 1
#define NGX_OPENSSL_MD5 1
#define NGX_HAVE_OPENSSL_SHA1_H 1

on line 11 of ngx_http_upload_module.c

This is only a HACK FIX that will only work if you know you have openssl libraries installed. If you do not have openssl installed this will make your build even less correct.

Long-term solution would be nice.

sed -i '' '11i\
#define NGX_HAVE_OPENSSL_MD5_H 1\
#define NGX_OPENSSL_MD5 1\
#define NGX_HAVE_OPENSSL_SHA1_H 1\
' ./build/nginx_upload_module-2.2.0/ngx_http_upload_module.c

模块编译

nginx-upload-module 重新编译 nginx

解压,并列放:

./nginx-1.0.15
./nginx-upload-progress-module-0.9.2
./nginx_upload_module-2.2.0
cd nginx-1.0.15
./configure --add-module=../nginx_upload_module-2.2.0  --add-module=../nginx-upload-progress-module-0.9.2
make && make install

输出:

Configuration summary
  + using system PCRE library
  + OpenSSL library is not used
  + md5: using system crypto library
  + sha1: using system crypto library
  + using system zlib library

  nginx path prefix: "/usr/local/nginx"
  nginx binary file: "/usr/local/nginx/sbin/nginx"
  nginx configuration prefix: "/usr/local/nginx/conf"
  nginx configuration file: "/usr/local/nginx/conf/nginx.conf"
  nginx pid file: "/usr/local/nginx/logs/nginx.pid"
  nginx error log file: "/usr/local/nginx/logs/error.log"
  nginx http access log file: "/usr/local/nginx/logs/access.log"
  nginx http client request body temporary files: "client_body_temp"
  nginx http proxy temporary files: "proxy_temp"
  nginx http fastcgi temporary files: "fastcgi_temp"
  nginx http uwsgi temporary files: "uwsgi_temp"
  nginx http scgi temporary files: "scgi_temp"

页面表单

需求场景是给某个用户uid,上传头像headface和身份证idcard实名认证,用户可以对自己简单描述。

<html>
<head>
<title>Test upload</title>
</head>
<body>
<h2>Select files to upload</h2>
<form enctype="multipart/form-data" action="/upload" method="post">
<input type="file" name="headface"><br>
<input type="file" name="idcard"><br>
<input type="text" name="selfdesc"><br>
<input type="hidden" name="uid" value="12345678">
<input type="submit" name="submit" value="Upload">
</form>
</body>
</html>

上传表单.png

命令行上传

curl   -F   "[email protected]"   http://127.0.0.1/upload

Upload 配置

worker_processes  4;

error_log  logs/error.log notice;

#working_directory /usr/local/nginx;

events {
    worker_connections  1024;
}

http {
    include       mime.types;
    default_type  application/octet-stream;

    server {
        listen       80;
        client_max_body_size 100m;

        # Upload form should be submitted to this location
        location /upload {
            # Pass altered request body to this location
            upload_pass   @test;

            # Store files to this directory
            # The directory is hashed, subdirectories 0 1 2 3 4 5 6 7 8 9 should exist
            upload_store /tmp/upload 1;

            # Allow uploaded files to be read only by user
            upload_store_access user:r;

            # Set specified fields in request body
            upload_set_form_field "${upload_field_name}_name" $upload_file_name;
            upload_set_form_field "${upload_field_name}_content_type" $upload_content_type;
            upload_set_form_field "${upload_field_name}_path" $upload_tmp_path;

            # Inform backend about hash and size of a file
            upload_aggregate_form_field "${upload_field_name}_md5" $upload_file_md5;
            upload_aggregate_form_field "${upload_field_name}_size" $upload_file_size;

            upload_pass_form_field "^submit$|^description$";
   	    upload_pass_args on;
        }

        # Pass altered request body to a backend
        location @test {
            #proxy_pass   http://localhost:8080;
  	    return 200;
        }
    }
}

multipart/form-data 编码概述

multipart/form-data 编码要解决的问题:

  • 支持文件上传:文件可以是二进制的,也可能是文本的,什么都有可能。由文件的Content-Type决定。
  • 支持文件与meta混排: 上传时,除了上传文件本身,还有文件的描述信息,比如用户对文件的描述,用户的uid,文件的md5和size,这些信息都可以理解问文件的meta信息。因此一个HTTP请求中,必须能同时包含多种编码信息。不只是 application/x-www-form-urlencoded的普通Key=Value形式。因此在multipart/form-data 引入了一个boundary用作分割符,这个分隔符大概32~64长度不等,一般浏览器都是随机生成的,只能从概率学层面保证这个boundary序列不会在上传文件中出现。
  • 支持多个文件:multipart/form-data 可以对 input type=file, text, hidden 等编码。多个file也一样可以编码进来。因此可以上传多个文件,并非必须只有1个文件。
  • 也可以不传文件: 尽管multipart/form-data 是为了上传文件被提出来,当然我们也完全可以用它来传普通的Key=Value。例如这里的nginx_upload_module,通知后端backend时,就是把原始的上传请求,过滤掉了file的内容,然后增加自定义的meta信息,转发给backend,里面不再有file内容。

上传请求

POST /upload HTTP/1.1
Host	localhost
User-Agent	Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:50.0) Gecko/20100101 Firefox/50.0
Accept	text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language	zh-CN,zh;q=0.8,en-US;q=0.5,en;q=0.3
Accept-Encoding	gzip, deflate
Referer	http://localhost/
Content-Type	multipart/form-data; boundary=---------------------------4642747761258070681269947205
Content-Length	739

-----------------------------4642747761258070681269947205
Content-Disposition: form-data; name="headface"; filename="up1.txt"
Content-Type: text/plain

aaaaaa

-----------------------------4642747761258070681269947205
Content-Disposition: form-data; name="idcard"; filename="up2.txt"
Content-Type: text/plain

bbbbbb

-----------------------------4642747761258070681269947205
Content-Disposition: form-data; name="selfdesc"

kukubao
-----------------------------4642747761258070681269947205
Content-Disposition: form-data; name="uid"

12345678
-----------------------------4642747761258070681269947205
Content-Disposition: form-data; name="submit"

Upload
-----------------------------4642747761258070681269947205--

如果是上传图片,则其中一个File的 Content-Type: image/jpeg

上传图片.png

转发请求

POST /upload HTTP/1.0
Host: localhost:8080
Connection: close
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:50.0) Gecko/20100101 Firefox/50.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: zh-CN,zh;q=0.8,en-US;q=0.5,en;q=0.3
Accept-Encoding: gzip, deflate
Referer: http://localhost/
Upgrade-Insecure-Requests: 1
Content-Type: multipart/form-data; boundary=---------------------------4642747761258070681269947205
Content-Length: 1499

-----------------------------4642747761258070681269947205
Content-Disposition: form-data; name="headface_name"

up1.txt
-----------------------------4642747761258070681269947205
Content-Disposition: form-data; name="headface_content_type"

text/plain
-----------------------------4642747761258070681269947205
Content-Disposition: form-data; name="headface_path"

/tmp/upload/9/0000000009
-----------------------------4642747761258070681269947205
Content-Disposition: form-data; name="headface_md5"

b1ffb6b5d22cd9f210fbc8b7fdaf0e19
-----------------------------4642747761258070681269947205
Content-Disposition: form-data; name="headface_size"

7
-----------------------------4642747761258070681269947205
Content-Disposition: form-data; name="idcard_name"

up2.txt
-----------------------------4642747761258070681269947205
Content-Disposition: form-data; name="idcard_content_type"

text/plain
-----------------------------4642747761258070681269947205
Content-Disposition: form-data; name="idcard_path"

/tmp/upload/0/0000000010
-----------------------------4642747761258070681269947205
Content-Disposition: form-data; name="idcard_md5"

7ed9295c3bdb1aaf2b427b64942b40fb
-----------------------------4642747761258070681269947205
Content-Disposition: form-data; name="idcard_size"

7
-----------------------------4642747761258070681269947205
Content-Disposition: form-data; name="submit"

Upload
-----------------------------4642747761258070681269947205--

Nginx收到上传文件后,保存文件到指定的目录,然后“转发” 一个请求到后端Tomcat/Jetty之类的,通知文件已上传完毕,请做后续操作,比如录入mysql。上述表达就上传了两个文件,一个是headface,另一个是idcard。其中headface的内容是:

headface_name: up1.txt
headface_content_type: text/plain
headface_path: /tmp/upload/9/0000000009
headface_md5: b1ffb6b5d22cd9f210fbc8b7fdaf0e19
headface_size: 7

特别注意

转发的请求,已经过滤掉了“文件内容本身”,只把“文件meta信息”通知给了后端 backend服务。但是呢?这个通知却延续了multipart/form-data编码方式,其实完全可以换成application/x-www-form-urlencoded编码。

附录1:HMTL form 表单的两种编码方式

最早的HTTP POST是不支持文件上传的,给编程开发带来很多问题。但是在1995年,ietf出台了rfc1867,也就是《RFC 1867 -Form-based File Upload in HTML》,用以支持文件上传。所以Content-Type的类型扩充了multipart/form-data用以支持向服务器发送二进制数据。

因此发送post请求时候,表单

属性enctype共有二个值可选,这个属性管理的是表单的MIME编码: ① application/x-www-form-urlencoded (默认值) ② multipart/form-data 其实form表单在你不写enctype属性时,也默认为其添加了enctype属性值,默认值是enctype="application/x- www-form-urlencoded".
  • 文件上传表单

上传表单:enctype="multipart/form-data"

<form method="post" enctype="multipart/form-data" action="/upload">
    <input type="file" name="myfile" />
    <input type="submit" />
</form>
  • 普通表单
<form method="post" enctype="application/x-www-form-urlencoded" action="/upload">
    <input type="text" name="myfile" />
    <input type="submit" />
</form>
  • multipart/form-data
<form method="post" enctype="multipart/form-data" action="/upload">
    <input type="file" name="file1" />
    <input type="file" name="file2" />
    <input type="text" name="desc" />
    <input type="hidden" name="uid" value="12345678" />
    <input type="submit" name="submit" value="Upload" />
</form>

注意

multipart/form-data 设计初衷是为了解决文件上传问题。但是它设计得更灵活,它只是一种能支持文件上传的编码方式。实际上,用multipart/form-data,也可以只发送普通 <input type="text" name="desc" /> 表单字段,当然也可以不只发送一个文件,可以多个。没有文件上传的multipart/form-data

<form method="post" enctype="multipart/form-data" action="/upload">
    <input type="text" name="desc" />
    <input type="hidden" name="uid" value="12345678" />
    <input type="submit" name="submit" value="Upload" />
</form>

附录2:nginx_upload_module 开启断点续传

nginx 开启断点续传

server {
[...]
        location /resumable_upload {
                upload_resumable on;
                upload_state_store /usr/local/nginx/upload_temp ;
                upload_pass @drivers_upload_handler;
                upload_store /usr/local/nginx/upload_temp;
                upload_set_form_field $upload_field_name.path "$upload_tmp_path";
        }
 
        location @resumable_upload_handler {
                proxy_pass http://localhost:8002;
        }
[...]
}

配置说明:

  • upload_resumable on: 开启断点续传功能;
  • upload_state_store /usr/local/nginx/upload_temp: 设置断点续传状态文件存储的目录。

客户端与服务端会话关键: Range

  • HTTP 请求
Content-Type: application/octet-stream
Content-Disposition: attachment; filename="big.TXT"
X-Content-Range: bytes 0-51200/511920
X-Session-ID: 1111215056

声明要上传的文件名,已经当前Range是从第几个字节到第几个字节。为了后续能够断点续传,除了Range表明从第几个字节外,还得表明自己的身份X-Session-ID: 1111215056

  • HTTP 响应
Range: 0-51200/511920

服务端表示已经接收了 0-51200 的字节段。

附录2:浏览上传文件

nginx 默认没有开启“目录浏览”功能,需要用autoindex指令开启。

比如我们期望通过 /img/ 指令访问 /var/www/image/ 目录,那么配置:

location /img/ {
    autoindex on;
    autoindex_exact_size  off;    
    alias /var/www/image/;
}

讲解:

autoindex on: 表示开启目录浏览; autoindex_exact_size off 关闭文件确切大小,只显示大概大小。 alias /var/www/image/ 对应的目录是 /var/www/image/ 。

root 指令 与 alias 指令的区别

  • alias 指令

URL: /img/abc.txt 文件: /var/www/image/abc.txt

location /img/ {
    alias /var/www/image/;
}

若按照上述配置的话,则访问/img/目录里面的文件时,ningx会自动去/var/www/image/目录找文件。

  • root 指令

URL: /img/abc.txt 文件: /var/www/image/img/abc.txt

location /img/ {
    root /var/www/image;
}

若按照这种配置的话,则访问/img/目录下的文件时,nginx会去 /var/www/image/img/ 目录下找文件。

重要提醒

还有一个重要的区别是alias后面必须要用“/”结束,否则会找不到文件的。。。而root则可有可无~~

附录3: nginx 直接返回JSON

有时候,我们只需要检测下nginx正常工作,并且返回固定的一些信息。

返回 HTML

location ~ ^/get_text {
        default_type text/html;
       add_header Content-Type 'text/html; charset=utf-8';
        return 200 'This is text!';
}

访问:

$ curl -i http://localhost/get_text
HTTP/1.1 200 OK
Server: nginx/1.0.15
Date: Sun, 05 Mar 2017 09:57:47 GMT
Content-Type: text/html
Content-Length: 13
Connection: keep-alive

This is text!%

返回JSON

location ~ ^/get_json {
       default_type application/json;
       return 200 '{"status":"success","result":"nginx json"}';
}

参考资料

http://www.thinksaas.cn/topics/0/511/511158.html http://www.cnblogs.com/suihui/archive/2013/04/13/3018557.html

⚠️ **GitHub.com Fallback** ⚠️