Home - Sage-Bionetworks/file-proxy GitHub Wiki
User's Guide
Introduction
One of the main features of Synapse is to act as a repository for scientific data. Access to all data in Synapse is controlled with the use of access-control-lists (ACLs). The ACL is often the only required control on non-human subjects data. Human subjects data requires additional controls.
While Synapse provides physical storage for files (using Amazon's S3), not all data 'in' Synapse is stored on Synapse controlled buckets. For example, data files can physically reside on a user owned S3 buckets, SFTP servers, or other type of file servers. In all cases, when a users wants to download a files from Synapse, a security check is made to ensure the user has the 'download' permission on the file. If the user is authorized, then Synapse will redirect the user to actual data file with a "pre-signed" URL. Each pre-signed URL is composed of an expiration date (usually 30 seconds after time issued), a path to the file, and a keyed-hash message authentication code (HMAC).
For data stored in S3, the pre-signed URLs redirect to S3 which handles the URL validation and actual file download. For files stored outside of Amazon, an additional proxy is needed to validate the pre-signed URL and then proxy the requested file contents. The primary purpose of this project is to provide such validation and file proxying.
The file-proxy is a Java web-application (.war) designed to be deployed to Apache Tomcat. The file-proxy can be configured either to act as proxy to a SFTP server (see: Setup-Proxy-SFTP) or to directly serve files from locally mounted drives (see: Setup-Proxy-Local).