Requirements: Part#1 - AhmadBadir/Downloader GitHub Wiki
Problem description
One way to handle the problem of storing large files is to break them into pieces that are placed on multiple disks or machines. At the same time, a way to download a data stream more quickly is to break it into parts and download the different parts from different peers.( similar to the peer-to-peer file sharing protocol BitTorrent). By downloading streams in multiple small parts, the chance of the entire download failing is reduced (since each part can be restarted individually) and load is spread more evenly across the network. In this project, you will build a multipart downloader that assembles a data stream from multiple, potentially endless, parts streaming individually from multiple machines. It allows the same part to be stored redundantly in multiple locations so it can be resilient to failures. Since the multipart streams you will be downloading may be unbounded and never end, your program will assemble the stream incrementally from its parts as they are downloaded, displaying the file or streamed sequence of files (an animated sequence of images, for example) as the download progresses. In later projects, you'll be asked to refine the problem itself. In this case, however, the problem is largely fixed. You are given a particular protocol format (described below) and required to implement a single method, Multipart.openStream, The openStream() method of the API returns an InputStream for reading from the multipart stream; the GUI then simply reads serially from this stream, but it could be used more ambitiously, for interleaving the downloading of multiple streams, for example. The breakdown of the stream into parts is specified in a special manifest stream. Your code will have to parse this stream, using it to determine which parts to download. Furthermore, these parts can themselves be manifest streams, in which case your code should recursively stream the subsegments. Both this parsing and the downloading process itself are naturally expressed as state machines. A variety of failures can occur, and it will be up to you to figure out what they are and decide how they should be handled.
Specifications
Manifest stream format A manifest stream specifies how a given stream is broken into parts. Your program can assume that a URL points to a manifest stream if and only if the stream has the content type text/segments-manifest or its URL ends with the suffix .segments Each part in a manifest stream is separated by a line containing two stars. For example, the file picture.jpg may be broken into three parts all stored on the same machine: http://machine1.birzeit.edu/picture.jpg-segment1 ** http://machine1.birzeit.edu/picture.jpg-segment2 ** http://machine1.birzeit.edu/picture.jpg-segment3 To download this stream, your program would download each part in succession, thus recreating the original file. The manifest stream can give alternatives, so that a part, or several parts, can be stored redundantly in different locations: http://machine1.birzeit.edu/picture.jpg-segment1 http://machine2.birzeit.edu/picture.jpg-segment1 ** http://machine1.birzeit.edu/picture.jpg-segment2 http://machine2.birzeit.edu/picture.jpg-segment2 ** http://machine1.birzeit.edu/picture.jpg-segment3 http://machine2.birzeit.edu/picture.jpg-segment3 for each part, if the first machine is not accessible, your downloader should try the second. Manifest streams can also be recursive. For example, the manifest stream http://machine1.birzeit.edu/endless.txt.segments might look like: http://machine1.birzeit.edu/verse.txt http://machine2.birzeit.edu/verse.txt http://hermachine.birzeit.edu/verse.txt ** http://machine2.birzeit.edu/chorus.txt ** http://machine1.birzeit.edu/endless.txt.segments Thus the last part of this manifest stream refers back to itself, so a client reading from your Multipart implementation would receive an endless stream alternating between the contents of verse.txt and the contents of chorus.txt. Note also that in the preceding manifest stream there are three alternatives for verse.txt but only one for chorus.txt and endless.txt.segments. Like in BitTorrent, certain parts may only be hosted by certain servers. Then again, it should never hurt if a nonexistent or perhaps even malformed URL is listed as an additional alternative source, since your code should be able to handle this and try a different alternative.
GUI
Will be provided for you Handling faults A variety of failures may occur during execution, which your code should deal with gracefully. These include network failures (e.g. the program cannot download a file), manifest file syntax errors, and other failure scenarios.