MongoDB Files - Yash-777/mongo-java-driver GitHub Wiki

Mongo Files: A convention for storing large files in a MongoDB database using GridFS. The GridFS is a specification for storing and retrieving files that exceed the BSON-document size limit of 16 MB. Example

GridFS is a specification for storing and retrieving files that exceed the BSON document size limit of 16MB. Instead of storing a file in a single document, GridFS divides a file into parts, or chunks, and stores each of those chunks as a separate document.

GridFS stores large binary files by breaking the files into smaller files called “chunks” and saving them in MongoDB. It essentially saves you, the application developer, from having to write all the code to break large files up into chunks, saving all the individual chunks into MongoDB, and then, when retrieving the files, combining all the chunks back together. GridFS gives you all this functionality for free.

GridFS uses two collections to save a file to a database: fs.files and fs.chunks. (The default prefix is “fs”, but you can rename it.)

  • fs.files « The fs.files collection contains the metadata for the document.
  • fs.chunks « The fs.chunks collection contains the binary file broken up into 255k chunks.

GridFS Structure

Java Example:

public class MongoDBFiles {
	
	static Properties props = new Properties();
	
	static String mongoDBHost, mongoDBName, mongoDBUserName, mongoDBPassword, mongoDB_BucketName;
	static Integer mongoDBPort;
	
	static {
		ClassLoader classLoader = RecordVedioToFile.class.getClassLoader();
		InputStream resourceAsStream = classLoader.getResourceAsStream("mongo.properties");
		try {
			props.load(resourceAsStream);
			mongoDBHost = props.getProperty("mongoDBHost");
			mongoDBName = props.getProperty("mongoDBName");
			mongoDBUserName = props.getProperty("mongoDBUserName");
			mongoDBPassword = props.getProperty("mongoDBPassword");
			mongoDB_BucketName = props.getProperty("mongoDB_BucketName");
			mongoDBPort = Integer.valueOf( props.getProperty("mongoDBPort") );
			
		} catch (IOException e) {
			e.printStackTrace();
		}
	}
	
	public static void main(String[] args) {
		upload( "E:\\IMediaWriterVedio7.mp4", "First");
	}
	
	public static void upload(String uploadFileLocation, String fileName) {
		Mongo mongoClient = null;
		try {
			mongoClient = new Mongo( mongoDBHost, mongoDBPort ); // Connect to MongoDB
			DB db = mongoClient.getDB( mongoDBName ); // Get database
			boolean auth = db.authenticate(mongoDBUserName, mongoDBPassword.toCharArray());
			System.out.println("Mongo DB Authentication >>> "+auth);
			if ( auth ) {
				//Create instance of GridFS implementation
				GridFS gridFs = new GridFS(db, mongoDB_BucketName);
				GridFSInputFile gridFsInputFile = gridFs.createFile(new File(uploadFileLocation));
				gridFsInputFile.setId("777");
				gridFsInputFile.setFilename(fileName); //Set a name on GridFS entry
				gridFsInputFile.save(); //Save the file to MongoDB
				System.out.println("Uploaded Successfully.");

				GridFSDBFile outputImageFile = gridFs.findOne(fileName);
				String downloadFileLocation = "E:\\IMediaWriterVedio"+fileName+".mp4";
				outputImageFile.writeTo(new File( downloadFileLocation ));
				System.out.println("Downloaded Successfully.");
			}
			mongoClient.close();
		} catch (UnknownHostException e) {
			e.printStackTrace();
		} catch (IOException e) {
			e.printStackTrace();
		} finally {
			if( mongoClient != null ) mongoClient.close();
		}
	}
}

mongo.properties file and required jars are {mongo-2.10.1.jar, log4mongo-java-0.7.4.jar}

mongoDBHost : 127.0.0.1
mongoDBPort : 27017
mongoDBName : MyDBFiles

mongoDBUserName : Yash777
mongoDBPassword : secretKEY

mongoDB_BucketName : fs_Files_AND_Chunks_Collections

Fields in the GridFS data model:

  • Bucket name « A prefix under which a GridFS system’s collections are stored. Collection names for the files and chunks collections are prefixed with the bucket name. The bucket name MUST be configurable by the user. Multiple buckets may exist within a single database. The default bucket name is ‘fs’.

  • Chunk « A section/part of a user file, stored as a single document in the ‘chunks’ collection of a GridFS bucket. The default size for the data field in chunks is 255KB. Chunk documents have the following form:

{
  "_id" : <ObjectId>,
  "files_id" : <TFileId>,
  "n" : <Int32>,
  "data" : <binary data>
}
  • Chunks collection « A collection in which chunks of a user file are stored. The name for this collection is the word 'chunks' prefixed by the bucket name. The default is ‘fs.chunks’.

  • Files collection « A collection in which information about stored files is stored. There will be one files collection document per stored file. The name for this collection is the word ‘files’ prefixed by the bucket name. The default is ‘fs.files’.

{
  "_id" : <TFileId>,
  "length" : <Int64>,
  "chunkSize" : <Int32>,
  "uploadDate" : <BSON datetime, ms since Unix epoch in UTC>,
  "md5" : <hex string>,
  "filename" : <string>,
  "contentType" : <string>,
  "aliases" : <string array>,
  "metadata" : <Document>
}

Files

INPUT FILE: IMediaWriterVedio7.mp4, Size on disk is 640kb

Statistics of MongoDB

QUERY for DB : db.stats()

/* 1 */
{
    "db" : "MyDBFiles",
    "collections" : 4,
    "objects" : 15,
    "avgObjSize" : 78750.9333333333340000,
    "dataSize" : 1.18126e+006,
    "storageSize" : 8.42138e+006,
    "numExtents" : 5,
    "indexes" : 4,
    "indexSize" : 32704,
    "fileSize" : 6.71089e+007,
    "nsSizeMB" : 16,
    "dataFileVersion" : {
        "major" : 4,
        "minor" : 5
    },
    "extentFreeList" : {
        "num" : 0,
        "totalSize" : 0
    },
    "ok" : 1.0000000000000000
}

QUERY for Collection : db.printCollectionStats()

  • CHUNKS : db.getCollection('fs_Files_AND_Chunks_Collections.chunks').stats()
  • Files : db.getCollection('fs_Files_AND_Chunks_Collections.files').stats()
/* 1 */
{
    "ns" : "MyDBFiles.fs_Files_AND_Chunks_Collections.chunks",
    "count" : 3,
    "size" : 1.1796e+006,
    "avgObjSize" : 393200,
    "storageSize" : 8.3968e+006,
    "numExtents" : 2,
    "nindexes" : 2,
    "lastExtentSize" : 8.38861e+006,
    "paddingFactor" : 1.0000000000000000,
    "systemFlags" : 1,
    "userFlags" : 1,
    "totalIndexSize" : 16352,
    "indexSizes" : {
        "_id_" : 8176,
        "files_id_1_n_1" : 8176
    },
    "ok" : 1.0000000000000000
}

/* 2 */
{
    "ns" : "MyDBFiles.fs_Files_AND_Chunks_Collections.files",
    "count" : 1,
    "size" : 240,
    "avgObjSize" : 240,
    "storageSize" : 8192,
    "numExtents" : 1,
    "nindexes" : 2,
    "lastExtentSize" : 8192,
    "paddingFactor" : 1.0000000000000000,
    "systemFlags" : 1,
    "userFlags" : 1,
    "totalIndexSize" : 16352,
    "indexSizes" : {
        "_id_" : 8176,
        "filename_1_uploadDate_1" : 8176
    },
    "ok" : 1.0000000000000000
}

/* 3 */
{
    "ns" : "MyDBFiles.system.indexes",
    "count" : 4,
    "size" : 704,
    "avgObjSize" : 176,
    "storageSize" : 8192,
    "numExtents" : 1,
    "nindexes" : 0,
    "lastExtentSize" : 8192,
    "paddingFactor" : 1.0000000000000000,
    "systemFlags" : 0,
    "userFlags" : 1,
    "totalIndexSize" : 0,
    "indexSizes" : {},
    "ok" : 1.0000000000000000
}

View Files:

db.getCollection('fs_Files_AND_Chunks_Collections.files').find({})

/* 1 */
{
    "_id" : "777",
    "chunkSize" : NumberLong(262144),
    "length" : NumberLong(653987),
    "md5" : "8a2834f14d2e6f67d438f0f6157e1509",
    "filename" : "First",
    "contentType" : null,
    "uploadDate" : ISODate("2016-12-29T11:29:09.695Z"),
    "aliases" : null
}

db.getCollection('fs_Files_AND_Chunks_Collections.chunks').find({})

/* 1 */
{
    "_id" : ObjectId("5864f3852f7c37b07df406cf"),
    "files_id" : "777",
    "n" : 0,
    "data" : { "$binary" : "AAAAHGZ0eXBpc...69Ig==", "$type" : "00" }
}

/* 2 */
{
    "_id" : ObjectId("5864f3852f7c37b07df406d0"),
    "files_id" : "777",
    "n" : 1,
    "data" : { "$binary" : "Yf/Jyof/J9DSXd3iCKd...kGXCEylzw==", "$type" : "00" }
}
/* 3 */
{
    "_id" : ObjectId("5864f3852f7c37b07df406d1"),
    "files_id" : "777",
    "n" : 2,
    "data" : { "$binary" : "clrfhyq0WmQcCD...jIuMTAw", "$type" : "00" }
}
⚠️ **GitHub.com Fallback** ⚠️