MongoDB Files - Yash-777/mongo-java-driver GitHub Wiki
Mongo Files: A convention for storing large files in a MongoDB database using GridFS. The GridFS is a specification for storing and retrieving files that exceed the BSON-document size limit of 16 MB. Example
GridFS is a specification for storing and retrieving files that exceed the BSON document size limit of 16MB. Instead of storing a file in a single document, GridFS divides a file into parts, or chunks, and stores each of those chunks as a separate document.
GridFS stores large binary files by breaking the files into smaller files called “chunks” and saving them in MongoDB. It essentially saves you, the application developer, from having to write all the code to break large files up into chunks, saving all the individual chunks into MongoDB, and then, when retrieving the files, combining all the chunks back together. GridFS gives you all this functionality for free.
GridFS uses two collections to save a file to a database: fs.files and fs.chunks. (The default prefix is “fs”, but you can rename it.)
- fs.files « The fs.files collection contains the metadata for the document.
- fs.chunks « The fs.chunks collection contains the binary file broken up into 255k chunks.
Java Example:
public class MongoDBFiles {
static Properties props = new Properties();
static String mongoDBHost, mongoDBName, mongoDBUserName, mongoDBPassword, mongoDB_BucketName;
static Integer mongoDBPort;
static {
ClassLoader classLoader = RecordVedioToFile.class.getClassLoader();
InputStream resourceAsStream = classLoader.getResourceAsStream("mongo.properties");
try {
props.load(resourceAsStream);
mongoDBHost = props.getProperty("mongoDBHost");
mongoDBName = props.getProperty("mongoDBName");
mongoDBUserName = props.getProperty("mongoDBUserName");
mongoDBPassword = props.getProperty("mongoDBPassword");
mongoDB_BucketName = props.getProperty("mongoDB_BucketName");
mongoDBPort = Integer.valueOf( props.getProperty("mongoDBPort") );
} catch (IOException e) {
e.printStackTrace();
}
}
public static void main(String[] args) {
upload( "E:\\IMediaWriterVedio7.mp4", "First");
}
public static void upload(String uploadFileLocation, String fileName) {
Mongo mongoClient = null;
try {
mongoClient = new Mongo( mongoDBHost, mongoDBPort ); // Connect to MongoDB
DB db = mongoClient.getDB( mongoDBName ); // Get database
boolean auth = db.authenticate(mongoDBUserName, mongoDBPassword.toCharArray());
System.out.println("Mongo DB Authentication >>> "+auth);
if ( auth ) {
//Create instance of GridFS implementation
GridFS gridFs = new GridFS(db, mongoDB_BucketName);
GridFSInputFile gridFsInputFile = gridFs.createFile(new File(uploadFileLocation));
gridFsInputFile.setId("777");
gridFsInputFile.setFilename(fileName); //Set a name on GridFS entry
gridFsInputFile.save(); //Save the file to MongoDB
System.out.println("Uploaded Successfully.");
GridFSDBFile outputImageFile = gridFs.findOne(fileName);
String downloadFileLocation = "E:\\IMediaWriterVedio"+fileName+".mp4";
outputImageFile.writeTo(new File( downloadFileLocation ));
System.out.println("Downloaded Successfully.");
}
mongoClient.close();
} catch (UnknownHostException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
} finally {
if( mongoClient != null ) mongoClient.close();
}
}
}
mongo.properties
file and required jars are {mongo-2.10.1.jar, log4mongo-java-0.7.4.jar
}
mongoDBHost : 127.0.0.1
mongoDBPort : 27017
mongoDBName : MyDBFiles
mongoDBUserName : Yash777
mongoDBPassword : secretKEY
mongoDB_BucketName : fs_Files_AND_Chunks_Collections
Fields in the GridFS data model:
-
Bucket name « A prefix under which a GridFS system’s collections are stored. Collection names for the files and chunks collections are prefixed with the bucket name. The bucket name MUST be configurable by the user. Multiple buckets may exist within a single database. The default bucket name is ‘fs’.
-
Chunk « A section/part of a user file, stored as a single document in the ‘chunks’ collection of a GridFS bucket. The default size for the data field in chunks is 255KB. Chunk documents have the following form:
{
"_id" : <ObjectId>,
"files_id" : <TFileId>,
"n" : <Int32>,
"data" : <binary data>
}
-
Chunks collection « A collection in which chunks of a user file are stored. The name for this collection is the word 'chunks' prefixed by the bucket name. The default is ‘fs.chunks’.
-
Files collection « A collection in which information about stored files is stored. There will be one files collection document per stored file. The name for this collection is the word ‘files’ prefixed by the bucket name. The default is ‘fs.files’.
{
"_id" : <TFileId>,
"length" : <Int64>,
"chunkSize" : <Int32>,
"uploadDate" : <BSON datetime, ms since Unix epoch in UTC>,
"md5" : <hex string>,
"filename" : <string>,
"contentType" : <string>,
"aliases" : <string array>,
"metadata" : <Document>
}
INPUT FILE: IMediaWriterVedio7.mp4, Size on disk is 640kb
QUERY for DB : db.stats()
/* 1 */
{
"db" : "MyDBFiles",
"collections" : 4,
"objects" : 15,
"avgObjSize" : 78750.9333333333340000,
"dataSize" : 1.18126e+006,
"storageSize" : 8.42138e+006,
"numExtents" : 5,
"indexes" : 4,
"indexSize" : 32704,
"fileSize" : 6.71089e+007,
"nsSizeMB" : 16,
"dataFileVersion" : {
"major" : 4,
"minor" : 5
},
"extentFreeList" : {
"num" : 0,
"totalSize" : 0
},
"ok" : 1.0000000000000000
}
QUERY for Collection : db.printCollectionStats()
- CHUNKS : db.getCollection('fs_Files_AND_Chunks_Collections.chunks').stats()
- Files : db.getCollection('fs_Files_AND_Chunks_Collections.files').stats()
/* 1 */
{
"ns" : "MyDBFiles.fs_Files_AND_Chunks_Collections.chunks",
"count" : 3,
"size" : 1.1796e+006,
"avgObjSize" : 393200,
"storageSize" : 8.3968e+006,
"numExtents" : 2,
"nindexes" : 2,
"lastExtentSize" : 8.38861e+006,
"paddingFactor" : 1.0000000000000000,
"systemFlags" : 1,
"userFlags" : 1,
"totalIndexSize" : 16352,
"indexSizes" : {
"_id_" : 8176,
"files_id_1_n_1" : 8176
},
"ok" : 1.0000000000000000
}
/* 2 */
{
"ns" : "MyDBFiles.fs_Files_AND_Chunks_Collections.files",
"count" : 1,
"size" : 240,
"avgObjSize" : 240,
"storageSize" : 8192,
"numExtents" : 1,
"nindexes" : 2,
"lastExtentSize" : 8192,
"paddingFactor" : 1.0000000000000000,
"systemFlags" : 1,
"userFlags" : 1,
"totalIndexSize" : 16352,
"indexSizes" : {
"_id_" : 8176,
"filename_1_uploadDate_1" : 8176
},
"ok" : 1.0000000000000000
}
/* 3 */
{
"ns" : "MyDBFiles.system.indexes",
"count" : 4,
"size" : 704,
"avgObjSize" : 176,
"storageSize" : 8192,
"numExtents" : 1,
"nindexes" : 0,
"lastExtentSize" : 8192,
"paddingFactor" : 1.0000000000000000,
"systemFlags" : 0,
"userFlags" : 1,
"totalIndexSize" : 0,
"indexSizes" : {},
"ok" : 1.0000000000000000
}
db.getCollection('fs_Files_AND_Chunks_Collections.files').find({})
/* 1 */
{
"_id" : "777",
"chunkSize" : NumberLong(262144),
"length" : NumberLong(653987),
"md5" : "8a2834f14d2e6f67d438f0f6157e1509",
"filename" : "First",
"contentType" : null,
"uploadDate" : ISODate("2016-12-29T11:29:09.695Z"),
"aliases" : null
}
db.getCollection('fs_Files_AND_Chunks_Collections.chunks').find({})
/* 1 */
{
"_id" : ObjectId("5864f3852f7c37b07df406cf"),
"files_id" : "777",
"n" : 0,
"data" : { "$binary" : "AAAAHGZ0eXBpc...69Ig==", "$type" : "00" }
}
/* 2 */
{
"_id" : ObjectId("5864f3852f7c37b07df406d0"),
"files_id" : "777",
"n" : 1,
"data" : { "$binary" : "Yf/Jyof/J9DSXd3iCKd...kGXCEylzw==", "$type" : "00" }
}
/* 3 */
{
"_id" : ObjectId("5864f3852f7c37b07df406d1"),
"files_id" : "777",
"n" : 2,
"data" : { "$binary" : "clrfhyq0WmQcCD...jIuMTAw", "$type" : "00" }
}