High level view of the process - divyadeep1/Internship_at_IITk GitHub Wiki

Overall process:

Images of sketches of floor plans (from a dataset) go through a pipeline, with the result being a 3D render of the sketch and a backprojection of the rendered model.

The dataset:

The dataset that we've worked with consists of over 40K images of floor plans (top view of floors with room labels and dimensions specified in the image itself). A few images can be found in './raw_images/' folder.

The Pipeline:

The process of rendering the 2D images in 3D requires a number of intermediate steps.All these steps, together, are referred to as the pipeline, with the original sketch (image) as input and 3D render and backprojection as the output. For each image, the following steps are executed, in the order they are stated in:

1. Box+Thresh+Inversion:

To put it in simple words, this pass takes the raw images, tries to remove noise and generate a black-and-white image consisting of only the thick lines found in the raw image. Improvements in this stage, i.e. better noise removal, will lead to better overall results.
Input - Raw image (./raw_images/)
Output - B/W image (./Input_images/)

2. Hough line Transform:

Hough line transform is used for extracting horizontal and vertical lines from an image. We use OpenCV's implementation in this step of the pipeline (refer https://docs.opencv.org/2.4/doc/tutorials/imgproc/imgtrans/hough_lines/hough_lines.html for documentation). The output of this pass, the lines present in the image, form the skeleton of the final render. So, the better the detection of lines, the better will the final result be.
Input - B/W image (./Input_images/)
Output - Lines present in the image as detected by the OpenCV's HoughLine function (./Screenshots/Snapped_hough_lines/)

3. Cleaning:

The output of the previous step consists of a lot of noise - overlapping lines, un-fused corners, small redundant lines that are detected over texts, and some lines that do not overlap with the original image. This pass tries to handle a few of these problems - overlapping lines are merged to single lines, corners of lines that should have been connected are fused, and small redundant lines are removed. Small lines present in the image sometimes cause problems.
Input - Lines present in the image as detected by the OpenCV's HoughLine function (./Screenshots/Snapped_hough_lines/)
Output - Cleaned lines (as per the criteria stated above) (./Screenshots/Cleaned_lines)

4. Extrusion and generation of .ply file:

Till the previous step, we were dealing with one dimensional lines that can't be rendered in 3D. We need to extend the 1D lines to 2D, and extrude the 2D planes to 3D. This step does exactly that. The generated 3D surfaces are written to a PLY file. We chose PLY format as writing code for parsing it for OpenGL was simple and it is natively supported by WebGL, should we wish to render the 3D model in a browser.
Input - Cleaned lines (./Screenshots/Cleaned_lines)
Output - .ply file containing the vertices, normals, and surfaces of the 3D model. (./PLY source files/Generated)

The 4 steps are run for each image present, and the resulting .ply file, is ready to render.

5 Rendering:

We can render the model in OpenGL or WebGL:
5.a In OpenGL - Rendering in OpenGL results in the scene being rendered in a glfw window. The mouse can be used for looking around whereas the keyboard arrow keys can be used for navigating the scene.
5.b In WebGL - Rendering in WebGL requires a modern browser with support for javascript. In this case, one needs to press and hold the left mouse button, and drag the mouse in order to rotate the scene. The arrow keys are used for moving in a horizontal plane and the scroll wheel of the mouse is used for zooming in and out of the scene. However, this approach has the disadvantage that it consumes more resources than OpenGL. On the positive side, the controls are pre-implemented and work like a charm.

6 Backprojection (optional):

As an optional step, we can backproject the rendered model, i.e., we can take a screenshot of the orthographic projection (instead of the default perspective projection), in order to analyze the problems that carried forward in the pipeline, and devise ways of rectifying them.