Web Service - KeithWilliamsGMIT/4th-Year-Mobile-Application-Development-Project GitHub Wiki

Overview

The web service for this project will have a number of responsibilities. This investigation will try to determine which technologies would be the most suitable for the given tasks.

Responsibilities

The following list outlines the list of responsibilities for this web service.

  1. Must provide a REST API for the mobile application.
  2. Must be capable of interacting with a database for storing user data.
  3. Must be deployable to Azure.
  4. Must be able to extract text from an image.

Technologies

Based on the responsibilities listed above, Python 3 seems to be a suitable language for developing this web service. The main reason for this is that the web service is not the main aspect of the project. Python, by nature, allows for the quick prototyping of applications which is exactly what is needed here. Also, Python can be compiled for any operating system with a Python compiler installed. Therefore, there shouldn't be a problem deploying the web service to Azure. However, Python alone will not be sufficient. It will also need to integrate with the following technologies in order to provide the required functionality.

Flask

The two main web frameworks for Python are Django and Flask. For this project I chose to use the Flask micro-framework. It's a very lightweight framework that is quick and easy to set up. Flask will be used to provide the REST API which will be consumed by the mobile application.

MongoDB

A database will be required to persist user data. This data will not be complex with each entry containing a path to the image of the receipt, the date which was parsed from it and the identity of the user. MongoDB, a NoSQL, document database seems suitable for this project as each receipt could be treated as a new document. There are several drivers available to allow Python 3 to interact with the MongoDB API. The recommended driver is PyMongo.

Tesseract

One of the key pieces of functionality that this web service must have, and one of the most difficult to implement, is parsing text from an image. Python offers binding to many computer vision and OCR (Optical Character Recognition) libraries. For example, pytesseract for the tesseract OCR library. Is library is a good starting point for extracting text from the image. However, Python, unlike C++ which the tesseract library is written in, allows for quick prototyping as mentioned earlier. This will help to drastically reduce development time.

Instructions for setting up and running this web service can be found in the README file for this project.