Program modules - shevdan/OCR_DataBases GitHub Wiki

Modules

Our program contains 4 modules.

3 of them is the realization of ## Data Bases Extension:

image_augment.py

This module is designed to expand the number of images that are intended to be fed to ML. This is made by applying random changes to each image. It consists of 1 class - ImageAugment().

Attributes:

fullpath: str

    path to a zip file that contains images which will be multiplied.
    Currently it is necessary for the file to be an archive

temp_drectory: Path

    Path of the temporary directory where all the files
    from the archive will be extracted

Methods

unzip_files()

    exctracts the zip file into the temporary directory

augment_image(filename, number_mult)

    applies ImageDataGenerator and generate given number of randomly
    created images from the base one, which has the filename path

process_folder(number_mult, dir_path)

    recursively walks through all the directories located by the dir_path
    and applies augment_image to every image

zip_files()

    zips the file and removes the temporary directory

data_adt.py

Module that implements ADT for operating with files and expanding data needed for ML. You can find details about this module here

convert_csv.py

Module that enables processing csv files that contain images to be converted to images. It consists of 1 class - CSVConvert()

Attributes:

fullpath: str

    path to the archive containing csv files
    ! Note ! Correct output will be proceeded for csv file that
    contains an image unicode character at the first column
    and pixels for the rest of columns

im_size: tuple

    tuple that contains size of the image that will be saved.
    Default value is 28x28 size

output: str

    name of the archive with images that will be created in the same
    directory as csv archive. Default value is train_images

Methods:

unzip_files()

    exctracts the zip file into the temporary directory

process_files(dir_str)

    recursively walks through all the directories located by the dir_path
    and converts each array containing image in csv into an image

convert_csv_to_image(filename)

    Method converts a numpy array into an image and
    saves it into the folder named by the symbol
    of the image in a output_file directory.

pixels_to_img(pixels, symb, cnt)

    converts one numpy array into an image and saves it into
    the folder named by the symbol
    of the image in a output_file directory

convert()

    Method that processes the archive containing csv,
    processes images in there and archives foler with images

The fourth module is made for work with OCR. It consists of 1 class - ## OCR()

Class Attributes:

API_KEY

    string that contains secret key obligatory to use the Azure API

ENDPOINT

    Another obligatory element to use API

Attributes:

img_directory

    directory where images are stored. Note: works even if there are other files except the image in the directory

filename

    string that defines the name of the file that will contain the recognized text

api_key

    Key to Microsoft Azure API. Necessary to use API

language

    optional parameter defining the text language to be recognized.
    Default value is 'en' for English. Possible values include: 'en', 
    'es', 'fr', 'de', 'it', 'nl', 'pt'. Azure OCR v. 3.2 is awaited to
    be implemented in the code to support over 70 languages in near future.

headers

    dictionary that defines headers to get send requests from Azure API

params

    dictionary containing parameters of the request to the API

Methods:

get_text(pathToImage)

    sends request to the API and returns information from
    json converted into the python dictionary.
    Takes a full path to the image, which is got as concatenation
    of the directory + the name of the file inside the directory

parse_text()

    parses the json and gets the text from the image.

handler()

    iterates over each file within a directory, sends requests to the API
    to recognize text, gets the recognized text and saves it into the file.