Program modules - shevdan/OCR_DataBases GitHub Wiki
Modules
Our program contains 4 modules.
3 of them is the realization of ## Data Bases Extension:
image_augment.py
This module is designed to expand the number of images that are intended to be fed to ML. This is made by applying random changes to each image. It consists of 1 class - ImageAugment().
Attributes:
fullpath: str
path to a zip file that contains images which will be multiplied.
Currently it is necessary for the file to be an archive
temp_drectory: Path
Path of the temporary directory where all the files
from the archive will be extracted
Methods
unzip_files()
exctracts the zip file into the temporary directory
augment_image(filename, number_mult)
applies ImageDataGenerator and generate given number of randomly
created images from the base one, which has the filename path
process_folder(number_mult, dir_path)
recursively walks through all the directories located by the dir_path
and applies augment_image to every image
zip_files()
zips the file and removes the temporary directory
data_adt.py
Module that implements ADT for operating with files and expanding data needed for ML. You can find details about this module here
convert_csv.py
Module that enables processing csv files that contain images to be converted to images. It consists of 1 class - CSVConvert()
Attributes:
fullpath: str
path to the archive containing csv files
! Note ! Correct output will be proceeded for csv file that
contains an image unicode character at the first column
and pixels for the rest of columns
im_size: tuple
tuple that contains size of the image that will be saved.
Default value is 28x28 size
output: str
name of the archive with images that will be created in the same
directory as csv archive. Default value is train_images
Methods:
unzip_files()
exctracts the zip file into the temporary directory
process_files(dir_str)
recursively walks through all the directories located by the dir_path
and converts each array containing image in csv into an image
convert_csv_to_image(filename)
Method converts a numpy array into an image and
saves it into the folder named by the symbol
of the image in a output_file directory.
pixels_to_img(pixels, symb, cnt)
converts one numpy array into an image and saves it into
the folder named by the symbol
of the image in a output_file directory
convert()
Method that processes the archive containing csv,
processes images in there and archives foler with images
The fourth module is made for work with OCR. It consists of 1 class - ## OCR()
Class Attributes:
API_KEY
string that contains secret key obligatory to use the Azure API
ENDPOINT
Another obligatory element to use API
Attributes:
img_directory
directory where images are stored. Note: works even if there are other files except the image in the directory
filename
string that defines the name of the file that will contain the recognized text
api_key
Key to Microsoft Azure API. Necessary to use API
language
optional parameter defining the text language to be recognized.
Default value is 'en' for English. Possible values include: 'en',
'es', 'fr', 'de', 'it', 'nl', 'pt'. Azure OCR v. 3.2 is awaited to
be implemented in the code to support over 70 languages in near future.
headers
dictionary that defines headers to get send requests from Azure API
params
dictionary containing parameters of the request to the API
Methods:
get_text(pathToImage)
sends request to the API and returns information from
json converted into the python dictionary.
Takes a full path to the image, which is got as concatenation
of the directory + the name of the file inside the directory
parse_text()
parses the json and gets the text from the image.
handler()
iterates over each file within a directory, sends requests to the API
to recognize text, gets the recognized text and saves it into the file.