Python Splitter - Gatunox/python-splitter GitHub Wiki

Welcome to the python-splitter wiki!

Splitter is a fast and easy way to split a HUGE content file into multiples lines.

If you have ever worked with files created (in the bank industry) on a Mainframe, those files are always HUGE 50MB or so. They were created as a result of a batch process or as an input to a batch process.

Working with these kind of files on a regular text editor like Sublime Text or Notepad++ is LITERALLY impossible, you will notice they will HUNG or in the best scenario work VERY SLOW. This is because advance text editors always try to process the file entirely at once. Maybe you just need to verify some value within the file or just look the content for a quick analysis, So why use Mainframes that lack features we love like quick find or multi-line editing (Sublime Text) or wait forever until for you editor to respond.

So Splitter comes into place to provided a solution, This is split these files into multiples (generally thousands) lines to allow you to work with any text editor, in my case (Sublime Text 👍)

Just execute the splitter.py with the -i[nputfile] STRING "Input Filename" -b[bytes] INT "Number of Bytes for each line", optionally use -h[elp]

Example 1:

MacBook-Pro-Retina:python$ _**python splitter.py -h**_

Python 2.7.10 (default, Oct 23 2015, 19:19:21) 
[GCC 4.2.1 Compatible Apple LLVM 7.0.0 (clang-700.0.59.5)] 
usage: splitter.py [-h] -i STRING -b INT

Splitter script will split one file in multiple lines

optional arguments:
  -h, --help            show this help message and exit
  -i STRING, --inputfile STRING
                        Input filename
  -b INT, --bytes INT   Number of bytes per line

MacBook-Pro-Retina:python$ 

Example 2:

Input file content = "01234567890123456789012345678901234567890123456789012345678901234567890..." 600 bytes long.

MacBook-Pro-Retina:python$ **python splitter.py -i datafile -b 10**

Python 2.7.10 (default, Oct 23 2015, 19:19:21) 
[GCC 4.2.1 Compatible Apple LLVM 7.0.0 (clang-700.0.59.5)] 
('Input file is ', 'datafile')
('Output file is ', 'datafile-splitted')
('Number of bytes ', '10')

Starting script excution...

Splitted 600 bytes in 60 lines 

End excution.

MacBook-Pro-Retina:python$ 

Output file content = 60 lines with 0123456789

IMPORTANT:

  • Input file needs to exist
  • Output file will be created at the same location where the input file resided with "-splitted" appended to the name.

Happy Coding.