AWS Glue Python Library Build - isgaur/AWS-BigData-Solutions GitHub Wiki

In order to import extra modules or packages, below are the steps:

  1. Create a directory with the name for eg: "paramiko_dir".
  2. Run the command "Python3 -m pip paramiko -t paramiko_dir/“
  3. Go to the directory "paramiko_dir"
  4. Create a setup.py file with the below content:

################################

from setuptools import setup

setup( name="paramiko_module", version="0.1", packages=['paramiko'], install_requires=['pynacl','bcrypt','cryptography','cffi','pycparser'] ) ################################

  1. Run the command "sudo python3 setup.py bdist_egg“
  2. The Egg file eg: "paramiko_module-0.1-py3.6.egg" will be copied under the dist folder. Go to the dist folder and copy the .egg file to the S3 location.
  3. Create a glue job and add the python library path as "s3://**/paramiko_module-0.1-py3.6.egg" and the run the job

Another approach

=====Commands used======

To build paramiko:

$ wget > https://files.pythonhosted.org/packages/ac/15/4351003352e11300b9f44a13576bff52dcdc6e4a911129c07447bda0a358/paramiko-2.7.1.tar.gz

$ tar -xvzf paramiko-2.7.1.tar.gz $ cd paramiko-2.7.1 $ python3 setup.py bdist_egg $ cd dist $ aws s3 cp paramiko-2.7.1-py3.6.egg s3:////

To build pysftp:

$ wget https://files.pythonhosted.org/packages/36/60/45f30390a38b1f92e0a8cf4de178cd7c2bc3f874c85430e40ccf99df8fe7/pysftp-0.2.9.tar.gz $ tar -xvzf pysftp-0.2.9.tar.gz $ cd pysftp-0.2.9 $ python3 setup.py bdist_egg $ cd dist $ aws s3 cp pysftp-0.2.9-py3.6.egg s3:////

⚠️ **GitHub.com Fallback** ⚠️