Converting 2 to 3 Notes - MridulS/pysal GitHub Wiki

PySAL Python 2.7 -> PySAL Python 3.4+

NOTE: This has been completed with PySAL 1.11.0

In order to have Python 2.7 and Python 3.4+ compliant code, some compatibility changes need to be made in the Python 2.7 code. We plan to maintain at least a Python 2.7 release, which gets automatically converted into Python 3.4 at install time.

Once done, we can still focus on a 2.7+ code base and use automated conversion to keep Python 3.4 compatibility. Since any CI framework can test multiple versions of the library, we can keep valid tests for both versions simultaneously with little effort.

In fact, you probably write code that is very close to python 3-compliant already.

#Where are we now?

Currently, all tests are passing in python 2.7 and python 3.4 from ljwolf's py3conv_cleaning branch. py3conv_cleaning is the current version of the python3-ready code. That branch has been cleaned up for merging to pysal/master.

The branch sjsrey py3merge has merged these changes and is passing.

This does not mean it is release-ready. Distribution and more rigorous testing are warranted.

For Travis, the build works on 2.7, but I can't figure out how to get it to build correctly on Python 3. It's missing pysal.py, and I'm not sure how or why setuptools are looking for that file.

Working on the Port

Since tests are passing, this is probably less urgent. But, if you'd like to validate, here's how to get a copy of PySAL working in Python 3.

Converting the Code

Clone ljwolf's py3conv_cleaning branch. This will give you two scripts in the git root, mk2 and mk3. The standard workflow is:

make changes in your personal py3conv branch.
git add the file you change and git commit the file.
use mk3 to convert the file into python3.
- all this script does is create a temporary branch, 2to3, where the output of the automated 2to3 changes go.
- the key to remember is that this branch, 2to3 is ephemeral and will be rewritten every time 2to3 is run!
test in python3, either using nosetests-3.4 or python3 and run the actual test_*.py file.
to get back to your committed changes in the python2 branch, run mk2.

The mk2 and mk3 scrips are <4 lines long. mk2 is only git checkout py3conv_cleaning && git stash. mk3 remakes the 2to3 branch and dumps the converted code into it.

##What's Broken?

Currently, all tests are passing in python 2.7 and 3.4!

###Continuous integration test each testing directory separately using nosetests3 -w ./test, as nosetests3 will exit out of the testing suite prematurely otherwise. Not sure why.

MUST figure out how to configure .travis.yml to test 3.4 and 2.7, while converting 3.4 inline.

###Potential for ship-one codebase If we want, we can investigate the use of __future__ and six to make the codebase work in both 3 and 2 without conversion. Right now, this is a far-off goal.

According to this guidance if we want to ship one code base, assuming it is coded in a forward-3-compatible style, the setup script can be modified so that either distutils or Distribute run 2to3 for us:

try:  # Python 3
  from distutils.command.build_py import build_py_2to3 as build_py
except ImportError:  # Python 2
  from distutils.command.build_py import build_py

setup(cmdclass = {'build_py': build_py},
  # ...
)

using Distribute:

setup(use_2to3=True,
  # ...
)

Full Development notes:

cg: all pass
core: By far the hardest fixes are here, thanks to changes in Python3's string model.
ESDA : Errors now are only showing in smoothing, with changes made to weights.
- test_gamma.py works on first try
- test_geary.py works on first try
- test_getisord.py works on first try
- test_join_counts.py works on first try
- test_mapclassify.py has a deprecation warning, but works on first try.
- test_mixture_smoothing.py works on first try
- test_moran.py works on first try
- test_smoothing.py needed major floatdiv modification. After the [assuncao rate bugfix merged] (https://github.com/pysal/pysal/pull/657) to master, this is complete.
meta : no tests. But, we know it's use of urllib2 will need to be rewritten. I've tried using six's urllib in place of where we used urllib2 as recommended.
network: once the relative import is resolved, tests pass.
Inequality: all pass
Region: given fixed components pysal/weights, the following holds:
- test_components.py works on first try
- test_maxp.py: assertion errors in all tests. @dfolch & @ljwolf fixed.
- test_randomregion.py: works on first try
spatial_dynamics : all pass
spreg : We'll just list test by test:
- diagnostics: all work
- error:
  - tests_error_sp.py fails out with AssertionError on a matrix. The difference between the two are only due to one elements' last digit.
  - test_error_sp_het.py passes
  - test_error_sp_het_regimes.py fails on some float/int stuff. After fixing those, we fail out lat2W issues, also floor division. Once floor division is used where it should be, tests pass.
  - test_error_sp_het_sparse.py works on first try.
  - test_error_sp_hom.py works on first try
  - test_error_sp_hom_regimes.py looks like same initial errors as test_error_sp_het_regimes.py. Fixed by swapping out for floor div on the regime test code.
  - test_error_sp_hom_sparse.py works on first try
  - test_error_sp_regimes.py also has float/floor errors. Once fixed, we fail out on two assertion errors, both exceedingly close to each other, but slightly larger than the others I'd consider closed.
  - test_error_sp_sparse.py fails out on an array comparison, but visual inspection reveals the arrays are the same, save one entry's ten-thousandth place decimal. So... pretty safe to say this passes in spirit
- ml:
  - test_ml_error.py fails on an AttributeError: can't set attribute on line 254 of ml_error. This was not an easy fix. But, essentially, it involved converting any property of a regression denoted by an @property to also have an accompanying setter. To do this, I had to modify the caching behavior. In concept, it's this flow. This keeps the setter/getter on a similar structure. I would've liked to implement this as a decorator that modifies properties, something like cached_property, but I couldn't get it to work immediately. This is probably the cleaner method going forward, but we would need two decorators to describe above. Once this is hacked into RegressionPropsVM and RegressionPropsY in spreg/utils.py, this, and everything else with this problem, passes.
  - test_ml_error_regimes.py had same regime problems as above. Once fixed, it had attribute errors like ml error. See ml_error.py for discussion of resolution.
  - test_ml_lag.py had the same error as ml_error.
  - test_ml_lag_regimes.py had same error as ml_lag.
- ols:
  - test_ols.py had ml_error-like attribute errors
  - test_ols_regimes.py too
  - test_ols_sparse.py works on first try
- test_probit.py had the same attribute errors, and required significant modification, since it did not inherit from RegressionPropsVM and RegressionPropsY. So, significant modification, same strategy. Tests now pass.
- twosls:
  - test_twosls.py attribute errors all over the place. Same idea as the fix discussed in ml_error, but needed to be custom for twosls because it overrides the vm method. now passes.
  - test_twosls_regimes.py had both attribute errors and float-floor errors.
  - test_twosls_sp.py failed on attribute errors. see twosls for answer. now passes.
  - test_twosls_sp_regimes.py failed like plain twosls_regimes.py. now passes.
  - test_twosls_sp_sparse.py failed on attribute errors. now passes.
weights:
- fixed the simple float conversion issues in tests.
- lat2W was using float div rather than floor div, fixing that makes lat2W work correctly. We currently don't test lat2W in weights, but it gets tested by proxy because it's used in spreg. Should probably also sanity check the hexagonal lattice generator.
- test_Distance fails out on using lists where we should be using sets, so the results are identical, but they're not correctly ordered. They correspond by index to the correct neighbors. Tests rewritten here using dictionaries of the form: {id:{neighbor_id:weight}}, which can be changed in the future to construct a numpy array for allclose comparisons.
- test_user fails out in what looks like the exact same error & numbers from test_Distance. Tests rewritten to use dictionaries.
- test_weights fails out also in the fact we're comparing against lists. The set of neighbors would be fine. Tests rewritten to use dictionaries.
- test_Wsets fails out on lists v. sets. Rewritten to use sets.