Introduction

There is a good chance that you might still be working on the Python 2 product or test code. If yes, then you might also be keep seeing the below deprecation message as a reminder while working with python 2 or pip. In addition, the key to note is that Python 3.x is not backward compatible with Python 2.x versions.

“DEPRECATION: Python 2.7 will reach the end of its life on January 1st, 2020. Please upgrade your Python as Python 2.7 won’t be maintained after that date. A future version of pip will drop support for Python 2.7.”

In addition, see the note from python site as  “Being the last of the 2.x series, 2.7 will receive bugfix support until 2020. Support officially stops January 1 2020, but the final release will occur after that date.” [1]

Ok. We heard that  Python 2 is entering into unsupported mode by end of this year. So, it is important to migrate the current Python 2 code to Python 3 syntax and stick to Python 3 going forward. Why doesn’t the teams just jump start on this migration? One of the major hurdles is that the majority of working code simply breaks (read more at why-was-python-3-made-incompatible-with-python-2) with Python 3 either from the direct language syntax or issues with third party APIs. Why does someone bother about migration if there is no need at this time for any specific improvement like more performance in some cases or resolving some reliability issues etc., in your code. No extra time and budget allocated for this effort. Thereby it goes to the bottom of the priority list. The same situation happened with Couchbase test infra and it was not prioritized before. Now, because of the above Python official unsupported announcement caused the team to think on the migration prioritization. Even if the bug fix support deadline crossed, its ok (because your code is still working) but better to be close to this date so that we are on the same page with other Python community and support.  

This document is going to provide tips and tricks while upgrading to Python 3 along with common problems encountered during Couchbase test infra migration process.

Couchbase is an open source Enterprise-class MultiCloud to Edge NoSQL Database. The Couchbase functional testing framework, TestRunner has been developed in Python 2. The TestRunner git  repository can be found at https://github.com/couchbase/testrunner .  Our goal now is to do complete switch to Python 3 runtime instead of co-running with both Python 3 and Python 2.

As part of the Python 3 porting process, we have identified the major changes needed and porting process. Some of the problems were identified during the porting process. Now, sharing our learnings with those porting problems incurred and solutions here so that it can help you at your end. You can pick the latest Python 3.x version (it depends on the pre-release, stable, security-fixes version on a specific platform, 3.7 or 3.6), which we are referring as Python 3 throughout this blog. See more details on the release at Python releases download and  Python 3 documentation.

Cheat Sheet

 

Major Changes

To get an idea on the key changes, here is the summary list of code changes needed from Python 2 to Python 3.

Python 2Python 3
Text utf-8 : str
Text is unicode : str
u”
Binary is same as Text: bytes/str
Example:
file.read(6) == ‘GIF89a’
Binary data is represented as b prefix: bytes
 b”
Use decode() to get the string, encode() to get bytes.Examples: 
file.read(6) == b’GIF89a’
b’hello’.decode() → ‘hello’
‘hello’.encode() → b’hello’
str(b’hello’) → “b’hello’
Print statement
Example: print ‘ ‘
Print function
Example:print(‘ ‘)
Integer division
Example: 5/2=2
Floor Division. Use 2 slashes
Example: 5//2 = 2 and 5/2=2.5
Float division
Example: 5/2.0 = 2.5 or 5.0/2 = 2.5
Float Division. Use single slash
Example: 5/2 = 2.5
Long type is different from int
long 
There is no long type. It is same as int
xrange()range()
Iteration functions had iter prefix. iterxxx()
Example: iteritems()
Dropped iter prefix. xxxx()
Example: items()
Lists are directly loaded (all elements loaded into memory when list is used)
Example: for i in [] 
Lists are lazy loaded (when an element is accessed, then only loaded into memory)
Example: for i in list([]) 
Dictionaries can be compared by default or against 2 dict.
Example: sorted(dict) 
Dictionaries can’t be compared directly. sorted() should have key.

Example: sorted(expected_result,key=(lambda x: x[bucket.name][‘name’]))For general dict/list comparison, you can use below: 
 from deepdiff import DeepDiff
 diffs = DeepDiff(actual_result[‘results’], expected_result[‘results’], ignore_order=True)if diffs:
   self.assertTrue(False, diffs)

Bytes and strings as values:
diffs = DeepDiff(set(actual_indexes), set(indexes_names), ignore_order=True, ignore_string_type_changes=True)

string.replace(data[i],…)data[i].replace(..)
urllib.urlencode()New modules

  • http.client
  • urllib.request, urllib.error, urllib.parse
  • sgmllib3k

Examples: 
urllib.parse.urlencode()

string.lowercaseAttributes:
string.ascii_lowercase
string.ascii_uppercase

See the testrunner py3 commits  for changes

Python 3 Setup

To setup Python 3 from scratch, run the below commands on a new host with major supported platforms.
Later during the runtime, either use python 3 command or python in python 3 virtual env. Use either pip3 or pip3.x (pip3.6 for example) to install packages based on the installed Python 3 version.

Mac OS

(Example: Your laptop)

CentOS  

(Example node: Jenkins Slave)

Ubuntu slave using for Python 3 runtime verificationWindows
Direct setup (pip3 automatically installed):

(https://wsvincent.com/install-python3-mac/)

Virtual environment setup:

Install required libraries:

For now, the below modification is required to the common Python 3 http client otherwise, you would hit an error.

 

Direct setup and virtual environment:

Install required libraries:

 

Perform Couchbase CSDK and Python SDK installation on new slave:

For now, the below modification is required to the common Python 3 http client otherwise, you would hit an error.

 

Direct Setup:

 

Install the required libraries:

Install CSDK and Python SDK installation: (Ref: https://docs.couchbase.com/c-sdk/2.10/start-using-sdk.html )

For now, the below modification is required to the common Python 3 http client otherwise, you would hit an error.

 

Download and install: https://www.python.org/ftp/python/3.7.4/python-3.7.4.exe

Install required libraries:

 

 

Porting Process

At a high level, the porting is a three step process. 1) Auto conversion 2) Manual changes 3) Runtime validation and fix

At first, clone the original repository and have the basic automatic conversion changes. Checkin the changes as a new repository until full conversion is done. This way, the current regression cycles can go without interruption.

1. Auto conversion

There is an automated tool called 2to3 tool, provided by Python 3 team that helps in taking care of a few common patterns like print, exception, list wrapping, relative imports etc.  

You can start with a single directory in the locally cloned workspace to do double check. Later, the conversion can be done entirely on entire code so that basic porting is taken care.

Below are some of the sample 2to3 conversion commands on the MacOS. In the last command, note that all idioms were applied. This way, the first time conversion can take care of key changes.

 

2. Manual changes

The auto conversion doesn’t do the complete porting. The below common problems might be experienced during the porting process than the common syntax changes done by the auto conversion 2to3 tool. 

Run the test class and see if any errors and fix appropriately whether to switch from bytes to str or str to bytes or some sort/comparison issue where one has to fix the key name in the sorted function. This is an iterative process until all the code runtime has been validated.

Once a common pattern for sure is clear, then you can do grep and sed to replace across many class files. If you are not sure on other code until runtime, then defer until that test class is executed. 

There might be issues with third party libraries/modules might have changed, those need to be searched on the web and use appropriately.

Make sure all the code path is covered by running across all supported platforms and parameters.

3. Runtime Validation and Fix

Once the conversion is done, then perform a lot of code runtime as Python is a dynamic language. Otherwise, the changes can break the things if you do just visual static code inspection/changes. You can start with basic sanity tests, acceptance tests and then select full tests from a single module tests.

Once comfortable, and then go with all other modules one by one. Keep checkin the changes into new repository. In addition, you need to make sure no regressions with ported changes from this new repository by running sanity tests on the newer builds. Also, the validation should include all the supported platforms with Python 3.

 

Python 3 Ported Code and Status

Below is the new repository for Python 3 ported code until it is merged to the main repository. The plan is to do one cycle of porting or intermediately take the changes from main repo and do manual merge to this.

https://github.com/couchbaselabs/testrunner-py3/

(Branch: master)

Many common changes were already done but not completed as there might be some other runtime issues. Fixes in common can also be regressed the earlier fixes because of assumptions on input value type conversions. There is still some more ported code to be validated with Python 3 and the effort is in still in progress.

Now, let me show you the common issues happened during the runtime validation. You can use this as a reference when you hit an issue to see if you are having the similar issue. You can apply the same solution and see if it works for you. Any new ideas, you can put in comments.

Common Runtime Problems

 

1. Problem(s):

  • You might get some of the below TypeErrors during runtime like str instead of bytes and bytes instead of str
  • Error#1. TypeError: can’t concat str to bytes
  • Error#2. TypeError: must be str, not bytes
  • Error#3. TypeError: a bytes-like object is required, not ‘str’
  • Error#4. TypeError: Cannot mix str and non-str arguments

Solution(s):

See the types of the variables in the statement and use xxx.encode() to get the bytes or xxx.decode() to get the string or use b prefix or use str(). Sometimes, the input might not be unknown and in this case, use try x.encode() except AttributeError: pass


2. Problem(s):

TypeError: root – ERROR – ——->installation failed: a bytes-like object is required, not ‘str’

Solution(s): 

In this case, Add b as prefix to the string under comparison or change the byte type to string type. Example: lib/remote/remote_util.py

Surround with try-except to check the exact line causing the error (say above TypeError.) 

Sample output after traceback.print_exec() to see the full stack trace in similar to java.

Fix with changes to lib/remote/remote_util.py as below.

3. Problem(s):

 

Solution(s):

 

4. Problem(s):

AttributeError suite_setUp() or suite_tearDown() are missing for some testsuites.

Solution(s):

Add the dummy suite_setUp() and suite_tearDown() methods. 

 

5. Problem(s):

 

Solution(s):

 

6. Problem(s):

AttributeError: ‘Transport’ object has no attribute ‘_Thread__stop’

Solution(s):

  There is no direct stopping of a non daemonic thread. But syntax wise use t._stop(). The recommendation is to use the graceful shutdown using a global flag and check in the thread’s run() to break.

(https://stackoverflow.com/questions/27102881/python-threading-self-stop-event-object-is-not-callable)

7. Problem(s):

Test expirytests.ExpiryTests.test_expired_keys was not found: module ‘string’ has no attribute ‘translate’

Solution(s):

Rewrite with str static methods. There is no old way of getting all chars, so used the earlier code used total set.

vi lib/membase/api/tap.py 

 

8. Problem(s):

TabError: inconsistent use of tabs and spaces in indentation

 

Solution(s):

Search for tab characters and replace with space character. 

For the above issue, remove tab characters.

 

9. Problem(s):

Solution(s):

Case sensitiveness issue. Fixed by changing from x_couchbase_meta key to X_Couchbase_Meta

 

10. Problem(s):

  • Error#1. TypeError: ‘<‘ not supported between instances of ‘dict’ and ‘dict’
  • Error#2. TypeError: ‘cmp’ is an invalid keyword argument for this function

Solution(s):

   

11. Problem(s):

Solution(s):

 

12. Problem(s):

Solution(s):

 

13. Problem(s):

Solution(s):

Here, it should return int as python 3 doesn’t compare automatically as in python 2.

 

14. Problem(s):

Solution(s):

 

15. Problem(s):

Solution(s):

Converted the key to string so that ch is string instead of int with binary key. See the file.

 

16. Problem(s):

TypeError: ‘FileNotFoundError’ object is not subscriptable

Solution(s):

Changed in Python 3 as FileNotFoundError is not sub-scriptable and instead, use errno attribute,  e.errno

 

17. Problem(s):

Solution(s):

The nested dictionary/list comparison was not working because of the earlier sorted function to sort completely is now not available. Use deepdiff module and DeepDiff class to do the comparison

 

18. Problem(s):

AttributeError: module ‘string’ has no attribute ‘replace’

Solution(s):

Use direct str variable to replace like below for fixing the issue.

 

19. Problem(s):

Solution(s):

Use str or int function appropriately.

 

20. Problem(s):

NameError: name ‘cmp’ is not defined

Solution(s):

Use deepdiff module and DeepDiff class to do object comparison.

 

21. Problem(s):

Solution(s):

Convert str to int as below for the above type error issue.

—-

Thats all for now on the few list of problems and more might be in next blogs.

 

Further readings

The following references were helped us. You can also read further at below reference links to get more details and improve your code porting to Python 3.

  1. https://www.python.org/dev/peps/pep-0373/
  2. https://wiki.python.org/moin/Python2orPython3
  3. https://www.toptal.com/python/python-3-is-it-worth-the-switch
  4. https://weknowinc.com/blog/running-multiple-python-versions-mac-osx
  5. https://docs.python.org/3/howto/pyporting.html
  6. https://wsvincent.com/install-python3-mac/
  7. http://python3porting.com/pdfs/SupportingPython3-screen-1.0-latest.pdf
  8. https://riptutorial.com/Download/python-language.pdf
  9. https://docs.couchbase.com/python-sdk/2.5/start-using-sdk.html
  10. https://docs.couchbase.com/c-sdk/2.10/start-using-sdk.html
  11. https://pypi.org/project/deepdiff/
  12. https://buildmedia.readthedocs.org/media/pdf/portingguide/latest/portingguide.pdf
  13. http://ptgmedia.pearsoncmg.com/imprint_downloads/informit/promotions/python/python2python3.pdf

Hope you had a good reading time!

Disclaimer: Please view this as a quick reference for your Python 3 upgrade rather than complete porting issues resolution. Our intent here is to help you at some level and give you a jump start on the porting process. Please feel free to share if you learned new that can help us. Your positive feedback is appreciated!

 

Thanks to Raju Suravarjjala and Keshav Murthy for their key inputs and feedback.

 

Posted by Jagadesh Munta, Principal Software Engineer, Couchbase

Jagadesh Munta is a Principal Software Engineer at Couchbase Inc. USA. Prior to this, he was a veteran in Sun Microsystems and Oracle together for 19 years. Jagadesh is holding Masters in Software Engineering at San Jose State University,USA and B.Tech. Computer Science and Engineering at JNTU,India. He is an author of "Software Quality and Java Automation Engineer Survival Guide” to help Software developers and Quality automation engineers.

Leave a reply