1*wigcg2cOMdiqq3y2Qrgk2w.png

Refactoring in Python is not that hard

In this article, I will walk you through:

The signs of need for refactoring.
What is considered refactoring?
What to do before refactoring?
Step by step guide to restructuring Python package

The examples are written specifically with Python in mind. However, the general principles shall hold true across all languages.

The signs of need for refactoring

Your versions are no longer supported. The best written code will still be obsolete after some time.
You want the new features in newer versions of frameworks, libraries or even programming language.
Your code was written quick and dirty with tons of technical debts.
Your code is hard to navigate. Countless indirect access.
Not comply with standards, style guides, conventions.

What is considered as refactoring?

The term refactor, covers a broad array of actions and definitions. While the term itself generally lead to a common goal, which is a cleaner, better code base, there are still a lot of actions can be considered as ‘refactoring’.

Upgrade your dependencies to newer version
Rename your code functions, classes, modules, etc
Reorganize your code base, moving a function from a file to another
Improve implementation of a function to increase performance
Reformat your code to make it standards compliant

Before Refactoring

In spite of distinction from one programming language to another, there are generally some steps and principles to be held while refactoring.

Step 1: Get A Glimpse of Your Code Quality.

Run static analysis tools. Static analysis tools gives you a summary report with quantitative statistics that can be compared while refactoring from time to time.

In Python, I personally use Prospector, a static analysis suite encompasses other tools such as pep8, Pylint, etc. The best thing I like about Prospector is its ability to detect and adapt to the frameworks I use (Django, Celery).

There are 3000+ messages found. The messages found should be vastly reduced after refactoring.

There are 3000+ messages found. The messages found should be vastly reduced after refactoring.

Do note that not all messages found are valid, there is a chance of false positive.

Step 2: Prepare and Verify Test Cases

Code without test is bad code by design. Do you have test cases in place? If so, how complete are they? Are those test cases up to date?

Why do we need test cases before refactoring? You can argue that you won’t need any, even if you are doing a simple change such as adding validation. Well, trust me, you gonna get yourself caught by unexpected behavior responded by the masterpiece you touched up.

I’m not gonna convince you the importance of having test codes in general. Test code before refactoring is to ensure the behavior of your system is consistent after refactoring.

Even if there are test cases in place, which gives you a green-light, you should still verify the test code. Let me tell you why.

Imagine you run whatever tests available before you attempt to refactor, and you get the following results.

Tests all passed. Nope, they look like all passed.

Tests all passed. Nope, they look like all passed.

However, when you dig further, you found this test case.

https://gist.github.com/melvinkcx/9045f6317597cac41d0b8478f9da180f

So, you see the point of verifying it?

Besides, test cases are often the best documentation for a software. While navigating within codes, they are also your best GPS navigator.

Start Refactoring — Reorganizing / Restructuring

In this section, I will walk you through an example config.py by reorganizing the structure, merging duplicated methods, decomposing and writing test code to ensure backward compatibility.

config.py looks like this:

https://gist.github.com/melvinkcx/8bcf0c390b6353c5022b356d8a3cd508

Step 1: Write Backward Compatible Code

This step is crucial. Before refactoring our code, test cases MUST be in place. In this case, we write backward compatible code to ensure all references to the classes/functions/constants are still working.

In __init__.py, we shall redefine the class/method signatures:

https://gist.github.com/melvinkcx/9bf28f8752e28d3566a47a15e5218bc9

The __init__.py is incomplete for now. We will revisit the file later. Next, we write a test case to make sure we can still import the package as if we are importing the old package.

https://gist.github.com/melvinkcx/70a12ed35694695bd335504617182d89

This is a simple test case, you may notice some backward compatibility issues are not caught in the test case.

Step 2: Reorganizing Package Structure

This section gives you an idea on how you can reorganize your Python package. Let’s revisit the config.py we have:

https://gist.github.com/melvinkcx/8bcf0c390b6353c5022b356d8a3cd508

Can you spot what’s wrong here? It is messy, there are constants, helpers, duplicated codes in a single file. When the code in config.py grows larger, it will become increasing difficult to navigate within. With this messy structure, you are breeding a spot for circular dependency, hidden coupling and refining the recipe for the tastiest spaghetti code.

How can you reorganize config.py? To me, separation of concerns comes across my mind. The following structure is often considered a good practice to structure Python package (this structure is used in Django as well).

config/
├── abstracts.py    # All the abstract classes should live here
├── constants.py    # All the constants should live here
├── exceptions.py   # All custom exceptions should live here
├── helpers.py      # All helpers should live here
├── __init__.py     # All backward compatible code in here
├── mixins.py       # All mixins goes to here
├── serializers.py  # All common serializers goes to here
└── tests.py        # All `config` related tests should live here

Let’s revisit our config.py before refactoring and identify where the individual piece of code should reside.

https://gist.github.com/melvinkcx/c7a84f573382edaeb21abfd94d437948

After refactoring, config.py should become a Python package config with a __init__.py in it.

utils/
├──config.py        # To be removed
└──config/
├── constants.py   
├── helpers.py   
├── __init__.py  
└── tests.py

In utils.config.constants:

https://gist.github.com/melvinkcx/39cde812a7aa2b29f74d5262e1dc4587

In utils.config.helpers:

https://gist.github.com/melvinkcx/2c637a5f80e59cb5102fb1c559ec791d

Step 3: Eliminate and Merging Duplicates

In utils.config.helpers, there are 2 similar methods/functions get_logging_level() and ConfigHelper()._get_logging_level(). Assuming both implementations are identical, it means we have to find a best place to host the function.

In this case, I remove the standalone get_logging_level() and keep the one in ConfigHelper.

https://gist.github.com/melvinkcx/d86c5e1aa0b3cb22bf51ccb4dfb46a25

Hierarchy of Decomposed ConfigHelper

Hierarchy of Decomposed ConfigHelper

We host our AbstractBaseConfigHelper in abstracts.py:

https://gist.github.com/melvinkcx/fd8d58ed0aa3891ee8a171bf2a89d881

In mixins.py:

https://gist.github.com/melvinkcx/32ed9d493e8641d776cdacaa354d3e24

In helpers.py:

https://gist.github.com/melvinkcx/6a1dddf4484cbe752fbc75cf6e4c97dc

ConfigHelper is now decomposed into multiple classes and mixins.

Step 5: Complete Our Backward Compatibility Code

In Step 1, we added some code in __init__.py. However, it is largely incomplete. Let’s revisit the file:

https://gist.github.com/melvinkcx/336239b9bb7620cb06ce0d4cbbbc1cb7

Notice that the bridge between the code above and our newly organized config package is still missing. To establish the bridge, we edit our __init__.py into:

https://gist.github.com/melvinkcx/3dda9ed799c217aed7aa6166c8a1c5e0

Step 6: Notify The Developer

Up to Step 5, our config is properly refactored. However, we need to keep the developers notified about the change. Is there any straightforward way? Yes. We can emit a warning message whenever a developer is trying to import an obsolete function/class/method. For example, we annotate the old functions/classes/methods with decorators:

https://gist.github.com/melvinkcx/562d3beaf86546ba4124a54d7b5771e4

In our __init__.py, we add decorator like this:

https://gist.github.com/melvinkcx/704d4e9316885e88459508a80e68b248

After Restructuring

After restructuring our Python package, we run our test case and make sure it’s all passed.

Conclusion

Up to this point, you should be able to understand the quality of your code base, understand the concept of refactoring, identify the need of refactoring, and understand how can one restructure/reorganize a Python package.

First published on 2018-11-20

Republished on Hackernoon

Melvin Koh