Python project integrity requires extra efforts
Posted on Fri, 06 Oct 2017 in Python
Python is a very flexible programming language. Its dynamic nature allows programmers to code elegant solutions that almost impossible to make with other more strict languages, for example, Java. However, you have to pay for everything. While Python code base grows, it requires more and more efforts to keep project integrity. Without these extra efforts, the project will fall apart.
Problem #1: a code navigation
The main problem is a code navigation. In this case, I use "code navigation" as a term that includes auto-completion, go-to-declaration and find-usage IDE functions, possibility to use grep through a code base easily. More or less it correlates with the easiness of finding a particular piece of code from any other piece of code in a project.
On early project stages, developers often sacrifice it for a more compact solution. It almost always causes several problems later. One of them is a high cost of debugging.
Traceback in Python isn't the most useful tool. Add some decorators, couple of lambdas, and anything from itertools, and you end up with a bunch of calls that says nothing about how a code really works. In this case, you have to use print statements, grep, and a debugger. All this means that you will lose hours trying to find and localize a problem.
Well, debugging is difficult. It is not a surprising thing. Yes, it is hard. Yes, it takes a lot of time. However, in the end, we'll find what we want. The worst part when you are stuck while writing a code. Just remember the last time you almost recalled how a part of a system worked but you had to find the last tile to complete the puzzle...
And then the chain of calls that took you hours to found it broke in the middle, IDE gave up and showed you "No declaration found" or suggested too many options.
Of course, this happens when you are working on the most complex and tricky part of a project. Of course, 3 years ago it was the most elegant and clean part of the project. But during this time it put on weight, became much more complex, lost flexibility and elegance. And it doesn't matter who decided to implement it this way. You have to deal with it. And it is extremely difficult to do.
What to do?
Is there any way how avoid this kind of problems? I have no good answer to it. Perhaps, it wise to follow The Zen of Python: explicit is better than implicit, even if it means to write more code. Do not afraid of adding some Java-style code if it helps to follow The Zen of Python. But basically I think we should follow this two rules:
Rule #1: You have to have really good reason to break a code navigation
There are a million ways how to break a code navigation in Python. But you have to have a strong bullet-proof reason to do it. A desire to write a beauty and elegant piece of code isn't enough. Only if there is a high risk to end up with extremely complex code, it is OK to implement some tricky solution.
You should always remember that a code is read more often than written. Keep its readability and integrity is the main priority.
Rule #2: All places that broke code navigation should be logged carefully
All tricky solutions should be logged carefully. Make sure that you have all significant details about everything that could help you in future. Log everything that could help you to reconstruct an event sequence that causes any given error in production.
Type hinting could help, but not much
You can use type hinting as an additional defensive line. It helps IDE - at least PyCharm - to suggest adequate things. Static analysis allows you to find some tricky problems in your code that you hardly ever find using other methods.
However, type hinting is very hard to use especially if you are working on a big project. It requires obligatory checks for all pull requests. So with regular tests, your CI-server should check types. You can't rely on manual checks. After a week nobody will do them.
Tools
Unfortunately, I can't say anything about tools. I doubt there is a tool or metric for a project integrity. mypy is a must-have tool for type checking. But other aspects can be controlled using common sense only.
Tests could help. Not small unit tests. But bigger ones that check interaction of several subsystems. Writing this kind of test often allows me to find places where an integrity and a code navigation fail.
Code review gives good results. To understand code written by someone else you have to jump from one part of it to another. In this condition, all tricky moments show themselves brightly. For me, the main signal about a potential problem is a strong desire to open IDE for code review.
In conclusion, I want to say do not affray of straightforward solutions that easy to read and that doesn't break IDE. They save your and your teammates' nerves and time.
Got a question? Hit me on Twitter: avkorablev