Table of Contents
THE MANIFESTO OF THE SCARRED: WHY YOUR CODE IS A CRIME AGAINST ENGINEERING
Traceback (most recent call last):
File "/usr/local/lib/python3.11/site-packages/service/processor.py", line 442, in process_payload
result = core_logic.calculate_weighted_offset(payload['delta'], config.STRATEGY_ALPHA)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/core/logic.py", line 89, in calculate_weighted_offset
return (delta * self.multiplier) / (config_val - 1)
~~~~~~^~~~~~~~~~~~~~~~~
TypeError: unsupported operand type(s) for *: 'NoneType' and 'float'
>>> # 3:14 AM. Production is down. The "Senior" dev who wrote this is asleep.
>>> # The docstring for 'calculate_weighted_offset' reads: "Calculates the offset."
>>> # I want to retire.
The Myth of the “Self-Documenting” Variable Name
I’ve spent thirty years watching languages evolve and developers devolve. I started on Python 1.5.2 when we were just happy to have a list comprehension that didn’t segfault the interpreter. Back then, we didn’t have the luxury of 64GB of RAM to run a “vibrant” (wait, I’m not allowed to say that word, and I wouldn’t anyway because it’s garbage) IDE that suggests code completions. We had man pages. We had grep. We had the source code, and if the source code didn’t tell you why a function expected a specific bitmask, you were dead in the water.
Today, I see these kids coming out of bootcamps claiming that “good code documents itself.” That is a lie. It is a lie born of laziness and nurtured by the arrogance of people who have never had to debug a race condition in a distributed system at three in the morning. “Self-documenting code” only works if the reader has the exact same mental model as the author. Newsflash: they don’t. Especially not after three years of technical debt has piled up on top of that “clean” architecture.
When you refuse to write python documentation, you are essentially saying, “I am so confident in my naming conventions that I expect you to intuit the edge cases of this float division.” Look at the traceback above. payload['delta'] was None. Why? Because the upstream service changed its schema. If there had been a shred of python documentation explaining that delta must be a non-zero integer representing milliseconds, the validation layer might have caught it. Instead, we get a TypeError in the core logic because someone thought calculate_weighted_offset was “obvious.”
Documentation isn’t about explaining what the code does—the code already does that. Documentation is about explaining the intent, the constraints, and the failures. If I see one more function named handle_data(data) with no docstring, I’m going to uninstall your compiler.
The Sphinx-Induced Migraine and the Death of Clarity
We’ve moved from simple text files to complex build pipelines just to generate a few HTML pages. I remember when pydoc was the gold standard. You ran it, it spat out some text, and you moved on with your life. Now, we have Sphinx 7.2.6, and with it comes a configuration file (conf.py) that is often more complex than the actual application code.
The problem isn’t the tool; it’s the way it’s abused. Developers spend hours tweaking the CSS of their ReadTheDocs theme while the actual python documentation inside the modules is out of date. They use autodoc to pull in docstrings, but since the docstrings are empty or contain “TODO: Add description,” the resulting “pretty” website is just a graveyard of function signatures.
Let’s look at what pydoc gives us for a basic module. This is what I want to see in my terminal when I’m desperate:
$ python3.11 -m pydoc math.sqrt
Help on built-in function sqrt in module math:
sqrt(x, /)
Return the square root of x.
(END)
Simple. Direct. It tells me the input and the output. But in modern “enterprise” Python, we’ve buried this simplicity under layers of ReStructuredText (reST) directives that no one remembers how to type. If you can’t write your python documentation in a way that survives a cat command in a terminal, you’ve failed. Sphinx 7.2.6 is a powerful engine, but it has become a crutch for people who think that “looking professional” is the same as “being useful.”
ReStructuredText: The Syntax That Time Forgot (and Why You Must Use It)
I hate ReStructuredText. I’ve hated it since it was proposed in PEP 287. The indentation rules are a nightmare, and the difference between a reference and a literal is enough to make a seasoned sysadmin weep. But here is the hard truth: it is the backbone of the Python ecosystem. If you try to use Markdown for your python documentation, you eventually hit a wall where you can’t cross-reference a class or link to a specific PEP.
Then came MyST-Parser. People thought they could escape the rigors of reST by using Markdown. All they did was create a fragmented mess where half the docs are in .rst and half are in .md, and the build fails because someone used a single backtick instead of a double backtick.
The internal mechanics of how Python handles these strings is actually quite elegant, if you bother to look. When you define a docstring, it’s stored in the __doc__ attribute of the object. The inspect module, specifically inspect.getdoc(), is what most tools use to extract this.
import inspect
def legacy_function(x: int) -> int:
"""
Perform a bitwise operation.
:param x: The integer to shift.
:return: The shifted integer.
"""
return x << 1
print(f"Raw __doc__: {repr(legacy_function.__doc__)}")
print(f"Cleaned doc: {repr(inspect.getdoc(legacy_function))}")
The inspect.getdoc() function uses inspect.cleandoc() under the hood. It handles the indentation stripping so that your python documentation doesn’t look like a jagged mess when it’s printed. It’s a bit of Python 3.11.4 logic that most people ignore, but it’s the reason your docstrings don’t look like garbage in the REPL. If you’re writing your own tooling, learn how inspect works. Don’t just regex the source code like an amateur.
The typing Module: A Band-Aid on a Gaping Wound
Python 3.5 gave us typing, and by Python 3.10, we finally got the | operator for unions. It’s supposed to make python documentation redundant, right? Wrong. Type hints are for the machine; docstrings are for the human.
Mypy 1.8.0 will tell you that a variable is a Union[str, List[int]], but it won’t tell you why it might be a string or what that list of integers represents. I’ve seen codebases where the type hints are so dense that the actual logic is obscured, yet there isn’t a single line of python documentation explaining the business logic.
# Modern "Type-Safe" Code with zero explanation
def process_registry(
mapping: dict[str, list[tuple[int, str | None]]],
timeout: float = 30.0
) -> bool:
...
What is the int in that tuple? Is it a Unix timestamp? A file descriptor? A count of failed login attempts? Mypy doesn’t care, but the engineer who has to fix this at 3:00 AM certainly does. The typing module has changed how we write python documentation by offloading the “what” to the type system, but it has made the “why” even more critical. If you aren’t using docstrings to explain the semantics of your complex types, you are just writing a puzzle for your colleagues to solve.
Google-Style vs. NumPy-Style: Pick a Side and Stay There
There are two primary schools of thought for formatting python documentation, and I have seen blood spilled over which is better.
The Google Style
Google-style is more concise. It’s designed for people who don’t want to spend their lives writing boilerplate. It uses indentation to denote sections.
def fetch_system_logs(server_id: int, limit: int = 100) -> list[str]:
"""Fetches logs from the specified server.
Args:
server_id: The unique identifier for the hardware node.
limit: The maximum number of log entries to return. Defaults to 100.
Returns:
A list of log strings, sorted by timestamp descending.
Raises:
ConnectionError: If the server is unreachable.
"""
pass
The NumPy Style
NumPy-style is for the heavy hitters. It’s verbose, it uses underlines, and it looks like a scientific paper. It’s great for complex mathematical functions where you need to explain the theory behind the parameters.
def calculate_eigenvalue(matrix, tolerance=1e-9):
"""
Compute the primary eigenvalue using the power iteration method.
Parameters
----------
matrix : numpy.ndarray
A square matrix of shape (N, N).
tolerance : float, optional
The convergence threshold for the iteration. Default is 1e-9.
Returns
-------
float
The estimated eigenvalue.
See Also
--------
numpy.linalg.eig : The standard library implementation.
"""
pass
In my day, we didn’t have “styles.” We had a paragraph of text and we liked it. But if you’re working in a modern stack, pick one. Mixing these styles in a single project is a firing offense. It breaks the Sphinx 7.2.6 parsers and makes your python documentation look like it was written by a committee of people who hate each other.
The Lies of the README.md
The README.md is the “pretty face” of the project, and it is almost always a lie. It contains a “Quick Start” guide that worked on the author’s machine six months ago and hasn’t been updated since. It uses words like “easy” and “simple” to describe a process that involves installing three different versions of Python and a specific C++ compiler.
A real README should be a technical specification. It should list the Python version (e.g., Python 3.11.4), the required system dependencies (e.g., libssl-dev), and the exact steps to run the tests. If your README doesn’t include a section on how to generate the python documentation, then the documentation doesn’t exist.
I’ve seen projects where the README is 2000 words of marketing fluff and zero words on how to handle a database migration. This is what happens when you let product managers near your repository. A README should be written by the person who had the hardest time setting up the dev environment.
The ReadTheDocs Pipeline: A Rube Goldberg Machine for Typos
We used to just host HTML files on a static server. Now, we have CI/CD pipelines that trigger on every git push. You push a change to a comment, and suddenly a GitHub Action is spinning up a container, installing Sphinx 7.2.6, downloading half the internet via pip, and trying to build your python documentation.
And it fails. It always fails. It fails because sphinx-rtd-theme had a breaking change, or because your requirements.txt didn’t pin the version of urllib3. You spend four hours debugging the documentation build for a two-line code change. This is the “overhead” no one talks about.
The maintenance of a python documentation pipeline is a full-time job. If you aren’t prepared to monitor your build logs, don’t bother with automated docs. Just put a text file in the root directory and call it DOCS.txt. At least that won’t break your deployment pipeline because of a CSS linter.
Docstring Inheritance: The Silent Killer of Logic
One of the most misunderstood features of Python is how docstrings interact with inheritance. If you have an abstract base class (ABC) and you don’t provide a docstring for the implementation, some tools will try to “inherit” the docstring from the parent. This is dangerous.
from abc import ABC, abstractmethod
class BaseProcessor(ABC):
@abstractmethod
def run(self):
"""Execute the primary logic of the service."""
pass
class FileProcessor(BaseProcessor):
def run(self):
# No docstring here
import os
os.remove("/") # Oops
If I’m looking at FileProcessor.run(), I might see the inherited docstring and assume it’s safe. But the implementation might be doing something radically different (or dangerous). Python’s help() command won’t always show you the parent’s docstring unless the child’s __doc__ is explicitly set or handled by a decorator.
>>> help(FileProcessor.run)
# In many cases, this will be empty if not explicitly defined.
When writing python documentation for class hierarchies, you must be explicit. Don’t rely on the reader to go hunting up the MRO (Method Resolution Order) to find out what a function is supposed to do. If you override it, document it.
The Era of LLMs: Hallucinated Documentation
Now we have AI. People are using LLMs to generate their python documentation. I’ve seen the output. It’s terrifying. The AI will look at a function, guess what it does based on the name, and write a beautiful, grammatically correct docstring that is factually wrong. It will claim a function returns a list when it actually returns a generator. It will say a parameter is optional when the code clearly requires it.
The danger of AI-generated python documentation is that it looks authoritative. A human-written “TODO” is honest. An AI-generated paragraph about a function it doesn’t understand is a trap. If you use an LLM to write your docs, you are just adding more noise to the signal. You are making the 3:00 AM outage even harder to solve because now the engineer has to figure out if the documentation is lying to them.
I don’t care how “advanced” the model is. If it didn’t write the code, it shouldn’t write the docs. Documentation is an act of reflection. It’s the moment where the developer realizes, “Wait, this function signature is actually terrible,” and changes it. If you automate that process, you lose the last line of defense against bad design.
The Internal Mechanics of inspect.getdoc()
Let’s get technical for a second. Why do we care about inspect.getdoc()? Because it’s the only way to reliably get the documentation of an object in a programmatic way.
def complex_function():
"""
Line one.
Line two (indented).
Line three.
"""
pass
import inspect
doc = inspect.getdoc(complex_function)
The getdoc function does several things:
1. It fetches obj.__doc__.
2. If it’s not a string, it returns None.
3. It calls inspect.cleandoc(), which looks for the minimum indentation of all non-blank lines after the first one and removes that amount from every line.
This is why your docstrings can be indented inside a class or function but still look correct when you run help(). If you were to just use obj.__doc__, you’d get all the leading whitespace from the source file, which would break most terminal-based pagers. This is the kind of detail that matters when you’re building tools that actually help people.
Hard-Truth Checklist for Junior Developers
If you want to survive in this industry without me breathing down your neck, follow these rules for your python documentation:
- The 3:00 AM Rule: If I wake you up and show you a function, can you tell me what it does without looking at the implementation? If not, the docstring is a failure.
- No Fluff: Do not use words like “vibrant,” “seamless,” or “comprehensive.” Tell me the inputs, the outputs, and what happens when it breaks.
- Pin Your Versions: If your docs depend on Sphinx 7.2.6, put it in a
requirements-docs.txtfile. Don’t make me guess. - Test Your Examples: If you put a code snippet in your python documentation, use
doctestto ensure it actually runs. There is nothing worse than a “Quick Start” that throws aSyntaxError. - Document the Exceptions: I don’t care about the “happy path.” Tell me what
Exceptionsubclasses I need to catch. - Check the REPL: Run
python3 -c "import my_module; help(my_module.my_func)". If the output is unreadable in a standard 80-column terminal, fix it. - Type Hints are Not Enough:
data: dictis useless.data: dict # Map of user_id to session_tokenis a start. A full docstring is better. - Update the README: If you change the installation process, update the
README.mdbefore you commit the code. Not after. Not “later.” Now. - Stop Using AI for Docs: If you didn’t think about the code enough to describe it in English, you didn’t think about the code enough to write it in Python.
- Respect the Man Page: If your library is big enough, write a manual. A real one. Not a collection of blog posts.
Documentation is the difference between a professional engineer and a script kiddie. Back in my day, we understood that. Now, I’m just hoping the next generation learns it before the last of us retires and the whole system collapses under the weight of “self-documenting” garbage.
Now get back to work and fix those docstrings. I’m watching the commit logs.