Technical Debt our Best Frenemy Forever
All applications grow. As their codebase grows, their complexity
grows.
You can think of code as nodes on a graph of functionality where the
edges represent dependencies between the nodes. The problem is that
each node on the tree was implemented by a human that more than likely
missed several important things, and made several other limiting
assumptions when they implemented it. As an application grows we
continue to add more imperfect nodes to the graph. The bugs and
limitations in these nodes are together what we call technical debt
and it can compound very quickly up until the point where it is very
difficult to progress. When a codebase has a lot of technical debt,
introducing new features will constantly break old features, and every
release brings as much breakage as it does positive change. Working on
code with a large amount of technical debt is like playing
whack-a-mole, you knock one down and the other buggers keep popping
their heads back up in places that you never expect.
Code with a lot of technical debt is hard to change, because when a
developer makes a change they need to chase the ramifications down
through the complexity of the whole system. So while a developer may
have been able to successful make the change without introducing any
new problems, the process of making the change was much harder than it
needed to be.
The experience of working on a project with a high degree of technical
debt is rarely a good one. Product managers become frustrated and the
developers start to dislike working on the codebase. In general,
everyone starts to become a bit demoralized as they drastically scale
back their visions for the number of changes that can be made to the
software in a given period of time. This overall drag of technical
debt is further complicated because developers unconsciously avoid the
pain changing things that have become too complex to reason
about. Instead it becomes easier to add more code and more debt or
simply move to another project that doesn’t have as much debt. Product
managers start to avoid the pain as well, unconsciously steering away
from the hairball areas that seem to gooble up time and prduce little
tangible results.
One could characterize technical debt as a bad thing, it feels bad,
but it happens every single time you develop an application. We need
to accumulate technical debt to learn about our domain, full
stop. There is no way around accumulating technical debt. Yes,
hard-won experience can help you avoid some tech debt, but the
hindsight you acquire while gaining domain knowledge is a much more
accurate lens than trying to see the future and prevent tech debt at
the outset. In fact, many attempts at preventing tech debt actually
manifest as causes of technical debt. So technical debt is absolutely
unavoidable. Technical debt is our frenemy, it’s just an active part
of the process that we can’t live without.
When we get to the point where we accumulate so much tech debt that we
can’t cover the interest, and can’t progress, there is no choice but
to go back to the base nodes and start firming things up. We need to
separate these nodes, tease them apart and understand their full needs
and effects on the greater application. In this process, we will
integrate the information that we gained while initially sprinting
forward, we brutally get rid of the things we thought we might need,
and destroy abstractions that we imagined would be helpful but on
reflection provide little actual utility. We reduce and distill the
code until you arrive at its essence, its stupid simple and
boring essence.
We need to make the foundation nodes on the graph solid as bedrock and
then continue up the nodes in layers to shore things up. You know that
you are refactoring correctly when the number of lines in your
codebase drastically shrinks, and your code is re-rendered as
simple understandable boring code. This shoring up and elimination
of tech debt restores forward momentum on software projects and
basically makes them fun to work on again. It unlocks their ability to
reach their potential.
It’s important to realize that this process is a deliberate and
careful process and not a reactionary process. We let the code that
exists guide us, we don’t just dump it and start over. I’ve seen teams
take wild swings and come up with simple sounding but drastic
solutions so that they can do something now and get the ball
rolling. This is an understandable reaction as folks are
frustrated. But trust me don’t elect a crazy orange faced leader and
think that is going to fix everything. The solution is to use diligent
reflection, careful refactoring, and reduction of the codebase.
During the process of reducing technical debt forward movement in
terms of additional external features will slow and perhaps stop, and
this is frustrating for product managers and other stakeholders as
they can not see and feel the improvements. But this situation is no
less frustrating than the actual situation where each release results
in little forward motion anyway. Product managers do have a roll in
the refactoring process, refactoring goals and milestones are just
different, but no less rewarding. While refactoring, improvements do
come and they come much more quickly than one might think. The backlog
of bugs starts to shrink, strange application behaviors begin to
vanish, and the application starts behaving the way it is intended to.
Status
At Status, I see folks struggling to move the codebase forward. The
status-react
codebase has become quite complex and is hard to
effectively reason about. Looking through the code I see several
patterns that have decreased the ability for individual developers to
understand the full implications of the code they are currently
looking at, and increased the likelihood that making changes locally
could have unknown ramifications in other parts of the application. I
also see examples of code that creates tech debt in order to solve
problems created by older tech debt. The codebase looks exactly like a
codebase where coders have been sprinting ahead accruing features for
quite some time. To me, the status-react codebase appears to have
reached a limit in the amount of technical debt it can accumulate.
What Status is doing is challenging. It is challenging both at a
technical level, and at an organizational level. It’s a bold
experiment in autonomy and it’s a worthy one. I’d like to offer some
practical suggestions on code patterns that will support the autonomy
of individual developers and small teams to work on a large codebase
and have a high degree of confidence about the code they are working
on and its relationship to the greater codebase. We can structure the
code to support autonomy by creating a contract between the local code
and the vastly complex world of the larger application so that the
relationship between the two is explicit and understood. We want to
support developers so that they can act in confidence locally and know
that they are not causing spooky action at a distance and wreaking
havoc on other parts of the application.
Encapsulation, components, and strict API boundaries have served
developers as time-proven tools that help reduce the scope of things
developers need to consider when working on a piece of code. As
developers we benefit tremendously from encapsulation and division of
responsibility. We benefit when we use components and libraries that
hide their internals from us and present us with a stable API as a
contract between us and them. This is what allows us as developers to
coordinate with a myriad of completely independent entities across the
FOSS landscape.
Let’s think about that, at Status we effortlessly cooperate with 1000s
and 1000s of other Open Source programmers. The open source ecosystem
is an absolutely incredible distributed team. Cooperating with this
immense distributed team is practically effortless. We can do it
because we aren’t all up in other people’s business and they aren’t
messing in ours. The defined API boundary between external code and
our code is a contract between them and us that supports our
independence.
Because of the nature of how we work at Status, I believe that using
patterns that support code independence will not only provide more
reliability to the codebase but will also structure the code in a way
that supports and encourages Status’s ideals.
Using refactoring and strict boundaries don’t guarantee solid
well-functioning code but it does support the likelihood that it will
emerge. It creates an environment where a developer can reason much
more effectively about what they are working on because the scope and
ramifications of the code at hand is much smaller.
Re-frame and scale
Re-frame was designed for and is used successfully on Web applications
of a certain scale. It is normally used as a central controller that
synthesizes many individual parts into a complete whole. When the team
at day8
(the authors of Re-frame) create event handlers they end up
being very shallow and explicit. The event handlers they write are
shallow because when you are developing a normal Web application it’s
only one or two event hops to a REST API call or a third party library
call. The event handlers are shallow because they are either just
altering the app-db
or talking to stateful encapsulated systems
like databases on HTTP servers.
Most of the work they do at day8
is data and display oriented, and
they benefit from the Re-frame pattern because it does data and
display really well. The vast of majority (90%) of their event
handlers are simple and alter the database only.
Beyond a certain scale Re-frame ceases to be helpful and starts to
display the common symptoms of mutable global state.
I would say that Status.im is currently experiencing the shortcomings
of the taking Re-frame pattern too far. The Re-frame pattern has
deeply entangled itself into different parts of the codebase that
would be much better off as separate entities.
Breaking the Re-frame rules
There truly is no silver bullet. There are good guidelines, principals
and trade-offs but there is no one pattern that you can simply lean on
and expect everything to turn out OK.
In support of reliability, simplicity, understandability, via code
isolation with strict boundaries, I am going to suggest that
developers do things like create namespaces and components that manage
their own local state, I’m going to recommend this because I strongly
beleive it is the correct trade-off to make. Some folks may resist
this, but in many cases it is what will allow the creation of local
components that don’t leak local information into the global
context. Again this trade-off is the very same trade-off that we
benefit from whenever we use a third party library like web3 or React
Native component. It is much better to compromise and use local
mutable state and in return get the concrete benefits of simple code
that isn’t complected with the greater application than to have a
notion that we are doing something the “right” way yet isn’t actually
offering any real utility.
Could you imagine if all of the internal state of the web3 library,
the HTTP server (threads, request state), all the third party React
Native components and the Ethereum node was all present in app-db
?
It would be a nightmare because it becomes very hard to tell where
layers of functionality begin and end. Would we be able to see the
boundaries of the HTTP server versus Ethereum? As these things mingle
together they start to become complected into one big mass and the
surrounding application code begins to depend on them being one single
thing.
So at times, I’m going to suggest deviating from the Re-frame way, and
in its place, I’m going to suggest some very straightforward,
stateful, boring, yet time-proven patterns.
To be clear, I’m not suggesting ditching Re-frame but rather
relegating it to the UI and the high-level data that’s needed for the
UI. Let Re-frame do Re-frame at a scale that makes sense. I want
Re-frame to be the brain that coordinates the application but doesn’t
get overzealous and micro-manage the parts of the body that already
know locally how to do what needs to be done.
Recommendations
Isolate chat/wallet functionality and into a well tested library
Isolate all code that talks to the blockchain and mail server for chat
and wallet functionality into a high-level library (or several
libraries) with a well-defined API and set up an integration test
suite that thoroughly tests this code (in Node) against geth (or
testrpc) and status-go.
Think in terms of a separate library that has an API that you could
build a completely different chat application on. Imagine you wanted
to make command-line curses-based chat client in Node and you wanted
to use this library.
The library API should be minimal and provide the consumer a level of
abstraction that leaks the least amount of internal implementation
details as possible. In an ideal world, you would be able to
dramatically alter how the API is implemented (say moving to Swarm PSS
or Signal) and it would make zero difference to the UI of the
application that utilizes it.
This library would need to maintain local state much the way that the
web3 environment underneath it does.
The code in core library would be written in a way that makes it very
explicit how chat communication on the Status platform works. It might
be a good idea to write the core code in a straightforward literate
(well commented) style to help all current and future developers
understand the internals.
The integration testing of this library should be extensive and
rigorous and reflect the importance of this core library to the
functioning and success of Status. Erroneous situations should be
exercised and well understood.
This library should be the rock-solid foundation that the rest of the
application springs from.
Careful refactoring is the way to get there
As mentioned above I recommend careful refactoring and distilling the
code down to its essence. Refactoring is the tool that will help
produce the core Status chat/wallet library mentioned above.
I am against starting an initiative where a team or a developer goes
off on their own and writes this high-level library from scratch. We
can keep the application functioning and whole while we clean things
up.
Break up UI code into components
Currently, our view code benefits from encapsulation when we use
components that are provided to us by React Native and third-party
libraries.
IMHO Status’s view code would benefit further if we broke up the UI
into small, medium, and large-scale reusable components that do not
rely internally on Re-frame subscriptions and events. Again these
components would have a high-level well-defined API and would
encapsulate their behavior such that the internals of the components
would know nothing of Re-frame and the structure of app-db. We would
then assemble these independent components and wire them into the
larger application by providing data from the appropriate
subscriptions etc.
This isolation would, yet again, free developers from the need to
understand more of the application than necessary. It would give
developers independence from the larger picture and allow them to
concentrate on the task at hand, which is to create great UX.
These components should feel free to deviate from the Re-frame pattern
and use local state, and even talk directly libraries and services. As
a motivating example, think about the HTML Video tag and all the
functionality it provides, it talks directly to browser services and
fetches the video and keeps track of all the internal playback state.
I could imagine a wallet React component that manages all of its own
navigation, and talks directly to a high-level wallet API, without
using Re-frame as an intermediary.
Specifics
I’m aware that this post is general. I address subjects like techical
debt and refactoring, while not providing specific examples. This is
something that I’d like to have a chance to do.
I think communicating specific examples of technical debt and
demonstrating what I mean when I talk about refactoring is going to
require a wider bandwith than a post on discuss.
If people are open to it, I’d like to pair program with some folks and
share concrete examples of reducing technical debt in the status-react
codebase. We could also experiment with some mob programming sessions
where we have lots of developers participate in a single programming
session.
I’m also happy to talk with anyone to answer questions.
Conclusion
IMHO the status-react codebase has accumulated enough technical debt
to cause significant friction in moving the codebase forward and
allowing Status to reach its stated goals. Technical debt is not a bad
thing but rather a normal feedback signal on the road to learning how
to build what you are building. There is no way to solve technical
debt other than careful consideration of the code and refactoring it
down to its essence. Drastic measures (Write tests for everything!
Write specs for everything! Re-write it completely!) normally just
produce more code and are a clumsy way to address the fundamental
problems.
Code isolation patterns will allow a distributed team to coordinate
much more easily and allow folks to focus on the problem in front of
them with confidence.
I fully believe that these things will help restore forward momentum
in the project and bring much more joy to everyone working on Status.
Big thanks to folks for taking the time to read this!
Bruce Hauman
[email protected]
bruce.stateofus.eth