Monday, January 31, 2022

Ports and Adapters part 1

Ports and Adapters part 1

A love story

Today I want to begin a series about a technique which has helps to isolate code from changes in implementation details.  Sometimes those details may seem central to the code in question, but many times it just isn't.

I'm going to start with a little story.  I don't know if it's interesting or not, but it's all true.3

Many years ago, back when my razor stubble was brown and I was a hired gun programmer, I was responsible for maintaining a corporate security library.  This was an important library.  Dozens of applications depended on it to authenticate and authorize corporate users.

There's just one problem:  It relied on an obsolete LDAP library which was no longer being maintained, and that had to change.

This was a pretty frightening task.  It absolutely had to work.  This was before the idea of providing such a thing as a service had become popular, and the library was statically linked to all those applications.  A mistake meant a patch, a patch that would get attention, and what's more a patch that would have to be quickly rolled out to all affected applications.

I hate attention.  Well, bad attention anyway.  I suppose I'm also not really a big fan of emergency deployments. 

I had to formulate a plan of attack.  As per my usual practice, I banged my head against things until I eventually worked out what I could have found in a book.  Let me walk you through my process:

Step 1:  Tests

First, I built a comprehensive suite of unit tests that validated all of the library's functions.  The tests were a little more integration-ey than I would build today, but when things were green?  I had great confidence that I would have a successful build.  We were also adopting Jenkins at the time, so every code commit was tested and built automatically.

Step 2: Isolation

Next up, I isolated all of the library calls behind interfaces.  That was a lot of effort, but with the tests backing me up, I was able to keep everything working just as before.  I did have to add a factory to instantiate the main library class, but kept that hidden behind a facade that looked unchanged to the API users. 

Step 3: In with the new

Now I started on the new code.  By sticking to the interfaces which allowed me to accomplish step 2, I was able to swap back and forth between the fully functional production implementation and the one under development.  All I had to do was change the name of the class from 'new obsoleteImplemenation()' to 'new unfinishedImplementation()'.  I could run the test suite and get a pretty solid idea of how far I had come and how far I had to go.

Step 4: Risky business

I realized that this was dangerous territory.  If I shipped a library that had an unexpected bug under load or under some kind of unexpected error condition, there could be really big implications.  The user base included both internal and external entities, offering lots of exposure.  That was too big a risk, so I had to do something to mitigate it.

Step 5: On reflection, this is a good idea

I was working in Java, so I decided to take advantage of reflection to create the class.  If you aren't familiar with the concept, it is just a way of creating an object using the name of the class as a string value.  That came in handy, because now I could just have the two different class names for the now-isolated library and use either one live at run-time.  To make it as safe as possible, I initially defaulted to the old library, but gave the clients an option to set an environment variable to enable the new one.

Fortunately, I had good ties with developers for several of the other projects, and I was able to cajole them into testing and then deploying with the optional library enabled.

Step 6: Once more into the breach

Once I had good feedback, I felt safe making the new library the default, while allowing an environment variable to enable the old library.  I kept that around for a good long while, until I was sure it was safe to get rid of it.

Phew

That was a lot of effort and a lot of worry as well. We can do a little better than this.  In fact, we can do a great deal better than this, although frankly I was proud of my accomplishment.  What I'd stumbled into was a more general idea around dependency isolation.  You'll hear terms like 'hexagonal architecture' used to tell you what to do.  

But how do you actually DO it?

That's what I want to talk into in the next few posts.  Swapping dependencies can be a giant pain point, but it does not have to be.  The most difficult thing, really, is adapting how you think about systems.  The trick is to stop trying to adapt our code to someone else's idea of how an API should work.  That is OK for quick demonstrations or tutorials, but it's not how I believe we should build systems.

Instead, when we require a capability, we should design an API for that capability ourselves, whether or not we intend to implement it.  The design should be harmonious with our existing system, or at least follow similar conventions.  It should not feel tacked on.

Integrating external libraries deeply into our own code base is a code smell.  I want my code talking to my own libraries, which will act as adapters to the third party code I want or need to use.  Let those adapters have the weird stuff that thinks the way other people do.  My job is to make those adapters conform to the expectations I've set for/with my API design.

Next time around, I'll dig a little deeper into what I mean by designing an API ourselves, and what the benefits (and costs) are.

Peace,

JD

Friday, January 28, 2022

Check in, check out, Daniel-san

When you start to work on projects with more than one developer, you suddenly find yourself having to solve what sounds like a very simple problem:  Sharing code.  At its heart it IS simple, but the reality is that you have to take a disciplined approach.  Trying to do this without using a tool designed for the job is a likely path to madness, and it's a bit mad not to do so given that the tools you need are widely available at no cost.  Every professional shop, open source project and even many independent individuals makes use of some kind of version control system.

Version control at first feels like a burden.  If anything though, it is quite the opposite.  Knowing that your old code is out there, ready to be brought back into your project anytime you need it?  That's gold.  It frees you up to experiment, to explore, to go down paths that you really aren't sure lead anywhere.

So how does one get started?  Naturally, as with anything else, it begins with education and access.  In this case, you need to determine what you have available to you first.  If there's an existing system ready for you to use, you probably want to take advantage of that.  If you're a clean slate, you need to get something set up.  There are many services out there which provide version control, and it can even be free if you don't have a problem with other people possibly seeing your code.  It can also be free if you feel safe just running everything on your own computer or a server you control, although then you may need to install a service and keep it running.

Here are some of the more popular version control systems:

CVS - An old standby, it still works but frankly it's lacking a bit in features more modern systems have.
SVN - More modern and quite functional, I have worked (and continue to do so) in subversion shops for years.
Bitkeeper - This was paid software for years and I've never actually used it myself .  It basically came about as an answer to the difficulties Linus Torvalds was having with Linux development.
Git - A slightly more convoluted system than some, but clearly very powerful.  This ALSO came about due to Linux development, and apparently due to issues Linus was having with Bitkeeper.

There are others, but these are probably the main ones most of you will be looking at.

I personally use Git (hosted on a service) for code I share on this blog and for my own experimental work. It doesn't cost me anything, and it's nicely integrated with my IntelliJ IDE.   It also supports something called a 'gist', which is (as far as I know) a unique way to share a subset of a project in order to request assistance or provide examples.

The basic idea behind version control is the same, no matter what system you use.  Your make changes to software on your own computer and make sure that things work the way you want.  When you're happy with the code, you check it in to your version control repository.  If you are unhappy with the code, or have broken something to the point where fixing it is a major burden, you can just pull the last working copy back down and you're back to a known good starting point.

If multiple programmers are working on a project, things are much the same, except that you will pull the last working copy down a bit more often as you are getting all the changes that others have checked in as well.  Things are quite simple as long as two developers aren't working on the same exact files.  If they are working on the same files, some manual intervention is likely going to be needed to ensure that changes don't conflict.  That last process is called 'merging'.


Merging is a source of difficulty, or it can be, depending on your development practices.  I prefer to keep commit changes small and isolated whenever possible.  This keeps the differences (deltas) down to manageable levels, and if I've added two new source files rather than modified an existing one, we're not going to run into any problems.

Avoiding the Deadly Quadrant

I saw a video (opens in a new window) recently which illustrated an important concept in a very elegant way. 



This is important, because in 3/4 of this diagram, your code is inherently safe to run in a multi-threaded environment.  There are no synchronization blocks required, there is no need for complicated gatekeeping.  And yet, somehow a lot of code winds up with mutable data and synchronization headaches.

Applying just a few functional programming principles to your work can go a long way.  Parameters should generally be considered inviolate, use return properly and don't try modifying your inputs directly.  Prefer constants to variables. 

When you kick off a process, you really don't want it randomly reaching out and modifying some kind of global state.  If it REALLY needs to send messages home, give it a tool to do so, such as a callback function it can use for that purpose.

New Directions

 I've been learning lately.

I mean, I have been learning a lot.  Some of it is completely new, some of it is just fresh perspective on old ideas.

This is undoubtedly the natural consequence of being put in charge of developing a green-field project (can you believe it?!?) using both familiar and unfamiliar technologies.  It's a mobile app.  It's an API.  It's a cloud native, event driven...  work in progress.  As a consequence, I run into unexpected things all the time.  I also get to see up close what works and doesn't, as my client is flexible enough to let us experiment with features.

I think I need to record this stuff for anyone who might be interested.  I'm not saying that I will be giving up on discussions of OOP principles, but I will also be expanding my reach.

There was no single trigger for this, but the past year or so has helped me to understand a few new tools and concepts.  I'm still working out others, as I have been all along.  But now I'm going to write it down here.

I hope it proves useful.


JD