Ever read the book The Pragmatic Programmer from the late 90’s? If you haven’t yet, I’d recommend that you at least read the first two chapters especially if you’re a programmer. It’s essentially the origin of the DRY principle of programming and in some aspects some of the things talked about in this book applies to real life situations. I’ve personally not read the whole book but even a few chapters gives you insight that I feel is invaluable to writing short, efficient code. But enough of that you’re here cause you’re too lazy to read the book and just need a short summary, so let’s get to it. I’ll be focusing on the DRY principle specifically duplication since the book has way more than that and it’s impossible to cram it all in a single blog post.
In a nutshell the DRY principle is “Every piece of information must have a unique, clear, definitive representation within a system” AKA Keep it DRY AKA Don’t Repeat Yourself. So what are the different causes of duplication and how are they dealt with? There are basically four ways where duplication can occur which are:
* Imposed Duplication – developers feel they have no choice, the environment seems to require duplication
* Inadvertent Duplication – developers don’t realize that they are duplicating information
* Impatient Duplication – developers get lazy and duplicate because it seems easier
* Interdeveloper Duplication – multiple people on a team or on different teams duplicate a piece of information
Most of you if not all probably have a very good idea of the scenarios mentioned above and have experienced them, some more than others. The most common one in this programme might be the fourth one, Interdeveloper duplication. Let’s try and get into more details about each individual form of duplication to better understand them.
Let’s assume we have the same information represented in multiple forms probably as a result of working with a myriad of different programming languages or frameworks. Good practice dictates finding a common metadata representation then reference it when working with different environments to represent the data rather than redefining the data on each environment and consequently creating redundant code that sometimes creates restrictions that are imposed by the respective environments. A common way of doing this representation is XML which you are probably familiar with.
DOCUMENT YOUR CODE! DOCUMENT YOUR CODE! This is probably a line that most people got from their programming lecturers, supervisors or even the scrum master but often in most cases I have found myself at a loss when trying to figure out what exactly I should be documenting and here is where we kind of try to clarify that. First off, if you’re going to document your code AVOID OVER DOCUMENTATION. Code that is heavily documented is primarily bad code. The documentation is in fact a second representation of the information already represented on the code so heavy documentation is just another synonym for redundant useless information. Changing the code would basically mean changing the documentation so basic practice when it comes to documentation would be to let the low-level knowledge of the code reside in the code and reserve your comments for the high-level knowledge.
The primary cause of this would be that the design of the system is flawed leading to developers following a standard that would probably lead to repeated information with modules that globally affect the system if they were to break instead of localizing the impact and this is normally only realized in the middle or at the end of development timelines. With this there is not much I can say about it other than ensure that the design of your system is near perfect(or close to it) with only errors arising from minor details that wouldn’t entirely cripple the system if they were to occur. One way of ensuring this is to make sure that your databases are normalized to avoid the common modification anomalies such as the update and delete anomalies as well as minimizing extensive redesign when extending the database which is very common in any project.
It has been hypothesised that humans are inherently lazy so it doesn’t come as a surprise that one of the major causes of duplication is as a result of one of our most in-built traits. Remember back when I mentioned using a common metadata representation for information instead of duplicating that data every time you define it in the different environments?(That was a mouthful. LOL) Sometimes some programmers might be in a position to do that but they don’t cause they find it easier. DON’T. Lazy programming in this case, and also in a lot of the cases leads to more code than should be there which basically defeats you’re very essence of being lazy in the first place. Shortcuts lead to long delays in the future. Ever heard of the Y2K disaster?
Ah yes. Personally my most feared type of duplication and also one we might or might not relate to the most. Do you have a supervisor (or Scrum Master ahem) that is constantly emphasising on communication between development teams at any time of the day? This type of duplication is probably why. For the most part it can be hard for team members to communicate between one another. It isn’t the even the fault of some of the developers sometimes. Sometimes developers are so invested in code that they forget to even check their team chat group. Some of them even lose sense of time entirely. This type of duplication is arguably the toughest one to detect and handle. Here’s an excerpt of the book I read:
We heard firsthand of a U.S. state whose governmental computer systems were
surveyed for Y2K compliance. The audit turned up more than 10,000 programs, each
containing its own version of Social Security number validation.
Various developers at different stages of development might replicate functionality sometimes at enormous amounts. This is where communication between team members comes in and is probably the best way to handle this kind of duplication. The developers in the team should make it a point to learn from each other and read each other’s documentation to avoid duplicate functionality within the system and exorbitant amounts of time spent maintaining that large system later on in the future.
And that’s it. Hope this was a bit helpful and that you might follow better programming practices in the future.
Posted by Cyrus Muriithi