This article is part of a series on SOA Development and Delivery. (read previous article)
Let’s get started on our SOA Development and Delivery journey by talking about version control. But before we delve right in, let’s take a moment to reflect on this axiom:
Axiom 1: Developing a SOA Composite Application is software development.
When we sit down to create a SOA Application, i.e. composites, user interfaces, services, etc., we are actually embarking on a software development exercise. I think that some people don’t believe (or at least they don’t admit to themselves) that this is the case. Why? Well acknowledging that it is, in fact, software development implies a whole bunch of practices are necessary – like version control and testing for example. And those are hard, right?
Well, they are certainly more work. But its a bit like insurance – it is a cost you accept in the present to offset or prevent a much more significant potential cost/pain in the future. In their quintessential book Continuous Delivery, Dave Farley and Jez Humble say:
“In software, when something is painful, the way to reduce the pain is to do it more frequently, not less.”
Putting in the extra effort upfront will save you from a lot more pain and effort later on. Using version control fits into this category. It can be a bit painful, but it is definitely worth it in the end. Consider this axiom:
Axiom 2: Developing software without version control is like mixing chemicals without reading the labels – sooner or later, it is going to blow up in your face.
Why is this true? Consider the following questions – how would you answer these if you are not using version control?
The answer is, of course, that you can’t answer them. At least not with any level of confidence. I would go so far as to say that if you don’t have your SOA applications under version control, you should stop whatever you are doing, and go set it up right now! Sooner or later you are going to encounter a critical problem in your production environment that you simply cannot fix.
That leads us to one more axiom:
Axiom 3: If it is not in version control, it does not exist.
If you cannot retrieve a previous revision of an artifact, then it may as well not have ever existed at all – it is of approximately the same value to you either way.
Now, what do I mean by ‘using’ version control? Well, it is more than just checking in your changes. That is an excellent start, and it is much better than nothing at all, but really, you should be thinking about a few more advanced ways of using your version control system as well – like branching and tagging for example. We will come back and discuss these in some depth later in this article.
For now, let’s cover some basics.
There are a number excellent version control systems available, both free and commercial. I have used most of them – rcs, SCCS, CVS, Subversion, ClearCase, Perforce, Mercurial, and git to name a few, and even some proprietary ones that exist only inside the organizations that use them or are included in a particular operating system (e.g. OpenVMS) or file system (e.g. zfs).
Today I mostly use Subversion, but I am going through a transition to git. Let’s talk about these two in a little more detail.
Firstly, both of them are free, and both are widely used. That means that there is good tool support, a large community of people creating helpful content about how to use them, and a pretty good chance that anyone you get working on your project will have some level of familiarity with them. They both also have excellent, freely available books to help you get started: Version Control with Subversion and Pro Git.
The other thing that is very useful about these two, specifically in the context of Oracle SOA Suite development, is that they use an atomic commit which means that you can change a number of files and commit all of those changes as a single revision. CVS, for example, does not allow you to do this because it versions each file individually. This capability is great in a SOA environment, as many of the changes we need to make involve making changes to more than one file. Having the project in a state where some, but not all, of those file changes are committed, is not useful.
JDeveloper supports Subversion quite well, although you do need to invest a little time to make sure you know how to drive it. And support for git was added in JDeveloper 220.127.116.11.
There are three key things that are driving my personal migration from Subversion to git:
I personally find the git workflow more complex than the Subversion one, but for me at least, the time has come to make the move. I think that moving to distributed version control is pretty much inevitable in the modern world.
I think the most important thing to say about which version control system is the right one to use is this – any is better than none. If you want me to recommend one, I would have to say git.
Version control is for source artifacts, not derived artifacts like binaries, deployable packages, etc. Source artifacts does not just mean source code – it means anything that is needed to recreate your production environment from scratch. Craig Barr proposed an excellent list in this article which I am quoting here:
Personally, I do not agree with putting your binaries into your version control system. I think that binaries belong in a separate repository, because they have quite different characteristics and management needs. We’ll talk a lot more about this in a future post on binary management, but for now a couple of examples to illustrate the point:
|Tend to be relatively small files||Tend to be relatively large files|
|Tend to change frequently||Tend to never change after they are created|
|Usually we want to keep all revisions||Often we only want to keep important and recent revisions|
|Are created by a person||Are (or at least should be) created by some automated/programmatic process|
|Cannot easily be recreated it they are lost or damaged||Can be easily recreated if they are lost or damaged (assuming you still have the source, etc.)|
So what do you put into version control for SOA, OSB, ADF?
The simplest answer to this question is ‘whatever is left in the project after you have executed the clean action on the project in the IDE.’ Note that you would need to make sure you have disabled automatic builds in eclipse, otherwise it will just go ahead and build again.
We need to be careful about a couple of things here:
The same goes for ADF and OSB projects. Given the flexibility provided by the tools, there is really no ‘one size fits all’ answer to this question, you need to invest the time to work out the correct answer for your own projects.
At the end of the day, there are two ways you can get this wrong:
So if you are limited for time, better to opt for too much than too little.
A question that comes up fairly often, particularly with Subversion, is how to structure the repository.
There are two common approaches here – one is to have a single Subversion repository with all the projects in the same repository. The other is to have one repository per project (or development group, or whatever unit).
There are of course advantages and disadvantages to each approach. Commonly cited issues include: the amount of administration overhead, the time taken to perform backups (which is done by dumping the whole repository), issues with revision numbers (which are shared across the whole repository) and comments (knowing which ones are for a specific project), different security requirements, or code separation requirements for different projects, and different Subversion workflow requirements across projects.
In practice though, I think that these can be handled with approximately the same amount of effort regardless of the approach chosen, assuming a suitably experienced Subversion administrator.
I believe that the one repository approach is easier, and I would recommend taking that approach unless there is some specific reason not to. The most likely reason is that some project team does not want their code stored with another team’s code due to some kind of confidentiality or licensing issue (perceived or real).
So, If you are using Subversion, I would recommend a single Subversion repository, shared by all projects for a SOA environment.
Inside the repository, you should create zero or more levels of directories that you use to organise projects into logical groups, then under these, create a directory for each project (i.e. SOA Application, etc.) and under that create the recommended Subversion trunk, tags, and branches directories. This is also consistent with the approach recommended in Version Control with Subversion. So your repository might look like this:
root - businessUnit1 - project1 - composites - GetCustomerDetails - trunk - tags - branches - ProcessOrder - trunk - tags - branches - ui-projhects - ... - ... - businessUnit2 - ...
With git, I am in the habit of creating a git repository for each project, as that is a more natural way to organise things in git.
Tagging, in the context of a version control system, is essentially making a named copy at a given point in time. (Though more than likely you will just be copying pointers, not all of the content.)
Why would you want to be able to go back to a given point in time? There are two excellent reasons:
So, the first one tells us that you must tag whenever you are going to release. You might also want to tag whenever you reach a significant or meaningful milestone.
The second reason can be addressed with tagging, but you might be better to use a branch in that case. We will talk about branches in a moment.
Often people ask about the relationship between the ‘versions’ in the version control system and the build system, and the runtime versions. It is important to understand that there is not, and need not really be, any kind of direct relationship between them, other than for releases or release candidates.
Normally you are going to be building the latest version of the code – this is sometimes called the ‘head’, or ‘trunk’ or ‘tip’ – and executing tests against that. So you don’t need to know which ‘version’ (‘revision’ is a more accurate word from the point of view of the version control system) it is – you can just refer to it as the latest version. In Subversion, this is done by ending the URL with ‘/trunk‘.
The only other revisions that you are likely to build are the latest versions on a particular branch. Again, this can be done without knowing the revision number.
You would not need to build any tagged/released version again – you could just go and get the binary from the binary repository. Although you could easily build it again if you needed to by referring to the tag name in the URL.
And all of the other revisions are essentially old, discarded points in time that you have moved on from.
So you do need to know which tag relates to which binary version, and the easiest way to do this is to just use the binary version number in the tag. So, for example, you might tag revision 126 as ‘VERSION-2.3‘. If you needed to come back later and look at that version, you could just end your Subversion URL with /tags/VERSION-2.3.
That brings us to branching…
Keeping a log of all the changes over time is all very well and good, but projects don’t always follow a straight line. What happens when you do find the problem in your old version 2.3, the one you have in production, but you have already done a heap of work on version 2.4?
This is one thing that branches are good for. A branch lets you start a parallel stream of work from a given point in time (like a tag, but it can be from any revision). Consider the following diagram:
Here we can see that development of the next version can continue on the trunk, while another team work on fixing the production bug in version 2.3 in the branch. The two have no effect on each other. We can build either the trunk or the branch, depending on what we want to do.
When you finish fixing the bug, let’s say that is with revision 131. You should tag that one too, so you have another tag to go back to in case you find a bug in that ‘fixed’ version.
The other useful thing about branches, is that you can merge them back into the trunk (or any other branch for that matter). This provides an ideal way to isolate potentially dangerous work. If you are trying something new, and you are not sure how it is going to work out, you can do that work in a branch. If it proves to be ok, you can merge the changes in that branch back into the trunk at some point in the future. But if it is not ok, you can just forget the branch and go on as if it never happened.
One really important thing to know is that you can merge backwards. This lets you essentially get rid of a bad commit. This is definitely something that you should learn how to do in your version control system.
Here is another meaningful question – how often should developers commit their changes to the trunk?
Later on we are going to talk about continuous integration. The word ‘integration’ in ‘continuous integration’ refers to the process of merging changes into a trunk. One of the foundation principles of continuous integration is that all developers must commit regularly to the trunk. What does regularly mean? Well no less frequently than once every day.
This drives some desirable behaviors. How much meaningful work can a developer do in a day? Not much right? Right! That’s the whole idea! By forcing developers to work in small iterations, we are minimizing the size of potential integration issues that can occur. Less code to integrate – less serious problems integrating. Less serious problems, easier to fix. Less problems in the codebase at any given point in time – better quality! That is what continuous integration is aiming to achieve.
When we are using continuous integration, we build every time a developer commits to the trunk (or to a branch). But now the question arises – do we really want to build all of the developer’s little intermediate commits, that we know are not going to work anyway, since they are work in progress?
There are two ways we are going to talk about (in future articles) to address this:
At this point, it starts to become difficult to talk about some of the strategies I want to discuss before we have moved forward a few more steps and talked about a few more topics, especially continuous integration . So let’s put version control on hold, and come back to it later.