Click here to Skip to main content
15,887,975 members
Please Sign up or sign in to vote.
5.00/5 (1 vote)
See more:
Hi,

I've been hunting around for a solution to merge two svn repositories containing the same base code but developed at two different locations.

Is there a way to merge them into one repository?

For example, the original repository has a revision number of 30. We made a copy of it and place it at a remote location and developed there until the revision number changed to 40. So can we update the original repository to revision 40?
Posted
Updated 20-Feb-12 14:53pm
v2

How to Merge Two SVN Repositories
If you don't care about retaining all the history of one of the repositories, you can just create a new directory under one project's repository, then import the other.

If you care about retaining the history of both, then you can use 'svnadmin dump' to dump one repository, and 'svnadmin load' to load it into the other repository. The revision numbers will be off, but you'll still have the history.

After reading it, one might get the impression that merging two SVN repositories is a trivial process. Of course that largely depends on your situation, but it is usually not as streamlined as the above quoted text implies.


Tools

The only tool I needed to do this is VisualSVN Server. It is a Subversion server for Windows that includes all the subversion binaries we need.

It would be convenient after installing VisualSVN Server to add its bin folder to the PATH environment variable; since we will be using several of the utilities there from the command line.

Initial Plan

Since we wanted to retain all history of both repositories, following the advice of the Subversion FAQ was the initial plan. However, using ‘svnadmin dump’ and ‘svnadmin load’ means that you must have access to the servers that host both repositories. The reason for this is that the two commands expect a path to the repository and not a URL. This is of course possible with the repository that we want to dump which is the private repository at our server but it can’t be done, as far as I know, with the other repository at Google Code which we want to load the dumped repository into.

To overcome this problem we can mirror the Google Code repository locally and work on this mirror instead. Then we can reset the Google Code repository to revision zero and sync it to this mirror, which would finally contain the result of merging the two repositories. If this seems a little vague to you now, continue reading and more details will come.

Dumping the Private Repository

Since I have RDP access to our server, I can use ‘svnadmin dump’ from a command line to achieve this, like so:

svnadmin dump /path/to/privaterepo > PrivateRepo.dump

And this will dump the entire repository to the file PrivateRepo.dump. But here came another problem. Our repository at the server does not host only the one project we need to be dumped but it also hosts several other projects. The repository structure can be outlined like this:

PrivateRepo
Ra
ProjectX
ProjectY
Since we are only interested in the project called ‘Ra’ we can use another subversion utility called svndumpfilter to filter out the unneeded projects from the dump file like this:

svndumpfilter include --drop-empty-revs --renumber-revs Ra < PrivateRepo.dump > Ra.dump

This will filter out the unneeded projects from PrivateRepo.dump and save the result to the Ra.dump file which should only include the project ‘Ra’. The optional arguments --drop-empty-revs and --renumber-revs are necessary here to remove any empty revisions resulting from filtering and to appropriately renumber the revisions that are left.

If this works for you then good. However, filtering does not always work as expected and svndumpfilter can choke on some projects and fail to filter them out which happened in my case with an error similar to this:

svndumpfilter: Invalid copy source path '/ProjectX'

The solution I used to solve this problem is to mirror or synchronize project ‘Ra’ to its own dedicated local repository on my machine. Then dump that mirror instead.



Synchronizing Two Repositories

This can be done in following steps:

1. Using VisualSVN Server Manager, create a new user. Let’s assume username: user1 and password: user1_pass



2. Right-click Repositories and create a new repository. We will call it ‘RaMirror’. It is important here to keep ‘Create default structure (trunk, branches, tags)’ unchecked since we want this repository to remain at revision zero in order to be able to sync it.



3. Right-click the repository ‘RaMirror’ and click on Properties. Here you should make sure that the user we have created in step one has read and write access to this repository.

4. Before we can sync the two repositories ‘Ra’ and ‘RaMirror’ we need to edit the hook scripts for ‘RaMirror’. If you installed VisualSVN Server accepting all defaults, the folder that contains the repository files for ‘RaMirror’ will usually be ‘C:\Repositories\RaMirror’. Under the hooks folder, you will find the default hook scripts.

These are Unix shell scripts, you can of course modify them to work on Windows if you want. But in our situation we don’t really need to do that. We can just rename all hook files to use the ‘cmd’ or ‘bat’ file extensions instead of the default ‘tmpl’ extension in order to make them executable on Windows.

And for all the post scripts:

post-commit.cmd
post-lock.cmd
post-revprop-change.cmd
post-unlock.cmd
We can pretty much clear their contents; because they are mostly used to send notification emails after the corresponding action takes place. As for the other scripts:

pre-commit.cmd
pre-lock.cmd
pre-revprop-change.cmd
pre-unlock.cmd
start-commit.cmd
They do certain checks to make sure that the provided user credentials are allowed to perform commits, change revision properties, lock/unlock files etc…

Since we created this repository, RaMirror, locally and since we have full privileges and read/write access, we can modify their contents to just exit with a hard-coded success code. like so:

exit 0

5. We are now ready to sync the repository ‘RaMirror’ with the project that we want to dump, which is project ‘Ra’. To achieve this we will use another subversion utility called svnsync in two steps:

Firstly, we initialize the syncing process, like so:

svnsync init --source-username src_user1 --source-password src_user1_pass --sync-username user1 --sync-password user1_pass file:///Repositories/RaMirror svn://ra-ajax.org/Ra

Here we are using svnsync with the init subcommand. We are providing credentials using --source-username, --source-password for the source repository that we want to mirror, and --sync-username, --sync-password for the destination repository which is ‘RaMirror’. Then we provide the URL of the destination repository and the URL of the source repository respectively.

Secondly, we start the actual synchronization process using svnsync with the sync subcommand:

svnsync sync --source-username src_user1 --source-password src_user1_pass --sync-username user1 --sync-password user1_pass file:///Repositories/RaMirror

Here we only need to provide the URL of the destination repository. After this finishes successfully you should have a mirror of the source subversion repository with all its history and revisions. And now we can dump this mirror to a dump file:

svnadmin dump /Repositories/RaMirror > Ra.dump

This will dump all revisions to Ra.dump, however the first revision just adds the same files and directories that already exist in the repository which we want to load this dump file into. We need to only start at the revision that had actual changes, assuming it is revision 2 and that the last revision in the repository is 100, the command we should actually use would be like this:

svnadmin dump --incremental -r 2:100 /Repositories/RaMirror > Ra.dump

Note that we also pass the --incremental option so that the first dumped revision, 2 in our case, would only describe the changes in that revision and not everything that existed in the repository as of that version.

I also similarly mirrored the repository at Google Code to a local SVN repository and named it ‘RaGMirror’.

Final Steps

Since the name of the repository we mirrored is ‘Ra’, the Ra.dump file will have the file/folder names that reside in the root of the repository prefixed with a ‘Ra’ folder. And since we need these files to be created at the root of the repository when we load this dump file not under a subfolder, we need to do some editing. You can read more about this here.

I used Notepad++ to search for all instances of ‘Node-path: Ra/’ and replaced them all with ‘Node-path: ’. We also need to search for the section that creates the ‘Ra’ subfolder and remove it. It would look like this:

Node-path: Ra
Node-kind: dir
Node-action: add
Prop-content-length: some_number
Content-length: some_number

Then we can save the dump file and start to load it into the trunk of our local ‘RaGMirror’ repository:

svnadmin load --parent-dir trunk /Repositories/RaGMirror < Ra.dump

After the loading process is finished successfully, the ‘RaGMirror’ repository would contain all revisions and full history of the two repositories that we wanted to merge.

The final step now is to sync this local repository ‘RaGMirror’ back to the public Google Code repository. Akin to what we did in step five in the previous section, but of course changing credentials and the source and destination URLs. However, before this can be done, the repository at Google Code must be reset to revision 0.

Be careful as many things can go wrong, be sure to have backups of every repository you are about to change and use the information provided here at your own risk. The image below shows the two revisions where both repositories merged seamlessly.
 
Share this answer
 
Assuming that

The existing repositories have a structure like:

repository root
branches
tags
trunk
and you want a structure something like:

repository root
projectA
branches
tags
trunk
projectB
branches
tags
trunk
Then for each of your project repositories:

svnadmin dump > project<n>.dmp
Then for each of the dump files:

svnadmin load --parent-dir "project<n>" <filesystem path to repos>
More complex manipulations are possible, but this is the simplest, most straightforward. Changing the source repository structure during a dump/load is hazardous, but doable through a combination of svnadmin dump, svndumpfilter, hand-editing or additional text filters and svnadmin load

Dealing with a third party provider

Request svnadmin dump files for each of your repositories. The provider should be willing/able to provide this - it is your code!
Create an SVN repository locally.
Perform the actions listed above for the dump files.
Verify the repository structure is correct with your favorite client.
Create a dump file for the combined repositories.
Request that the provider populate a new repository from this dump file.
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900