Latest articles

The Better "SVN -> Git" Guide

Converting a Subversion repository to Git can be a pain since there's no simple tool that automates the whole process. Worse, most guides and scripts out there completely ignore a couple basic differences between SVN and Git:

  • Unlike Git, SVN supports empty directories. Consequently, many SVN projects rely on these. These need to be preserved with placeholder directories.
  • SVN repos don't usually get cloned, so they're more likely to contain large binary files. For example, I used to be in the habit of keeping my precompiled binaries in the repository in addition to the source code. Since Git repos are always cloned, it's good to prune out such files when converting.

I've recently gone through the pains of converting a few of my SVN projects to Git. So here are my notes on how to do it (just as much for my own reference as for anyone else).

One important note that Windows users (like me) aren't going to like to hear: Since Git is heavily Linux-oriented and all the needed scripts are Linux shell scripts...you'll have to do this on a Linux system (The "Git Bash" that comes with Git on Windows might be sufficient, but I haven't tried). If you don't have a Linux machine, and Git Bash gives you problems, I recommend installing Linux into a VM using Sun'sOracle's free VirtualBox.

Also, you will need at least Git v1.7.7 (and also git-svn). Anything older than v1.7.7 lacks the --preserve-empty-dirs switch we'll be using. You can check your version of Git with:

$git --version

If you need to upgrade it, and you're on a system that uses apt-get, remember: With apt-get, you upgrade a program with the install command, not the upgrade command. Ie:

$sudo apt-get install git git-svn

1. Copy SVN Repo to Local System

We'll be creating a lot of files and directories, so we should work in a clean directory:

$mkdir my-proj-convert-vcs $cd my-proj-convert-vcs

The SVN repository needs to be copied to your local system if it isn't already there. If your only way of accessing the repo is through SVN itself, you can do it like this:

$svnadmin create my-local-svn-repo $cd my-local-svn-repo $echo '#!/bin/sh' > hooks/pre-revprop-change $chmod +x hooks/pre-revprop-change $svnsync init file:///`pwd` https://url_to_svn_repo $svnsync sync file:///`pwd` $cd ..

That synsync sync... command may take awhile as it downloads each revision in order.

2. Prune the Repo

If you don't have any big binary files (or anything else) that you want pruned out of the repo, you can skip this step.

First, dump the SVN repo:

$svnadmin dump my-local-svn-repo > my-local-svn-repo.dump

Subversion has an official svndumpfilter tool for removing content from a dumped repo, but it's known to be crap. It didn't even work at all for me. Instead, you should use the vastly superior svndumpsanitizer.

Download, extract and compile svndumpsanitizer:

$wget http://miria.linuxmaniac.net/svndumpsanitizer/svndumpsanitizer-0.8.4.tar.bz2 $tar xvjf svndumpsanitizer-0.8.4.tar.bz2 $gcc svndumpsanitizer-0.8.4/svndumpsanitizer.c -o svndumpsanitizer

Now, create a little script to run svndumpsanitizer. Depending if you're on a KDE-based, GNOME-based system or text-based system:

$kate prune-repo.sh & or $gedit prune-repo.sh & or $pico prune-repo.sh

Enter something like this (note that svndumpsanitizer doesn't support wildcards):

#!/bin/sh ./svndumpsanitizer --infile my-local-svn-repo.dump --outfile my-pruned-svn-repo.dump \ --exclude trunk/bin/myApp1 \ --exclude trunk/bin/myApp1.exe \ --exclude trunk/bin/myApp2 \ --exclude trunk/bin/myApp2.exe \ --exclude branches/fooBranch/myApp1 \ --exclude branches/fooBranch/myApp1.exe \ --exclude branches/fooBranch/myApp2 \ --exclude branches/fooBranch/myApp2.exe

Save that, and then back at the command prompt, run it:

$chmod +x prune-repo.sh $./prune-repo.sh

Double-check that it actually pruned the files by comparing the file sizes and making sure the pruned version is indeed smaller:

$ls -l

Now, we create our newly-pruned SVN repo:

$svnadmin create my-pruned-svn-repo $svnadmin load --ignore-uuid my-pruned-svn-repo < my-pruned-svn-repo.dump

That last command may take awhile. It creates a new SVN repository one commit at a time.

3. Convert the Authors

Make an empty checkout of any of your project's SVN repos. The original SVN repo works just as well as any:

$svn co --depth empty https://url_to_svn_repo my-working-copy

Create this file and name it svn-authors.sh :

#!/usr/bin/env bash svn log -q | \ awk -F '|' \ '/^r/ {sub("^ ", "", $2); sub(" $", "", $2); print $2" = "$2" <"$2">"}' | \ sort -u

Make it executable, and run it on your checked out working copy:

$chmod +x svn-authors.sh $cd my-working-copy $../svn-authors.sh > ../my-repo-authors.txt $cd ..

The file my-repo-authors.txt now contains a list of all the authors who have committed to the repo. It looks like this:

User1 = User1 User2 = User2 User3 = User3

Edit that file, changing the right-side to the user's name/email for Git. Don't change the left the left-hand side - those are the SVN user names.

4. Fix git-svn

This part is a bit of an annoyance. From v1.7.7 onward, Git has a --preserve-empty-dirs. Problem is, the damn thing's broken. If you try to use it as-is, the whole operation will likely just fail partway through. It has to be fixed.

First, find your git-svn file:

$find / 2> /dev/null | grep git-svn

For me, it was at /usr/libexec/git-core/git-svn. Open it in your favorite editor:

$sudo [your favorite editor] /path/to/git-svn

Now, in this git-svn file, search for die "Failed to strip path. It should be somewhere near line 4583. Change the die to print and save. Your git-svn is now fixed.

5. Convert to Git

As you may have already guessed, we're going to use git-svn. For very large repos (ex: ten or so thousand commits, hundreds of branches/tags, and thousands of files) git-svn has been known to take forever and then crap out. Allegedly, such repos can be converted quickly with svn-fe and git-fast-import, but good luck actually figuring out how to do it without screwing up your branches, tags, and empty dirs. Personally, I just gave up. This git-svn method may not be suitable for such huge repos, but at least it's actually feasible for mere mortals.

The exact flags to use depend on the structure of your SVN repo. If your repo uses the standard SVN trunk/branches/tags layout, then the proper command is:

$git svn clone file://`pwd`/my-pruned-svn-repo --preserve-empty-dirs \ --placeholder-filename=.stupidgit --authors-file=my-repo-authors.txt \ --stdlayout my-temp-git-repo

The traditional name for the empty-directory-preserving placeholder file is .gitignore (and that's the default), but I think .stupidgit is much more appropriate (and satisfying).

Note that the above command is equivalent to:

$git svn clone file://`pwd`/my-pruned-svn-repo --preserve-empty-dirs \ --placeholder-filename=.stupidgit --authors-file=my-repo-authors.txt \ --trunk=trunk --branches=branches --tags=tags \ my-temp-git-repo

So if your SVN repo uses a non-standard layout for trunk/branches/tags, you handle it like this:

$git svn clone file://`pwd`/my-pruned-svn-repo --preserve-empty-dirs \ --placeholder-filename=.stupidgit --authors-file=my-repo-authors.txt \ --trunk=whatever/trunk/path --branches=whatever/branches/path \ --tags=whatever/tags/path my-temp-git-repo

Even though we now have a Git repo, we're still not done yet.

6. Clean Up the Mess Git Left Behind

First, we'll convert the ignore list, since Git didn't bother to do that automatically:

$cd my-temp-git-repo $git svn show-ignore > .gitignore $git add .gitignore $git commit -m 'Convert svn:ignore properties to .gitignore.'

Even though Git was able to insert dummy files to preserve your empty directories, it was still too dumb to know when to actually get rid of them. So now you likely have a bunch of useless old directories that had already been deleted in SVN which Git wasn't intelligent enough to mimic the removal of. These directories are being held in existence by the .stupidgit placeholder files. You may also have unneeded .stupidgit files in directories that already have other files. So while some of your .stupidgit files are holding legitimate empty directories in existence, we need to remove the rest of them from version control. For each of these useless .stupidgit files, run:

$git rm path/to/useless/placeholder/.stupidgit

Once you've gotten them all (but none of the ones you legitimately want to keep!), commit the changes:

$git commit -m 'Remove superfluous .stupidgit files.'

Now we'll create a new bare Git repository (ie, a repository without a working copy):

$cd .. $git init --bare my-bare-git-repo.git $cd my-bare-git-repo .git $git symbolic-ref HEAD refs/heads/trunk $cd ../my-temp-git-repo $git remote add bare ../my-bare-git-repo.git $git config remote.bare.push 'refs/remotes/*:refs/heads/*' $git push bare $cd ../my-bare-git-repo $git branch -m trunk master $cd ..

At this point, you can delete the temporary Git repo if you want:

$rm my-temp-git-repo -rf

Create a script to convert the tags from Git branches into actual Git tags:

$[your favorite editor] clean-tags.sh

Enter the following:

#!/bin/sh git for-each-ref --format='%(refname)' refs/heads/tags | cut -d / -f 4 | while read ref do git tag "$ref" "refs/heads/tags/$ref"; git branch -D "tags/$ref"; done

Save, then exit back to the command line and run it on the bare Git repo:

$chmod +x clean-tags.sh $cd my-bare-git-repo $../clean-tags.sh

Finally, you're done! You can copy your my-bare-git-repo.git to whatever computer you want, clone from it, push it to BitBucket, etc.

References:

Read more


Ancient: It's Not What You Think

Normal people are already well aware of this, but those big in computers apparently need to be told:

"Ancient: Of or in time long past, especially before the end of the Western Roman Empire"

The important thing to be aware of here is that "two years", or even "two decades", fails to qualify as "ancient" by a wide margin.

Read more


Goldie Parsing System v0.7 - API, 64-bit, Git

Goldie v0.7 is now released.

Goldie is a series of open-source parsing tools, including an optional D programming language library called GoldieLib. Goldie is compatible with GOLD Parser Builder and can be used either together with it, or as an alternative to it.

In this version:

(Tested to work on: DMD 2.052 - DMD 2.056, and partially DMD 2.057 as described below - but see this note regarding RDMD)

Links:

Read more


A Professional Coder With A Dynamic Language...

...is like a professional racer with automatic transmission.

Read more


Stupid Git

Jesus christ, I just want to shoot Git in the face.

I wish Git had a face so I could shoot it in the face.

I wouldn't even be trying to use the stupid thing if TortoiseGit wasn't so damn much better than TortoiseHg.

Of course, with Hg I'd be having all that fun dealing with Python's separate incompatible runtimes bullshit.

Ggrrrraaaaaaaa!!

Read more


Fuck SOPA *and* The Moronic SOPA Blackout Day

SOPA/PIPA is obviously Orwellian idiocy. In fact, it's so moronic, that even most of its former supporters ended up abandoning it. It's dying, it's been dying, and chances are it'll be dead by the end of the month.

And yet, we're having a big internet blackout day to stop it. But, let's take a moment to fully appreciate how this is supposed to work: Today, all the sites that are opposed to this fleeting carcass of a bill are supposed to shut down thereby forcing their users towards...the sites of those few companies who would embrace the Orwellian SOPA/PIPA? Umm...ok?

Alright, suppose I'm wrong and these fascist bills do manage to pass - just as the corrupt douchebag politicians passed the DMCA years ago. Am I really supposed to believe that's immutably "Game Over: They Win"? Well it will be if Americans repeat their reaction to the DMCA; that is, to roll over and say "Oh, ok, a bunch of out-of-touch politicians said this is so, so that's just how it is and we'd all better comply."

Jesus fuck, has nobody learned anything from Henry David Thoreau?

When an unjust law is passed, especially by corrupt out-of-touch lawmakers, you have a moral obligation to disregard it, if not flagrantly break it.

(I seem to remember the Germans rolling over for unjust leaders about, oh, 70-some years ago. Look how wonderfully their subservience turned out. Gee, heaven forbid anyone should have enough of a moral compass to rock the boat.)

If people (especially those in justice or law enforcement) aren't demonstrably willing to stand up and take a big steaming piss on unjust already-passed laws, then crap like SOPA/PIPA/DMCA/etc. are only going to keep popping up. Eventually, some of that shit flung against the wall is going to stick. And no "awareness day" stunt, no matter how big or small, is ever going to do a damn thing to stop that pattern.

This is still supposed to be a country run by the people, not by "the corporate entities" or even the lawmakers. So don't go pulling another DMCA: No matter what happens with SOPA/PIPA, have the balls - and the basic integrity - to exercise the power that you are constitutionally entitled to.

And as far as the ill-conceived SOPA blackout day: Fuck, I have better things to do today than deal with self-crippled sites and play "follow the leader".

Read more