Bzrofs uses the FUSE (http://fuse.sourceforge.net/) Linux/BSD kernel extension to export a bzr (http://bazaar-vcs.org/) version repository as a file system. The primary use of bzrofs is to make versions visible to tools that are not version-aware.
The FUSE mountpoint exports the working directory as a read-write file system. In addition, two views of the repository are overlaid as a read-only directories.
The synchronic repository view consists of revision snapshots. With bzr, a revision applies to the whole repository instead of an individual file. In this view, a snapshot appears as a directory containing files.
The diachronic repository view consists of file histories. In this view each file appears as a directory containing a series of versions.
This project is as much an exploration of FUSE services as it is of version control. It has been tested on Linux and lightly tested on MacOS (10.5). There is extensive debugging output when run in foreground with the -f option.
By default, bzrofs exports its working directory, which must contain a bzr repository, usually found in a hidden subdirectory ".bzr". Repository versions and history are accessed through directories beginning "#" and "##".
These and other defaults can be changed with the following options.
--dir=DIRECTORY
Exports DIRECTORY instead of the current directory. This is necessary when running bzrofs as a server (without FUSE's -f option).
--bzr=EXECUTABLE
Uses EXECUTABLE as the bzr command to access the repository. By default this is "/usr/bin/bzr" on Linux and "/usr/local/bin/bzr" on OS X.
--cache=SIZE
Caches the results of SIZE bzr commands. These include directory listings, version logs, and file revisions. The default SIZE is 32.
--tag=CHARACTER
Changes the special directory character from "#" to CHARACTER.
The options listed above may be followed by the standard FUSE options. Among the most useful are
-f
Runs in foreground, with debug output directed to stderr.
-s
Runs single-threaded. By default, FUSE uses pthreads to overlap file system operations. Note that bzrofs single-threads all repository access.
-d
Produces detailed debug output for all file operations.
The final argument is the FUSE mount point.
This project is itself developed using bzr versions control. Assuming it is in a directory ~/bzrofs, it can be mounted on ~/mnt with the command
bzrofs --dir=~/bzrofs ~/mnt
Example Working Files
This README file is then visible as the read-write file
~/mnt/README.html
Working files are simply a reflection of the working directory. They can be added, deleted, edited, compiled, linked, and executed.
Example Snapshot
In addition, the most recent committed version is available read-only as
~/mnt/#-1/README.html
which can be abbreviated to
~/mnt/#/README.html
The previous committed version is
~/mnt/#-2/README.html
The README.html file didn't exist in the first committed version, so there is no
~/mnt/#1/README.html
Any bzr revisionspec without / can follow the "#", allowing names such as
~/mnt/#100/README.html
for the file as of revision 100, or
~/mnt/#tag:sourceforge-1.0/README.html
for the file as tagged for sourceforge release.
Because bzr revisions apply to all files in the repository,
~/mnt/#/
presents a complete directory tree of all currently committed files. Snapshot files can be opened in read-only editors and are available to common file system tools such as diff. For example, uncommitted changes can be located with the command
diff -r ~/mnt/# ~/mnt
Example History
Where a snapshot presents a single version of the entire repository, a history presents a directory of versions of a single file. Only those versions where the file was updated are included. Versions are numbered with the snapshot name, padded with leading zeros so that they will sort correctly in a directory listing.
~/mnt/##/README.html
lists files including the first committed version,
~/mnt/##/README.html/#098
through a recent version,
~/mnt/##/README.html/#130
File versions in a history are the same as file versions in a snapshot -- they are available read-only to standard tools. For example, the growth of the file can be seen with the command
wc ~/mnt/##/README.html/*
Similarly, the increase in source code is shown by
for v in ~/mnt/##/* do echo -n $v wc $v/*.[ch] | grep total done
The special "#" and "##" directories can appear anywhere in a path after the mountpoint and before the final name. This is useful when working in a project subdirectory. A snapshot of the current directory anywhere in the project tree always looks like "#". For example, if the project contains a subdirectory "tests", the commands
diff -r ~/mnt/tests/# ~/mnt/tests
or
cd ~/mnt/tests diff -r ./# .
locate uncommitted changes in the current directory. Likewise
ls ~/mnt/tests/##
or
cd ~/mnt/tests ls ./##
show which versions include changes to the tests.
Because bzrofs walks each component of a pathname without seeing what follows, special components such as "#" and "##" must appear as directories before the files they modify.
Some metadata is visible using additional ## directories. These metadata directory names are case-insensitive.
The log history is visible as
ls ~/mnt/##Log/*
where the log entry for at revision 143 is
ls ~/mnt/##Log/#143
The command
grep -iH tags: ~/mnt/##Log/*
locates all tagged revisions.
The "##Annotate" directory presents annotated files at specific revisions.
There is no special provision for modifying the repository through bzrofs. Bzr commands work directly through the working directory. Bzrofs flushes internal cached state when it detects file write operations in the .bzr working directory.
There are issues with versioned files that have been renamed, one of the most useful features supported by bzr. The history directory "##" has entries for each version, but the versions are not found if they are moved or renamed. This should be corrected by tracing from an initial revision using object ids, but may require support from lower-level bzr tools.
Operations on working files do not require subprocesses, and are relatively fast.
The cost of bzr operations is dominated by the cost of starting python interpreters. Bzrofs runs bzr commands as subprocesses, rather than using a library to read the repository data structures. This insulates it from changes to data formats and interfaces, and cross-language issues.
Bzrofs implements fine-grain operations on the repository, requiring many bzr commands. A recursive diff issues
bzr ls
commands to list each directory involved and
bzr cat
commands to extract the contents of each file. This is significantly slower than a single command.
bzr diff
To improve performance, bzrofs caches the results of each bzr command. For example, it uses a cached repository directory listing to look up file attributes and a cached file version to get both size and contents.
For safety, bzr operations and the bzrofs cache are single-threaded. This does not usually affect performance for a single user.
Performance might improve by piping commands through a
bzr shell
process, avoiding repeated startup costs.
The defaults and format of the
bzr ls
command changed between bzr 1.13 (bundled in Ubuntu Hardy Heron) and 1.14, specifically in the way a trailing / indicates a directory. This code relies on the 1.14 and later versions. Earlier versions could be supported by separating the listing by kind,
bzr ls --non-recursive --kind=file bzr ls --non-recursive --kind=directory
bzrfs (https://launchpad.net/bzr-fs).
This project is hosted at (http://sourceforge.net/projects/bzrofs/).