bzrofs

INTRODUCTION

Bzrofs uses the FUSE (http://fuse.sourceforge.net/) Linux/BSD kernel extension to export a bzr (http://bazaar-vcs.org/) version repository as a file system. The primary use of bzrofs is to make versions visible to tools that are not version-aware.

The FUSE mountpoint exports the working directory as a read-write file system. In addition, two views of the repository are overlaid as a read-only directories.

The synchronic repository view consists of revision snapshots. With bzr, a revision applies to the whole repository instead of an individual file. In this view, a snapshot appears as a directory containing files.

The diachronic repository view consists of file histories. In this view each file appears as a directory containing a series of versions.

This project is as much an exploration of FUSE services as it is of version control. It has been tested on Linux and lightly tested on MacOS (10.5). There is extensive debugging output when run in foreground with the -f option.

USAGE

By default, bzrofs exports its working directory, which must contain a bzr repository, usually found in a hidden subdirectory ".bzr". Repository versions and history are accessed through directories beginning "#" and "##".

These and other defaults can be changed with the following options.

--dir=DIRECTORY

Exports DIRECTORY instead of the current directory. This is necessary when running bzrofs as a server (without FUSE's -f option).

--bzr=EXECUTABLE

Uses EXECUTABLE as the bzr command to access the repository. By default this is "/usr/bin/bzr" on Linux and "/usr/local/bin/bzr" on OS X.

--cache=SIZE

Caches the results of SIZE bzr commands. These include directory listings, version logs, and file revisions. The default SIZE is 32.

--tag=CHARACTER

Changes the special directory character from "#" to CHARACTER.

The options listed above may be followed by the standard FUSE options. Among the most useful are

-f

Runs in foreground, with debug output directed to stderr.

-s

Runs single-threaded. By default, FUSE uses pthreads to overlap file system operations. Note that bzrofs single-threads all repository access.

-d

Produces detailed debug output for all file operations.

The final argument is the FUSE mount point.

EXAMPLES

This project is itself developed using bzr versions control. Assuming it is in a directory ~/bzrofs, it can be mounted on ~/mnt with the command

bzrofs --dir=~/bzrofs ~/mnt

Example Working Files

This README file is then visible as the read-write file

~/mnt/README.html

Working files are simply a reflection of the working directory. They can be added, deleted, edited, compiled, linked, and executed.

Example Snapshot

In addition, the most recent committed version is available read-only as

~/mnt/#-1/README.html

which can be abbreviated to

~/mnt/#/README.html

The previous committed version is

~/mnt/#-2/README.html

The README.html file didn't exist in the first committed version, so there is no

~/mnt/#1/README.html

Any bzr revisionspec without / can follow the "#", allowing names such as

~/mnt/#100/README.html

for the file as of revision 100, or

~/mnt/#tag:sourceforge-1.0/README.html

for the file as tagged for sourceforge release.

Because bzr revisions apply to all files in the repository,

~/mnt/#/

presents a complete directory tree of all currently committed files. Snapshot files can be opened in read-only editors and are available to common file system tools such as diff. For example, uncommitted changes can be located with the command

diff -r ~/mnt/# ~/mnt

Example History

Where a snapshot presents a single version of the entire repository, a history presents a directory of versions of a single file. Only those versions where the file was updated are included. Versions are numbered with the snapshot name, padded with leading zeros so that they will sort correctly in a directory listing.

~/mnt/##/README.html

lists files including the first committed version,

~/mnt/##/README.html/#098

through a recent version,

~/mnt/##/README.html/#130

File versions in a history are the same as file versions in a snapshot -- they are available read-only to standard tools. For example, the growth of the file can be seen with the command

wc ~/mnt/##/README.html/*

Similarly, the increase in source code is shown by

for v in ~/mnt/##/* do echo -n $v wc $v/*.[ch] | grep total done

PATHS

The special "#" and "##" directories can appear anywhere in a path after the mountpoint and before the final name. This is useful when working in a project subdirectory. A snapshot of the current directory anywhere in the project tree always looks like "#". For example, if the project contains a subdirectory "tests", the commands

diff -r ~/mnt/tests/# ~/mnt/tests

cd ~/mnt/tests diff -r ./# .

locate uncommitted changes in the current directory. Likewise

ls ~/mnt/tests/##

cd ~/mnt/tests ls ./##

show which versions include changes to the tests.

Because bzrofs walks each component of a pathname without seeing what follows, special components such as "#" and "##" must appear as directories before the files they modify.

METADATA

Some metadata is visible using additional ## directories. These metadata directory names are case-insensitive.

The log history is visible as

ls ~/mnt/##Log/*

where the log entry for at revision 143 is

ls ~/mnt/##Log/#143

The command

grep -iH tags: ~/mnt/##Log/*

locates all tagged revisions.

The "##Annotate" directory presents annotated files at specific revisions.

BZR

There is no special provision for modifying the repository through bzrofs. Bzr commands work directly through the working directory. Bzrofs flushes internal cached state when it detects file write operations in the .bzr working directory.

BUGS

There are issues with versioned files that have been renamed, one of the most useful features supported by bzr. The history directory "##" has entries for each version, but the versions are not found if they are moved or renamed. This should be corrected by tracing from an initial revision using object ids, but may require support from lower-level bzr tools.

PERFORMANCE

Operations on working files do not require subprocesses, and are relatively fast.

The cost of bzr operations is dominated by the cost of starting python interpreters. Bzrofs runs bzr commands as subprocesses, rather than using a library to read the repository data structures. This insulates it from changes to data formats and interfaces, and cross-language issues.

Bzrofs implements fine-grain operations on the repository, requiring many bzr commands. A recursive diff issues

bzr ls

commands to list each directory involved and

bzr cat

commands to extract the contents of each file. This is significantly slower than a single command.

bzr diff

To improve performance, bzrofs caches the results of each bzr command. For example, it uses a cached repository directory listing to look up file attributes and a cached file version to get both size and contents.

For safety, bzr operations and the bzrofs cache are single-threaded. This does not usually affect performance for a single user.

Performance might improve by piping commands through a

bzr shell

process, avoiding repeated startup costs.

COMPATIBILITY

The defaults and format of the

bzr ls

command changed between bzr 1.13 (bundled in Ubuntu Hardy Heron) and 1.14, specifically in the way a trailing / indicates a directory. This code relies on the 1.14 and later versions. Earlier versions could be supported by separating the listing by kind,

bzr ls --non-recursive --kind=file bzr ls --non-recursive --kind=directory

HOSTED

This project is hosted at (http://sourceforge.net/projects/bzrofs/).

Page updated

Google Sites

Report abuse