Using a Wiki for Documentation

Introduction

Welcome to this case study on how a documentation group in a software Division in Agilent Technologies has started using a Wiki to create Online Help and Printed Books.

My name is Rahul Mehrotra and I’m here to talk about

How we created documentation before the Wiki
How we set up a Wiki that 500 people now use to create and modify documentation
How we now collaborate, create, moderate, publish, and manage content

I will try and spend most of the time on what I think is important but don’t let that stop you; please feel free to jump in, add comments, ask questions, and lead us to examine issues as they come to mind.

Eye of the Needle

Before we started working with a Wiki, we used

FrameMaker to write the doc
WebWorks Publisher to generate html and pdf files for online help and printed books
CVS for version control and content management
Perl scripts to run these tools in batch mode and integrate doc into the nightly software build

But our attempt to arrive at a professional-looking end-result was forcing all content creation through the proverbial "eye of the needle"—Us

Even conducting classes in using FrameMaker, which we did for a group of engineers in China, didn’t magically solve our problems.

We still had to go through all their docs. Only now they resented our messing with it because they felt they were doing what we asked them to do.

This was problem #1: everything had to flow through us.

Barrier for Entry

Most of our software engineers didn’t even want to write doc. As one of them put it, our tools were "a formidable barrier to content creation."

Their eyes glazed over when they saw what it took to make things happen—our tools, stylesheets, and do’s and don’ts.

So they wrote "inputs," which we laboriously copied into "doc." Copy and paste is all very well but—and most of you know this—Word and FrameMaker don’t play nice together.

We could see that if we dismantled the barrier for entry requiring a subject-matter expert to learn to use our desktop publishing processes, more people would be able to create content.

This was problem #2: our processes were complicated.

Need to Focus

Even our eyes glazed over when we first made a flowchart to figure out our tasks and processes.

We really needed a publishing process anyone could use; and one that everyone would use.

We realized that if we simplified things enough so that more people could get it right the first time, our time would be better spent doing something better than just fixing what someone else had done.

This was problem #3: we didn’t have time for content.

Managing Churn

We could recreate and restore any arbitrary version of content, for any release of any product. We had a very robust content management system:

version-controlled source (FrameMaker) files
version-controlled output (html/pdf) files

We could run a nightly build to take updated FrameMaker content and turn it into Help files that ship with the software.

But when you get to eight products with four releases a year (that’s 32) plus another 40 versions of legacy, maintaining content on the Web becomes more than a full-time task.

We could create the latest and greatest content but didn’t have a good way to put so much of it up on the Web so that people could see it.

This was problem #4: it took a lot of effort to keep content updated.

Hunting for the Wiki

So we set out to look for a solution that would do it all, only better.

Distribute work
Simplify processes
Improve content
Streamline publishing

It seemed just the perfect job for the Wiki.

Here is why...

The Wikipedia contains over 2 million articles in English, with over 100,000 articles in 17 other languages… mostly managed by people who do this in their spare time.

All we wanted was a system in which minor changes are incorporated directly.

Some people don’t like to wait a few months to see a typo fixed on the Web.

Down the Rabbit Hole

DMOZ, the Open Directory Project, plunged us into the world of Wikis.

It was like we’d just gone down the Rabbit hole and found ourselves in another world.

Wikis that work on thumb drives, Wikis that can be configured any way you want, Wikis that don’t cost any money, Wikis that work on any operating system, read any kind of file…the wonders never cease.

Like Alice in Wonderland,

"...(we) got so much into the way of expecting nothing but out-of-the-way things to happen, that it seemed quite dull and stupid for life to go on in the common way."

I can’t think of a better place to start.

The Matrix

One of the links DMOZ points to is the Wiki Matrix.

The Wiki Matrix has links to the feature and descriptions of almost all Wiki engines.

It also has a Wiki Choice Wizard to guide you, based upon what you think you want.

We spent a lot of time at this site, answering questions in different ways to see the results of our choices, poring over feature comparisons, and haunting the bulletin boards of the probable choices.

You post something on a forum, a whole lot of people respond with tips and guidance. You email the developer of a Wiki and you get a response right away.

The folks out in the Open Source world can teach us a lot about speed, responsiveness, and all those sorts of things.

The Contenders

If we didn’t look at ALL Wiki engines, it sure felt like it.

Eventually...

We settled on three Wikis and ported over a few thousand pages of content (yes, a few thousand) to see how they actually handled Input (interface, workflow, and functionality) and Output (Web, online help, print-pdf)

MediaWiki is the Wiki engine used by Wikipedia
TWikiis a Wiki engine we really thought we could use, given its ease of use, features, and accessible technology
Confluence is the Wiki engine that eventually worked the best for us

The One

I’d like to add one disclaimer about the various Wiki engines:

Your mileage may vary.

MediaWiki is free. TWiki is free. A lot of Wikis are free. We’d probably have gone with a free wiki if we had the know-how.

In our experience, everyone engaged in developing Wiki software is wonderful to work with. They are very responsive, very helpful, and very generous with their time. It’s almost as if no matter what you pick, you can’t go wrong.

We picked Confluence and the more we use it, the more we like it. Very quickly, here’s why:

Almost zero down-time
Amazingly easy to use for all tasks, from Administration to Authoring

The Confluence Wiki works for us but there are so many excellent Wiki engines out there that I’m sure there is another one that is perfect for you.

A 500-user Confluence license costs $5,000. That’s half the price of a single-user FrameMaker-WebWorks Publisher license. But it’s not free.

Moving the Mountain

This is how we moved the mountain of legacy content that we had accumulated over the years…

Translate

We used a modified WebWorks Publisher template to generate Wiki Markup output from FrameMaker files. Then we ran Perl scripts to clean-up the Wiki Markup file—remove double-spaces, add file names, titles, etc.—and complete some housekeeping tasks

What you’re looking at here is work we did with our test installation of TWiki. We took what we did there and adapted it to work with other Wikis.

Moving thousands of pages into a number of Wiki installations was the best thing we could have done. We gained a solid understanding of exactly what we needed to do to translate and transfer large amounts of information. It has come in handy not just for translating into Wiki Markup but also for translating from Wiki Markup into html, chm, and pdf.

>Transfer

We ran a Perl script to move files directly into the Wiki. Confluence supports WebDAV, an http-like protocol that lets you drag and drop files or run a script to copy them over

Cleaning-Up

Once the content was in the Wiki, we began a two-step clean-up:

All pervasive issues—those where we could detect a pattern, such as extra spaces before and after punctuation—were fixed using clean-up scripts
All one-of-a-kind issues were cleaned manually

And we used lists in the Wiki to keep track of who completed what so that no one missed out on the fun.

Manually editing all doc seems like a lot of work—who has the time—but, trust me, this one task has probably saved us more time than anything else. You see, doing the work made us experts on tool usage because each day someone would figure out a new and simpler way to do something. And what we learned, still works. In the space of a month or so, we made the transition from FrameMaker experts to Confluence Wiki experts.

Today, when someone asks us how to do something, we know: been there, done that.

Single-Sourcing

Some of our content is also drawn directly from the software code...

Now, when a programmer writes code, a Wiki Markup version of the information is extracted from the source and inserted (or updated) in the Wiki using the WebDAV widget, the one that lets us drag and drop content directly into the Confluence Wiki. Except now we’ve got it working off a script.

Once this information is in place, it is pulled into the topic by reference. The Confluence-specific term for it is "include": you can include, and thereby single-source, content in this manner.

The downside of this was that we missed out on having to re-create multiple tables for tens of thousands of pages of reference docs.

We used numerous such techniques to set up and organize content so that we would not need to fix things in multiple places.

We figured out this trick when we were translating the documents. It alone has more than made up for any extra time we spent on the translation task.

THEME BUILDING

We used a plug-in called Theme Builder to tweak the layout, formats, menus, and UI.

I’m sure we could have gone in and tweaked the style sheets, learned to work around and under the hood, so to speak, but using this plug-in we were able to come up with a look and feel quickly. (Couple of days)

Having a look and feel in place enabled us to work out the kinks in our content organization.

Working out the kinks in our content organization suddenly made finding things simple.

That made a lot of people very happy.

This plug-in makes people happy.

We don’t have the time to go over the issues of information architecture, especially the importance of how to organize content so that people find it where they think it should be.

So let’s just say a miracle occurred. Much to my dismay, information architecture has been the most overlooked and underrated aspect of our Wiki project, maybe because it seems too much like Black Magic.

Privilege and Access

After all the discussion that took place around issues of privilege and access, we seem to be settling on these five groups.

System Administrator: this is our IT group. They maintain the hardware, the software, backups, etc.
Confluence Administrators: this is some of us writers. We maintain the interface, add functionality to the Wiki, etc.
Learning Products: this is all of us writers. We create spaces for new products, add content, review changes, run builds, etc.
Confluence Users: this is everyone else in our Division. They can add and edit content

We wanted to keep it as open as the Wikipedia—let anyone and everyone edit content. But the reality is that we are a Corporation. We fear having to manage the editing-frenzy that might take place. You never know, it may be the kind of thing for which people wait around, you know, when can I edit someone else’s engineering documentation.

So, for now, our customers get to read but not edit the content.

Rollout

We rolled-out the Wiki with a song, a limerick, posters, reminder notes, candy, and a training session.

A training session is invaluable because it does two things:

it shows everyone how to do things
it makes it clear that they now need to do it

We conducted multiple training sessions for folks in China, India, Japan, Belgium, the US, and the UK. Then we did it again for the folks in China, India, Japan, Belgium, the US, and the UK.

We also did informal training sessions for key people, one in each group, before we rolled it out to an entire group.

We did lot’s of training sessions to get the word out.

The training sessions were followed by posters that sprang up all across on office walls, catchy limerick and all.

That was followed by reminder notes and candy.

Candy works best.

Content and Process

Confluence has two kinds of areas: "Personal Space" and "Published Space"

We use personal spaces to keep track of what we need to do, the process documentation, the internal and development notes, you know; how we do what we do.

No more shared drives full of documents that no one reads. No Microsoft Project.

Throughout our Wiki development, we used personal spaces to communicate with each other.

Personal spaces have taught us more about collaboration than collaboration.

Now the rest of the Division is figuring that out. People get hooked when they see how easy it is to set up and use a personal space website.

Using personal spaces to collaborate on how we are going to do things, the processes, has provided us with an excellent learning ground for collaborating on content.

Add or Modify

Here is how content editing works:

If a person has edit permission, the Edit menu is displayed. This menu lets them Add, Edit, Copy, and Remove text and attached images.

Everyone in our Division has edit permission. That adds up to five hundred potential content contributors. ALL of them have registered. More than half have tried it out more than once. No one has come back saying they couldn’t do it.

We’ve kept Adding a page equally simple by asking people to Copy a page and then edit it.

Confluence has the functionality for us to put together a template-based macro for adding new pages. It’s on our to-do list. For now, we are finding that people know exactly what to do and they know who to call when they don’t.

And that makes us wary about adding more bells and whistles.

Rich Text or Markup

When a person Edits a page or copies it...

They have a choice between Rich Text and Wiki Markup.

Rich Text offers point-and-click choices in the form of buttons.

Wiki Markup is for those who prefer to work with the code and a notation cheat sheet.

Clicking Save updates the Wiki.

So someone who has edit permissions can log in from anywhere in the world at any time and use a browser to update content.

Their changes will be visible instantly on the Web.

One interesting aside:

We had expected that we, the writers would use the Rich Text format because it is more visual and the programmers would use the Wiki Markup because it is more code-like.

The opposite is true. WE use Wiki Markup while the programmers use the Rich Text to add content and even drag and drop content from other files.

Images and Equations

We have a simple three-step process for all non-text elements such as equations and graphics.

Create. We didn’t dictate which screen capture tool to use because people have their own preferences. But we have guidelines for screen captures and graphics. I don’t know if we all agree on what they are, but that is another story.

For equations, we decided to stay away from Math Markup Language simply because we haven’t been able to make it work for us in the multiple outputs that we generate—html, pdf, chm. Working on it.

So we ask people to use the Equation editor in Word to create the equation and take a screen capture of the equation. Everyone in Agilent has access to Word so this works for us.

Attach. Before an image can be added to a page, it needs to be attached or uploaded. In the case of the equation, we ask people to attach both, the screen capture and the Word file. That way the equation can be edited and updated as needed.

Insert. The final step is to insert the graphic at the desired location on the page.

And that’s it. Honest. A content creator doesn’t need to do much more. Everything else happens in the background.

This is how we solved our problem #2

(our complicated processes)

Instead of asking people to learn to use our tools, we have modified our system to use their tools.

On Moderation

We monitor Wiki activity using four tools.

A Daily Summary that is generated each night. This summary is a record of all the changes made in the entire Wiki during the day.
A Space Watch that sends out a message each time any content in a specified area is updated.
A Watch List that is updated in real-time. This list provides each writer with updates on changes made to pages for which they are responsible.
A Dashboard that is updated in real-time. This list is an activity monitor to watch changes as they happen.

These reporting tools, enable us to moderate content changes and support instant updates on the Web.

Spotting Change

Tracking what has changed is a very simple task.

All the changes to a topic that someone in China made yesterday, for instance, can be viewed in one color-coded snapshot.

This is where the "validate" part comes in: the writer can see the changes and figure out if they are valid.

Backing out changes is also very simple: just "restore" a previous version.

This solves our problem #3

(we were spending time on activities that did not add value to the content)

A majority of our time is now spent in validating and moderating content, not re-formatting it.

Content Integrity

This is what the Confluence Wiki calls the Dashboard. It is an internal, development view of the entire repository.

From this page one gets to monitor all activity taking place in all of the spaces in real-time. (You can see the list of spaces on the left.)

In the past, keeping track of what was changing, and sometimes what all needed to change, was a cumbersome process. Now, not just change but also all change reporting is instantaneous.

We get to see it as it happens; we try and intervene only as needed. This is also a good place to keep on the lookout for changes by people who don’t normally make them.

This is one aspect of content management: ensuring content integrity

Version Control

At the macro level we have the notion of SPACE, which defines all the content related to, for example, a specific release of a specific software product in our case.

That is what you see here: a snapshot of spaces defined in our Wiki, along with the contents of a single space. Each space contains all the documentation that ships with a version of one software product.

For now, we have created a separate space for each version of content.

At the same time we are experimenting with "Workflow" and "Publishing" plug-ins that one can use to set up a workflow where the content doesn’t become visible to the end-user until it is ready to be published.

When that happens, a single space will be used to publish content to multiple spaces.

This is another aspect of content management: ensuring version control

Workflow

Here’s what happens in the background when content is changed...

A nightly build generates the html, chm, and pdf files that get added to the main software build. This is what it does:

Export the content from the Wiki
Massage the html look-and-feel
Gather the content into “books”
Generate html-help
Generate pdf files
Generate chm-help

As a result of this nightly build, content development remains an integral part of the software development process. When the software is ready to be released, the documentation is ready as well.

Content on the Web is updated instantly so the author can see the results of their actions right away.

This is how we solved our problem #1

(the one where everything had to flow through us)

Now, while we do need to ensure some quality and standard, we do so without holding things up.

Place and Publish

Think of a book as a collection of articles arranged in a logical sequence. When writers feel there has been enough change in a book they moderate, they can add it to the build list.

This list changes each day and the html, chm, and pdf files generated are tested (automatically) and checked (manually). We run tests for three main categories.

Hyperlinks
Help Links that connect content to the software
File generation and collation into installable units

Again, this is our solution-in-progress to problem #4

(it took a lot of effort to keep content updated).

Most of the content update and publishing is automatic and instantaneous.

Web (Wiki)

This is what a user sees on the Web: documentation created using the Confluence Wiki.

Notice no Edit menus...

Apart from that, it is exactly what the authors see.

People can use the drop-down menus to go from one book to another. They can click links to headings within this page and to other resources. There is also a breadcrumb trail to indicate where they are. And there is the search.

This Web-based Context-sensitive online help is seen by the end-users of our bleeding-edge software products where the information changes frequently.

It’s perfect for new information, new examples, new functionality, new standards…just the kind of thing that works for all our early adopters, beta testers, and external partners.

Online Help (.html)

This html-based Context-sensitive online help is seen by the end-users on Linux and Solaris.

Again, people can use the drop-down menus to go from one book to another. They can click links to headings within this page and to other resources. There is also a breadcrumb trail to indicate where they are. And there is the search.

This search is an applet-based search for which the content gets indexed and the infrastructure gets added as part of the nightly build. The Javascript menus are also added in as part of a wrapper script that adds this look-and-feel to each topic file exported from the Wiki. There is also a link to a pdf file of the entire book.

We use Perl and shell scripts to

Collect all the files exported from the Wiki
Remove the html elements we don’t want
Add those that we do
Index the content for search
Put together an installer

The Unix software build picks up the doc installer and incorporates it into the software installer so that when a person installs the software, the doc gets installed.

Windows Help(.chm)

This compiled html Context-sensitive online help is seen by the end-users on Windows.

We use Perl and shell scripts to

Collect all the files exported from the Wiki
Remove the html elements we don’t want
Add those that we do

For now, we are using Microsoft’s html Help Workshop to add in the topic index and compile the html files into a single .chm

The Windows software build picks up the .chm file and incorporates it into the software installer so that when a person installs the software, this file gets installed.

Books(.pdf)

And finally, here is the pdf file we generate from the Wiki. Links to these files are included in the .html and .chm files.

These pdf files provide a “book” for those people who prefer books.

To make these books, we’ve set up pages in the Wiki that concatenate on-the-fly, all the topics that go to make up a book. We export these pages separately from the Wiki and use Perl and shell scripts to

Collect all the files exported from the Wiki
Remove the html elements we don’t want
Add those that we do

Then we use a product called pd4ml to generate pdf files. Pd4ml uses our html and css files to create pdf files for our books, complete with bookmarks.

Collaboration

They say a product is only as good as it is successful.

For that, I’d like to fall back on the Collaboration measure.

For us, the key statistics are

Number of people now actively involved in the direct creation of documentation
Number of people volunteering development time and resources (we have only been able to take advantage of four offers so far)
- Single-sourcing from the code
- Use of color in display of code
- Conditional text display
- Approval and Workflow

I think this, more than anything else, is a measure of the true success of our project: people are using it and they are collaborating with us to extend it for their uses.

I have numbers on how many people and how many pages but I think my informal little graphic says it all. The floor on the number of interactions each day continues to rise.

External Collaboration

Early adopters, Beta testers, Contractors...

Internal Collaboration

R&D, Tech Support, Marketing...

Results

No PowerPoint presentation is complete without a bulleted list or two.

Here is what we have accomplished with our transition to the Wiki as our authoring, publishing, and content management tool

Problem #1: everything had to flow through us.

We no longer hold things up. Instead we use a number of Moderation tools to watch and intervene only as needed.

Problem #2: our processes were complicated.

Adding and editing content is now simple. More important, we have adapted our tools to coexist with tools that people are already familiar with.

Problem #3: we were spending time on activities that did not add value to the content.

A majority of our time is now spent validating and moderating content instead of cutting, pasting, and reformatting.

Problem #4: it took a lot of effort to keep content updated.

Update on the Web is instantaneous. And we are working on automating the last parts of our publishing in other formats.

To Do

Here is my other list of things that we need to get to the "next time"

Each day we get requests to add a feature, another widget, some functionality that someone knows we absolutely must have. And we try our best to just say no. I don’t know what it is about minimalist interfaces: everyone loves them, finds them very usable and intuitive. And then wants to change them.

And here is why we don’t really want to start out with our list:

We are going to try and hold out for as long as possible to discover what people need so that we don’t have to depend upon what they say they want.

Done

I’d like to leave you with this thought:

It took over a year to sell this concept.

It took less than a year to do all the work.

We did it in our spare time, in addition to all our regular "real" work.

It is that easy. Really.

Google Sites

Report abuse