created by Geraldine_VdAuwera
on 2019-11-21
TL,DR: In a few weeks, we're going to move the website and forum to a new online platform that will scale better with the needs of the research community. The website will still live at https://software.broadinstitute.org/gatk/ but there will be some important changes at the level of the user guide and the support forum in particular. Read on to get the lowdown on where this is coming from, where we're heading and how you can prepare for the upcoming migration.
When GATK was first released around 2010, its documentation lived in a rather primitive wiki that was half-public, half-private, and almost entirely aimed at developers. The wiki was supplemented by a proto-forum, hosted by Get Satisfaction and run by Eric Banks, one of the original authors of GATK who has since risen to the lofty position of Senior Director in the Data Sciences Platform at the Broad (a heartwarming rags-to-riches story for another time). Despite being an absolutely lovely human being in person, Eric was notoriously mean to the unfortunate few who dared ask questions on the old forum. So, in 2012, I was hired to be a human filter for his snark, plus, you know, build a proper user guide. Something that would enable researchers to use GATK without needing to be physically in the room with the developers for the darn thing to work. Coming out of a wetlab microbiology postdoc, I was uniquely unqualified for the job, but that too is a story for another time... Point is, that's how a little over seven years ago, Roger the summer intern and I built and launched the GATK website, which included a more formally structured user guide and the community forum hosted by Vanilla Forums that we have been using ever since.
Our little hand-rolled artisanal website has had a good run, with over 20 million page views to date and about two to three thousand unique visitors on any given weekday. But it's time to face the facts: we've outgrown it. In that time, and especially since the release of GATK4 two years ago (OMG has it really been two years already), the toolkit has expanded dramatically. It currently includes more than 200 tools and multiple Best Practices pipelines covering the major variant classes, plus use cases like mitochondrial analysis. We're aware that many of you find it difficult to find the information you need in our sprawling documentation. And there's more new stuff coming out soon that we haven't yet had a chance to talk about... So it's clear we're going to need both a better structure and way more elbow room than the current system can support.
For the past few months, we've been crafting a new web home for GATK documentation and support. This time, instead of building a traditional website, we're using a customer service system called Zendesk that includes a knowledge base module for documentation and a community forum for Q&A. Part of our support team has already been using this system for the Terra helpdesk. Although the Terra knowledge base itself is still a work in progress, it’s been a positive experience so far. That gives us confidence that adopting Zendesk for GATK will help us improve the usability of the GATK documentation. We're also looking forward to being able to streamline our support operation by consolidating across the multiple software products and services offered by the Data Sciences Platform. That's good for everyone — not just those of you who use WDL and Terra as well as GATK — because if our support team spends less time wrangling different internal systems, they can spend more time improving the docs and answering your questions.
Overall we're trying to minimize disruption but there will be a few important changes. Here's a breakdown of what you're most likely to care about.
The user guide will be organized a little differently and the search box will work better
We're taking this opportunity to update how the documentation content is organized to make it easier to find information. Hopefully this will be all upside, but if you get lost at any point, try the search box — it should work better in the new system.
Some links may break
This is perhaps the most important consequence of the fact that we're moving to a new content management system: all links to individual articles will change. We're going to set up an automated redirection system to map old URLs to the new ones, so that your bookmarks and links that people have previously posted online stay functional, but we can't guarantee that we'll be able to capture absolutely everything. We'll do our best to make the system handle missing content as gracefully as possible.
GATK3 documentation will be archived in GitHub
We're aware that the awkward coexistence of docs from the GATK3 era and the newer GATK4 versions is one of the major sources of confusion in the current user guide. We've wrestled with this problem and ultimately decided to move the GATK3 content to the documentation archive in the old GATK repository in GitHub, where versions 1.x through 3.x of the code live. This way the old content will always remain available for anyone still working with older versions of the software, yet it will be more clear that it only applies to those versions, and it will be out of the way of anyone using the current versions of the tools. And of course, we'll do our best to include all those articles in the redirect mapping to keep links functional.
You'll need a new account for the community forum
Unfortunately we're not able to move existing forum accounts to the new platform. So to ask questions, start discussions or add comments in the new Community forum, you’ll need to create a new account. The good news is that this new account will work for all the other products we support, like Terra and Cromwell/WDL. And if you already have a Terra community forum account, you’re all set.
The new system will support single sign-on (SSO) with Google, Microsoft, Twitter and Facebook.
Old forum discussions will eventually be taken offline
I'll admit it: this is the part that makes me hyperventilate a little. We have over 17,000 discussion threads in the "Ask a question" section of the forum, and it's just not feasible to migrate them all over to the new platform. Most of them are out of date anyway, referencing old tools and methods that we no longer recommend, command syntax that no longer works, and my favorite, bugs that no longer occur! But there is still plenty of useful information in there that's not in the docs, from explanations of weird errors to strategies for customizing methods for non-standard use cases. So we're going to keep the old forum online in read-only mode for the next few months, and during that time we'll comb through the most frequently visited threads to capture the good stuff and turn it into documentation articles. We're also open to suggestions if there are any discussions that you have found particularly useful in your own work.
However, at some point we're going to have to shut down the old forum. The plan right now is to shut it down on February 1st, 2020, but we'll re-evaluate that timing if we feel that we need more time for the knowledge capture process. Your opinion on this matters a lot to us, so don't hesitate to nominate threads that you think would be useful to preserve in the knowledge base. We also recommend you save any threads that are important to you personally as a PDF or HTML page on your computer just in case. If all else fails, the Internet Archive's "Way Back Machine" does preserve snapshots of the forum, so it's very likely that those old forum discussions will actually outlive us all.
Ultimately the purpose of these resources is to help you use GATK effectively in your work, so we'd really like to hear from you, especially if you have concerns about how any of this is going to affect how you normally use the documentation and forum. We're very open to making amendments based on your feedback, both before the migration happens but also during the months that follow. I have no doubt that as the dust settles and we put some mileage on the new platform, we'll see opportunities emerge to tweak it for the better. Don't be shy about volunteering your thoughts and suggestions!
From deena_b on 2019-11-21
Woah! Big change! Thanks for the heads up.
Would you please add a link to register for a Terra community forum account?
From matdmset on 2019-11-22
Thanks for the heads up!
Could you also add GitHub to the oAuth options? Since most of us already have a github account, I suppose this’ll be a frequently used option!
Cheers
M
From kvn95ss on 2019-11-25
Big changes indeed! Could we better archive the current form, rather than just cherry picking the things which ‘we’ might deem useful? I’m pretty sure there are some posts which might benefit only few people, but nonetheless removing the older, obscure posts would be a disservice to the people who benefit from it.
Isn’t it possible to permanently keep the form for a longer period of time, like an year or so? A permanent, read-only archive would have been nice, but I can see how the information might be confusing to some.
From danilovkiri on 2019-11-27
Great change! It would be nice to have ALL GATK3 tools in GATK4 before the transition. Unfortunately, several great GATK3-specific tools are still either not implemented or only partially implemented in GATK4. Parallelism is also a great concern in GATK4 as it has not been applied to the majority of the tools. My message is not to forget that there is still a lot to be done before one can finally forget about GATK3. Looking forward to GATK4 new releases.
From bhanuGandham on 2019-11-28
@deena_b
Here is a link to the terra forum community: https://support.terra.bio
@matdmset
You will be able to use Google, Microsoft, Twitter, and Facebook accounts, but not GitHub.
@kvn95ss
Thank you for the feedback! I hear you and we will try our best to capture and transfer as much of the knowledge as we can from the old forum to the new site but we won’t be able to keep the old forum around for a year. If need be we will extend the read-only duration for a little longer than Feb 1st 2020. We will also create a public database of the all the old threads in github.
@danilovkiri
Thank you for your feedback. We constantly trying to improve our tools and your feedback is very helpful in that process.
From igor on 2019-12-05
I am sad to hear that the forum is being sunset, especially since you are moving to a closed-source remotely hosted commercial solution. No one can guarantee that Zendesk will continue to provide a similar (or better) service years into the future, forcing another painful migration possibly without the ability to keep the data. Anyway, I am sure there was a lot of thought that went into the process, so my comments are irrelevant.
However, I would like to make a request that I think is fairly reasonable. Is it possible to offer a database dump of the current forum? If it is too much trouble for you to keep the content permanently available, why not make it possible for other interested parties to set up a mirror? As you already mentioned, it will be possible to use the “Way Back Machine”, but that does not always offer a great experience (no search, for example).
From Geraldine_VdAuwera on 2019-12-06
Hi @igor, I understand your feeling on this. It was definitely a tough decision to move away from Vanilla. To be frank the pace of change in technology is such that I wouldn’t be surprised if in 7+ years we need to adapt our platform again one way or another. In the meantime, we’re going to do our best to make the migration as painless as possible.
We are indeed going to save a database dump of the forum contents (except for user profiles, which will be deleted in order to protect private information). I’m not super comfortable with the idea of just putting that data dump out there for anyone to use, for a variety of reasons, but if someone is serious about wanting to help set up an archive of that content for the benefit of the community I’d be open to discussing it. If that is your case, you’re welcome to contact me by email at geraldine at broadinstitute dot org.
From krisss on 2019-12-09
Look Forward to use new software with most advanced features.
just want to check will you be using Artificial intelligence or Machine Learning to build new software?
From wbsimey on 2019-12-12
Thanks for the update Geraldine! Looking forward to the new site. I know Broad is here to support human research, but many of us non-model folks use GATK heavily and this will continue to grow as genome level analyses become increasingly common. What about the whacky idea of creating a non-model sub forum? Perhaps that could be community driven rather than having Broad moderators.
From Geraldine_VdAuwera on 2019-12-13
wbsimey Yes actually, it's not that wacky!
bhanuGandham has some cool ideas along those lines, which she will be developing in the course of the next few months. If you’re interested in being a thought partner on this project, please feel free to reach out to her through direct messages.
From davidben on 2019-12-13
> Look Forward to use new software with most advanced features. just want to check will you be using Artificial intelligence or Machine Learning to build new software?
@krisss We already do.
From bhanuGandham on 2019-12-16
wbsimey I agree with you. And as Geraldine said, we are going to make provisions in the new forum to enable and encourage community driven discussions. If you are interested in collaborating in this effort please feel free to reach out to me directly via email at bgandham
broadinstitute.org and I can share more details with you.