Contact SCGS

Subscribe to site updates

Navigation

    Navigation

      Why All Genealogy Records Are Not Free


      The following article is from Eastman's Online Genealogy Newsletter and is copyright by Richard W. Eastman. It is re-published here with the permission of the author. Information about the newsletter is available at http://www.eogn.com.  


      - Why Isn't It Free?

      The following was published as a Plus Edition article a couple of weeks
      ago. At the suggestion of several newsletter readers, I am now
      republishing it in the Standard Edition newsletter for all to see. If
      you wish, you do have permission to forward this article to others or to
      republish it elsewhere for non-profit use. For details, look at the
      menus on the upper right side of this web page and click on COPYRIGHTS.

      One topic that surprises me has appeared several times recently in
      comments from this newsletter's readers. Some people have questioned the
      idea of placing public domain data online and charging for access to
      that information, as is done by Ancestry.com, Footnote.com, FindMyPast,
      WorldVital Records, and others. One person claimed that it is illegal to
      charge for access to public domain data, and another reader stated that
      the online sites are "violating my constitutional rights to view the
      census."

      Sorry, folks, but that simply isn't true.

      Indeed, in the U.S. and Canada, governmental records are public domain,
      available free of charge to those who can travel to the repositories
      where the original records are stored. Many private records, such as
      church records, may not be public domain, but they are also often
      available at no charge if one can travel to view them. When travel is
      not an option, a trip to a local library may suffice if that library has
      microfilms of the original records that patrons can view for free. (For
      this article, I will ignore the costs of sending a filming crew to a
      repository to make the microfilms and the expenses of reproducing and
      distributing microfilms. However, those expenses are not trivial.)

      Given the fact that the records are already available "free of charge,"
      one might question the need to pay $50 or $100 or more per year to
      access the same records on a subscription service such as Footnote.com,
      Ancestry.com, Origins.net, NewEnglandAncestors.org, and other genealogy
      web sites.

      First of all, the idea that the records are available "free" is only
      true for those who live near the repository that houses the original
      records or photocopies of the records and can walk to that repository.
      If you have to travel some distance to a library that houses the records
      you seek, you will incur travel expenses. Even a trip to a library a few
      miles away will incur costs for gasoline and perhaps for parking. Such
      records are not truly "free."

      While perhaps the visitor doesn't pay anything to view records in books
      or in microfilms, that library had to pay someone for the books, the
      microfilm, the microfilm reader, the building, the employees, heat,
      electricity, etc. The library may not charge the patron to look at the
      microfilms, but the process certainly is not free. Information in a
      library is never really free. Someone always pays, usually the
      taxpayers.

      A longer trip will incur airfare or automobile expenses, along with
      hotel rooms and meals. I can go to Salt Lake City to view the “free”
      records available at the Family History Library. The last time I made
      that trip, it certainly was not “free.”

      A three-day trip to a distant repository can easily cost $500 or more.
      If I want to go back to the "old country" to look at records, expenses
      will be much higher, of course. For many who do not live near major
      genealogy libraries, this quickly changes the concept of "free."

      From the genealogist's viewpoint, accessing records published on the
      Internet greatly increases convenience and reduces travel expenses. From
      the publisher's viewpoint, the financial realities of publishing on the
      web add up rather quickly when one looks at the expenses involved with
      acquiring, digitizing, and electronically publishing records of interest
      to genealogists. Such an effort is not cheap.

      To be sure, there are hundreds of web pages available today at no charge
      that contain transcribed records from a variety of sources. RootsWeb has
      many such pages, as do freebmd.org.uk, genuki.org.uk, Find-A-Grave,
      hundreds of local society web sites, and many others. These web sites
      contain records transcribed by volunteers, and someone pays for the web
      servers, often without passing those expenses on to users. In most
      cases, the expenses are not huge, and advertising can help pay the
      bills. A few of these web sites may even contain images of the original
      records. Most of these sites have databases that contain hundreds or
      even thousands of records. In contrast, commercial services typically
      provide millions of records, usually many millions. With larger
      databases come larger expenses.

      Let's assume that a company or even a genealogy society, such as the New
      England Historic Genealogical Society, decides to make state vital
      records available on the World Wide Web. Once an agreement has been
      negotiated with the state, the company or society starts work. I will
      make some rough estimates of the expenses involved.

      In our example, let's say that the project entails 25 million
      handwritten records that were recorded over a 50-year period. (This
      would be for a state with a rather small population; many states will
      have more records than that in a 50-year period.) Digitizing these
      records will require thousands of manhours. It is doubtful if anyone can
      find that number of unpaid volunteers to travel to the repository, run
      the scanners, and enter the data. In fact, the repository may not even
      have room for a crew of that size.

      If you own a scanner, calculate how many pages you can scan in one hour.
      Then calculate how long it would take you to scan twenty-five million
      pages. Using a scanner purchased at a local computer store, I can scan
      one page every 2 minutes. Assuming a 40-hour work week, I will need
      20,833 weeks for this project. Clearly, hobbyist-grade scanners will
      never get the job done. Expensive, high-speed scanners need to be
      purchased. Five thousand dollars is a typical price for high-volume
      scanners, and this project will probably require two or more of them.
      Next, operators need to be hired to sit at the scanners 40 hours a week
      to create the digitized images. Those operators need to be paid.

      This process only makes scanned images of the records, probably the
      simplest and least-expensive part of the project. Somebody else then
      needs to make indexes as well. The process will vary, depending upon
      what is already available. In many cases, someone sitting at a computer
      will need to index each and every one of the millions of entries. Add in
      many more thousands of dollars in labor charges.

      Now we have created images, plus indexes to those images. We need some
      skilled programmers to combine all the data into one huge database.
      Skilled database administrators' labor also is not cheap.

      Once the records have been digitized and a database has been created,
      the real expenses begin. This database with twenty-five million
      high-quality images requires several terabytes of disk storage. (A
      terabyte equals one thousand gigabytes, the same as one million
      megabytes.) The purchase of a high-uptime, high-throughput disk array of
      that size, along with built-in backup capabilities, easily costs $25,000
      or more per terabyte. Add in the expense of a web server, a database,
      and the required software, and the cost soon exceeds $100,000 for the
      required hardware and software to make these records available online to
      genealogists. This figure does not include the labor charges mentioned
      earlier. All this is for a small web site. High activity web sites such
      as Ancestry.com will cost much, much more.







      Next, we need very high-speed connections to connect the hardware to the
      Internet so that we can serve 100 or more simultaneous users who wish to
      view these large graphics files. A single T-1 line is the minimum
      requirement for 20 or 30 simultaneous users, but most commercial web
      servers today are connected by multiple OC-3 connections. (I'll skip the
      technical discussion of T-1 and OC-3 connections. Let's just say that
      they are very high-speed lines, capable of handling many simultaneous
      users. They also cost a lot of money.)

      In most cases, it is cheaper to install the disk array, database server,
      and web server at a commercial web hosting service than to build one's
      own data center. Hosting fees for a high-usage database start at $1,000
      a month and quickly go up. Way up. Commercial genealogy companies with
      lots of users typically pay $10,000 or more per month in hosting fees.
      This may seem high, but it is still much less expensive than building
      your own data center.

      The bottom line is clear to anyone with a calculator: more than a
      quarter million dollars is easily expended to make high-quality original
      source records available to genealogists. Following that cost are
      monthly fees to keep this data available.

      The result is a database in which one can search for a name, find it,
      double-click on the entry, and then see an image of the original record.
      In other words, primary source records are visible to anyone in Virginia
      or California or Australia or anywhere else in the world with no travel
      expenses required.

      Of course, I have ignored many other expenses. When a popular database
      of this sort is placed online, users will have questions. Someone needs
      to answer those questions; so, we must create a customer service
      department. In the case of a society, a few members might step forward
      to answer questions. In the case of Ancestry.com, it means several
      hundred employees and a large building with telephones, computers, and
      high-speed data connections. Again, you can guess at the expenses.

      Where did this money come from?

      Yes, it would be nice to provide genealogy information online at no
      cost. However, if you are the person who wishes to provide that
      information, a few minutes with a calculator will quickly bring you back
      to reality.

      I like to use the analogy of water. Water is free. If I wish, I can
      obtain all the water I want at no charge. All I have to do is go to
      where the water is located. I can leave buckets on the lawn when it
      rains to obtain free water. If that is insufficient to meet my needs, I
      can walk to the nearest river or lake with buckets, scoop up all the
      water I want, and carry it home at no charge. Our ancestors did that
      centuries ago, and we can still do that today if we want. Nothing has
      changed. Water is still free.

      However, if we want the convenience of having water delivered to our
      homes, we will incur expenses. Our ancestors did not have this option.

      Someone paid to purchase large pumps, and they paid for the pipes to be
      buried underground to connect our house to the water mains. The entire
      construction effort cost many thousands of dollars. In addition,
      employees were hired to maintain the pumps and the pipes to make sure
      everything continues to work correctly. As a result, those who consume
      the water must pay a fee. Yes, the water is free; but, the pipes, the
      pumps, and the employees are not. Most all urban home owners today pay a
      water bill. We pay for the convenience of home delivery. Those who do
      not want to pay the delivery fee could elect to have the water shut off
      and then obtain free water in the same manner that our ancestors did.

      In my mind, public domain information is the same. The information is
      free, always has been free, and probably always will be free. I can
      still obtain information today at no charge in the same manner I always
      have: by going to the source records and looking at them in person. If I
      want to go to the location where the information is located, I can do so
      at no charge, assuming I am willing to walk. If the information is
      located hundreds or thousands of miles away, I may encounter significant
      travel expenses, but the information itself remains free of charge.

      HOWEVER, if I want someone to conveniently deliver the information to my
      home at any hour of the day or night that I might want it, I have to pay
      for "the pipes" and for the labor of those who provide that convenient
      access. We might consider the information to still be free, but the
      "pipes" (the servers, the high-speed data connections, the data centers,
      and the air conditioning to keep the equipment cooled, etc.) are not
      free, nor is all the labor of the hundreds of people who are involved in
      delivering that information to me. Those who invest millions of dollars
      in high-speed data "pipes" and all the associated labor certainly do
      deserve fair compensation for their investments.

      Yes, the data was free once, and it is still free today. As always, I
      still may go to the location where the information is stored and, in
      most cases, I can look at that information free of charge. Nothing has
      changed. The only significant change is that we all now have another
      option: we can still do things the old way at no charge, or we may use
      new, convenient delivery options if we are willing to pay for that
      convenience.

      Personally, I cannot afford to travel to Maine or Texas or England or
      Sweden to look at every single bit of information about my ancestors
      that I want to see. I find it much cheaper to sit at home and pay $10 or
      $30 a month to look at that information. Heck, ten bucks won't even pay
      for the shuttle bus to the airport, much less airline tickets, hotels,
      restaurant meals, and other required expenses to look at the "free"
      records.

      The only practical method of placing large amounts of genealogy
      information on the web is to have someone pay the expenses of acquiring,
      digitizing, and providing the data. In most cases, this means that the
      customers who benefit will pay. If the genealogy public does not wish to
      pay the expenses of "piping" the information to our homes, we can always
      do what all the genealogists of yesteryear used to do: travel to the
      repositories where the documents are kept.

      As for me, I will choose the cheaper option and pay a modest fee for
      someone to "pipe" the information directly to my home.

      Do you have comments, questions, or corrections to this article? If so,
      please post your words at:
      http://blog.eogn.com/eastmans_online_genealogy/2009/12/why-isnt-it-free.html

      You may find that other newsletter readers have already posted comments,
      questions, or corrections to this article at the same place.




      Comments