1,446 American Civil War battles and incidents in 1 minute
This is the video that plots 1,446 Civil War battles and incidents in 1 minute.
The program to display the animation is here . The code.exe program inputs the data.txt list of lat/longs and dates and plots them on a map.
Delphi 6 source code is included.
master.xls is a spreadsheet database of Civil War battle data, from which the cwdata.xls spreadsheet was extracted, which was used to generate the data.txt file input to the code.exe animation program.
While consolidating and reconciling data from various sources in the preparation of master.xls, the following became evident:
1. There's a huge problem with data accuracy, consistency, completeness, etc. Every name and number needs to be taken with a grain of salt. Sometimes the numbers do more harm than good (for example, all the dates in one database were too early by one day). Inconsistency is rampant. For example, this page lists Manassas II as starting August 28 but this page at the same web site has it starting August 29.
2. A large number of battles and places are known by more than one name. Lists tend to use one of two approaches to solve this. One method is to key on the "preferred" name and list the alternate name alongside:
One problem with the above is that often there is more than just two variations on the name:
With the above approach, finding a name requires searching two different columns. To address that, some databases list everything in one column:
Although every name can now be found with a simple search in one column, there's a lot of redundant date listings (which would be even worse when including additional columns like casualties), and it isn't obvious that two different rows are the same event.
Our recommendation as the best solution is to do this:
Now every name can be found by searching only one column, additional data (like casualties) only needs to be added to a single row, and there's no confusing two names as different events.
The rule that every alternate name has an "Instead see" entry pointing to primary names should also be applied to cities, campaigns, generals, etc. Yes, you end up with a lot of primary and alternate columns, but computers are good at that, and the resulting consistency, flexibility, and ease of use is well worth the verbose structure.
3. Killed and wounded are obviously "casualties", but are missing and captured "casualties" or "losses"? Unfortunately, lists are inconsistent in what gets counted as which, resulting in lots of numbers that can't easily be reconciled. Any list should always clearly announce if missing/captured are counted as casualties or not. Being as detailed and succinct as possible is best:
Are the blank entries above missing or zero? Again, databases are inconsistent. They should always explicitly document that blank entries are unknown, and zero entries are known, for example:
which shows there were zero captured in Battle 1 but an unknown number of missing in Battle 2.
4. Last but not least, almost every statistic seems to have different claimed values. The number of troops, number of casualties, etc., will often be a vague estimate, and different sources will claim different numbers. Sometimes a value is sufficiently documented to be considered irrefutable, but that is fairly rare, and often "documented" numbers still disagree.
master.xls took the easy way out with the quick and dirty method of combining unreconcilable differences into entries like "100 or 200", but that prevents calculations such as computing totals.
Perhaps a solution is to properly treat every claimed number as either an "estimate" or a documented "known", and when more than one estimate exists, show a low-high range:
It starts to get messy when trying to consolidate ranges like the above with multiple columns like the casualties vs. losses columns in suggestion #3, and still leaves the problem of how to list multiple "knowns" claimed by different documents/sources (perhaps they should then be treated as estimates only?), but the more a database gives attention to these kinds of details the more respect and usefulness that database will have.