Starting in December 1998, usage data was broken down by cluster.
NOTE: Due to various outages or migrations of the larvnet and syslog servers over the years, some portions of the raw data had duplicate entries – that is, entries for the same day and cluster. These duplicate entries were pruned. In some cases, it was clear which entry to prune, either because the entries were exact duplicates of each other, or because the values were clearly erroneous (for example, showing 0 total machines in a cluster). In other cases, a more subjective method was used. Regardless, each spreadsheet lists the pruned data so it can be reincorporated if necessary.
Each Excel file has two sheets in it – one containing the actual data, and another listing the duplicate entries that were pruned from the raw data.