Event Selection Examples |
|
To set up a formal test, we identify an event that provides a focus for collective attention of large numbers of people around the world. We then fix the start and end times and specify a statistical analysis to be performed on the data. As explained in Event Selection Procedures, we choose a variety of kinds of events in order to learn what matters, and we have only gradually learned how to set parameters that are adequate. In particular, we had to guess what kinds of events might produce the data deviations we hoped to study, and we also had to learn how much data should be included in the event specifiction. Experience and analysis have helped answer these questions, and despite an effect size so small that we need dozens of events for reliable statistics, it is possible to standardize many of the event selection parameters. The descriptions below explain how we set the time period for most events. Some are firmly fixed, and others generally so, while a few categories still demand flexibility. A standard analysis statistic has been used for almost all events since late 1999. It is a measure of network variance, calculated as the squared Stouffer's Z score for each second, accumulated over the whole event period. It is important to add that in all cases, the specification is done a priori. All formal events are completely defined and entered into the hypothesis registry before the corresponding data are extracted from the archive. Event Specification ExamplesThe following list represents our selection procedure as of 2010. The specifications are guidelines rather than strict rules, but they cover most kinds of events. Much of the guidance comes from a decade of experience, during which we have developed useful rules of thumb. Analysis shows, for example, that event periods have become longer over the years, and these guidelines reflect that. Similarly, we have learned that the average effect period is a few hours, but also that the exact length of our event period is not critical -- in other words, that great flexibility is not required.
About half the events in the formal series are identifiable before the fact; the accidents, disasters, and other surprises must, of course, be identified after they occur. We do not look for "spikes" in the data and then try to find what caused them. Such a procedure is obviously inappropriate though many people imagine it is what we do or should do. After specification (and after the data are in) analysis for an event proceeds according to the registry specifications, yielding a test statistic relative to the null hypothesis. These individual results become the series of replications that address the general hypothesis and ultimately are combined to estimate its likelihood.
|