Technical Articles

Revitalizing Decades-Old Analog Seismograms Through Image Analysis and Digitization

By Petros Bogiatzis, Harvard University


Before the advent of digital seismographs in the 1970s, scientists relied on analog seismographs to measure seismic waves. Millions of these aging seismograms are archived in observatories around the world, constituting a vast store of valuable scientific information. Until now, however, accessing this information has been problematic because modern analytical techniques were developed for use with digital seismographs and require discretized time series data.

Professor Miaki Ishii and I at the Harvard University Seismology Group have unlocked this previously inaccessible analog data by developing an interactive software tool that converts images of analog seismograms to time-series data. The DigitSeis software uses MATLAB®  image processing algorithms to identify time marks and correct image distortions to establish the timing and amplitude of every signal. Our team is using DigitSeis to digitize seismograms from the 1930s through the 1950s archived at the Harvard-Adam Dziewoński Observatory (HRV). The software continues to be developed as we apply the technique to different styles of recordings. To date, about two dozen seismograms have been digitized. 

One outcome of this research will be a larger, more complete catalog of earthquakes in tectonically quiet regions, such as the Northeastern U.S., where earthquakes are uncommon. By enabling earth scientists to study individual earthquakes and seismic events that occurred before the digital era, the expanded catalog will shed new light on seismological trends.

Furthermore, using DigitSeis to digitize records from other stations around the world, especially in regions with incomplete earthquake catalogs, may have an immediate practical application by improving seismic risk assessment, thus ensuring that building codes are based upon accurate data.

Figure 1. A 1938 analog seismogram in the Harvard-Adam Dziewoński Observatory collection.

Figure 1. A 1938 analog seismogram in the Harvard-Adam Dziewoński Observatory collection.

Scanning Seismograms and Preparing Images

Digitizing a seismogram is a multistep process involving both manual and automated steps. The first step is cleaning and scanning the original analog seismogram to create a high-resolution digital image. A typical seismogram in the HRV collection is about 14 inches by 36 inches, resulting in a JPG digital image file in the tens of megabytes.

To make large image files easier to work with, DigitSeis reduces the images from 24-bit color to 8-bit grayscale, which gives sufficient precision while enabling efficient processing. Then, using histogram correction algorithms developed in MATLAB, DigitSeis removes artifacts in the data that arose from factors such as exposure, long-term storage, and scanning procedures (Figure 2).

Figure 2. An original seismogram image (top left) that was enhanced via histogram correction to produce an image with improved contrast (bottom left). Histograms of intensity values for each image are shown on the right.

Figure 2. An original seismogram image (top left) that was enhanced via histogram correction to produce an image with improved contrast (bottom left). Histograms of intensity values for each image are shown on the right.

While our goal was to automate the digitization as much as possible, users can modify the images and files before or after the automatic processing. For example, after DigitSeis performs the contrast enhancement, the user can crop the image, remove background noise, fine-tune contrast settings, and adjust the orientation of the image. At this stage, the user can also remove unwanted artifacts, such as handwritten notes or stains from the original paper. Using the “remove region” tool in DigitSeis, which is based on the roipoly() function in Image Processing Toolbox™ the user can select a region of the image to exclude from the digitization process (Figure 3).

Figure 3. Top: A section of a seismogram showing traces with the hour note (17 and 18). The first timing note is selected (middle) and then removed (bottom).

Figure 3. Top: A section of a seismogram showing traces with the hour note (17 and 18). The first timing note is selected (middle) and then removed (bottom).

Identifying Traces and Time Marks

The next step is to classify objects in the preprocessed image into three categories:

Seismic traces. Seismic traces record ground movement and are the main features of a seismogram.

Time mark offsets. Each trace on a seismogram is interrupted once a minute by a time mark that is offset from the main trace. These offsets help scientists determine the accurate timing of events recorded on the seismogram.

Noise. This category includes any objects that should not be digitized, such as stains and notes that were not manually removed.

DigitSeis uses MATLAB object identification algorithms to locate and then highlight traces, time marks, and noise in white, green, and red, respectively (Figure 4). A colorblind-friendly scheme is also available.

Figure 4. A seismogram in which objects have been classified as traces (white), time marks (green), and noise (red).

Figure 4. A seismogram in which objects have been classified as traces (white), time marks (green), and noise (red).

At this stage, DigitSeis also invokes algorithms that we developed in MATLAB to quantify the image’s horizontal and vertical distortion. This distortion is corrected later in the digitization process to reduce inaccuracies in waveform timing.

Digitizing the Seismogram

The digitization algorithm uses intensity information to compute a single digital value for every point in each trace of the seismogram. DigitSeis then displays the results.

Although the digitization is automated, manual refinements are occasionally needed. For example, significant earthquakes can cause the traces to cross one another, making it difficult to distinguish the two signals algorithmically. For these cases, DigitSeis supports manual separation of the signals.

Next, DigitSeis corrects the time mark offsets, using fminbnd() from Optimization Toolbox™ to create a continuous waveform by realigning each time mark with its trace (Figure 5).

Figure 5. Results of digitization overlaid on original image. Note that the time marks have been successfully combined with the main trace to provide a continuous time series.

Figure 5. Results of digitization overlaid on original image. Note that the time marks have been successfully combined with the main trace to provide a continuous time series.

This part of the process can easily be executed in parallel on processors with multiple cores. We have created a version of DigitSeis that uses Parallel Computing Toolbox® to process multiple traces simultaneously on multicore processors.

Following the digitization process, DigitSeis saves the time series data to a .MAT file or Seismic Analysis Code (SAC) data files.

Using DigitSeis to Digitize the HRV Collection

Our initial work with the HRV archive is focusing on seismically active dates. For example, several large earthquakes were recorded at HRV from November 13th through November 15th, 1938 (Figure 6). These include a magnitude 6.9 earthquake in the Kuril Islands region (number 1), a magnitude 7.0 event in Japan (number 2), and an aftershock of the latter (number 3).

Figure 6. Digitized seismogram (left) and associated spectrogram (right) for November 13th through November 15th, 1938. Numbered dashed horizontal lines and arrows in the spectrogram indicate the arrival of surface waves at HRV from major seismic events around the world.

Figure 6. Digitized seismogram (left) and associated spectrogram (right) for November 13th through November 15th, 1938. Numbered dashed horizontal lines and arrows in the spectrogram indicate the arrival of surface waves at HRV from major seismic events around the world.

After digitizing this seismogram in DigitSeis, we generated a spectrogram using the resulting time series data. The spectrogram revealed additional earthquakes that were hardly discernible on the raw seismogram. The spectrogram also revealed distinctive noise levels (probably due to storms in the area on November 14th) with peaks at about 0.14 and 0.25 Hz. The frequencies of these peaks are consistent with those of noise recorded by modern instruments at the same location in 2014. This finding illustrates another potential use of old analog seismograms: understanding how storm activity has changed over time.

Next Steps

As we continue to process seismograms from the HRV archive, we are learning more about what steps in the digitization process can be simplified through improved automation. Once we have digitized a significant portion of the archive, we plan to make the results available either on the Harvard Seismology Group website or in the Incorporated Research Institutions for Seismology (IRIS) database.

We have made DigitSeis publically available as open-source MATLAB code. Other observatories have already expressed interest in using the software to digitize their own seismogram archives.

Acknowledgments

The following people have been involved in testing DigitSeis and in the digitization of the Harvard collection: Hiromi Ishii, Isabella Lorrainy Altoé, Alexandra Karamitrou, Thomas Lee, George Liu, and Victor Salles. I would also like to acknowledge that this project was supported by the U.S. Geological Survey Earthquake Hazard Program Award No. G14AP00016 and G16AP00021.

About the Author

Petros Bogiatzis is a research associate in the Harvard Seismology Group at Harvard University. In addition to the digitization of analog seismograms, his primary research focus is seismic tomography. He holds a Ph.D. in Geophysics from Aristotle University of Thessaloniki, Greece.

Published 2016 - 93048v00

View Articles for Related Capabilities

View Articles for Related Industries