Description
EQcorrscan Version: 0.5.0
ObsPy Version: 1.4.1
NumPy Version: 1.26.4
I'm working on applying the FMF matched filter to a swarm of about 100 events over a month of continuous data. My goal is to create a .dt.cc file with correlations and an events catalog with magnitudes. To do this, I need to process the data and pass the whole processed stream through several steps: lag calculation, relative magnitudes estimation, and writing correlations.
Right now, I load the entire stream into memory to get snippets for each detection using detection.extract_stream(). This approach is faster but requires a lot of RAM. I know I could load only the necessary data for each detection, which would save memory, but that would involve a lot of iterations and potentially redundant data loads, making it time-consuming. Unfortunately, I can't use that strategy for lag_calc, since it needs continuous data rather than snippets.
I'm looking for a more efficient way to handle this processing, ideally one that mirrows the 'client_detect' processing. Is there a way to pass the waveform in smaller segments to ease the memory pressure? Or I’m also thinking a solution could be something like detection.extract_stream() that works with a Wavebank or similar.
I’m also hitting memory issues when running write_correlations. This function is pretty memory-intensive, and I’ve crashed the process more often than I’d like while trying to run it.
the call looks something like this:
write_correlations(new_catalog, stream_dict, dt_length, dt_prepick, shift_len,
event_id_mapper=id_mapper, lowcut=None, highcut=None,
max_sep=8, min_link=6, min_cc=0.6, interpolate=False,
all_horiz=False, max_workers=None, parallel_process=False, weight_by_square=True)
Any tips on how to make this function run more smoothly or alternative processing strategies would be super helpful!
What is your setup?
Operating System: Rocky Linux release 8.10 (Green Obsidian)
Python Version: 3.11.8
EQcorrscan Version: 0.5.0
Activity