[HELP] processing waveforms and memory issues with long periods

EQcorrscan Version: 0.5.0
ObsPy Version: 1.4.1
NumPy Version: 1.26.4

I'm working on applying the FMF matched filter to a swarm of about 100 events over a month of continuous data. My goal is to create a .dt.cc file with correlations and an events catalog with magnitudes. To do this, I need to process the data and pass the whole processed stream through several steps: lag calculation, relative magnitudes estimation, and writing correlations.

Right now, I load the entire stream into memory to get snippets for each detection using detection.extract_stream(). This approach is faster but requires a lot of RAM. I know I could load only the necessary data for each detection, which would save memory, but that would involve a lot of iterations and potentially redundant data loads, making it time-consuming. Unfortunately, I can't use that strategy for lag_calc, since it needs continuous data rather than snippets.

I'm looking for a more efficient way to handle this processing, ideally one that mirrows the 'client_detect' processing. Is there a way to pass the waveform in smaller segments to ease the memory pressure? Or I’m also thinking a solution could be something like detection.extract_stream() that works with a Wavebank or similar.

I’m also hitting memory issues when running write_correlations. This function is pretty memory-intensive, and I’ve crashed the process more often than I’d like while trying to run it.

the call looks something like this:
``` 
write_correlations(new_catalog, stream_dict, dt_length, dt_prepick, shift_len,
                                event_id_mapper=id_mapper, lowcut=None,  highcut=None,
                                max_sep=8, min_link=6, min_cc=0.6, interpolate=False,
                                all_horiz=False, max_workers=None, parallel_process=False, weight_by_square=True)
```

Any tips on how to make this function run more smoothly or alternative processing strategies would be super helpful!
What is your setup?

Operating System: Rocky Linux release 8.10 (Green Obsidian)
Python Version: 3.11.8
EQcorrscan Version: 0.5.0



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[HELP] processing waveforms and memory issues with long periods #588

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[HELP] processing waveforms and memory issues with long periods #588

Description

Activity

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions