Region exclusion
A region is a known area of the (m/z, 1/K0) plane that you want to drop wholesale — typically based on physical knowledge of the acquisition rather than from estimating noise. The canonical example is the singly-charged / polymer contamination band in timsTOF MS1.
Conceptually distinct from noise filters: region exclusion answers "which part of the data plane are we even interested in?", while noise filtering answers "of what's left, what's real signal?".
from tdfpy import ChargeStateRegion, get_raw_peaks
# Drop the typical singly-charged region
peaks = get_raw_peaks(td, frame_id, exclude=ChargeStateRegion())
# Custom line + cap
peaks = get_raw_peaks(
td, frame_id,
exclude=ChargeStateRegion(
line=((400.0, 0.75), (1200.0, 1.5)),
cap_at_upper_endpoint=True,
),
)
The line is converted to a per-scan TOF-index cutoff once per frame, so exclusion happens via a single vectorized integer comparison.
tdfpy.ChargeStateRegion
dataclass
ChargeStateRegion(
line: tuple[
tuple[float, float], tuple[float, float]
] = ((350.0, 0.7), (1200.0, 1.4)),
cap_at_upper_endpoint: bool = True,
)
Drop peaks above a line in (m/z, 1/K0) space, capped at the line's upper endpoint.
The line is defined by two (m/z, 1/K0) points. A peak is in the
region (and therefore dropped) iff
``1/K0 > line(m/z)`` OR (if ``cap_at_upper_endpoint``)
``1/K0 > max(point_1[1], point_2[1])``.
The default endpoints target the singly-charged region in typical timsTOF MS1 data.
index_cutoff_per_scan
index_cutoff_per_scan(
td: TimsData, frame_id: int, num_scans: int
) -> np.ndarray
Per-scan TOF-index cutoff implementing this region exclusion.
For each scan, mz_indices < cutoff[scan] lies above the line
(in the region) and should be dropped. Scans whose 1/K0 is above
the cap get cutoff +inf so all their peaks are dropped.
Performing the comparison in integer-index space is much cheaper than converting every peak to m/z and 1/K0.
Source code in src/tdfpy/regions.py
50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 | |