Spectrum and MsnSpectrum

SpectrumType

class SpectrumType(StrEnum):
    CENTROID = "centroid"
    PROFILE  = "profile"
    DECONVOLUTED = "deconvoluted"

SpectrumType tags what stage the data is in. Several methods check or set this flag to prevent out-of-order operations (e.g., calling .decharge() on a non-deconvoluted spectrum raises ValueError).

Peak

@dataclass(frozen=True, slots=True)
class Peak:
    mz: float
    intensity: float
    charge: int | None = None
    im: float | None = None
    score: float | None = None

A frozen dataclass representing a single detected peak. charge, im, and score are optional. Peak objects are returned by .peaks, .top_peaks(), .get_peak(), and .get_peaks() — they are read-only views, not references into the underlying arrays.

score holds the isotopic profile score (0–1) assigned during deconvolution, or None for peaks that have not been through deconvolution.

>>> peak = Peak(mz=500.1, intensity=1e5, charge=2)
>>> repr(peak)
'Peak(mz=500.1000, int=1.00e+05, z=2)'

Charge conventions

`charge` value	Meaning
`> 0`	Peak belongs to an assigned isotope cluster with that charge state
`-1`	Singleton — no isotope neighbours found at any tested charge
`0`	After `.decharge()` — neutral mass, charge state no longer tracked

Spectrum

@dataclass(slots=True)
class Spectrum:
    mz: NDArray[np.float64]
    intensity: NDArray[np.float64]
    charge: NDArray[np.int32] | None = None
    im: NDArray[np.float64] | None = None
    score: NDArray[np.float64] | None = None
    spectrum_type: SpectrumType | str | None = None
    denoised: str | None = None
    normalized: str | None = None

The central data structure. mz and intensity must have the same length. charge, im, and score must also match that length when provided.

Fields:

Field	Type	Description
`mz`	`NDArray[np.float64]`	Peak m/z values, sorted ascending
`intensity`	`NDArray[np.float64]`	Parallel peak intensities
`charge`	`NDArray[np.int32] \\| None`	Charge state per peak. `None` before deconvolution
`im`	`NDArray[np.float64] \\| None`	Ion mobility per peak. `None` if not acquired
`score`	`NDArray[np.float64] \\| None`	Per-peak isotopic profile score (0–1). Populated after `deconvolute()`; `None` otherwise. Singletons have `score=0.0`.
`spectrum_type`	`SpectrumType \\| str \\| None`	Stage tag: `CENTROID`, `PROFILE`, or `DECONVOLUTED`
`denoised`	`str \\| None`	Name of the denoising method applied, or `None`
`normalized`	`str \\| None`	Name of the normalization method applied, or `None`

Validation rules enforced in __post_init__:

len(charge) == len(mz) when charge is not None
len(im) == len(mz) when im is not None
len(score) == len(mz) when score is not None
A charge array may only be present when spectrum_type == DECONVOLUTED

import numpy as np
from spxtacular import Spectrum

spec = Spectrum(
    mz=np.array([500.1, 800.2, 1200.5], dtype=np.float64),
    intensity=np.array([1e5, 2e5, 9e4], dtype=np.float64),
)
print(spec)
# Spectrum(n_peaks=3, type=None, denoised=None, normalized=None)

Peak access

`peaks` property

@property
def peaks(self) -> list[Peak]

Returns all peaks as a list of Peak objects. Iterates the full spectrum; prefer numpy operations on .mz / .intensity for performance on large spectra.

for peak in spec.peaks:
    print(peak.mz, peak.intensity)

`top_peaks`

def top_peaks(
    self,
    n: int,
    by: Literal["intensity", "mz", "charge", "im"] = "intensity",
    reverse: bool = True,
) -> list[Peak]

Returns the top n peaks sorted by the chosen attribute.

Parameter	Description
`n`	Number of peaks to return
`by`	Sort key: `"intensity"` (default), `"mz"`, `"charge"`, `"im"`
`reverse`	`True` (default) returns highest values first

"charge" requires a charge array to be present; "im" requires an ion mobility array. Both raise ValueError otherwise.

# Five most intense peaks
top5 = spec.top_peaks(5)

# Lowest-mz three peaks
low_mz = spec.top_peaks(3, by="mz", reverse=False)

Peak finding

`has_peak`

def has_peak(
    self,
    target_mz: float,
    mz_tol: float = 0.01,
    mz_tol_type: Literal["Da", "ppm"] = "Da",
    target_charge: int | None = None,
    target_im: float | None = None,
    im_tol: float = 0.01,
) -> bool

Returns True if at least one peak matches all supplied criteria.

spec.has_peak(500.1, mz_tol=0.02)
spec.has_peak(500.1, mz_tol=10, mz_tol_type="ppm", target_charge=2)

`get_peak`

def get_peak(
    self,
    target_mz: float,
    mz_tol: float = 0.01,
    mz_tol_type: Literal["Da", "ppm"] = "Da",
    target_charge: int | None = None,
    target_im: float | None = None,
    im_tol: float = 0.01,
    collision: Literal["largest", "closest"] = "largest",
) -> Peak | None

Returns a single matching peak, or None if no match is found. When multiple peaks fall within tolerance, collision="largest" picks the most intense; collision="closest" picks the nearest in m/z.

peak = spec.get_peak(800.2, mz_tol=5, mz_tol_type="ppm")
if peak:
    print(f"Found: {peak}")

`get_peaks`

def get_peaks(
    self,
    target_mz: float,
    mz_tol: float = 0.01,
    mz_tol_type: Literal["Da", "ppm"] = "Da",
    target_charge: int | None = None,
    target_im: float | None = None,
    im_tol: float = 0.01,
) -> list[Peak]

Returns all peaks matching the criteria (may be empty).

Filtering and processing

All processing methods accept inplace: bool = False. When inplace=False (the default) a new Spectrum is returned, leaving the original unchanged and allowing method chaining.

`filter`

def filter(
    self,
    min_mz: float | None = None,
    max_mz: float | None = None,
    min_intensity: float | None = None,
    max_intensity: float | None = None,
    min_charge: int | None = None,
    max_charge: int | None = None,
    min_im: float | None = None,
    max_im: float | None = None,
    top_n: int | None = None,
    inplace: bool = False,
) -> Self

Removes peaks outside the given bounds. All parameters are optional and combinable. top_n is applied last — after all range filters — keeping the top_n most intense survivors.

Charge, ion mobility, and score filters are silently ignored if the spectrum lacks those arrays.

Score filter parameters:

Parameter	Type	Description
`min_score`	`float \\| None`	Keep peaks with score >= this value. Only effective when `score` array is present.
`max_score`	`float \\| None`	Keep peaks with score <= this value. Only effective when `score` array is present.

# Keep peaks between 200 and 1500 Da with intensity >= 1000
filtered = spec.filter(min_mz=200, max_mz=1500, min_intensity=1000)

# Keep only the 50 most intense peaks after m/z filtering
filtered = spec.filter(min_mz=200, top_n=50)

`normalize`

def normalize(
    self,
    method: Literal["max", "tic", "median"] = "max",
    inplace: bool = False,
) -> Self

Scales all intensities so that the chosen reference equals 1.0.

`method`	Normalization factor
`"max"` (default)	Most intense peak
`"tic"`	Total ion current (sum of all intensities)
`"median"`	Median intensity

Calling normalize on an already-normalized spectrum emits a UserWarning and returns self unchanged.

norm = spec.normalize()            # max normalization
norm = spec.normalize("tic")       # TIC normalization

`denoise`

def denoise(
    self,
    method: Literal["mad", "percentile", "histogram", "baseline", "iterative_median"]
            | float | int = "mad",
    inplace: bool = False,
) -> Self

Removes peaks below an estimated noise threshold. Peaks at or above the threshold are kept.

`method`	Threshold strategy
`"mad"` (default)	`median + 3 × 1.4826 × MAD`
`"percentile"`	5th percentile of intensities
`"histogram"`	Mode of 100-bin histogram + 3 σ (FWHM-derived)
`"baseline"`	Mean + 3 σ of the bottom 25th percentile
`"iterative_median"`	Iteratively refines median/MAD estimate over 3 passes
`float` or `int`	Used directly as the absolute threshold

Calling denoise on an already-denoised spectrum emits a UserWarning and returns self unchanged.

spec.denoise()                       # MAD (robust, recommended for most spectra)
spec.denoise("histogram")            # histogram mode estimate
spec.denoise(5000.0)                 # fixed absolute threshold

`centroid`

def centroid(self, inplace: bool = False) -> Self

Converts a profile-mode spectrum to centroid mode using vectorized Gaussian fitting. Detects local maxima, fits a Gaussian to each triplet of points, and returns sub-bin peak positions. Ion mobility data is preserved at the apex value.

Calling this on an already-centroided spectrum emits a UserWarning and returns self unchanged.

centroided = profile_spec.centroid()

`merge`

def merge(
    self,
    mz_tolerance: float = 0.01,
    mz_tolerance_type: Literal["ppm", "da"] = "da",
    im_tolerance: float = 0.05,
    im_tolerance_type: Literal["relative", "absolute"] = "relative",
    inplace: bool = False,
) -> Self

Merges nearby peaks using a greedy intensity-ordered strategy. Peaks are processed from most to least intense; each unused neighbour within the tolerance window is merged into the current peak. The merged peak carries the intensity-weighted average m/z (and ion mobility if present) and the summed intensity. Charge arrays are preserved — only peaks with matching charge are merged together.

merged = spec.merge(mz_tolerance=0.02, mz_tolerance_type="da")
merged = spec.merge(mz_tolerance=5, mz_tolerance_type="ppm")

`deconvolute`

def deconvolute(
    self,
    tolerance: float = 50,
    tolerance_type: Literal["ppm", "da"] = "ppm",
    charge_range: tuple[int, int] = (1, 3),
    intensity: Literal["base", "total"] = "total",
    max_dpeaks: int = 2000,
    inplace: bool = False,
) -> Self

Assigns each peak to an isotope cluster and records the charge state. Returns a spectrum with spectrum_type=DECONVOLUTED and a populated charge array.

Parameter	Description
`tolerance`	Peak matching tolerance (default 50 ppm)
`tolerance_type`	`"ppm"` (default) or `"da"`
`charge_range`	`(min_charge, max_charge)` inclusive; default `(1, 3)`
`intensity`	`"total"` (default) sums the whole cluster; `"base"` uses only the monoisotopic peak
`max_dpeaks`	Maximum output peaks (default 2000)
`min_intensity`	`float \\| "min"` — Absolute intensity floor for isotope detectability. The sentinel `"min"` (default) uses the spectrum's own minimum intensity as the S/N floor.
`min_score`	`float` — Clusters whose best isotopic profile score falls below this threshold are recorded as singletons. Default `0.0` accepts all clusters.

After deconvolution the charge array follows the charge conventions table: > 0 for assigned clusters, -1 for singletons.

See Deconvolution for a detailed walkthrough.

decon = spec.deconvolute(charge_range=(1, 5), tolerance=10, tolerance_type="ppm")

`decharge`

def decharge(self, inplace: bool = False) -> Self

Converts deconvoluted m/z values to neutral monoisotopic masses using neutral_mass = (mz × charge) - (charge × proton_mass). Singletons (charge == -1) are dropped. The resulting charge array is set to all zeros (meaning "charge unknown / neutral mass").

Raises ValueError if the spectrum is not in DECONVOLUTED state.

The score array is propagated through decharge() — each surviving neutral-mass peak retains the score of its charged precursor.

neutral = decon.decharge()
# neutral.mz now contains neutral masses sorted ascending
# neutral.charge is all zeros
# neutral.score carries through from the deconvoluted spectrum

`update`

def update(self, inplace: bool = False, **kwargs) -> Self

Low-level helper to create a new Spectrum with arbitrary fields replaced. Prefer the named methods above for normal use.

renamed = spec.update(spectrum_type="centroid")

Compression

`compress`

def compress(
    self,
    url_safe: bool = False,
    mz_precision: int | None = None,
    intensity_precision: int | None = None,
    im_precision: int | None = None,
    compression: str = "gzip",
) -> str

Serialises the spectrum to a compact ASCII string. m/z values are delta-encoded; intensities and ion mobility use raw float32 hex encoding. The result is compressed with gzip, zlib, or brotli (requires pip install brotli) and then base85-encoded (default) or base64 URL-safe encoded when url_safe=True.

Optional *_precision parameters round the corresponding arrays before encoding, reducing compressed size at the cost of numeric precision.

blob = spec.compress()
blob_url = spec.compress(url_safe=True, mz_precision=4, compression="zlib")

`Spectrum.from_compressed`

@classmethod
def from_compressed(cls, compressed_str: str) -> Spectrum

Round-trips a string produced by .compress() back to a Spectrum.

recovered = Spectrum.from_compressed(blob)

Visualization

`plot`

def plot(
    self,
    title: str | None = None,
    show_charges: bool = True,
    show_scores: bool = True,
    **layout_kwargs,
) -> Figure

Returns a Plotly Figure (stick plot). Requires plotly (pip install plotly).

Parameter	Description
`title`	Plot title
`show_charges`	Colour sticks by charge state when a `charge` array is present
`show_scores`	Annotate scored peaks with their score value when a `score` array is present

spec.plot(title="My spectrum").show()
decon.plot(show_charges=True, show_scores=True).show()

See Visualization for mirror_plot() and annotate_spectrum().

`plot_table`

def plot_table(
    self,
    show_charges: bool = True,
    show_scores: bool = True,
) -> pd.DataFrame

Returns a pandas.DataFrame with one row per peak. Each row contains both the raw peak data (mz, intensity, charge, score, im) and all visual properties (color, linewidth, opacity, series, label, label_size, label_font, label_color, label_yshift, label_xanchor, hover). Modify the table freely, then render it with plot_from_table().

tbl = decon.plot_table()
tbl.loc[tbl["charge"] == 2, "color"] = "red"
tbl.loc[tbl["intensity"] > 1e5, "linewidth"] = 2.0
fig = plot_from_table(tbl, title="Custom plot")
fig.show()

`annot_plot_table`

def annot_plot_table(
    self,
    fragments,
    mz_tol: float = 0.02,
    mz_tol_type: Literal["Da", "ppm"] = "Da",
    peak_selection: Literal["closest", "largest", "all"] = "closest",
    include_sequence: bool = False,
) -> pd.DataFrame

Like plot_table() but matched peaks are coloured by ion series and labelled with their fragment identifier. Unmatched peaks are grey. Modify the returned table and call plot_from_table() to render.

tbl = spec.annot_plot_table(fragments, mz_tol=10, mz_tol_type="ppm")
tbl.loc[tbl["label"] != "", "label_size"] = 14
fig = plot_from_table(tbl, title="Annotated")
fig.show()

See Visualization — Plot table API for full column reference.

MsnSpectrum

MsnSpectrum extends Spectrum with instrument-level metadata. It is what the readers (DReader, MzmlReader) yield. All Spectrum methods are available unchanged.

@dataclass(slots=True, kw_only=True)
class MsnSpectrum(Spectrum):
    # Scan identification
    scan_number: int | None = None
    ms_level: int | None = None
    native_id: str | None = None

    # Timing
    rt: float | None = None          # retention time, seconds
    injection_time: float | None = None  # ion accumulation time, ms

    # Acquisition windows
    mz_range: tuple[float, float] | None = None
    im_range: tuple[float, float] | None = None
    im_type: str | None = None       # e.g. "1/K0", "drift_time_ms"

    # Instrument settings
    polarity: Literal["positive", "negative"] | None = None

    # Optional metadata
    resolution: float | None = None
    analyzer: str | None = None      # e.g. "TOF", "FTMS", "ITMS"
    ramp_time: float | None = None
    collision_energy: float | None = None
    activation_type: str | None = None
    precursors: list[TargetIon] | None = None

TargetIon

@dataclass(frozen=True, slots=True, kw_only=True)
class TargetIon(Peak):
    is_monoisotopic: bool | None

Represents a precursor ion selected for fragmentation. Stored in MsnSpectrum.precursors.

Example: inspecting an MS2 spectrum

from spxtacular import MzmlReader

reader = MzmlReader("run.mzML")
for spec in reader.ms2:
    print(f"Scan {spec.scan_number}, RT={spec.rt:.1f}s, CE={spec.collision_energy}")
    if spec.precursors:
        prec = spec.precursors[0]
        print(f"  Precursor: {prec.mz:.4f} m/z, z={prec.charge}")
    break

Spectrum and MsnSpectrum

SpectrumType

Peak

Charge conventions

Spectrum

Peak access

peaks property

top_peaks

Peak finding

has_peak

get_peak

get_peaks

Filtering and processing

filter

normalize

denoise

centroid

merge

deconvolute

decharge

update