Getting Started

Installation

pip install tdfpy

Requires Python 3.12+.

Detecting Acquisition Type

Before loading data, you can inspect the acquisition type of a .d folder:

from tdfpy import get_acquisition_type

acq_type = get_acquisition_type(D_PATH)
# Returns one of: "DDA", "DIA", "PRM", "Unknown"
print(acq_type)

DDA Acquisitions

from tdfpy import DDA

with DDA(D_PATH) as dda:
    # Iterate over MS1 frames
    for frame in dda.ms1:
        print(f"Frame {frame.frame_id} at RT {frame.time:.1f}s")
        # Centroid the frame — returns shape (N, 3): [m/z, intensity, 1/K0]
        peaks = frame.centroid()
        print(f"  {len(peaks)} centroided peaks")

    # Iterate over precursors (MS2)
    for precursor in dda.precursors:
        print(f"Precursor {precursor.precursor_id}: {precursor.largest_peak_mz:.4f} m/z")
        # Raw centroided peaks from Bruker's algorithm
        peaks = precursor.peaks

DIA Acquisitions

from tdfpy import DIA

with DIA(D_PATH) as dia:
    # MS1 frames
    for frame in dia.ms1:
        peaks = frame.centroid()

    # DIA windows
    for window in dia.windows:
        print(f"Window group {window.window_group}: isolation {window.isolation_mz} m/z")
        peaks = window.centroid()

Lookups and Queries

Access frames, precursors, or windows directly by ID or query by properties.

from tdfpy import DDA

with DDA(D_PATH) as dda:
    # Access by ID
    frame = dda.ms1[1]
    precursor = dda.precursors[1]

    # Query precursors by m/z and retention time
    results = dda.precursors.query(
        mz=1292.63,
        mz_tolerance=20.0,       # ppm by default
        rt=2400.0,               # seconds
        rt_tolerance=30.0,       # seconds
    )
    for p in results:
        print(p.precursor_id, p.largest_peak_mz)

from tdfpy import DIA

with DIA(D_PATH) as dia:
    # Get all windows in a window group
    group_windows = dia.windows[0]  # returns a list

    # Query windows by retention time
    results = dia.windows.query(rt=1200.0, rt_tolerance=60.0)
    for w in results:
        print(w.window_group, w.isolation_mz)

How Data Access Works

A .d folder contains two files: analysis.tdf (a SQLite database with metadata) and analysis.tdf_bin (a binary file with the raw spectral data).

When you open a DDA or DIA reader, it immediately:

Opens a connection to the binary file
Reads all frame and precursor metadata from the SQLite database into memory

The objects you get back — Frame, Precursor, DiaWindow, etc. — all hold a reference to that open connection. Their fields (frame_id, rt, monoisotopic_mz, etc.) are available immediately. Spectral data is fetched lazily: calling .peaks or .centroid() reads from the binary file at that moment.

This means objects cannot be used after the reader closes:

from tdfpy import DDA

with DDA(D_PATH) as dda:
    frame = dda.ms1[1]
    peaks = frame.centroid()  # fine — connection is open

# peaks = frame.centroid()  # RuntimeError: connection is closed

Development

The project uses uv for dependency management and just as a task runner.

just install-dev     # install with dev dependencies
just test            # run tests
just lint            # ruff linter
just check           # lint + test + type check

To serve the docs locally:

uv run --group docs mkdocs serve