swarmsort.embedding_scaler module

Embedding distance scaler for SwarmSort.

This module provides advanced scaling methods for embedding distances to improve tracking performance and numerical stability.

class swarmsort.embedding_scaler.EmbeddingDistanceScaler(method='robust_minmax', update_rate=0.05, min_samples=200, update_interval=3)[source]

Bases: object

Enhanced embedding scaler with multiple scaling methods for comparison

get_statistics()[source]

Get current scaler statistics

Return type:: dict

reset()[source]

Full reset of all statistics.

Use this when: - Starting tracking on a new video/scene - The embedding distribution has changed significantly - Scene changes detected (camera switch, dramatic lighting change)

After reset, the scaler will need min_samples frames to become ready again.

restore_update_rate(rate=None)[source]

Restore the update rate after a soft reset.

Parameters:: rate (float) – Rate to restore to. If None, uses 0.05 (default).

scale_distances(distances)[source]

Scale distances using the selected method

Return type:: ndarray

soft_reset(faster_update_rate=0.2)[source]

Soft reset - keep statistics but increase learning rate temporarily.

Use this when: - Embedding distribution may be shifting gradually - You want to adapt faster without losing all history

Parameters:: faster_update_rate (float) – Temporary update rate (default 0.2, 4x faster than default 0.05)

update_statistics(distances)[source]: Update running statistics with new distance samples - OPTIMIZED VERSION

warmup(n_samples=None)[source]

Pre-populate scaler with synthetic data to avoid mode transition spike.

When the scaler transitions from simple scaling (sample_count < min_samples) to percentile-based scaling, there can be a performance spike as statistics are computed for the first time with real data sizes.

This method pre-populates the scaler with synthetic data that approximates typical embedding distance distributions, avoiding the cold-start spike.

Parameters:: n_samples (int) – Number of synthetic samples to generate. Defaults to min_samples. Should be >= min_samples to enable percentile-based scaling immediately.