TLE_ARCHIVE NORAD_ID EPOCH INC ECC 25544 2024-01 51.64 0.0001 41866 2024-01 97.44 0.0012 48274 2024-01 53.05 0.0001 ... derived outputs FORMATS .parquet .csv CONJUNCTION_EVENTS TCA miss dist PIPELINE TLEs process output PUBLIC SOURCES Space-Track CelesTrak NOAA LICENSE CC BY 4.0

Research

Open datasets for
space research

Derived datasets for ML research in orbital mechanics and space domain awareness. Built on public sources, processed by our algorithms.

Our approach

We build datasets by processing publicly available data through our algorithms and physics models.

Public inputs

TLEs from Space-Track, space weather from NOAA, ground station locations from public databases. Transparent provenance.

Our processing

Conjunction screening, maneuver detection, orbital mechanics computation, physics-based simulation. The value we add.

Research-ready outputs

Clean, documented datasets with train/val/test splits. Ready for ML research and benchmarking.

Available Q2 2026

We're preparing these datasets for public release. Contact us for early access or research collaboration.

From Public TLE Data

Orbital Intelligence Datasets

Derived from public Two-Line Element data via our orbital analysis algorithms.

01

Conjunction Events Dataset

Close approach events computed from public TLE catalog. For each event: time of closest approach, miss distance, relative velocity, and collision probability estimate. Enables research on conjunction screening and collision avoidance.

Source: Space-Track TLEs Parquet / CSV

Processing: Brute-force screening of catalog pairs, filtered by orbital geometry, refined with numerical propagation.

02

Maneuver Detection Dataset

Satellite maneuvers detected from TLE sequence discontinuities. Includes maneuver timing, estimated delta-v, and classification (station-keeping, orbit raise, plane change, collision avoidance). Ground truth derived from public conjunction warnings and known events.

Source: Space-Track TLEs Parquet / CSV

Processing: Sequential TLE analysis detecting orbital element discontinuities beyond propagation error thresholds.

03

Orbit Prediction Dataset

Historical TLE sequences with train/test splits for orbit prediction benchmarking. Given past TLEs, predict future orbital elements. Includes SGP4 baseline predictions for comparison.

Source: Space-Track TLEs Parquet / CSV

Processing: Temporal splits ensuring no leakage, stratified by orbital regime (LEO/MEO/GEO) and object type.

04

Satellite Classification Dataset

Orbital behavior sequences labeled with satellite type and operational status. Labels derived from UCS Satellite Database and public catalogs. For classification research using only orbital elements as input.

Source: Space-Track + UCS Database Parquet / CSV

Processing: TLE sequences joined with public satellite metadata, cleaned and standardized labels.

From Orbital Mechanics

Computed Environment Datasets

Generated from orbital mechanics calculations and physics models. No hardware telemetry required.

01

Eclipse Timing Dataset

Precise eclipse entry/exit times computed for satellites across orbital regimes. Includes umbra and penumbra durations, sun angle progressions, and beta angle variations. Foundation for power and thermal modeling.

Source: TLEs + JPL Ephemeris Parquet / CSV

Processing: Shadow geometry computation using conical Earth shadow model and high-precision sun ephemeris.

02

Ground Station Visibility Dataset

Computed visibility windows between satellites and ground stations. Includes elevation profiles, azimuth tracks, and theoretical link margins. Based on public ground station network locations.

Source: TLEs + Public GS Locations Parquet / CSV

Processing: Geometric visibility computation with terrain masking and minimum elevation constraints.

03

Radiation Environment Dataset

Modeled radiation exposure along orbital trajectories. Computed using AP-8/AE-8 trapped particle models and solar proton event data from NOAA. Includes South Atlantic Anomaly transit times and dose rate estimates.

Source: TLEs + NOAA Space Weather Parquet / CSV

Processing: Orbital position mapped to radiation belt models, correlated with historical space weather indices.

From Our Experiments

Simulation & Experiment Datasets

Generated from our own simulations and ML experiments. Synthetic but realistic.

01

Federated Learning Experiment Logs

Training logs from federated learning experiments with simulated Earth-space network constraints. Includes gradient statistics, convergence curves, communication costs, and accuracy by synchronization strategy.

Synthetic Parquet / CSV

Processing: Actual FL training runs with injected latency, bandwidth limits, and intermittent connectivity matching orbital link profiles.

02

Model Partitioning Results

Benchmark results for neural network partitioning across distributed infrastructure. Various model architectures tested with different latency/bandwidth constraint profiles. Optimal split points and performance trade-offs.

Synthetic Parquet / CSV

Processing: Exhaustive evaluation of partition points for standard architectures (ResNet, BERT, etc.) under varied constraints.

03

Gradient Compression Benchmarks

Evaluation of gradient compression techniques for bandwidth-limited distributed training. Compression ratios, reconstruction error, and downstream model accuracy across compression methods and rates.

Synthetic Parquet / CSV

Processing: Systematic evaluation of quantization, sparsification, and learned compression on standard training tasks.

Data formats

All datasets available in multiple formats for different workflows.

Apache Parquet

Columnar format optimized for analytical queries. Best for large-scale processing with Spark, DuckDB, or pandas. Includes schema and compression.

CSV

Universal format for maximum compatibility. Works with any tool or language. Includes header row with column names.

Data licensing

License Use Case Cost
CC BY 4.0 Academic and commercial use with attribution Free
Enterprise License Custom processing, private datasets, SLA Contact us

Note: Our datasets are derived from public sources. Original data from Space-Track requires a user agreement. NOAA data is public domain.

Want early access?

Contact us if you're working on space research and need access to our datasets before public release.