Interactive Simulation

Earth-Space Distributed AI

Explore how RotaStellar coordinates federated learning, model partitioning, and synchronization across Earth and orbital infrastructure.

Federated Learning Simulation
Round 1,247 · 00:04:32

RotaStellar vs Naive

Bandwidth -99.0%
Convergence +23.4%
Fault tolerance +∞
Accuracy delta -0.31%
LEO 550km AGGREGATOR Global Model v1247 us-west-2 computing eu-west-1 syncing ap-south-1 ready ap-northeast computing orbital-1 sunlight ISL relay orbital-2 eclipse (47s) ∇' = 42 KB COMPRESSION 100× TopK + Quantization THROUGHPUT 1.2 MB/s vs 120 MB/s naive
Configuration
Live Metrics
Accuracy
94.72%
Target: 95.0%
Loss
0.0523
Bandwidth
1.2 MB/s
99% reduction
Sync Rate
2.4/min
Async aggregation
Node Status
🖥️
us-west
computing
🖥️
eu-west
syncing
🖥️
ap-south
ready
🖥️
ap-ne
computing
🛰️
orb-1
sunlight
🛰️
orb-2
eclipse
Event Stream
04:32.1 Round 1247 aggregated
04:31.8 eu-west gradient: 38 KB
04:31.2 TopK sparsification: 99.1%
04:30.5 orbital-2 entering eclipse
04:29.8 🛰 ISL handover complete
Training Convergence
RotaStellar (compressed)
Naive (full gradients)
Centralized baseline
0% 50% 100% 0 500 1000 Rounds 94.7% Bandwidth limited
Technical Deep Dive: Gradient Compression

Our gradient compression achieves 100× reduction while maintaining convergence by combining Top-K sparsification with stochastic quantization. This is critical for space links where bandwidth is measured in KB/s, not GB/s.

∇'compressed = Q8bit(TopK(∇, k=0.01)) + eaccumulated

The key insight is error feedback: compression errors are accumulated locally and added to the next round's gradients, ensuring no information is permanently lost.

# RotaStellar gradient compression
def compress_gradient(gradient, k_ratio=0.01):
    # Top-K sparsification: keep only top 1% by magnitude
    k = int(gradient.numel() * k_ratio)
    values, indices = torch.topk(gradient.abs().flatten(), k)
    sparse = torch.zeros_like(gradient.flatten())
    sparse[indices] = gradient.flatten()[indices]

    # 8-bit stochastic quantization
    scale = sparse.abs().max() / 127
    quantized = (sparse / scale).round().to(torch.int8)

    # Error feedback for next round
    error = gradient - decompress(quantized, indices, scale)
    return quantized, indices, scale, error

# Compression ratio: 32-bit × N params → 8-bit × 0.01N + indices
# For 70B params: 280GB → 2.8GB (indices) + 0.7GB (values) ≈ 100×
100×
Compression Ratio
<0.5%
Accuracy Loss
+23%
Faster Convergence
Comparison: RotaStellar vs Alternatives
Metric Naive Sync FedAvg RotaStellar
Bandwidth per round 280 GB 280 GB 2.8 GB
Handles intermittent connectivity No Partial Yes (async)
Eclipse resilience Fails Stalls Continues
Convergence (rounds to 95%) N/A 1,800 1,250
Final accuracy N/A 94.2% 94.7%
Supports orbital nodes No No Native
Model Partitioning Optimizer
Inference latency: 127ms
LLaMA-70B Layer Distribution GROUND (us-west-2) Embedding (4.2B params) Layers 0-23 (28.4B params) Layers 24-47 (28.4B params) 60.8B params · 45ms compute 12ms 2.1 MB activations ORBITAL (orbital-1) Layers 48-63 (18.9B params) Output Head (1.2B params) Solar surplus +340W available 20.1B params · 70ms compute Why this partition? Minimize activation transfer (split at natural boundary) Utilize orbital solar surplus for compute-heavy final layers Latency: 127ms total (vs 180ms ground-only, vs 95ms centralized) Latency Breakdown 45ms 12 70ms Ground compute + Transfer + Orbital compute = 127ms
Partition Config
Performance
End-to-end
127ms
vs Ground-only
-29%
Bandwidth
2.1 MB
Energy saved
+18%
Constraints
Ground: 8× A100 (80GB), 2.4 PFLOPS
Orbital: 4× H100 (80GB), 1.2 PFLOPS, solar-powered
Link: 1.2 Gbps, 12ms RTT (during pass)
Technical Deep Dive: Optimal Partitioning

Finding the optimal model partition is a constrained optimization problem. We minimize end-to-end latency subject to memory, bandwidth, and energy constraints.

mins [ Tcompute(0:s) + Ttransfer(s) + Tcompute(s:L) ]

Where s is the split layer, subject to:

  • Memory(0:s) ≤ Ground_VRAM
  • Memory(s:L) ≤ Orbital_VRAM
  • Activation_size(s) × RTT ≤ Latency_budget
  • Compute(s:L) ≤ Orbital_energy_budget
-29%
Latency vs Ground-only
+34%
vs Naive Split
340W
Solar Surplus Used
Ground Station Pass Scheduler
Next pass: 2m 34s
24-Hour Ground Station Pass Schedule 00:00 06:00 12:00 18:00 24:00 Svalbard Alaska Singapore NOW Sync Queue (Priority-ordered) P1: Gradients (2.4 MB) P2: Checkpoint (18 MB) P3: Telemetry (4 MB) P4: Logs (12 MB) Deferred (45 MB) Optimization Impact Data freshness 98.2% vs 67% naive Bandwidth utilization 94.7% vs 52% naive Priority adherence 100% critical data first
Current Pass
Station
Svalbard
Duration
8m 42s
Max elev.
72°
Bandwidth
1.8 Gbps
Queue Status
P1 Critical 2.4 MB
P2 Important 18 MB
P3 Normal 4 MB
P4 Low 57 MB
Event Stream
12:34:12 P1 gradients synced
12:34:08 Pass started (Svalbard)
12:33:45 Queue reordered (P1 priority)
12:32:20 Alaska pass skipped (weather)

Built on Open Research

Every capability demonstrated here is grounded in our published research, open datasets, and benchmarks.

Models

gradient-compress, model-partition, sync-scheduler, checkpoint-optimizer, bandwidth-predict

View 15 models →

Datasets

Link Budget Archive, ISL Topology, Space Network Traces, Federated Training Logs

View 13 datasets →

Benchmarks

FedSpace, PartitionBench, SyncEfficiency, CheckpointOpt, MeshRoute

View 15 benchmarks →

Ready to build Earth-space AI?

Get early access to distributed compute capabilities and start coordinating AI workloads across ground and orbital infrastructure.

Get Early Access Talk to Us