Commit ·
4946666
1
Parent(s): 0f31e57
prep release
Browse files- README.md +176 -0
- make_datasets.sh +47 -0
- s23dr_2026_example/cache_scenes.py +121 -17
- s23dr_2026_example/make_sampled_cache.py +102 -18
- submitted_2048/README.md +35 -0
- submitted_2048/args.json +67 -0
- submitted_2048/checkpoint.pt +3 -0
README.md
ADDED
|
@@ -0,0 +1,176 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: cc-by-nc-4.0
|
| 3 |
+
library_name: pytorch
|
| 4 |
+
tags:
|
| 5 |
+
- 3d-reconstruction
|
| 6 |
+
- wireframe
|
| 7 |
+
- building
|
| 8 |
+
- point-cloud
|
| 9 |
+
- s23dr
|
| 10 |
+
- cvpr-2026
|
| 11 |
+
datasets:
|
| 12 |
+
- usm3d/s23dr-2026-sampled_4096_v2
|
| 13 |
+
- usm3d/s23dr-2026-sampled_2048_v2
|
| 14 |
+
metrics:
|
| 15 |
+
- HSS
|
| 16 |
+
pipeline_tag: other
|
| 17 |
+
---
|
| 18 |
+
|
| 19 |
+
# S23DR 2026 Learned Baseline
|
| 20 |
+
|
| 21 |
+
A learned baseline for the **S23DR 2026** challenge (**S**tructured and **S**emantic **3D R**econstruction, or S^2 3DR), part of the [USM3D workshop](https://usm3d.github.io) at CVPR 2026. The model takes a fused point cloud of a building and predicts its wireframe as a set of 3D line segments.
|
| 22 |
+
|
| 23 |
+
**Headline result: HSS = 0.382** on the 1024-sample validation set (shipped checkpoint).
|
| 24 |
+
|
| 25 |
+
For context, the handcrafted baseline scores HSS = 0.307 on the same split.
|
| 26 |
+
|
| 27 |
+
## Quick start
|
| 28 |
+
|
| 29 |
+
Run the submission pipeline directly (matches the competition eval harness):
|
| 30 |
+
|
| 31 |
+
```bash
|
| 32 |
+
python script.py
|
| 33 |
+
```
|
| 34 |
+
|
| 35 |
+
That loads `checkpoint.pt`, fuses the input views into a 4096-point cloud, runs the model, and writes the predicted wireframe for each scene.
|
| 36 |
+
|
| 37 |
+
To reproduce the checkpoint from scratch on a single RTX 4090 (~3 hours):
|
| 38 |
+
|
| 39 |
+
```bash
|
| 40 |
+
bash reproduce.sh
|
| 41 |
+
```
|
| 42 |
+
|
| 43 |
+
Or for a bit-identical deterministic run (~5.5 hours, slower because it disables `torch.compile`):
|
| 44 |
+
|
| 45 |
+
```bash
|
| 46 |
+
bash reproduce_deterministic.sh
|
| 47 |
+
```
|
| 48 |
+
|
| 49 |
+
Both scripts run the full three-stage recipe described below. See `REPRODUCE.md` for the exact hyperparameters and the reproducibility notes.
|
| 50 |
+
|
| 51 |
+
## Architecture
|
| 52 |
+
|
| 53 |
+
A Perceiver-style transformer that ingests the point cloud as a sequence of per-point tokens and decodes a fixed set of 3D line segments through cross-attention into a latent.
|
| 54 |
+
|
| 55 |
+
```
|
| 56 |
+
Perceiver: hidden=256, ff=1024
|
| 57 |
+
latent_tokens=256, latent_layers=7
|
| 58 |
+
encoder_layers=4, decoder_layers=3, cross_attn_interval=4
|
| 59 |
+
num_heads=4, kv_heads_cross=2, kv_heads_self=2
|
| 60 |
+
qk_norm=L2, rms_norm=True, dropout=0.1
|
| 61 |
+
segments=64, segment_param=midpoint_dir_len, segment_conf=True
|
| 62 |
+
behind_emb_dim=8, vote_features=True, activation=gelu
|
| 63 |
+
```
|
| 64 |
+
|
| 65 |
+
The decoder predicts 64 candidate segments, each parametrized as midpoint + direction + length with a confidence head. Training uses a Sinkhorn optimal-transport loss to match predicted segments to ground-truth, plus a symmetric endpoint L1 term in the cooldown stage.
|
| 66 |
+
|
| 67 |
+
All architecture and optimizer settings live in `configs/base.json`.
|
| 68 |
+
|
| 69 |
+
## Training recipe
|
| 70 |
+
|
| 71 |
+
The model ships with a three-stage recipe. Each stage starts from the previous stage's final checkpoint.
|
| 72 |
+
|
| 73 |
+
| Stage | Input | Steps | LR | Batch | Notes | HSS |
|
| 74 |
+
|---|---|---|---|---|---|---|
|
| 75 |
+
| 1. 2048 from scratch | 2048 pts | 0 -> 125k | 3e-4, warmup 10k | 32 | Random init, sinkhorn only | 0.281 |
|
| 76 |
+
| 2. 4096 finetune | 4096 pts | 125k -> 135k | 3e-5 constant | 64 | Gentle LR preserves representations | 0.351 |
|
| 77 |
+
| 3. Endpoint cooldown | 4096 pts | 135k -> 170k | 3e-5 then linear decay | 64 | Adds endpoint L1 loss, tightens vertices | **0.382** |
|
| 78 |
+
|
| 79 |
+
**Why 2048 first:** training directly on 4096 overfits (1.47x train/val ratio vs 1.19x for 2048). Starting on 2048 produces better-generalized representations that the 4096 finetune can then specialize.
|
| 80 |
+
|
| 81 |
+
**Why a gentle LR on finetune:** LR > 1e-4 causes catastrophic forgetting of the 2048 geometry understanding.
|
| 82 |
+
|
| 83 |
+
**Why endpoint loss only in stage 3:** the Sinkhorn loss operates on the midpoint/direction/length parametrization and doesn't directly penalize vertex position error. Adding a symmetric endpoint L1 against the detached Sinkhorn assignment tightens vertex precision in the cooldown.
|
| 84 |
+
|
| 85 |
+
Full details, including the "what does not work" list (BuildingWorld pretraining, mixed training, high dropout, etc.), are in `REPRODUCE.md`.
|
| 86 |
+
|
| 87 |
+
## Evaluation
|
| 88 |
+
|
| 89 |
+
> **About the numbers:** all val scores below are HSS at confidence threshold 0.7, averaged over the 1024-sample *internal validation split* we hold out from the published training data (`usm3d/s23dr-2026-sampled_{2048,4096}_v2:validation`). They are **not** test-set numbers. The only test-set number we have is the public leaderboard score of the older 2048 submission (see `submitted_2048/` and the last table below).
|
| 90 |
+
>
|
| 91 |
+
> All numbers below are freshly measured in this release against the checkpoints in this repo.
|
| 92 |
+
|
| 93 |
+
### Shipped model and reproductions
|
| 94 |
+
|
| 95 |
+
| Model | Checkpoint | HSS @ 4096 | HSS @ 2048 |
|
| 96 |
+
|---|---|---|---|
|
| 97 |
+
| Handcrafted baseline | — | 0.307 | — |
|
| 98 |
+
| **Current release (shipped)** | `checkpoint.pt` | **0.3819** | 0.3734 |
|
| 99 |
+
| Closest compiled E2E repro (#4) | `repro_runs/e2e_repro4_hss379/` | 0.3736 | 0.3675 |
|
| 100 |
+
| Best compiled repro from this codebase | `repro_runs/compiled_repro_hss376/` | 0.3757 | 0.3670 |
|
| 101 |
+
| Deterministic E2E repro (bit-reproducible) | `repro_runs/deterministic_hss372/` | 0.3716 | 0.3665 |
|
| 102 |
+
|
| 103 |
+
All repros use the exact 3-stage recipe on a single RTX 4090. The shipped `checkpoint.pt` was trained on the same recipe before this release branch was cut; the ~0.005-0.010 HSS gap between shipped and repros is compiled-mode run-to-run variance (see the Reproducibility section).
|
| 104 |
+
|
| 105 |
+
### Training progression (deterministic repro, all stages measured fresh)
|
| 106 |
+
|
| 107 |
+
| Stage | Steps | HSS @ 4096 | HSS @ 2048 |
|
| 108 |
+
|---|---|---|---|
|
| 109 |
+
| 1. 2048 from-scratch | 125k | 0.2755 | 0.2812 |
|
| 110 |
+
| 2. 4096 finetune | 135k | 0.3557 | 0.3510 |
|
| 111 |
+
| 3. Endpoint cooldown | 170k | 0.3716 | 0.3665 |
|
| 112 |
+
|
| 113 |
+
The stage 1 -> stage 2 jump (+0.08 HSS on 4096) is the biggest single improvement and motivates the 2048 -> 4096 transfer. Stage 3 (endpoint cooldown) adds another +0.016. Note how stage 1 is slightly better at 2048 than at 4096 (because it was only trained on 2048), while stages 2 and 3 invert that ordering after being finetuned on 4096.
|
| 114 |
+
|
| 115 |
+
### Previously submitted model (2048, single-stage)
|
| 116 |
+
|
| 117 |
+
The `submitted_2048/` directory holds the checkpoint we actually sent to the public leaderboard. It was trained in a single stage on 2048-point data and is a direct ancestor of the current release.
|
| 118 |
+
|
| 119 |
+
| Split | Metric | Score |
|
| 120 |
+
|---|---|---|
|
| 121 |
+
| **Public leaderboard (test)** | **HSS** | **0.427** |
|
| 122 |
+
| Internal val @ 2048 | HSS | 0.3692 |
|
| 123 |
+
| Internal val @ 4096 | HSS | 0.3665 |
|
| 124 |
+
|
| 125 |
+
We do not have a test number for the current release, but the val-to-test gap observed on this 2048 submission was about **+0.06 HSS** (0.37 val -> 0.43 test). A similar gap on the current `checkpoint.pt` (0.382 val) would suggest a test score in the low 0.44s, though this is extrapolation and unverified.
|
| 126 |
+
|
| 127 |
+
## Reproducibility
|
| 128 |
+
|
| 129 |
+
| Test | Result |
|
| 130 |
+
|---|---|
|
| 131 |
+
| Forward pass (same ckpt, same input) | bit-identical (0.00 diff) |
|
| 132 |
+
| Deterministic mode, 3 independent runs | bit-identical (162 tensors, max_diff=0.0) |
|
| 133 |
+
| Step 3 from same stage-2 ckpt (2 runs) | HSS=0.382, 0.384 |
|
| 134 |
+
| Compiled-mode E2E variance across runs | ~0.03 HSS (Triton kernel nondeterminism) |
|
| 135 |
+
|
| 136 |
+
`reproduce_deterministic.sh` produces byte-identical weights across runs with the same seed, at the cost of ~2x slower training (no `torch.compile`). Compiled mode has small run-to-run variance from Triton kernel selection that grows through chaotic SGD dynamics; E2E compiled repros land in the 0.349-0.379 range.
|
| 137 |
+
|
| 138 |
+
A subtle iteration-order effect: the shipped `bad_samples.txt` has 156 non-empty entries (the file lacks a trailing newline so `wc -l` reports 155). Two additional bad samples were discovered after training - they are legitimately bad GT but adding them changes the batch iteration order and costs ~0.005 HSS in deterministic mode and ~0.04 in compiled mode. See the "Reproducibility Notes" section of `REPRODUCE.md` for the full story.
|
| 139 |
+
|
| 140 |
+
## Repository layout
|
| 141 |
+
|
| 142 |
+
```
|
| 143 |
+
checkpoint.pt shipped HSS=0.382 model (step 170000), 4096-point input
|
| 144 |
+
script.py competition inference entry point (uses checkpoint.pt)
|
| 145 |
+
s23dr_2026_example/ training package (model, data, train loop, losses)
|
| 146 |
+
configs/base.json shared training config
|
| 147 |
+
reproduce.sh compiled-mode E2E reproduction (~3 hr)
|
| 148 |
+
reproduce_deterministic.sh bit-reproducible E2E reproduction (~5.5 hr)
|
| 149 |
+
REPRODUCE.md detailed recipe, results, ablations, notes
|
| 150 |
+
|
| 151 |
+
submitted_2048/ the model we actually sent to the public leaderboard (HSS_test=0.427)
|
| 152 |
+
checkpoint.pt single-stage 2048 model (step 160000)
|
| 153 |
+
args.json full training args
|
| 154 |
+
README.md training details and val/test scores
|
| 155 |
+
|
| 156 |
+
repro_runs/ evidence that the 3-stage recipe reproduces
|
| 157 |
+
e2e_repro4_hss379/ closest compiled E2E repro (val HSS=0.374)
|
| 158 |
+
compiled_repro_hss376/ best compiled repro from this codebase (val HSS=0.376)
|
| 159 |
+
deterministic_hss372/ bit-reproducible deterministic repro (val HSS=0.372)
|
| 160 |
+
```
|
| 161 |
+
|
| 162 |
+
Each directory under `repro_runs/` contains the three stage-final checkpoints (125k / 135k / 170k) plus their `args.json`, so a participant can resume from any stage. Note the directory names carry the score at the time the directory was created, which may differ by ~0.002 from fresh evals in the table above due to random variation in post-processing and CUDA kernel selection.
|
| 163 |
+
|
| 164 |
+
## Related branches
|
| 165 |
+
|
| 166 |
+
- `main` - this release
|
| 167 |
+
- `best-4096-transfer` - working branch with full commit history and internal dev notes
|
| 168 |
+
- `validation-archive` - cold archive of all validation runs (logs, final checkpoints, args) used to verify the release
|
| 169 |
+
|
| 170 |
+
## License
|
| 171 |
+
|
| 172 |
+
**CC-BY-NC 4.0.** The model weights and code in this repository are released under the Creative Commons Attribution-NonCommercial 4.0 International license. You are free to use, share, and adapt this work for **non-commercial** purposes, provided you give appropriate **attribution**. The training and validation datasets (`usm3d/s23dr-2026-sampled_*`) have their own terms - see the S23DR 2026 competition page for details.
|
| 173 |
+
|
| 174 |
+
## Acknowledgements
|
| 175 |
+
|
| 176 |
+
This checkpoint is released as a public learned baseline for participants of the **S23DR 2026** challenge, part of the [USM3D workshop](https://usm3d.github.io) at CVPR 2026.
|
make_datasets.sh
ADDED
|
@@ -0,0 +1,47 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
#!/bin/bash
|
| 2 |
+
# Rebuild the sampled datasets from scratch, starting from the public raw
|
| 3 |
+
# `usm3d/hoho22k_2026_trainval` dataset. Two stages:
|
| 4 |
+
#
|
| 5 |
+
# 1. cache_scenes.py : stream raw shards -> per-scene .pt files
|
| 6 |
+
# (runs point fusion + priority grouping)
|
| 7 |
+
# 2. make_sampled_cache.py : per-scene .pt -> fixed-size .npz files
|
| 8 |
+
# (priority samples to seq_len=2048 or 4096)
|
| 9 |
+
#
|
| 10 |
+
# This reproduces the content of
|
| 11 |
+
# hf://usm3d/s23dr-2026-sampled_2048_v2
|
| 12 |
+
# hf://usm3d/s23dr-2026-sampled_4096_v2
|
| 13 |
+
# without needing the intermediate (private) cached_full_pcd dataset.
|
| 14 |
+
#
|
| 15 |
+
# ~3-4 hr on a workstation for the full train+val set (network-bound in stage 1).
|
| 16 |
+
set -e
|
| 17 |
+
|
| 18 |
+
OUT_ROOT="${1:-cache}"
|
| 19 |
+
FULL_TRAIN="$OUT_ROOT/full/train"
|
| 20 |
+
FULL_VAL="$OUT_ROOT/full/validation"
|
| 21 |
+
|
| 22 |
+
# ----- Stage 1: raw -> per-scene .pt -----
|
| 23 |
+
echo "=== Stage 1: caching train scenes from raw tars ==="
|
| 24 |
+
python -m s23dr_2026_example.cache_scenes --out-dir "$FULL_TRAIN" --split train --skip-existing
|
| 25 |
+
|
| 26 |
+
echo "=== Stage 1: caching validation scenes from raw tars ==="
|
| 27 |
+
python -m s23dr_2026_example.cache_scenes --out-dir "$FULL_VAL" --split validation --skip-existing
|
| 28 |
+
|
| 29 |
+
# ----- Stage 2: .pt -> sampled .npz -----
|
| 30 |
+
for split in train validation; do
|
| 31 |
+
for seq_len in 2048 4096; do
|
| 32 |
+
in_dir="$OUT_ROOT/full/$split"
|
| 33 |
+
out_dir="$OUT_ROOT/sampled_${seq_len}/$split"
|
| 34 |
+
echo "=== Stage 2: sampling $split at seq_len=$seq_len ==="
|
| 35 |
+
python -m s23dr_2026_example.make_sampled_cache \
|
| 36 |
+
--in-dir "$in_dir" --out-dir "$out_dir" --seq-len "$seq_len"
|
| 37 |
+
done
|
| 38 |
+
done
|
| 39 |
+
|
| 40 |
+
echo ""
|
| 41 |
+
echo "All done. Sampled datasets are at:"
|
| 42 |
+
echo " $OUT_ROOT/sampled_2048/{train,validation}"
|
| 43 |
+
echo " $OUT_ROOT/sampled_4096/{train,validation}"
|
| 44 |
+
echo ""
|
| 45 |
+
echo "To train from these, point reproduce.sh at them via"
|
| 46 |
+
echo " --cache-dir \"\$OUT_ROOT/sampled_2048/train\" (and similar for val/4096)"
|
| 47 |
+
echo "instead of the default hf:// URLs."
|
s23dr_2026_example/cache_scenes.py
CHANGED
|
@@ -1,31 +1,42 @@
|
|
| 1 |
#!/usr/bin/env python3
|
| 2 |
"""Cache compact scenes from HoHo22k shards to training-ready .pt files.
|
| 3 |
|
| 4 |
-
|
| 5 |
-
|
|
|
|
|
|
|
|
|
|
| 6 |
|
| 7 |
Usage:
|
| 8 |
-
python
|
| 9 |
-
python
|
| 10 |
-
|
| 11 |
-
|
| 12 |
-
|
| 13 |
-
|
| 14 |
-
|
| 15 |
-
|
| 16 |
-
|
| 17 |
-
visible_src:
|
| 18 |
-
visible_id:
|
| 19 |
-
|
| 20 |
-
|
| 21 |
-
|
| 22 |
-
|
|
|
|
|
|
|
| 23 |
"""
|
| 24 |
from __future__ import annotations
|
| 25 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 26 |
import numpy as np
|
|
|
|
| 27 |
|
| 28 |
from .point_fusion import (
|
|
|
|
| 29 |
GEST_ID_TO_NAME, ADE_ID_TO_NAME, NUM_GEST,
|
| 30 |
)
|
| 31 |
|
|
@@ -176,3 +187,96 @@ def _compute_smart_center_scale(xyz, source, mad_k=2.5, percentile=95.0,
|
|
| 176 |
return center.astype(np.float32), np.float32(scale)
|
| 177 |
|
| 178 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
#!/usr/bin/env python3
|
| 2 |
"""Cache compact scenes from HoHo22k shards to training-ready .pt files.
|
| 3 |
|
| 4 |
+
Streams samples from the public `usm3d/hoho22k_2026_trainval` dataset, runs
|
| 5 |
+
`build_compact_scene` (see point_fusion.py), precomputes priority group_id
|
| 6 |
+
and semantic class_id, and saves one .pt per scene.
|
| 7 |
+
|
| 8 |
+
Stage 1 of the dataset pipeline. See make_sampled_cache.py for stage 2.
|
| 9 |
|
| 10 |
Usage:
|
| 11 |
+
python -m s23dr_2026_example.cache_scenes --out-dir cache/full --split train
|
| 12 |
+
python -m s23dr_2026_example.cache_scenes --out-dir cache/full_val --split validation
|
| 13 |
+
|
| 14 |
+
Cache format per .pt file:
|
| 15 |
+
xyz: float32 [P, 3] all points in world space
|
| 16 |
+
source: uint8 [P] 0=colmap, 1=depth
|
| 17 |
+
group_id: int8 [P] priority tier 0-4, -1=excluded
|
| 18 |
+
class_id: uint8 [P] one-hot class index (0-12)
|
| 19 |
+
behind_gest_id: int16 [P] behind-gestalt id (-1 if none)
|
| 20 |
+
visible_src: uint8 [P] 1=gestalt, 2=ade
|
| 21 |
+
visible_id: int16 [P] class id within space
|
| 22 |
+
n_views_voted: uint8 [P] number of views that voted
|
| 23 |
+
vote_frac: float32 [P] fraction of votes
|
| 24 |
+
center: float32 [3] smart normalization center
|
| 25 |
+
scale: float32 scalar smart normalization scale
|
| 26 |
+
gt_vertices: float32 [V, 3] ground truth wireframe vertices
|
| 27 |
+
gt_edges: int32 [E, 2] ground truth wireframe edge indices
|
| 28 |
"""
|
| 29 |
from __future__ import annotations
|
| 30 |
|
| 31 |
+
import argparse
|
| 32 |
+
import time
|
| 33 |
+
from pathlib import Path
|
| 34 |
+
|
| 35 |
import numpy as np
|
| 36 |
+
import torch
|
| 37 |
|
| 38 |
from .point_fusion import (
|
| 39 |
+
FuserConfig, build_compact_scene,
|
| 40 |
GEST_ID_TO_NAME, ADE_ID_TO_NAME, NUM_GEST,
|
| 41 |
)
|
| 42 |
|
|
|
|
| 187 |
return center.astype(np.float32), np.float32(scale)
|
| 188 |
|
| 189 |
|
| 190 |
+
# ---------------------------------------------------------------------------
|
| 191 |
+
# Dataset pipeline stage 1: raw HF sample -> cached .pt
|
| 192 |
+
# ---------------------------------------------------------------------------
|
| 193 |
+
|
| 194 |
+
def _process_one(sample, cfg):
|
| 195 |
+
"""Fuse a single HF sample into a cache dict. Returns (order_id, dict) or None."""
|
| 196 |
+
rng = np.random.RandomState()
|
| 197 |
+
|
| 198 |
+
n_edges = len(sample.get("wf_edges", []))
|
| 199 |
+
if n_edges == 0 or n_edges > 64:
|
| 200 |
+
return None
|
| 201 |
+
|
| 202 |
+
scene = build_compact_scene(sample, cfg, rng=rng)
|
| 203 |
+
if scene is None:
|
| 204 |
+
return None
|
| 205 |
+
|
| 206 |
+
gt_v = scene.get("gt_vertices")
|
| 207 |
+
gt_e = scene.get("gt_edges")
|
| 208 |
+
if gt_v is None or gt_e is None or len(gt_e) == 0:
|
| 209 |
+
return None
|
| 210 |
+
|
| 211 |
+
xyz = scene["xyz"]
|
| 212 |
+
source = scene["source"]
|
| 213 |
+
group_id, class_id = _compute_group_and_class(
|
| 214 |
+
scene["visible_src"], scene["visible_id"], scene["behind_gest_id"], source)
|
| 215 |
+
center, scale = _compute_smart_center_scale(xyz, source)
|
| 216 |
+
|
| 217 |
+
gt_edge_classes = np.asarray(sample["wf_classifications"], dtype=np.int64)
|
| 218 |
+
return sample["order_id"], {
|
| 219 |
+
"xyz": xyz.astype(np.float32),
|
| 220 |
+
"source": source.astype(np.uint8),
|
| 221 |
+
"group_id": group_id,
|
| 222 |
+
"class_id": class_id,
|
| 223 |
+
"behind_gest_id": scene["behind_gest_id"].astype(np.int16),
|
| 224 |
+
"visible_src": scene["visible_src"].astype(np.uint8),
|
| 225 |
+
"visible_id": scene["visible_id"].astype(np.int16),
|
| 226 |
+
"n_views_voted": scene["n_views_voted"],
|
| 227 |
+
"vote_frac": scene["vote_frac"],
|
| 228 |
+
"center": center,
|
| 229 |
+
"scale": scale,
|
| 230 |
+
"gt_vertices": gt_v.astype(np.float32),
|
| 231 |
+
"gt_edges": gt_e.astype(np.int32),
|
| 232 |
+
"gt_edge_classes": gt_edge_classes,
|
| 233 |
+
}
|
| 234 |
+
|
| 235 |
+
|
| 236 |
+
def main():
|
| 237 |
+
p = argparse.ArgumentParser(description="Stage 1: HoHo22k -> cached .pt files")
|
| 238 |
+
p.add_argument("--out-dir", required=True, help="Output directory for .pt files")
|
| 239 |
+
p.add_argument("--split", default="train", choices=["train", "validation"])
|
| 240 |
+
p.add_argument("--limit", type=int, default=0, help="Stop after N samples (0 = all)")
|
| 241 |
+
p.add_argument("--depth-per-view", type=int, default=8000)
|
| 242 |
+
p.add_argument("--skip-existing", action="store_true")
|
| 243 |
+
args = p.parse_args()
|
| 244 |
+
|
| 245 |
+
out_dir = Path(args.out_dir)
|
| 246 |
+
out_dir.mkdir(parents=True, exist_ok=True)
|
| 247 |
+
existing = {p.stem for p in out_dir.glob("*.pt")} if args.skip_existing else set()
|
| 248 |
+
|
| 249 |
+
from datasets import load_dataset
|
| 250 |
+
print(f"Streaming usm3d/hoho22k_2026_trainval split={args.split}...")
|
| 251 |
+
ds = load_dataset("usm3d/hoho22k_2026_trainval",
|
| 252 |
+
streaming=True, trust_remote_code=True, split=args.split)
|
| 253 |
+
|
| 254 |
+
cfg = FuserConfig(depth_points_per_view=args.depth_per_view)
|
| 255 |
+
saved, skipped = 0, 0
|
| 256 |
+
t0 = time.perf_counter()
|
| 257 |
+
for i, sample in enumerate(ds):
|
| 258 |
+
if args.limit > 0 and i >= args.limit:
|
| 259 |
+
break
|
| 260 |
+
oid = sample["order_id"]
|
| 261 |
+
if oid in existing:
|
| 262 |
+
skipped += 1
|
| 263 |
+
continue
|
| 264 |
+
result = _process_one(sample, cfg)
|
| 265 |
+
if result is None:
|
| 266 |
+
skipped += 1
|
| 267 |
+
continue
|
| 268 |
+
order_id, data = result
|
| 269 |
+
torch.save(data, out_dir / f"{order_id}.pt")
|
| 270 |
+
saved += 1
|
| 271 |
+
if saved % 100 == 0:
|
| 272 |
+
rate = saved / (time.perf_counter() - t0)
|
| 273 |
+
print(f" saved {saved} (skipped {skipped}) [{rate:.1f}/s]")
|
| 274 |
+
|
| 275 |
+
elapsed = time.perf_counter() - t0
|
| 276 |
+
print(f"Done. Saved {saved}, skipped {skipped} in {elapsed:.0f}s.")
|
| 277 |
+
|
| 278 |
+
|
| 279 |
+
if __name__ == "__main__":
|
| 280 |
+
main()
|
| 281 |
+
|
| 282 |
+
|
s23dr_2026_example/make_sampled_cache.py
CHANGED
|
@@ -1,30 +1,30 @@
|
|
| 1 |
#!/usr/bin/env python3
|
| 2 |
-
"""
|
| 3 |
|
| 4 |
-
Reads
|
| 5 |
-
|
|
|
|
| 6 |
|
| 7 |
-
|
| 8 |
-
|
| 9 |
-
python make_sampled_cache.py --in-dir /workspace/cache/v2 --out-dir /workspace/cache/sampled
|
| 10 |
-
|
| 11 |
-
# From HF dataset:
|
| 12 |
-
python make_sampled_cache.py --hf-repo usm3d/s23dr-2026-cached_full_pcd --out-dir /workspace/cache/sampled
|
| 13 |
-
|
| 14 |
-
# Specify split:
|
| 15 |
-
python make_sampled_cache.py --hf-repo usm3d/s23dr-2026-cached_full_pcd --split validation --out-dir /workspace/cache/sampled_val
|
| 16 |
|
| 17 |
-
|
| 18 |
-
python
|
| 19 |
-
|
|
|
|
|
|
|
| 20 |
|
| 21 |
-
|
| 22 |
-
|
| 23 |
-
the same points. Fine for now; better augmentation can be added later.
|
| 24 |
"""
|
| 25 |
from __future__ import annotations
|
| 26 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 27 |
import numpy as np
|
|
|
|
| 28 |
|
| 29 |
|
| 30 |
# Priority sampling (same logic as train.py)
|
|
@@ -73,3 +73,87 @@ def _priority_sample(source, group_id, seq_len, colmap_quota, depth_quota):
|
|
| 73 |
return indices[:seq_len], mask
|
| 74 |
|
| 75 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
#!/usr/bin/env python3
|
| 2 |
+
"""Stage 2: priority-sample cached .pt scenes into fixed-size .npz files.
|
| 3 |
|
| 4 |
+
Reads the per-scene .pt files produced by cache_scenes.py, priority-samples
|
| 5 |
+
a fixed number of points (2048 or 4096), normalizes, and writes one .npz per
|
| 6 |
+
scene (~50KB at 2048, ~100KB at 4096).
|
| 7 |
|
| 8 |
+
A fixed seed is used so every scene gets one deterministic sample -- no
|
| 9 |
+
per-epoch sampling augmentation, every epoch sees the same points.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 10 |
|
| 11 |
+
Usage:
|
| 12 |
+
python -m s23dr_2026_example.make_sampled_cache \\
|
| 13 |
+
--in-dir cache/full --out-dir cache/sampled_2048 --seq-len 2048
|
| 14 |
+
python -m s23dr_2026_example.make_sampled_cache \\
|
| 15 |
+
--in-dir cache/full --out-dir cache/sampled_4096 --seq-len 4096
|
| 16 |
|
| 17 |
+
The 3:1 colmap:depth quota ratio is fixed: at seq_len=2048 that's
|
| 18 |
+
colmap=1536/depth=512; at seq_len=4096 that's colmap=3072/depth=1024.
|
|
|
|
| 19 |
"""
|
| 20 |
from __future__ import annotations
|
| 21 |
|
| 22 |
+
import argparse
|
| 23 |
+
import time
|
| 24 |
+
from pathlib import Path
|
| 25 |
+
|
| 26 |
import numpy as np
|
| 27 |
+
import torch
|
| 28 |
|
| 29 |
|
| 30 |
# Priority sampling (same logic as train.py)
|
|
|
|
| 73 |
return indices[:seq_len], mask
|
| 74 |
|
| 75 |
|
| 76 |
+
def _process_sample(d, seq_len, colmap_q, depth_q):
|
| 77 |
+
"""Sample and normalize one cached scene dict into a small npz-ready dict."""
|
| 78 |
+
xyz = np.asarray(d["xyz"], np.float32)
|
| 79 |
+
source = np.asarray(d["source"], np.uint8)
|
| 80 |
+
group_id = np.asarray(d["group_id"], np.int8)
|
| 81 |
+
class_id = np.asarray(d["class_id"], np.uint8)
|
| 82 |
+
vis_src = np.asarray(d["visible_src"], np.uint8)
|
| 83 |
+
vis_id = np.asarray(d["visible_id"], np.int16)
|
| 84 |
+
center = np.asarray(d["center"], np.float32)
|
| 85 |
+
scale = float(d["scale"])
|
| 86 |
+
gt_v = np.asarray(d["gt_vertices"], np.float32)
|
| 87 |
+
gt_e = np.asarray(d["gt_edges"], np.int32)
|
| 88 |
+
|
| 89 |
+
indices, mask = _priority_sample(source, group_id, seq_len, colmap_q, depth_q)
|
| 90 |
+
xyz_norm = ((xyz[indices] - center) / scale).astype(np.float32)
|
| 91 |
+
gt_seg = np.stack([gt_v[gt_e[:, 0]], gt_v[gt_e[:, 1]]], axis=1)
|
| 92 |
+
gt_seg_norm = ((gt_seg - center) / scale).astype(np.float32)
|
| 93 |
+
|
| 94 |
+
result = {
|
| 95 |
+
"xyz_norm": xyz_norm,
|
| 96 |
+
"class_id": class_id[indices].astype(np.uint8),
|
| 97 |
+
"source": source[indices].astype(np.uint8),
|
| 98 |
+
"mask": mask,
|
| 99 |
+
"gt_segments": gt_seg_norm,
|
| 100 |
+
"scale": np.float32(scale),
|
| 101 |
+
"center": center,
|
| 102 |
+
"gt_vertices": gt_v,
|
| 103 |
+
"gt_edges": gt_e,
|
| 104 |
+
"visible_src": vis_src[indices].astype(np.uint8),
|
| 105 |
+
"visible_id": vis_id[indices].astype(np.int16),
|
| 106 |
+
}
|
| 107 |
+
if "behind_gest_id" in d:
|
| 108 |
+
result["behind"] = np.asarray(d["behind_gest_id"], np.int16)[indices]
|
| 109 |
+
if "n_views_voted" in d:
|
| 110 |
+
result["n_views_voted"] = np.asarray(d["n_views_voted"], np.uint8)[indices]
|
| 111 |
+
if "vote_frac" in d:
|
| 112 |
+
result["vote_frac"] = np.asarray(d["vote_frac"], np.float32)[indices]
|
| 113 |
+
if "gt_edge_classes" in d:
|
| 114 |
+
result["gt_edge_classes"] = np.asarray(d["gt_edge_classes"], np.int64)
|
| 115 |
+
return result
|
| 116 |
+
|
| 117 |
+
|
| 118 |
+
def main():
|
| 119 |
+
p = argparse.ArgumentParser(description="Stage 2: cached .pt -> sampled .npz")
|
| 120 |
+
p.add_argument("--in-dir", required=True, help="Directory of .pt files from cache_scenes.py")
|
| 121 |
+
p.add_argument("--out-dir", required=True, help="Output directory for .npz files")
|
| 122 |
+
p.add_argument("--seq-len", type=int, default=2048, help="Points per sample (2048 or 4096)")
|
| 123 |
+
p.add_argument("--seed", type=int, default=7)
|
| 124 |
+
args = p.parse_args()
|
| 125 |
+
|
| 126 |
+
colmap_q = args.seq_len * 3 // 4
|
| 127 |
+
depth_q = args.seq_len - colmap_q
|
| 128 |
+
print(f"seq_len={args.seq_len} colmap={colmap_q} depth={depth_q}")
|
| 129 |
+
|
| 130 |
+
out_dir = Path(args.out_dir)
|
| 131 |
+
out_dir.mkdir(parents=True, exist_ok=True)
|
| 132 |
+
np.random.seed(args.seed)
|
| 133 |
+
|
| 134 |
+
files = sorted(Path(args.in_dir).glob("*.pt"))
|
| 135 |
+
print(f"Found {len(files)} .pt files in {args.in_dir}")
|
| 136 |
+
|
| 137 |
+
done = 0
|
| 138 |
+
t0 = time.perf_counter()
|
| 139 |
+
for f in files:
|
| 140 |
+
out_f = out_dir / (f.stem + ".npz")
|
| 141 |
+
if out_f.exists():
|
| 142 |
+
done += 1
|
| 143 |
+
continue
|
| 144 |
+
d = torch.load(f, weights_only=False)
|
| 145 |
+
result = _process_sample(d, args.seq_len, colmap_q, depth_q)
|
| 146 |
+
np.savez(out_f, **result)
|
| 147 |
+
done += 1
|
| 148 |
+
if done % 2000 == 0:
|
| 149 |
+
rate = done / (time.perf_counter() - t0)
|
| 150 |
+
print(f" {done}/{len(files)} [{rate:.0f}/s]")
|
| 151 |
+
|
| 152 |
+
elapsed = time.perf_counter() - t0
|
| 153 |
+
print(f"Done. {done} files in {elapsed:.0f}s -> {out_dir}")
|
| 154 |
+
|
| 155 |
+
|
| 156 |
+
if __name__ == "__main__":
|
| 157 |
+
main()
|
| 158 |
+
|
| 159 |
+
|
submitted_2048/README.md
ADDED
|
@@ -0,0 +1,35 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Submitted 2048 Model (public leaderboard entry)
|
| 2 |
+
|
| 3 |
+
This is the checkpoint that was actually submitted to the S23DR 2026 public leaderboard. It trains on the 2048-point dataset only (single-stage, no 4096 transfer). The current top-level `checkpoint.pt` (HSS=0.382 val) is its direct descendant via the 3-step 2048 -> 4096 -> endpoint-cooldown recipe.
|
| 4 |
+
|
| 5 |
+
| Split | Metric | Score |
|
| 6 |
+
|---|---|---|
|
| 7 |
+
| Public leaderboard (test) | HSS | **0.427** |
|
| 8 |
+
| Internal val (2048, 1024 samples) | HSS_conf | 0.369 |
|
| 9 |
+
| Internal val (4096, 1024 samples) | HSS_conf | 0.367 |
|
| 10 |
+
|
| 11 |
+
## Training details
|
| 12 |
+
|
| 13 |
+
Single-stage training on `hf://usm3d/s23dr-2026-sampled_2048_v2:train`:
|
| 14 |
+
|
| 15 |
+
- **Architecture:** same Perceiver as the current release (hidden=256, latent_tokens=256, latent_layers=7, segments=64)
|
| 16 |
+
- **Input:** 2048 points
|
| 17 |
+
- **Steps:** 160,000
|
| 18 |
+
- **Final LR:** 3e-5 (after cooldown)
|
| 19 |
+
- **Batch size:** 32
|
| 20 |
+
- **Cooldown:** starts at step 140,000, lasts 20,000 steps
|
| 21 |
+
- **Endpoint weight:** 0.1 (used throughout, not only in cooldown)
|
| 22 |
+
- **Confidence weight:** 0.1
|
| 23 |
+
- **Seed:** 353
|
| 24 |
+
|
| 25 |
+
Full training args are in `args.json`.
|
| 26 |
+
|
| 27 |
+
## How to run inference
|
| 28 |
+
|
| 29 |
+
This checkpoint expects 2048-point input. To run it with the submission harness you would need to modify `script.py` to use `SEQ_LEN = 2048`. Alternatively, load the weights manually via `EdgeDepthSegmentsModel` in `s23dr_2026_example/model.py` and feed a 2048-point cloud.
|
| 30 |
+
|
| 31 |
+
## Why it is included
|
| 32 |
+
|
| 33 |
+
The current release (`../checkpoint.pt`, HSS=0.382 val) is a strict improvement over this one, but only on the internal val split. The **0.427 public leaderboard score** is the only test-set number we have, so this checkpoint is preserved as the empirical anchor for the val-to-test gap.
|
| 34 |
+
|
| 35 |
+
Val-to-test gap observed: **0.369 val -> 0.427 test** (about +0.06). The same train/val/test relationship should roughly carry over to the current 0.382-val release, but we do not have a test number for it since the leaderboard uses this older model.
|
submitted_2048/args.json
ADDED
|
@@ -0,0 +1,67 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"cache_dir": "hf://usm3d/s23dr-2026-sampled_2048_v2:train",
|
| 3 |
+
"val_cache_dir": "",
|
| 4 |
+
"arch": "perceiver",
|
| 5 |
+
"segments": 64,
|
| 6 |
+
"hidden": 256,
|
| 7 |
+
"ff": 1024,
|
| 8 |
+
"latent_tokens": 256,
|
| 9 |
+
"latent_layers": 7,
|
| 10 |
+
"encoder_layers": 4,
|
| 11 |
+
"pre_encoder_layers": 0,
|
| 12 |
+
"decoder_layers": 3,
|
| 13 |
+
"decoder_input_xattn": false,
|
| 14 |
+
"qk_norm": true,
|
| 15 |
+
"qk_norm_type": "l2",
|
| 16 |
+
"learnable_fourier": false,
|
| 17 |
+
"num_heads": 4,
|
| 18 |
+
"kv_heads_cross": 2,
|
| 19 |
+
"kv_heads_self": 2,
|
| 20 |
+
"cross_attn_interval": 4,
|
| 21 |
+
"dropout": 0.1,
|
| 22 |
+
"steps": 160000,
|
| 23 |
+
"batch_size": 32,
|
| 24 |
+
"lr": 3e-05,
|
| 25 |
+
"muon_lr": null,
|
| 26 |
+
"adam_betas": "0.9,0.95",
|
| 27 |
+
"warmup": 10000,
|
| 28 |
+
"cosine_decay": false,
|
| 29 |
+
"cooldown_start": 140000,
|
| 30 |
+
"cooldown_steps": 20000,
|
| 31 |
+
"mup": false,
|
| 32 |
+
"mup_base_width": 128,
|
| 33 |
+
"seed": 353,
|
| 34 |
+
"varifold_weight": 0.0,
|
| 35 |
+
"varifold_cross_only": false,
|
| 36 |
+
"sinkhorn_weight": 1.0,
|
| 37 |
+
"sinkhorn_eps": 0.1,
|
| 38 |
+
"sinkhorn_eps_start": null,
|
| 39 |
+
"sinkhorn_iters": 20,
|
| 40 |
+
"sinkhorn_dustbin": 0.3,
|
| 41 |
+
"vertex_f1_weight": 0.0,
|
| 42 |
+
"soft_hss_weight": 0.0,
|
| 43 |
+
"endpoint_weight": 0.1,
|
| 44 |
+
"endpoint_warmup": 0,
|
| 45 |
+
"aug_rotate": true,
|
| 46 |
+
"aug_jitter": 0.0,
|
| 47 |
+
"aug_drop": 0.0,
|
| 48 |
+
"aug_flip": true,
|
| 49 |
+
"gpu_dataset": false,
|
| 50 |
+
"stored_seq_len": 8192,
|
| 51 |
+
"rms_norm": true,
|
| 52 |
+
"activation": "gelu",
|
| 53 |
+
"behind_emb_dim": 8,
|
| 54 |
+
"vote_features": true,
|
| 55 |
+
"segment_param": "midpoint_dir_len",
|
| 56 |
+
"length_floor": 0.0,
|
| 57 |
+
"segment_conf": true,
|
| 58 |
+
"conf_weight": 0.1,
|
| 59 |
+
"conf_mode": "sinkhorn",
|
| 60 |
+
"conf_clamp_min": null,
|
| 61 |
+
"conf_head_wd": 0.1,
|
| 62 |
+
"optimizer": "adamw",
|
| 63 |
+
"out_dir": "/workspace/s23dr_2026_example/runs",
|
| 64 |
+
"resume": "runs/20260322_085443/checkpoints/step125000.pt",
|
| 65 |
+
"cpu": false,
|
| 66 |
+
"args_from": "runs/20260322_085443/args.json"
|
| 67 |
+
}
|
submitted_2048/checkpoint.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:cc38a61ff512948b1dc92a30129d6efdd093f507948fc5b538050c4a38bfbf6c
|
| 3 |
+
size 106460054
|