See What I See, Know What I Think:
Dense Latent Communication Across Heterogeneous Agents

University of Michigan · NVIDIA · University of Pennsylvania · University of Colorado Boulder · Michigan State University
Preprint 2026
Main teaser figure

See what I see, know what I think. Dense alignment enables accurate and efficient heterogeneous latent communication in context-aware and context-unaware settings.

Abstract

Multi-agent systems communicate mostly through text, paying a lossy and expensive decode and re-encode cost. KV-cache communication is a promising alternative, yet most prior work is homogeneous, using duplicate copies of the same model, and avoids the central challenge of cross-model latent alignment; existing heterogeneous methods are also restrictive, typically assuming shared input and using transferred caches mainly for steering. We study a more fundamental question: can heterogeneous agents be aligned well enough to perform real "mind reading" and transfer both what one agent sees and how it thinks? Our information-structure analysis reveals a duality: context-aware transfer is driven by sparse reasoning signals, while context-unaware transfer, where the receiver sees no input, requires dense contextual knowledge preservation. Motivated by this, we propose dense alignment for heterogeneous KV-cache communication via a lightweight cross-model cache transformation and two-phase training: reconstruction followed by generation. Across all six directions of {Qwen3-4B, 8B, 14B} and six in-domain and out-of-domain benchmarks, our method outperforms prior heterogeneous baselines, matches or exceeds text communication in context-aware settings at roughly 2 to 3× lower compute, and remains effective in context-unaware transfer where prior methods collapse.

1. Compressed Sensing Analysis (Figure + Method)

Compressed-sensing head selection from paper analysis section

We run post-hoc compressed sensing in homogeneous self-communication (Qwen3-4B to Qwen3-4B), recover head importance with Lasso, aggregate to KV-group scores, and sweep top-K KV retention under both communication regimes.

The compressed-sensing estimator identifies which sender heads contribute most to communication quality. We solve a sparse linear inverse problem over random ablation masks:

\[ \hat{\alpha} = \arg\min_{\alpha} \frac{1}{2M}\|\tilde{y}-\Phi\alpha\|_2^2 + \lambda \|\alpha\|_1, \quad \tilde{y}=y-y_0. \]

2. Compressed Sensing Analysis Results

Sparse reasoning vs dense context signal from paper analysis

Context-aware transfer reaches near-ceiling with few KV groups (sparse reasoning signal), while context-unaware transfer remains poor until dense retention (dense knowledge signal).

This result motivates the core design principle: sparse channels can be sufficient for reasoning guidance when the receiver already has context, but context-unaware transfer requires information-dense latent communication.

3. Dense Alignment Methods (Setup, Training, Architecture)

Dense vs sparse alignment contrast

Dense alignment preserves both reasoning and contextual knowledge, while sparse alignment mainly carries steering signals.

Dense alignment method figure

Method: position disentanglement, layer/group transformation, and structured gating.

Training uses two phases: Phase I reconstruction aligns sender caches into receiver-native cache geometry for dense information preservation; Phase II generation jointly optimizes context-aware and context-unaware decoding so the aligned cache becomes directly actionable.

\[ \mathcal{L}_{\mathrm{rec}} = \sum_{l,g} \|\widetilde{K}_R^{(l,g)}-K_R^{(l,g)}\|_2^2 + \|\widetilde{V}_R^{(l,g)}-V_R^{(l,g)}\|_2^2. \] \[ \mathcal{L}_{\mathrm{gen}} = -\sum_t \log p_{\mathcal{A}_R} \big( y_t \mid y_{\lt t}, \widetilde{\mathcal{C}}_R(X), X_R \big), \quad X_R \in \{X,\emptyset\}. \]

4. Dense Alignment Results

Context-aware communication (In-domain + Out-of-domain + TFLOPs)

PairMethodGSM8KMATH-500ARC-CMMLU-ReduxMedQAOpenBookQATFLOPs
4B→8BReceiver-only81.1049.2091.0072.1053.0091.2019.85
4B→8BT2T88.1076.0091.7480.7567.4090.4037.73
4B→8BC2C77.8644.2086.0975.8756.8785.604.50
4B→8BOurs92.9582.0093.6978.5267.2491.2012.56
4B→14BReceiver-only83.7046.4092.6071.8064.7091.8031.16
4B→14BT2T92.3477.8092.0082.8771.1792.0056.24
4B→14BC2C82.3444.2092.4372.7663.0087.006.64
4B→14BOurs93.8686.0094.2078.5771.9693.6021.54
8B→4BReceiver-only82.4044.2089.2065.0047.7088.009.18
8B→4BT2T89.3963.2090.9679.3966.4687.0033.91
8B→4BC2C72.4837.4086.7870.1955.0777.203.43
8B→4BOurs91.8183.4093.0077.8666.3089.607.95
8B→14BReceiver-only83.7046.4092.6071.8064.7091.8031.16
8B→14BT2T93.9375.0091.7484.1372.6692.8067.08
8B→14BC2C82.2643.2092.4373.8164.6587.007.42
8B→14BOurs94.0985.0094.3777.3870.4693.4021.79
14B→4BReceiver-only82.4044.2089.2065.0047.7088.009.18
14B→4BT2T90.6060.2089.8380.6169.9185.0043.48
14B→4BC2C70.5836.6085.8369.7353.4276.405.01
14B→4BOurs91.1382.6091.8977.6663.0088.8010.18
14B→8BReceiver-only81.1049.2091.0072.1053.0091.2019.85
14B→8BT2T91.5873.0092.3583.4372.9090.2055.50
14B→8BC2C76.3543.0089.3975.9962.6985.607.83
14B→8BOurs92.9581.4093.7778.0470.3892.6015.64

Context-unaware communication (In-domain + Out-of-domain + TFLOPs)

PairMethodGSM8KMATH-500ARC-CMMLU-ReduxMedQAOpenBookQATFLOPs
4B→8BT2T-ctx-unaware51.6374.4019.4821.0018.8524.0037.86
4B→8BC2C-ctx-unaware1.903.0022.5221.918.9627.0014.94
4B→8BOurs-ctx-unaware91.4378.8091.3874.5961.8288.809.42
4B→14BT2T-ctx-unaware56.7975.2023.3923.0627.6527.2060.83
4B→14BC2C-ctx-unaware0.000.0010.7010.2612.8016.4020.05
4B→14BOurs-ctx-unaware82.2670.6086.8662.8653.2682.4017.15
8B→4BT2T-ctx-unaware27.9869.4023.7425.1822.9423.2032.75
8B→4BC2C-ctx-unaware0.382.4010.358.207.2310.0014.76
8B→4BOurs-ctx-unaware91.3681.6093.6077.0664.5790.006.56
8B→14BT2T-ctx-unaware30.8670.4022.9622.8027.8127.4072.06
8B→14BC2C-ctx-unaware0.000.007.397.146.367.609.84
8B→14BOurs-ctx-unaware81.5865.0088.4854.9257.7483.8015.85
14B→4BT2T-ctx-unaware18.5766.6020.6124.5221.1320.2041.95
14B→4BC2C-ctx-unaware0.680.404.966.110.636.6017.93
14B→4BOurs-ctx-unaware82.4964.0088.4867.1559.5485.408.90
14B→8BT2T-ctx-unaware18.8066.4022.8721.4125.2226.2053.80
14B→8BC2C-ctx-unaware0.000.202.785.345.182.2024.77
14B→8BOurs-ctx-unaware84.1568.4088.9967.8454.2088.0013.28

Across all six heterogeneous directions, dense alignment outperforms prior heterogeneous baselines and remains robust in context-unaware transfer.

Layer 7 PCA

Early-layer PCA (L7).

Layer 34 PCA

Late-layer PCA (L34).

Latent-space PCA: transformed sender caches overlap receiver-native manifolds, supporting dense geometric alignment.

BibTeX

@misc{chen2026denselatentcommunication,
  title={See What I See, Know What I Think: Dense Latent Communication Across Heterogeneous Agents},
  author={Siyi Chen and Xiaoyan Zhang and Meng Wu and Jonathan Tremblay and Valts Blukis and Stan Birchfield and Rene Vidal and Alvaro Velasquez and Sijia Liu and Qing Qu},
  year={2026},
  note={Preprint}
}