02

02

02

SpatiaLLM

SpatiaLLM

SpatiaLLM

LLM-generated Human Interface for Co-presence Experience in Mixed Reality Agent

LLM-generated Human Interface for Co-presence Experience in Mixed Reality Agent

LLM-generated Human Interface for Co-presence Experience in Mixed Reality Agent

Description

Description

SpatiaLLM is a multi-agent framework that seamlessly bridges Large Language Model (LLM) capabilities with spatial understanding, enabling intelligent and context-aware content placement in Mixed Reality environments.

SpatiaLLM 是一个多智能体框架,融合大型语言模型与空间感知技术,实现混合现实环境中智能化且具上下文自适应的内容布置。

SpatiaLLM is a multi-agent framework that seamlessly bridges Large Language Model (LLM) capabilities with spatial understanding, enabling intelligent and context-aware content placement in Mixed Reality environments.

SpatiaLLM 是一个多智能体框架,融合大型语言模型与空间感知技术,实现混合现实环境中智能化且具上下文自适应的内容布置。

Keywords

Keywords

Spatial Computing 空间计算

Spatial Computing 空间计算

Mixed Reality (MR) 混合现实

Mixed Reality (MR) 混合现实

Human–Computer Interaction (HCI) 人机交互

Human–Computer Interaction (HCI) 人机交互

Large Language Models 大型语言模型

Adaptive Interfaces 自适应界面

Year

Year

2024

2024

Technical Details

Technical Details

SpatiaLLM combines advanced language-based reasoning with real-time environment sensing to deliver adaptive 3D layouts and immersive user experiences. By leveraging Large Language Models as orchestrators of semantic logic—and coupling them with sensor-driven agents that detect planes, boundaries, and objects—SpatiaLLM ensures that virtual elements remain anchored, coherent, and physically consistent within the user’s real-world surroundings. This dynamic integration of computational linguistics and spatial perception facilitates a more natural presentation of 3D information, transforming traditional slides into fully interactive and environment-aware storytelling. Originally developed as a “3D PPT” concept, SpatiaLLM now extends into educational, industrial, and entertainment settings, providing a robust framework for contextualized, cross-platform MR design.

SpatiaLLM 将先进的语言推理能力与实时环境感知相结合,为混合现实应用提供自适应的三维布局和沉浸式用户体验。通过使用大型语言模型来驱动语义层面的逻辑,再辅以传感器驱动的智能体检测平面、边界与物体,SpatiaLLM 能够确保虚拟元素在用户现实环境中始终保持锚定、一致且物理合理。这个将计算语言学与空间感知紧密结合的方法,使3D信息的呈现更加自然,能够将传统的PPT升级为可交互、与环境相契合的空间化叙事形式。SpatiaLLM最初旨在打造“3D PPT”概念,如今已延伸至教育、工业和娱乐领域,为跨平台的MR设计提供了一个稳健的上下文化框架。

SpatiaLLM combines advanced language-based reasoning with real-time environment sensing to deliver adaptive 3D layouts and immersive user experiences. By leveraging Large Language Models as orchestrators of semantic logic—and coupling them with sensor-driven agents that detect planes, boundaries, and objects—SpatiaLLM ensures that virtual elements remain anchored, coherent, and physically consistent within the user’s real-world surroundings. This dynamic integration of computational linguistics and spatial perception facilitates a more natural presentation of 3D information, transforming traditional slides into fully interactive and environment-aware storytelling. Originally developed as a “3D PPT” concept, SpatiaLLM now extends into educational, industrial, and entertainment settings, providing a robust framework for contextualized, cross-platform MR design.

SpatiaLLM 将先进的语言推理能力与实时环境感知相结合,为混合现实应用提供自适应的三维布局和沉浸式用户体验。通过使用大型语言模型来驱动语义层面的逻辑,再辅以传感器驱动的智能体检测平面、边界与物体,SpatiaLLM 能够确保虚拟元素在用户现实环境中始终保持锚定、一致且物理合理。这个将计算语言学与空间感知紧密结合的方法,使3D信息的呈现更加自然,能够将传统的PPT升级为可交互、与环境相契合的空间化叙事形式。SpatiaLLM最初旨在打造“3D PPT”概念,如今已延伸至教育、工业和娱乐领域,为跨平台的MR设计提供了一个稳健的上下文化框架。

Highlights

Highlights

  • Multi-Agent Architecture: Specialized agents handle perception, retrieval, layout optimization, and user interaction, unifying language queries with precise geometric checks.

  • Spatial-Temporal Cognitive Map (ST-CM): Ensures continuity and realistic object behavior across multiple scenes by blending sensor-based constraints with semantic anchors.

  • Adaptive Storytelling: Transitions from static slides to immersive scenarios where 3D content can scale, reposition, and adapt in real time based on the physical environment.

  • Broad Applicability: Designed for mixed reality teaching, product showcases, and engaging industrial simulations, bridging the gap between engineering prototypes and immersive user experiences.

  • ST-CM

  • 3D

  • 广 仿

  • Multi-Agent Architecture: Specialized agents handle perception, retrieval, layout optimization, and user interaction, unifying language queries with precise geometric checks.

  • Spatial-Temporal Cognitive Map (ST-CM): Ensures continuity and realistic object behavior across multiple scenes by blending sensor-based constraints with semantic anchors.

  • Adaptive Storytelling: Transitions from static slides to immersive scenarios where 3D content can scale, reposition, and adapt in real time based on the physical environment.

  • Broad Applicability: Designed for mixed reality teaching, product showcases, and engaging industrial simulations, bridging the gap between engineering prototypes and immersive user experiences.

  • 多智能体架构: 专职的智能体分别负责感知、检索、布局优化以及用户交互,将语言查询与精准几何校验统一起来。

  • 时空认知图(ST-CM): 将传感器数据与语义锚点相融合,在多场景切换中维持连续性与真实感。

  • 自适应叙事: 从静态演示到沉浸式场景,3D内容能够根据物理环境实时缩放、重新定位并保持自适应。

  • 广泛适用性: 支持混合现实教学、产品展示与工业仿真,打通工程原型与沉浸式用户体验之间的桥梁。

Credits

Research

Yueze Zhang

Dev

Yueze Zhang, Shucun Zhao

Advisors

Prof. Dr. Martin Werner, Prof Tim Purdy

Appendix

Credits

Research

Yueze Zhang

Dev

Yueze Zhang, Shucun Zhao

Advisors

Prof. Dr. Martin Werner, Prof Tim Purdy

Appendix