We present FAST-Splat for fast, ambiguity-free semantic Gaussian Splatting, which seeks to address the main limitations of existing semantic Gaussian Splatting methods, namely: slow training and rendering speeds; high memory usage; and ambiguous semantic object localization. In deriving FAST-Splat , we formulate open-vocabulary semantic Gaussian Splatting as the problem of extending closed-set semantic distillation to the open-set (open-vocabulary) setting, enabling FAST-Splat to provide precise semantic object localization results, even when prompted with ambiguous user-provided natural-language queries. Further, by exploiting the explicit form of the Gaussian Splatting scene representation to the fullest extent, FAST-Splat retains the remarkable training and rendering speeds of Gaussian Splatting. Specifically, while existing semantic Gaussian Splatting methods distill semantics into a separate neural field or utilize neural models for dimensionality reduction, FAST-Splat directly augments each Gaussian with specific semantic codes, preserving the training, rendering, and memory-usage advantages of Gaussian Splatting over neural field methods. These Gaussian-specific semantic codes, together with a hash-table, enable semantic similarity to be measured with open-vocabulary user prompts and further enable FAST-Splat to respond with unambiguous semantic object labels and 3D masks, unlike prior methods. In experiments, we demonstrate that FAST-Splat is 4x to 6x faster to train with a 13x faster data pre-processing step, achieves between 18x to 75x faster rendering speeds, and requires about 3x smaller GPU memory, compared to the best-competing semantic Gaussian Splatting methods. Further, FAST-Splat achieves relatively similar or better semantic segmentation performance compared to existing methods.
我们提出了 FAST-Splat,一种用于快速、无歧义语义高斯分布(Semantic Gaussian Splatting)的新方法,旨在解决现有语义高斯分布方法的主要局限性,包括:训练和渲染速度慢、内存占用高以及语义对象定位的歧义性。 在 FAST-Splat 的推导中,我们将开放词汇(open-vocabulary)语义高斯分布形式化为将封闭集语义蒸馏扩展到开放集(开放词汇)环境的问题,从而使其能够在用户提供的模糊自然语言查询下,提供精准的语义对象定位结果。此外,通过充分利用高斯分布场景表示的显式特性,FAST-Splat 保留了高斯分布在训练和渲染速度上的显著优势。 具体而言,与现有方法通过单独的神经场或神经网络进行维度降维的方式不同,FAST-Splat 直接将特定的语义编码附加到每个高斯上,从而在不牺牲训练速度、渲染速度和内存占用的情况下,将语义集成到高斯分布中。这些高斯特定的语义编码结合哈希表,使系统能够通过开放词汇提示测量语义相似性,并以无歧义的语义对象标签和三维掩码作出响应,这点显著优于现有方法。 在实验中,FAST-Splat 的性能优势明显:相比最佳竞品语义高斯分布方法,训练速度提升 4x 到 6x,数据预处理速度提升 13x,渲染速度提升 18x 到 75x,GPU 内存需求减少约 3x。此外,FAST-Splat 在语义分割性能上表现出与现有方法相当甚至更优的效果。