solr4.3 useSmart=false模式下会把词分割成单个字符 #135

GoogleCodeExporter · 2016-04-07T08:24:23Z

solr 版本4.3
ik 版本 2012hf1
使用了IKTokenizerFactory接入solr,在useSmart=false模式下会把词分割�
��单个字符，在true下则不会，比如：
-------------------------------------------------------------
   solr 右侧Analysis功能中，Field Value为123，分析结果如下：
HTMLSCF text 123
IKT text        123         1     2   3
    raw_bytes   [31 32 33]  [31] [32] [33]
    start       0           0     1   2
    end         3           1     2   3
    type        ARABIC    CN_WORD CN_WORD CN_WORD
--------------------------------------------------------------
   在索引中也有1，2，3这三个字符，很奇怪false模式下会有这种结果，特别是1，2，3的类型是CN_WORD，简单了解过 IK的源码，CN_WORD类型只有在CJKSegment中匹配成词才会得到，求解决方法。

Original issue reported on code.google.com by [email protected] on 13 May 2014 at 7:57

The text was updated successfully, but these errors were encountered:

GoogleCodeExporter · 2016-04-07T08:24:23Z

这是我的问题，我接手维护前人的工作，才发现在class文件夹
下有个扩展词库

Original comment by [email protected] on 14 May 2014 at 1:34

GoogleCodeExporter added Priority-Medium Type-Defect auto-migrated labels Apr 7, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

solr4.3 useSmart=false模式下会把词分割成单个字符 #135

solr4.3 useSmart=false模式下会把词分割成单个字符 #135

GoogleCodeExporter commented Apr 7, 2016

GoogleCodeExporter commented Apr 7, 2016

solr4.3 useSmart=false模式下会把词分割成单个字符 #135

solr4.3 useSmart=false模式下会把词分割成单个字符 #135

Comments

GoogleCodeExporter commented Apr 7, 2016

GoogleCodeExporter commented Apr 7, 2016