Package | Description |
---|---|
cn.hutool.core.collection |
集合以及Iterator封装,包括集合工具CollUtil,Iterator和Iterable工具IterUtil
|
cn.hutool.extra.tokenizer |
中文分词封装
通过定义统一接口,适配第三方分词引擎 |
cn.hutool.extra.tokenizer.engine.analysis |
Lucene-analysis分词抽象封装
项目地址:https://github.com/apache/lucene-solr/tree/master/lucene/analysis |
cn.hutool.extra.tokenizer.engine.ansj |
Ansj分词实现
项目地址:https://github.com/NLPchina/ansj_seg |
cn.hutool.extra.tokenizer.engine.hanlp |
HanLP分词引擎实现
项目地址:https://github.com/hankcs/HanLP |
cn.hutool.extra.tokenizer.engine.ikanalyzer |
IKAnalyzer分词引擎实现
项目地址:https://github.com/yozhao/IKAnalyzer |
cn.hutool.extra.tokenizer.engine.jcseg |
Jcseg分词引擎实现
项目地址:https://gitee.com/lionsoul/jcseg |
cn.hutool.extra.tokenizer.engine.jieba |
Jieba分词引擎实现
项目地址:https://github.com/huaban/jieba-analysis |
cn.hutool.extra.tokenizer.engine.mmseg |
mmseg4j分词引擎实现
项目地址:https://github.com/chenlb/mmseg4j-core |
cn.hutool.extra.tokenizer.engine.mynlp |
MYNLP 中文NLP工具包分词实现
项目地址:https://github.com/mayabot/mynlp/ |
cn.hutool.extra.tokenizer.engine.word |
Word分词引擎实现
项目地址:https://github.com/ysc/word |
Modifier and Type | Class and Description |
---|---|
class |
ArrayIter<E>
数组Iterator对象
|
class |
CopiedIter<E>
|
class |
EnumerationIter<E>
Enumeration 对象转Iterator 对象 |
class |
LineIter
将Reader包装为一个按照行读取的Iterator
此对象遍历结束后,应关闭之,推荐使用方式: LineIterator it = null; try { it = new LineIterator(reader); while (it.hasNext()) { String line = it.nextLine(); // do something with line } } finally { it.close(); } 此类来自于Apache Commons io |
class |
PartitionIter<T>
分批迭代工具,可以分批处理数据
比如调用其他客户的接口,传入的入参有限,需要分批
比如mysql/oracle用in语句查询,超过1000可以分批
比如数据库取出游标,可以把游标里的数据一批一批处理
|
Modifier and Type | Interface and Description |
---|---|
interface |
Result
分词结果接口定义
实现此接口包装分词器的分词结果,通过实现Iterator相应方法获取分词中的单词 |
Modifier and Type | Class and Description |
---|---|
class |
AbstractResult
|
Modifier and Type | Class and Description |
---|---|
class |
AnalysisResult
Lucene-analysis分词抽象结果封装
项目地址:https://github.com/apache/lucene-solr/tree/master/lucene/analysis |
Modifier and Type | Class and Description |
---|---|
class |
AnsjResult
Ansj分词结果实现
项目地址:https://github.com/NLPchina/ansj_seg |
Modifier and Type | Class and Description |
---|---|
class |
HanLPResult
HanLP分词结果实现
项目地址:https://github.com/hankcs/HanLP |
Modifier and Type | Class and Description |
---|---|
class |
IKAnalyzerResult
IKAnalyzer分词结果实现
项目地址:https://github.com/yozhao/IKAnalyzer |
Modifier and Type | Class and Description |
---|---|
class |
JcsegResult
Jcseg分词结果包装
项目地址:https://gitee.com/lionsoul/jcseg |
Modifier and Type | Class and Description |
---|---|
class |
JiebaResult
Jieba分词结果实现
项目地址:https://github.com/huaban/jieba-analysis |
Modifier and Type | Class and Description |
---|---|
class |
MmsegResult
mmseg4j分词结果实现
项目地址:https://github.com/chenlb/mmseg4j-core |
Modifier and Type | Class and Description |
---|---|
class |
MynlpResult
MYNLP 中文NLP工具包分词结果实现
项目地址:https://github.com/mayabot/mynlp/ |
Modifier and Type | Class and Description |
---|---|
class |
WordResult
Word分词结果实现
项目地址:https://github.com/ysc/word |
Copyright © 2025. All rights reserved.