Commit Graph

523 Commits (master)
 

Author SHA1 Message Date
Dingyuan Wang f2b7183a71 use str.splitlines to avoid losing line breaks 10 years ago
Sun Junyi b14eb329e3 Merge pull request #237 from gumblex/master
直接将前缀储存在词频字典里
10 years ago
Dingyuan Wang 872a7039f2 Merge branch 'master' of https://github.com/fxsjy/jieba 10 years ago
Dingyuan Wang f808ea0ebb use only one dict to store words and prefixes 10 years ago
fxsjy 4d7b515801 Merge branch 'master' of https://github.com/fxsjy/jieba 10 years ago
fxsjy 5bfa43a781 fix test scripts 10 years ago
Dingyuan Wang f3a53dd2da fix print() in tests 10 years ago
Sun Junyi a229041e58 Merge pull request #234 from yanyiwu/patch-2
Update README.md
10 years ago
Yanyi Wu 5d321cbccd Update README.md 10 years ago
fxsjy 8cbb26a7b6 fix test_file.py 10 years ago
Sun Junyi 41b47b0593 Merge pull request #233 from gumblex/master
合并 jieba3k,兼容 Python 2/3
10 years ago
Dingyuan Wang 32a0e92a09 don't compile re every time; autopep8 10 years ago
Dingyuan Wang 22bcf8be7a Merge master and jieba3k, make the code Python 2/3 compatible 10 years ago
Sun Junyi caae26fbfa Merge pull request #231 from gumblex/master
在 FREQ 中直接储存频数
10 years ago
Dingyuan Wang 4197dfb8fa store int directly in FREQ; small improvements 10 years ago
Dingyuan Wang 765fd6b7f0 store int directly in FREQ; small improvements 10 years ago
Sun Junyi c95f402e2b Merge pull request #214 from aszxqw/master
add iosjieba
10 years ago
yanyiwu 1d91072498 add iosjieba 10 years ago
Sun Junyi 852a07c4f2 Merge pull request #211 from gumblex/jieba3k
修复 posseg 中 pair 类 repr 返回值 (jieba3k)
10 years ago
Dingyuan Wang 7bcb128f5f fix textrank divided by zero; fix posseg.pair.__repr__ 10 years ago
Sun Junyi b08c3f8ed7 Merge pull request #205 from lynschinzer/master
Fix divided by zero issue in case of words are not found in dict.
10 years ago
Lin fea3aec6bd Fix divided by zero issue in case of words are not found in dict. 10 years ago
Sun Junyi 8be082017a Merge pull request #204 from gumblex/jieba3k
完善setup.py等对应py3k更新
10 years ago
Sun Junyi 293dbbc390 Merge pull request #203 from gumblex/master
修复 posseg;完善 setup.py
10 years ago
Dingyuan Wang 3dad899ec8 backport 2to3 scripts and changelog 10 years ago
Dingyuan Wang c6b386f65b update jieba3k 10 years ago
Dingyuan Wang 7b7c6955a9 complete the setup.py, fix #202 problem in posseg 10 years ago
Sun Junyi 8a2e7f0e7e Merge pull request #202 from nomaka/patch-1
Update __init__.py
10 years ago
Nomaka 9cb76dd8b9 Update __init__.py
calc的idx参数没用
10 years ago
Sun Junyi 99748bfc17 Merge pull request #201 from skyerown/master
为关键字提取函数增加词性过滤功能
10 years ago
walkskyer a336e26403 为函数textrank增加参数allowPOS,并修改extract_tags的参数allowPOS与textrank保持一致。 10 years ago
walkskyer bab5f362ba 将exstract_tags参数allowPOS转换为frozenset以减少查找时间。 10 years ago
Dingyuan Wang 6b0da06481 merge from upstream 10 years ago
fxsjy 5c487dbcba update verson 10 years ago
fxsjy 447c1ded8c fix problem for python3.2 10 years ago
walkskyer dd62477605 .gitignore中忽略pycharm项目文件 10 years ago
Dingyuan Wang a5ecf70f71 update to v0.35 10 years ago
walkskyer d82d2c18df 为关键字提取函数增加词性过滤功能 10 years ago
fxsjy 315a411e52 version update 10 years ago
fxsjy ec68c21ea0 version update' 10 years ago
Sun Junyi 3eea28d6f4 Merge pull request #200 from skyerown/master
修复stop words处理未考虑"\r"导致不能正常匹配的问题。
10 years ago
walkskyer 5571a0337a 修复stop words处理未考虑"\r"导致不能正常匹配的问题。 10 years ago
Sun Junyi 40c0edfd99 Merge pull request #198 from gumblex/jieba3k
Jieba3k 对应更新;半自动转换脚本
10 years ago
Dingyuan Wang 4a6140081e fix problems in auto2to3 10 years ago
Dingyuan Wang 7a6caa0c3c port extract_tags, etc to jieba3k; add auto2to3 script 10 years ago
walkskyer 36bc9e18c6 Merge pull request #1 from fxsjy/master
pull
10 years ago
Sun Junyi 7ce63e53b7 Merge pull request #197 from skyerown/master
修复带权重测试脚本输出结果是调用顺序错误
10 years ago
walkskyer 6772f0282e 修复带权重测试脚本输出结果是调用顺序错误 10 years ago
Sun Junyi a5944bb88e Merge pull request #196 from qinwf/master
Add jiebaR in README
10 years ago
Qin Wenfeng 77a831b8c1 Add jiebaR in README 10 years ago