Commit Graph

441 Commits (89481cfd8484b2decb75e6c3c32b070d5df8300f)
 

Author SHA1 Message Date
fxsjy 89481cfd84 version update 0.36 10 years ago
Sun Junyi 59aa8b69b1 Merge pull request #246 from gumblex/master
增加自动词频
10 years ago
Dingyuan Wang 4fa2728fb6 update README about new features 10 years ago
Dingyuan Wang 4a552ca94f suggest word frequency, support passing str to add_word 10 years ago
Sun Junyi 1b4721ebb8 Merge pull request #179 from changyy/master
新增自訂 cache_file 產生的目錄位置,可支援 jieba 運行在 Read-Only File System,如: Embedded Linux、Google App Engine 和 Heroku 等
10 years ago
Yuan-Yi Chang 62433a3205 讓 jieba 可以自行指定 cache_file 產生的目錄位置,提供 jieba 在 Read-only file system 環境中運行
1.在呼叫 jieba.cut() 等相關動作前,先透過 jieba.tmp_dir 指定目錄位置
2.當應用環境為 Read-Only File System,可透過預先產生 cache_file 的機制,讓 jieba 正常運行
3.實際案例為 Google App Engine 和 Heroku,其中前者免費版僅 128MB 記憶體空間無法運行,後者免費環境有 512MB 可正常運行。發佈前,先在本地端產生 cache_file 後,連同 cache_file 一併發佈至 Google App Engine 或 Heroku 環境上即可使用。
10 years ago
Sun Junyi 4b4aff6d89 Merge pull request #242 from gumblex/master
textrank 细节问题;文档更新
10 years ago
Dingyuan Wang f29430f49e details in textrank; update README 10 years ago
Sun Junyi a4fb439070 Merge pull request #241 from sing1ee/master
improve some details from other commiters' adivces
10 years ago
zhangcheng 01b7f6efcf improve some details from other commiters' adivces 10 years ago
Sun Junyi 4e05cde07e Merge pull request #240 from sing1ee/master
build stable sort for graph iteration
10 years ago
zhangcheng 8b8c6c85d0 remove unusage import 10 years ago
zhangcheng a6d1b2479e build stable sort for graph iteration, then we can get stable result and adatpe details for python 3~ 10 years ago
zhangcheng 1152db7736 build stable sort for graph iteration, then we can get stable result. 10 years ago
fxsjy 49657c976d make extract_tags behavior compatiable with previous version 10 years ago
fxsjy abcaf3e475 fix bug: load_userdict 10 years ago
Jack a06b7d388e fix bug in __main__.py 10 years ago
Sun Junyi 9ca5b69907 Merge pull request #238 from gumblex/master
use str.splitlines to avoid losing line breaks
10 years ago
Dingyuan Wang f2b7183a71 use str.splitlines to avoid losing line breaks 10 years ago
Sun Junyi b14eb329e3 Merge pull request #237 from gumblex/master
直接将前缀储存在词频字典里
10 years ago
Dingyuan Wang 872a7039f2 Merge branch 'master' of https://github.com/fxsjy/jieba 10 years ago
Dingyuan Wang f808ea0ebb use only one dict to store words and prefixes 10 years ago
fxsjy 4d7b515801 Merge branch 'master' of https://github.com/fxsjy/jieba 10 years ago
fxsjy 5bfa43a781 fix test scripts 10 years ago
Dingyuan Wang f3a53dd2da fix print() in tests 10 years ago
Sun Junyi a229041e58 Merge pull request #234 from yanyiwu/patch-2
Update README.md
10 years ago
Yanyi Wu 5d321cbccd Update README.md 10 years ago
fxsjy 8cbb26a7b6 fix test_file.py 10 years ago
Sun Junyi 41b47b0593 Merge pull request #233 from gumblex/master
合并 jieba3k,兼容 Python 2/3
10 years ago
Dingyuan Wang 32a0e92a09 don't compile re every time; autopep8 10 years ago
Dingyuan Wang 22bcf8be7a Merge master and jieba3k, make the code Python 2/3 compatible 10 years ago
Sun Junyi caae26fbfa Merge pull request #231 from gumblex/master
在 FREQ 中直接储存频数
10 years ago
Dingyuan Wang 4197dfb8fa store int directly in FREQ; small improvements 10 years ago
Dingyuan Wang 765fd6b7f0 store int directly in FREQ; small improvements 10 years ago
Sun Junyi c95f402e2b Merge pull request #214 from aszxqw/master
add iosjieba
10 years ago
yanyiwu 1d91072498 add iosjieba 10 years ago
Sun Junyi 852a07c4f2 Merge pull request #211 from gumblex/jieba3k
修复 posseg 中 pair 类 repr 返回值 (jieba3k)
10 years ago
Dingyuan Wang 7bcb128f5f fix textrank divided by zero; fix posseg.pair.__repr__ 10 years ago
Sun Junyi b08c3f8ed7 Merge pull request #205 from lynschinzer/master
Fix divided by zero issue in case of words are not found in dict.
10 years ago
Lin fea3aec6bd Fix divided by zero issue in case of words are not found in dict. 10 years ago
Sun Junyi 8be082017a Merge pull request #204 from gumblex/jieba3k
完善setup.py等对应py3k更新
10 years ago
Sun Junyi 293dbbc390 Merge pull request #203 from gumblex/master
修复 posseg;完善 setup.py
10 years ago
Dingyuan Wang 3dad899ec8 backport 2to3 scripts and changelog 10 years ago
Dingyuan Wang c6b386f65b update jieba3k 10 years ago
Dingyuan Wang 7b7c6955a9 complete the setup.py, fix #202 problem in posseg 10 years ago
Sun Junyi 8a2e7f0e7e Merge pull request #202 from nomaka/patch-1
Update __init__.py
10 years ago
Nomaka 9cb76dd8b9 Update __init__.py
calc的idx参数没用
10 years ago
Sun Junyi 99748bfc17 Merge pull request #201 from skyerown/master
为关键字提取函数增加词性过滤功能
10 years ago
walkskyer a336e26403 为函数textrank增加参数allowPOS,并修改extract_tags的参数allowPOS与textrank保持一致。 10 years ago
walkskyer bab5f362ba 将exstract_tags参数allowPOS转换为frozenset以减少查找时间。 10 years ago