Commit Graph

58 Commits (0bb2ddcc1be60a455bdb065c7dac19633da76368)

Author SHA1 Message Date
Sun Junyi 81390a2d23 test_file.py: close the file object 12 years ago
fxsjy b77645b3aa modify test_file.py; use less memory 12 years ago
Linker Lin 5d83855088 自动检测CPU数目,启动合适数目的进程。 12 years ago
Linker Lin 2ceb981da0 自动检测CPU数目,启动合适数目的进程。 12 years ago
Sun Junyi 9d0ea771a5 fix bug; decimals & digit-english mixed 12 years ago
Sun Junyi ba5114dc95 update whoosh example 12 years ago
Sun Junyi f424862222 clean the files in tmp 12 years ago
Sun Junyi b18d56d2a3 Merge pull request #72 from linkerlin/master
添加一个tmp目录,好让test_whoosh.py可以运行。
12 years ago
miao.lin becd32b178 made test_whoosh.py happy.
添加一个tmp目录,好让test_whoosh.py可以运行。
12 years ago
Sun Junyi 45daf561c7 follow PEP8: change tab to 4 white spaces 12 years ago
Sun Junyi dbec3ad9df add some comments 12 years ago
Sun Junyi efc784312c add ChineseAnalyzer for whoosh search engine 12 years ago
Sun Junyi f08690a2df add 'search mode' for jieba.tokenize 12 years ago
Sun Junyi cb1b0499f7 unittest for jieba.tokenize 12 years ago
Sun Junyi 11a3b10755 new method: jieba.tokenize 12 years ago
Sun Junyi c0816b9bb0 more mixed words 12 years ago
Sun Junyi c9e8da9e63 add more mix words to dict.txt 12 years ago
fxsjy 0087a4e7e3 adjust prob_trans for better support of name entity; fix some bad cases 12 years ago
Sun Junyi 4300f79788 add a example of using sklearn+jieba 12 years ago
Sun Junyi a8f902545c fix some bad cases 12 years ago
cloudaice 9ee20a5293 add generator test 12 years ago
cloudaice 0c050b5eb2 add jieba.posseg test case 12 years ago
cloudaice b0f9e6721e 添加cutall 测试用例 12 years ago
cloudaice a7ff398edc 添加cut,set_dictionary,cut_for_search三个测试用例 12 years ago
cloudaice 667203a9ae 替换tab为空格,使用join代替循环 12 years ago
cloudaice a2d2078465 将tab换成空格,使用is判断对象是否为None 12 years ago
cloudaice e0434871eb 修改demo.py的代码格式,使得符合pep8规范 12 years ago
Sun Junyi c1bf815343 update test case 12 years ago
Sun Junyi 94d455b079 hot fix of cut_all=True 12 years ago
Sun Junyi 59d5d3b811 fix bug and change version 12 years ago
fxsjy 8666428fb0 fix a bug of changing dictionary 12 years ago
fxsjy 9bebe6120b utf-8 output is more friendly to Linux 12 years ago
Sun Junyi d3339633d5 in the speed test: initialize first to ignore the time of dict loading 12 years ago
fxsjy bc049090a5 make lazy load thread safe 12 years ago
fxsjy b46166f768 use CRLF as seperator to make chunks in parallel mode 12 years ago
fxsjy 6b83593b5a rm stub.log 12 years ago
fxsjy 62cf22121f new feature: parallel segment with multiprocessing 12 years ago
Sun Junyi 8d89e8afda handle 的 12 years ago
fxsjy 45591bb9ab support flag '_'; ignore white space 12 years ago
Sun Junyi 94ad7e7035 support decimal point 12 years ago
Sun Junyi a383f035ba support decimal point: example PI=3.141569 = > PI / = / 3.14159 12 years ago
Sun Junyi 8e49199993 keep punctuation marks 12 years ago
Sun Junyi 58c363655c support user defined word tag 12 years ago
Sun Junyi 6cc0e95759 rm 1.log 12 years ago
Sun Junyi d2634a049b fix a bug in pypy 12 years ago
Sun Junyi 06ebc6f71c en-chn mix words in POS 12 years ago
Sun Junyi a8ae0398b4 add one example 12 years ago
Sun Junyi 6517119110 remove 1.log 12 years ago
Sun Junyi 8c05efed68 remove tlbb.txt 12 years ago
Sun Junyi 379cd4933a support en-chn mixed words, like B超 12 years ago