Commit Graph

101 Commits (5e8ca02ec670e18f52361296072929fc0a93efc3)

Author SHA1 Message Date
yihua.huang 928f98dd93 auto create folder in JsonFilePipeline #122 11 years ago
yihua.huang 7fbe18b8c0 implementation of PageMapper #120 11 years ago
yihua.huang 5dc9fe95a9 interface of PageMapper #120 11 years ago
yihua.huang 186b90512e refactor redisscheduler #118 11 years ago
yihua.huang d1140b9e29 add bloom filter for scheduler #118 11 years ago
yihua.huang e8d4a9be2b fix remove duplicate error #117 11 years ago
yihua.huang 1104122979 more abstraction in scheduler 11 years ago
yihua.huang b0fb1c3e10 remove copy-dependcies plugin for m2e error 11 years ago
yihua.huang 94a67165e1 remove jmx server for simplify #98 11 years ago
yihua.huang 86a45a6643 change SpiderMonitor to singleton #98 11 years ago
yihua.huang ab4d36806e clean code 11 years ago
yihua.huang 04fde8203b add control for monitor 11 years ago
yihua.huang 2770811a10 update monitor example 11 years ago
yihua.huang 17e95f2a7f comments 11 years ago
yihua.huang 375e64e845 more monitor status 11 years ago
yihua.huang c6661899fd new thread pool #110 11 years ago
yihua.huang 179baa7a22 return when page is null 11 years ago
yihua.huang 4738ae2d14 change url find to match #94 11 years ago
yihua.huang f973889cda refactor subpageprossor etc. #94 11 years ago
yihua.huang acb63d55d7 some check and example #98 11 years ago
yihua.huang 11ba5beb42 [refactor]move monitor to webmagic-extension #98 11 years ago
yihua.huang b06aa489fb [BugFix]Only one url from sourceRegion can be extracted #107 11 years ago
yihua.huang 023c2ac84e spider config draft 11 years ago
yihua.huang a5db6cf292 some monitor and JMX support #98 11 years ago
yihua.huang aae1ab2cd6 fix compile error 11 years ago
yihua.huang 1fbfc92de2 Inherit support of Field annotation in Model #103 11 years ago
yihua.huang 3a79b1b64a [Bugfix]formatter property does not work when field is String#100 11 years ago
Yihua Huang cc9d319fd9 Merge pull request #94 from sebastian1118/master
update:PatternHandler
11 years ago
yihua.huang 03c251237b add Json parse support 11 years ago
Tian 99e12aafaa update:PatternHandler 11 years ago
yihua.huang c1e7207869 add FileCacheQueueScheduler support for cycleRetryTimes 11 years ago
yihua.huang 969ad1766b change logger style to slf4j for cleaner code 11 years ago
yihua.huang 9b2cb43f47 ConfigurablePageProcessor #91 11 years ago
Bo LIANG 159eeea2f5 Remove unused variable to make the project cleaner. 11 years ago
yihua.huang c143fc662c add SubPageProcessor #86 11 years ago
Yihua Huang 474f785dab Merge pull request #86 from sebastian1118/master
new feature: PatternProcessor
11 years ago
Tian 38a12f8641 new feature: PatternProcessor 11 years ago
yihua.huang dafd0b5875 [BugFix]multi model in one pageprocessor will be skipped #85 11 years ago
yihua.huang 8958d774f2 add default values for @Formatter 11 years ago
yihua.huang 6c11718566 Clean project structure #70 11 years ago
yihua.huang 0e98183f74 Change log4j to slf4j #55 11 years ago
yihua.huang fa33b15843 property loader 11 years ago
yihua.huang 362fdd0662 Merge branch 'master' of github.com:code4craft/webmagic 11 years ago
yihua.huang af809c4d55 update version to 0.5.0-snapshot 11 years ago
jon a722f9bb66 修复由于FileCacheQueueScheduler中fileCursor 文件再次打开时没有初始化抛出NullPointerException的错误 11 years ago
yihua.huang 486d9d276f #45 Remove multi in ExtractBy 11 years ago
yihua.huang 18a3af4a0a add more sample for jsonpath #42 11 years ago
yihua.huang 59ad4cad27 #42 Add jsonpath in annotation mode for json result 11 years ago
yihua.huang cf62d707e0 #36 Spider does not exit when success 11 years ago
yihua.huang a01312930a #39 Parsing html after page.getHtml() 11 years ago