Commit Graph

1086 Commits (33906e36f48588f8d1a44331d1a21fbcd3a5f9d7)
 

Author SHA1 Message Date
yihua.huang e3131af856 remove user manual link
yihua.huang 7f0d79ccb0 readme
yihua.huang ab1dad3a71 remove lib
yihua.huang a6b2e307f3 docs for 0.6.0
yihua.huang d69204b919 0.6.0
yihua.huang 9bdb48b2d0 version 0.6.0
yihua.huang eeb607fd0e 将Spider.processRequest()抛出异常改回原来的逻辑
yihua.huang 97592d6720 Version 0.6.0
yihua.huang 42e5e623b4 remove project avalon
yihua.huang 00dfebbceb remove guava dep and add fix docs
yihua.huang c2531c6817 clean dependency
yihua.huang a960a39c44 fix compile error for example change
yihua.huang a3ee9e3d08 fix example
yihua.huang 7476ceccee more stable test
yihua.huang 5ce3fdfe5a some refactor in log
yihua.huang 98163a3e40 update examples
yihua.huang 243ebc22fa update httpclient version to 4.5.2
yihua.huang b090dcd20d sepcific error page for HttpClientDownloaderTest to avoid test error when local port is available
yihua.huang 8f942d6fe2 修复抓取https链接线程无法结束导致进程一直运行的问题
Jsbd 1b886d48a2 新增PhantomJSDownloader构造函数,支持crawl.js路径自定义,因为当其他项目依赖此jar包时,runtime.exec()执行phantomjs命令时无使用法jar包中的crawl.js
Jsbd d1f2e65e5d 新增PhantomJSDownloader构造函数,支持crawl.js路径自定义,因为当其他项目依赖此jar包时,runtime.exec()执行phantomjs命令时无使用法jar包中的crawl.js
Jsbd f8a2328ead Merge pull request from code4craft/master
merge
Yihua Huang 1987cd3ae1 Merge pull request from code4craft/0.6.0
groovy demo
Yihua Huang 65fe2c4487 Merge pull request from jsbd/master
为PhantomJSDownloader添加新的构造函数,支持phantomjs自定义命令
Jsbd ebc61363c8 为PhantomJSDownloader添加新的构造函数,支持phantomjs自定义命令
为PhantomJSDownloader添加新的构造函数,支持phantomjs自定义命令
 example: 
   *    phantomjs.exe 支持windows环境
   *    phantomjs --ignore-ssl-errors=yes 忽略抓取地址是https时的一些错误
   *    /usr/local/bin/phantomjs 命令的绝对路径,避免因系统环境变量引起的IOException
yihua.huang fdf39eb99d open in new page
yihua.huang 1e74494708 add related link
yihua.huang b92e6b04f0 修复FileCacheQueueScheduler自己设置DuplicateRemover会导致NPE的问题
yihua.huang dafd2b77ff fix GithubRepoPageProcessor in example
yihua.huang cfed860fb9 Merge branch 'master' of github.com:code4craft/webmagic
yihua.huang 2189aab652 fix test
Yihua Huang 1491033534 Merge pull request from jerry-sc/monitor-bug
fix the monitor bug which the spider will terminate when a seed url with port
Yihua Huang 228911b58c Merge pull request from gyk001/master
fixed  修复使用注解抽取JSON数据的问题
yihua.huang 507556d0aa fix test: ProxyTest.testProxy() do not load exist proxy config
yihua.huang 55f131e5ef update fastjson to 1.2.21
Jerry e56b8c3efc fix the monitor bug which the spider will terminate when a seed url with port
郭玉昆 700898fe8a fixed 修复使用注解抽取JSON数据的问题
Yihua Huang e22d6426fc Merge pull request from Salon-sai/master
add: redis scheduler with priority
Salon.sai f89a6a6826 add: redis scheduler with priority
yihua.huang 448e528140 update StringUtils to apache lang3
yihua.huang 3e33959b7a fix javadoc
yihua.huang 3a6e246350 Merge branch 'kapsterio-fix'
yihua.huang 8730e3e97a Merge branch 'fix' of git://github.com/kapsterio/webmagic into kapsterio-fix
Yihua Huang 37cb43b667 Merge pull request from lavenderx/master
add PhantomJSDownloader
yihua.huang 2400ff7e1a resovle conflict
yihua.huang 9de64ea0f2 Merge branch 'hepan-master'
yihua.huang b7f3c4bba0 Merge branch 'master' of git://github.com/hepan/webmagic into hepan-master
yihua.huang d8f978fd20 fix test in JsonPathSelectorTest
yihua.huang 25821c3c06 Merge branch 'EdwardsBean-proxy-strategy'
yihua.huang 61c28a0130 refactor on proxypool