Commit Graph

223 Commits (8f84fb2ab5c154f8ed65d98448b897c104d08950)

Author SHA1 Message Date
yihua.huang 6c11718566 Clean project structure
yihua.huang 9606a173cd fix ZipCodePageProcessor
yihua.huang 757cc9b942 [maven-release-plugin] prepare for next development iteration
yihua.huang 63ffb5c792 [maven-release-plugin] prepare release webmaigc-0.4.3
yihua.huang 66d4d3c192 Merge branch 'master' into 0.4.x
yihua.huang af07280176 remove defend code for httpclient 4.3.1 because it is fixed in 4.3.3
yihua.huang d5a978e00f update version back to 0.4.3
yihua.huang 55368919df add attribute 'text' support for CssSelector
yihua.huang 88b50d4182 bigfix: cycleTry will not work when spawnUrl is set to false
yihua.huang 2768a1cae4 add test for cycleTriedTimes and fix cycleTriedTimes inc error
yihua.huang bbd0d7e600 update httpclient version to 4.3.3
yihua.huang 571061454a add CYCLE_TRIED_TIMES support to QueueScheduler and PriorityScheduler
yihua.huang 0e98183f74 Change log4j to slf4j
yihua.huang fa33b15843 property loader
yihua.huang af809c4d55 update version to 0.5.0-snapshot
Almark Ming 2b46b11e55 Update RegexSelector.java
Optimize regex format check

Conflicts:
	webmagic-core/src/main/java/us/codecraft/webmagic/selector/RegexSelector.java
yihua.huang b51fb2696b update ut for cookie
yihua.huang ff2f588c41 nullpointer exception
yihua.huang d274310cb2 [maven-release-plugin] prepare for next development iteration
yihua.huang e8c32a32dc [maven-release-plugin] prepare release webmagic-0.4.2
yihua.huang 6a828e923c Downloader thread hang up when timeout
shijinping 9a524aa364 double-check 中再取次httpClient的内容
yihua.huang e7083dc39d [maven-release-plugin] prepare for next development iteration
yihua.huang ae623567b3 [maven-release-plugin] prepare release webmagic-0.4.1
yihua.huang 59ad4cad27 Add jsonpath in annotation mode for json result
yihua.huang c2d6d495b3 add getThreadAlive(),getStatus,getPageCount() to spider
yihua.huang cf62d707e0 Spider does not exit when success
yihua.huang a01312930a Parsing html after page.getHtml()
yihua.huang f63d33b457 update some comments
yihua.huang 04fcf3193f Change algorithm of SmartContentSelector
yihua.huang 296a68920e fix javadoc and add setPipelines() for spider
yihua.huang 47a0360783 add status code to page
yihua.huang bc5c30de17 update scripts
yihua.huang f9daae39cf [maven-release-plugin] prepare for next development iteration
yihua.huang fdb9441519 [maven-release-plugin] prepare release webmagic-0.4.0
yihua.huang 1d75ae7f5b rollback version to 0.4.0 because not deploy success
yihua.huang df8ca8ad09 add scripts
yihua.huang 775eb9732f [maven-release-plugin] prepare for next development iteration
yihua.huang 0b4fadc24d [maven-release-plugin] prepare release webmagic-0.4.0
yihua.huang fe6d9bb2e2 get keep-alive rework
yihua.huang fd6d2fd6f8 try to keepalive TCP connection
yihua.huang 425df08523 update version to 0.4.0
yihua.huang e046bb0723 remove useless code
yihua.huang 6e32a19f80 update api for direct download
yihua.huang 807aefe9df change EntityUtil to IOUtil because some encoding error
yihua.huang 00b0a751b4 ignore 'content-encoding' when redirect
yihua.huang 8f774afc84 add direct download
yihua.huang c18b603399 optimize long compare
yihua.huang ed3f3583cc downloader refactor
yihua.huang a37f40e6e6 add cookie supoort