Commit Graph

188 Commits (74fdd1df1b20885285294c2f52ecb4e2166a7e91)

Author SHA1 Message Date
yihua.huang 16e12e3bc9 #27 customize http header for downloader 11 years ago
yihua.huang 1a2c84ea78 #27 add timeout config to site 11 years ago
yihua.huang b131878123 add example 12 years ago
yihua.huang 95ab4edec3 some bugfix 12 years ago
yihua.huang fba330872b fix a thread pool exception 12 years ago
yihua.huang 3c79d031bd fix thread pool 12 years ago
yihua.huang c17a31a21d fix null pointe exception #26 12 years ago
yihua.huang d2e0f0cd33 #25 use URL api in UrlUtils.canonicalizeUrl() 12 years ago
yihua.huang ef4cf49fee add stop method to spider #24 12 years ago
yihua.huang 692de76f86 fix issue #21 charset detect error 12 years ago
yihua.huang 1fc8e104ab add cycle retry 12 years ago
yihua.huang d141541ef3 add retry 12 years ago
yihua.huang aefd0569a5 update version 12 years ago
yihua.huang 194518fd82 add switch 12 years ago
yihua.huang 326b97c65a update 12 years ago
yihua.huang 2c3574537a refactor in selectors 12 years ago
yihua.huang 85b7cf1563 complete test 12 years ago
yihua.huang d7cd9e5747 update pom 12 years ago
yihua.huang 55d4a76ab7 newselectors 12 years ago
yihua.huang d7abbd0e4b fix compile error 12 years ago
yihua.huang 5e9e8b2541 add TextContentSelector 12 years ago
yihua.huang 0cc0ccee35 add charset specific for easy call of HttpClientDownloader 12 years ago
yihua.huang 91dcccf7b5 add a sample 12 years ago
yihua.huang 0f0f1a9bcd release notes 12 years ago
yihua.huang c1471718df extractors 12 years ago
yihua.huang 20705b34ac add more option to extractors 12 years ago
yihua.huang c70ed57025 remove PriorityScheduler to core 12 years ago
yihua.huang 7003426898 update pom 12 years ago
yihua.huang c79d6ecf09 complete all comments 12 years ago
yihua.huang 90bbe9b951 webmagic-core 12 years ago
yihua.huang 17f8ead28f update comments for selector 12 years ago
yihua.huang 77e6ca2945 update comments 12 years ago
yihua.huang 5073258237 closable 12 years ago
yihua.huang d01c0eb8ce update comments of spider 12 years ago
yihua.huang 5f1f4cbc46 update comments 12 years ago
yihua.huang 1148450ff9 update filecache to more useful 12 years ago
yihua.huang 3ba7a76f44 add combo extract to replace Extract2 Extract3... 12 years ago
yihua.huang 5cb45af3a4 +doc 12 years ago
yihua.huang ef673b985e add a method for httpclientdownloader 12 years ago
yihua.huang 067f3ea0cb add some null pointer check for httpclientdownloader 12 years ago
yihua.huang 9e82256ce3 update docs 12 years ago
yihua.huang 0a902b441c update docs 12 years ago
yihua.huang 0f2c5b5723 update redisscheduler 12 years ago
yihua.huang 787b952932 release notes and docs 12 years ago
yihua.huang 8b15f3c63d add test 12 years ago
yihua.huang ade5714d50 add https support 12 years ago
yihua.huang 21eca688e9 complete docs 12 years ago
yihua.huang 17d2d98cec remove invalid @date 12 years ago
yihua.huang 268bd8d0c4 remove saxon to extension 12 years ago
yihua.huang cff943f698 fix path format error 12 years ago
yihua.huang 36494bcfa5 add xpath2.0 api 12 years ago
yihua.huang 5c96407a3d fix a null domain error 12 years ago
yihua.huang c7005a0227 json fix 12 years ago
yihua.huang e5f4b3916f change file dir 12 years ago
yihua.huang 7d277e84d4 update lucene pipeline 12 years ago
yihua.huang b40cca1122 move model package to plugin 12 years ago
yihua.huang 4eb3d60083 fix nullpointer exception 12 years ago
yihua.huang b0af45f4bb complete redis support 12 years ago
yihua.huang f3a29d9315 fix pagedmodel bug 12 years ago
yihua.huang 629f8ac2d1 add extractors chain 12 years ago
yihua.huang 27ce3fc176 lazy init 12 years ago
yihua.huang dc9f574e27 update request 12 years ago
yihua.huang d56c681be1 add priority to request 12 years ago
yihua.huang 971e7b6ce2 add core 12 years ago
yihua.huang 619a12b303 add paged support 12 years ago
yihua.huang a5c85c3c8b add annotation ExtractByRaw 12 years ago
yihua.huang 1a50c64e33 update name 12 years ago
yihua.huang a3a868f584 rename 12 years ago
yihua.huang 04a7fa037a update pipeline 12 years ago
yihua.huang 21cae2ff2e update package 12 years ago
yihua.huang cfb8990453 update author 12 years ago
yihua.huang b393e38320 add multi entity extract 12 years ago
yihua.huang bfadac756a fix an attribute bug 12 years ago
yihua.huang 145628557d update afterextract api 12 years ago
yihua.huang aca165b132 add and or selector 12 years ago
yihua.huang 69245e8c03 fix Class.assinable bug 12 years ago
yihua.huang 65518f7672 add list support 12 years ago
yihua.huang d4de60a562 skip test 12 years ago
yihua.huang d26cd82d59 rename package 12 years ago
yihua.huang f84b53514f complete objectpipeline 12 years ago
yihua.huang 866ab0a056 update email 12 years ago
yihua.huang 7c9e9ce869 xpath2.0 12 years ago
yihua.huang 7f27c28d4c simplify api 12 years ago
yihua.huang d7899e94ae test saxon and invite XPath2.0 support 12 years ago
yihua.huang 3fe3d8f044 update 12 years ago
yihua.huang 516ff3310d add failfast 12 years ago
yihua.huang 7a4dbb1f15 invite notnull 12 years ago
yihua.huang 06a39af0f3 add setter support 12 years ago
yihua.huang abba3b7bff add extract by url 12 years ago
yihua.huang f08ffc34fd rename 12 years ago
yihua.huang c5cf05640a processor 12 years ago
yihua.huang 50edd22ef6 add annotation 12 years ago
yihua.huang 7020b8648d fix a thread problem 12 years ago
yihua.huang 52fd5cfc1c fix encoding 12 years ago
yihua.huang e87aabf8fd 为downloader增加了一个新方法,可设置线程数 12 years ago
yihua.huang 18fefa0c0a fix a spider init problem 12 years ago
yihua.huang 54904851ea add list output support 12 years ago
yihua.huang 42508af041 add huaban processor 12 years ago
yihua.huang fe224cbf66 release resource 12 years ago
yihua.huang 86a20eabd9 fix a httpclient pool size bug 12 years ago