us.codecraft0.2.04.0.0pomwebmagicwebmagic
A crawler framework. It covers the whole lifecycle of crawler: downloading, url management, content
extraction and persistent. It can simply the development of a specific crawler.
https://github.com/code4craft/webmagic/code4craftYihua huangcode4crafer@gmail.comscm:git:git@github.com:code4craft/webmagic.gitscm:git:git@github.com:code4craft/webmagic.gitgit@github.com:code4craft/webmagic.gitApache License, Version 2.0http://www.apache.org/licenses/LICENSE-2.0webmagic-corewebmagic-extension/webmagic-samplesjunitjunit4.7testorg.apache.httpcomponentshttpclient4.2.4net.sf.saxonSaxon-HE9.5.1-1log4jlog4j1.2.17org.apache.commonscommons-lang33.1commons-collectionscommons-collections3.2.1net.sourceforge.htmlcleanerhtmlcleaner2.5org.apache.commonscommons-io1.3.2org.jsoupjsoup1.7.2org.apache.maven.pluginsmaven-compiler-plugin3.11.6UTF-8org.apache.maven.pluginsmaven-dependency-plugin2.8copy-dependenciespackagecopy-dependencies${project.build.directory}/libfalsefalsetrueorg.apache.maven.pluginsmaven-resources-plugin2.6UTF-8org.apache.maven.pluginsmaven-source-plugin2.2.1attach-sourcesjarorg.apache.maven.pluginsmaven-javadoc-plugin2.9.1UTF-8attach-javadocsjarorg.apache.maven.pluginsmaven-release-plugin2.4.1