org.sonatype.ossoss-parent7us.codecraft0.4.24.0.0pomUTF-8UTF-8webmagic-parentwebmagic-parent
A crawler framework. It covers the whole lifecycle of crawler: downloading, url management, content
extraction and persistent. It can simply the development of a specific crawler.
https://github.com/code4craft/webmagic/code4craftYihua huangcode4crafer@gmail.comyuanyLigang Yaoligang.yao@answers.comscm:git:git@github.com:code4craft/webmagic.gitscm:git:git@github.com:code4craft/webmagic.gitgit@github.com:code4craft/webmagic.gitHEADApache License, Version 2.0http://www.apache.org/licenses/LICENSE-2.0webmagic-corewebmagic-extension/webmagic-samples/junitjunit4.7testorg.apache.httpcomponentshttpclient4.3.1com.google.guavaguava15.0us.codecraftxsoup0.1.0net.sf.saxonSaxon-HE9.5.1-1log4jlog4j1.2.17org.apache.commonscommons-lang33.1commons-collectionscommons-collections3.2.1net.sourceforge.htmlcleanerhtmlcleaner2.5org.apache.commonscommons-io1.3.2org.jsoupjsoup1.7.2org.apache.maven.pluginsmaven-surefire-pluginpertest-Xms1024m -Xmx1024m -Xss1m org.apache.maven.pluginsmaven-compiler-plugin3.11.6UTF-8org.apache.maven.pluginsmaven-dependency-plugin2.8copy-dependenciespackagecopy-dependencies${project.build.directory}/libfalsefalsetrueorg.apache.maven.pluginsmaven-resources-plugin2.6UTF-8org.apache.maven.pluginsmaven-source-plugin2.2.1attach-sourcesjarorg.apache.maven.pluginsmaven-javadoc-plugin2.9.1UTF-8attach-javadocsjarorg.apache.maven.pluginsmaven-release-plugin2.4.1release-sign-artifactsperformReleasetrueorg.apache.maven.pluginsmaven-gpg-plugin1.1sign-artifactsverifysign