us.codecraft webmagic 0.1.0 4.0.0 webmagic-core org.apache.httpcomponents httpclient junit junit org.apache.commons commons-lang3 net.sf.saxon Saxon-HE log4j log4j commons-collections commons-collections net.sourceforge.htmlcleaner htmlcleaner org.jsoup jsoup org.apache.commons commons-io