+ Page保存了上一次抓取的结果,并可定义待抓取的链接内容。
+
+ 主要方法:
+ {@link #getUrl()} 获取页面的Url
+ {@link #getHtml()} 获取页面的html内容
+ {@link #putField(String, Object)} 保存抽取的结果
+ {@link #getResultItems()} 获取抽取的结果,在 {@link us.codecraft.webmagic.pipeline.Pipeline} 中调用
+ {@link #addTargetRequests(java.util.List)} {@link #addTargetRequest(String)} 添加待抓取的链接
+
+
+
+ Store extracted result and urls to be crawled.
+
+ Main method:
+ {@link #getUrl()} get url of current page
+ {@link #getHtml()} get content of current page
+ {@link #putField(String, Object)} save extracted result
+ {@link #getResultItems()} get extract results to be used in {@link us.codecraft.webmagic.pipeline.Pipeline}
+ {@link #addTargetRequests(java.util.List)} {@link #addTargetRequest(String)} add urls to crawl
+
+
+
+ @author code4crafter@gmail.com
+]]>
+