抓取微信小程序
如何抓取微信小程序,其实小程序的抓取不是很难,主要解决抓包和如何调试小程序这两个问题。可以采用chrome进行抓取,方便很多。
如果微信小程序抓包抓不到,可能是由于微信版本太高了导致无法抓包。如果版本过高可以采用Fiddler或Charles抓包,这样就能解决抓包问题
只要抓包搞定了,很多小程序也就能抓取了,剩下就是解决IP问题。还有一部分小程序在前端有反爬措施,对请求参数加密或混淆了,这时候需要解决小程序的调试方案了。
微信小程序就相当于一个网站,只不过不能通过浏览器打开。其实浏览器和小程序的构成差不多,小程序里的数据交互也是由javascript来负责的。所以爬虫调试小程序也主要是调试javascript。
我们可以通过代码来进行调试:
import org.apache.commons.httpclient.Credentials; import org.apache.commons.httpclient.HostConfiguration; import org.apache.commons.httpclient.HttpClient; import org.apache.commons.httpclient.HttpMethod; import org.apache.commons.httpclient.HttpStatus; import org.apache.commons.httpclient.UsernamePasswordCredentials; import org.apache.commons.httpclient.auth.AuthScope; import org.apache.commons.httpclient.methods.GetMethod; import java.io.IOException; public class Main { # 代理服务器(产品官网 www.16yun.cn) private static final String PROXY_HOST = "t.16yun.cn"; private static final int PROXY_PORT = 31111; public static void main(String[] args) { HttpClient client = new HttpClient(); HttpMethod method = new GetMethod("https://httpbin.org/ip"); HostConfiguration config = client.getHostConfiguration(); config.setProxy(PROXY_HOST, PROXY_PORT); client.getParams().setAuthenticationPreemptive(true); String username = "16ABCCKJ"; String password = "712323"; Credentials credentials = new UsernamePasswordCredentials(username, password); AuthScope authScope = new AuthScope(PROXY_HOST, PROXY_PORT); client.getState().setProxyCredentials(authScope, credentials); try { client.executeMethod(method); if (method.getStatusCode() == HttpStatus.SC_OK) { String response = method.getResponseBodyAsString(); System.out.println("Response = " + response); } } catch (IOException e) { e.printStackTrace(); } finally { method.releaseConnection(); } } }
抓取微信小程序
laical
会员积分:2280
如何抓取微信小程序,其实小程序的抓取不是很难,主要解决抓包和如何调试小程序这两个问题。可以采用chrome进行抓取,方便很多。
如果微信小程序抓包抓不到,可能是由于微信版本太高了导致无法抓包。如果版本过高可以采用Fiddler或Charles抓包,这样就能解决抓包问题
只要抓包搞定了,很多小程序也就能抓取了,剩下就是解决IP问题。还有一部分小程序在前端有反爬措施,对请求参数加密或混淆了,这时候需要解决小程序的调试方案了。
微信小程序就相当于一个网站,只不过不能通过浏览器打开。其实浏览器和小程序的构成差不多,小程序里的数据交互也是由javascript来负责的。所以爬虫调试小程序也主要是调试javascript。
我们可以通过代码来进行调试:
import org.apache.commons.httpclient.Credentials; import org.apache.commons.httpclient.HostConfiguration; import org.apache.commons.httpclient.HttpClient; import org.apache.commons.httpclient.HttpMethod; import org.apache.commons.httpclient.HttpStatus; import org.apache.commons.httpclient.UsernamePasswordCredentials; import org.apache.commons.httpclient.auth.AuthScope; import org.apache.commons.httpclient.methods.GetMethod; import java.io.IOException; public class Main { # 代理服务器(产品官网 www.16yun.cn) private static final String PROXY_HOST = "t.16yun.cn"; private static final int PROXY_PORT = 31111; public static void main(String[] args) { HttpClient client = new HttpClient(); HttpMethod method = new GetMethod("https://httpbin.org/ip"); HostConfiguration config = client.getHostConfiguration(); config.setProxy(PROXY_HOST, PROXY_PORT); client.getParams().setAuthenticationPreemptive(true); String username = "16ABCCKJ"; String password = "712323"; Credentials credentials = new UsernamePasswordCredentials(username, password); AuthScope authScope = new AuthScope(PROXY_HOST, PROXY_PORT); client.getState().setProxyCredentials(authScope, credentials); try { client.executeMethod(method); if (method.getStatusCode() == HttpStatus.SC_OK) { String response = method.getResponseBodyAsString(); System.out.println("Response = " + response); } } catch (IOException e) { e.printStackTrace(); } finally { method.releaseConnection(); } } }
21-06-22 16:52
985
0
回复
暂无评论