How does the high-speed train collector collect JS paging/click to load ajax list content

Conventional article collection is simple, but it is difficult for high-speed rail (train) collectors to collect JS paging and waterfall flow. Clicking the loaded JavaScript and drop-loading a list page similar to Ajax makes many novices unable to start.

Chen WeiliangThe blog will share here how the high-speed train collector collects JS pagination and clicks to load the list of ajax.

The train collector collects the content page URL and writes it in JS

First of all, the target page needs to capture the package, simply grab the json data from the website, the more difficult website needs the post method, and also need to fill in cookies, random values, such asmushroom Streetand many more……

  • It is simpler to analyze the JS pagination without capturing packets, such as the URL in the second page of the Tencent video search results.cur=2 
  • the numbers behind2, is to set "Start URL Addition Wizard" → "Batch URL" → "Address Format"[address parameter]

Here is a simple locomotive collector how to obtain the content of JS calls, taking the fruit shell network as an example.  

How does the high-speed rail locomotive collector obtain the content of JS calls?

The first thing you need to use is the Chrome browser ▼

1. First click on the target pageF12OrCtrl+Shift+COpen the inspect element and click the Network tab ▼

How does the high-speed train collector collect JS paging? Collect and load ajax list

2. Click the XHR button to trigger ajax loading on the page, and the browser will monitor the execution and changes of the page data▼

2. Clicking the XHR button triggers an AJAX load on the page, and the browser will listen for the execution and changes of page data.

The red box is the address of the captured data ▲

3. Click the data address, and the detailed information will appear on the right.Pay attention to the law of the request address url.For example, in the figure below, there are timestamps and page numbers ▼

3. Click on the data address; detailed information will appear on the right. Note the pattern in the request URL. For example, in the image below, there are timestamps and page numbers. 

4. Add the following captured addresses in the train collector, and set the address rules, then the regular train collector settings▼

4. Add the following captured addresses to the train data collector and set the address rules, then proceed with the regular train data collector settings.

Comment

Your email address will not be published. Required fields * Callout

Article directory
Scroll to Top