Since I am using LiteSpeed I always use local PC with Powershell and cURL for Windows to warmup the cache, because it is faster than every other method. For customers I am currently checking LScache crawler if the crawler meets customer's requirements. While checking the code of the crawler script, I found some weak spots they force server load, reduce crawling speed and generates too much traffic. Not all of my improvements can be applied because they depend on which cURL version is used, so I can only add 1 additional header for version 7.29.
My improvements:
*********************************
search for:
and replace with:
Adding Accept-Encoding header forces to generate compressed version of cached URLs. Otherwise they are uncompressed and LiteSpeed musst generate a second compressed version additonal to existing uncompressed version.
*********************************
The crawler tries to use http/2 if supported by cURL version, but for reason I can't reproduce this doesn't work all the time, so crawler uses http/1.1. To force using http/2 add
parameter. This parameter doesn't work with cURL version lower than 7.47.0
*********************************
Function for using mobile devices should be removed and should be completely rebuild, because this function is very insufficient. It is not difficult to define a device detection that works up to 99% with a detection that can also differenciate between cell phones and tablets with a few line of code, but the current solution is unusable! If LiteSpeed needs support to define such define feel free to contact me.
Michael
My improvements:
*********************************
search for:
Code:
CURLRESULT=$(curl ${CURL_OPTS} -siLk -b name="${3}" -X GET -H "${1}" ${2} | tac | tac | sed '/Server: /q')
Code:
CURLRESULT=$(curl ${CURL_OPTS} -siLk -b name="${3}" -X GET -H "Accept-Encoding: gzip, deflate, br" -H "${1}" ${2} | tac | tac | sed '/Server: /q')
*********************************
The crawler tries to use http/2 if supported by cURL version, but for reason I can't reproduce this doesn't work all the time, so crawler uses http/1.1. To force using http/2 add
Code:
--http2-prior-knowledge
*********************************
Function for using mobile devices should be removed and should be completely rebuild, because this function is very insufficient. It is not difficult to define a device detection that works up to 99% with a detection that can also differenciate between cell phones and tablets with a few line of code, but the current solution is unusable! If LiteSpeed needs support to define such define feel free to contact me.
Michael