How to warm up LiteSpeed LScache of URLs that are not in the sitemap

LiteCache

Active Member
#1
How to warm up the cache of URLs that are not in the sitemap.

If you use one of the LiteSpeed Cache Plugins for WordPress, OpenCart, PrestaShop or another LiteSpeed Cache Plugin, then you are certainly familiar with the problem that the included cache crawlers can only warm up the cache of the URLs that are in the sitemap. Since the sitemap only contains the URLs required for SEO purposes, the sitemap inevitably misses a lot of URLs, such as for pagination or filters or other URLs that have a GET parameter, for example.

For LScache, it doesn't matter whether the URLs from the sitemap are used for the cache warmup. As long as the URL is a dynamically generated PHP source, the cache can be used for any type of URL or page. However, if a lot of URLs are missing from the sitemap, then you cannot take advantage of an HTTP cache or cache warmup. After all, a page must first be requested in order to cache it.

However, this significant problem can be solved very easily.

Before I present the solution to this problem, I want to first describe another problem, but one that is closely related to the problem with warming up URLs that are not in the sitemap.

Almost every user who uses LiteSpeed LScache believes that it is necessary to warm up the cache of all URLs from the sitemap. Unfortunately, this is a common misconception that can easily be proven by analyzing your site's traffic. This makes it very easy and quick to see that up to 70% of all URLs listed in the sitemap are either never or only very rarely requested. This applies not only to natural visitors, but also to bots, especially Googlebot. Google operates crawling on demand. This means that Google does not crawl all URLs on a page and does not index all URLs from the sitemap, but only crawls the URLs for which there is interest. At Google, efficiency is what counts and is therefore able to tell from the URL what content a URL is about. This means Google doesn't have to first crawl a URL and analyze the content to determine whether the content is worth including in the search index.

Why waste resources on cache warmup if no one benefits from it?

Given this, the logical conclusion is that warming up the cache of URLs is a waste of resources if neither users nor bots never or only very rarely request these URLs. You don't have to reinvent the wheel to solve this problem. It is enough to copy Google's methodology and use this methodology for the cache warmup strategy. The result of this strategy is a better, faster and resource-saving result without any disadvantages.

Ultimately, the cache warmup is not an uncritical process and puts a lot of strain on shared hosting in particular. Even if you use a dedicated server, the cache warmup is critical because it simply takes too long if all URLs have to be crawled with every cache warmup.

So what is the solution?

Continue reading and find the solution in LiteSpeed support forum for LiteSpeed Enterprise.
https://www.litespeedtech.com/suppo...hat-are-not-in-the-sitemap.22835/#post-128715
 
Top