Please note that this is no longer being updated. Kindly visit the new Cookiebot support area (https://support.cookiebot.com/hc/en-us) for updated guides, articles and FAQs. If you do not find the answer to your question there, you are welcome to post it in the Cookiebot community (https://support.cookiebot.com/hc/en-us/community/topics).
The Cookiebot scanner scans all the html content that a website user can access. This includes both static and dynamic pages, blog posts, images, embedded videos etc.
In addition, the scanner also scans any content on your website that you yourself link to. For example, if your sitemap links to a 404 (not available) page, then the scanner will scan that page as well. For each scan, a URL list is provided.
How can I see the URL list of the pages the scanner has found?
- If you have received a price quote from Cookiebot: attached to the email is a URL list of up to 5,000 of the subpages found by our scanner.
- If you have received an email saying your account has been upgraded from a free subscription to a 1-month free trial: attached to the email is a URL list of up to 200 of the subpages found by our scanner.
- If you have signed up for a Cookiebot subscription: you can find details and an URL list with up to 10,000 of the subpages that our scanner has found on the ‘Cookies’ page: Log into your account (https://manage.cookiebot.com/goto/login); choose the top menu point ‘Cookies’. When you click on the number of pages identified by the scanner, the URL list of subpages will download.
Our scanner simulates a number of website users who visit your website and perform all the actions that can be performed (clicking on all links, scrolling, accessing all pages, playing any embedded videos etc.). If you have an area of the website behind a log in or a cookie wall, we can configure the scanner to also scan behind this. Please provide us with the credentials to set up the scanner (username and password) on firstname.lastname@example.org.
Why does the Cookiebot scanner include old test pages that are no longer displayed?
The Cookiebot scanner only finds pages that are publicly available (see above) and can be accessed by a website user. You may have test pages and old content that is not actively displayed on your website but has not been unpublished or removed. You may also still be linking to such content.
If you have a Cookiebot account, log in and go to the menu point ‘Cookies’. Click on the hyperlink with the number of subpages on your domain. A URL list will download. On this list, you can check the second column, which lists ‘FirstParentURL’ so you can check, where a link to the page was first found.
The Cookiebot scanner is set up not to include 404 pages in the scan. If you link to those pages, however, this will overrule the general rule, and the scanner will include those 404 pages in its scan.
The primary reason for this is that some websites contain 404 pages with content where cookies and trackers can also be in use (primarily due to the technical setup not being optimal).
There are multiple reasons for this. Some of them are that Google also indexes PDFs, Word and Excel docs and other attachments. Cookiebot only scans html pages because those are the pages that can set cookies and online trackers.
Google does not scan all pages all the time and their index is therefore not necessarily updated - the Cookiebot scan provides a ‘here and now’ picture of your website.
Also, Google may deal with the indexing of dynamic pages in a different way.
If you would like to make sure how many pages (URLs) your website has before signing up for a subscription, you can request a free price quote (which will include the number of subpages, a URL list of up to 5,000 subpages identified and a price quote): https://www.cookiebot.com/goto/quote-input/
Please also see: