Crawlers

Crawlers are essential tools in the getchat ecosystem, designed to automatically gather content from your website to power your chatbot's knowledge. Let's delve into the specifics of how crawlers function and the best practices for leveraging them.

How Crawlers work

Automatic Setup on Bot Creation:

When you first set up your bot and input your website URL, we also set up a crawler. This crawler is tasked with identifying and creating a website source for every accessible page on your domain.

More about website sources

Automatic Source Creation:

As your website evolves and new pages are added, our crawlers automatically detect these changes. They promptly create website sources for these fresh additions, ensuring your bot remains updated.

Things to consider

Domain Specificity:

Our crawlers are domain-specific. They remain within the confines of the same (sub-)domain. This means if you have multiple sub-domains, such as blog.yourpage.com and yourpage.com, you'll require individual crawlers for each.

Adherence to Web Permissions:

We prioritize web ethics. Our crawlers only access content from websites that grant permissions to robots.

The getchat user agent is identified as getchat-crawler/1.0 (+https://getchat.org). Using this identifier, you can manage access for our bot through your robots.txt file.

For a comprehensive understanding of robots.txt and its setup, Google's documentation is an excellent resource: https://developers.google.com/search/docs/crawling-indexing/robots/create-robots-txt

Crawlers Based on Your Plan:

The number of crawlers at your disposal and the frequency of their automated crawls are determined by the getchat plan you've selected.

For a detailed overview of our plan specifications, please visit our pricing page.

Handling Excess Content:

If the content fetched during automated updates surpasses the character limit of your chosen plan, you'll be promptly notified.

In such scenarios, your bot will retain the last training data acquired before the limit breach. This ensures uninterrupted service while you decide on the next steps.

Questions and support

For further assistance or inquiries related to crawlers, ask the getchat Bot or feel free to contact our support team.