the crawler · v1.0
CureshiBot
A submit-only crawler. We don't fetch your site unless a webmaster asks us to.
How it identifies itself
CureshiBot sets this user-agent string:
CureshiBot/1.0 (+https://cureshi.com/bot.html)
What it crawls
Three sources only:
- A webmaster submitted a sitemap on cureshi.com/search/
- A webmaster submitted a single URL through the console
- Headline RSS feeds we read for fresh news (BBC, Reuters, Al Jazeera, Guardian, AP) — title & summary only, not full articles
What it stores
- Page title, meta description, h1/h2 headings, keywords
- Up to 8 image URLs per page (with alt text)
- Up to 3 video URLs per page (
<video>tags or YouTube/Vimeo embeds)
We don't store full page bodies, copy your styling, or render JavaScript.
Robots.txt
CureshiBot respects robots.txt. To block it specifically:
User-agent: CureshiBot
Disallow: /
Or block a section:
User-agent: CureshiBot
Disallow: /private/
Disallow: /admin/
If you already block all bots with User-agent: *, CureshiBot stays away. You don't need a separate rule.
Rate
5 URLs at a time, on-demand, when a webmaster submits a sitemap. No aggressive recurring crawl. We won't hammer your server.
Removing your content
- Sign in to the webmaster console and click "Unindex" on a page, or "Remove all" on a domain
- Or email
noc@cureshi.comwith the URL — removed within 24 hours
Contact
Questions or complaints: cureshi.com/contact · noc@cureshi.com