FeedHall

gigabot

Gigabot is the crawler used by Gigablast. The gigablast search engine became open source around 2015 and has scaled to has scaled to over 12 billion web pages on over 200 servers. The spidering rate performance is about 1 page per second per core (2015).

Gigabot by default respect robots.txt, although it can actively be disabled.

The open source project gigablast is not actively maintained https://github.com/gigablast/open-source-search-engine.

The company Privacore made a heavily modified fork of the gigablast search engine https://github.com/privacore/open-source-search-engine (called Findx) but closed down in Nov, 2018.

More technical details about the crawler and the search engine can be found here: https://web.archive.org/web/20170724040342/http://www.gigablast.com/faq.html https://web.archive.org/web/20170720184221/http://www.gigablast.com/developer.html or in the git repository.

Info

Regex: ^gigabot$

Aliases

  • GigablastOpenSource