-
Notifications
You must be signed in to change notification settings - Fork 26
/
bots.txt
320 lines (320 loc) · 8.2 KB
/
bots.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
200pleasebot 200PleaseBot
360spider 360Spider
abot CrawlDaddy, abot
addthis AddThis
adldxbot Microsoft Bing Ads
admantx ADmantX Platform Semantic Analyzer
adsbot-google Google Adwords
adstxtcrawler AdsTxtCrawler
advbot AdvBot
ahrefsbot Ahrefs backlinks research tool
alexa Alexa Crawler
anderspink AndersPinkBot
apache-httpclient Java http library
apachebench ApacheBench (ab)
apis-google APIs-Google
appengine-google Google App Engine
applebot Apple Bot
archive.org_bot Internet Archive (archive.org)
archiveteam archivebot ArchiveTeam ArchiveBot
ask jeeves Ask Jeeves
asynchttpclient Java http and WebSocket client library
awe.sm Awe.sm URL expander
baidu Baidu
barkrowler Barkrowler
bdcbot Big Data Corp
bingbot Microsoft Bing
bingpreview Microsoft Bing preview
bitlybot bit.ly bot
blekkobot Blekkobot
blexbot BLEXBot (webmeup)
[email protected] Linkfluence bot
bubing BUbiNG
bufferbot BufferBot
buibui-checkbot buibui
butterfly Topsy Labs
buzzbot Buzzbot
buzztalk buzztalk
catchbot CatchBot (catchbot.com)
check_http Nagios monitor
chrome-lighthouse Chrome-Lighthouse
cipacrawler CipaCrawler
cliqzbot Cliqzbot
cloudflare CloudFlare-AlwaysOnline
cmradar/0.1 CMRadar/0.1
coldfusion ColdFusion http library
commoncrawl CCBot
comodo ssl checker COMODO SSL Checker
comodo-webinspector-crawler Comodo
copypants BotPants
crowsnest Crowsnest
curabot cura.yt
curl curl unix CLI http client
dap/nethttp DAP/NetHTTP
datafeedwatch DataFeedWatch
datagnionbot datagnion.com/bot.html
datanyze Datanyze
daumoa Korean portal and search engine indexing bot
developers.google.com/+/web/snippet/ Google Plus
diffbot Diffbot
digitalpersona fingerprint software HP Fingerprint scanner
domain re-animator bot Domain Re-Animator Bot
domainsbot DomainsBot
domaintunocrawler DomainTuno
dotbot Dot Bot
duckduckbot Duck Duck Go
elb-healthchecker AWS ELB HealthChecker
embedly Embedly
eoaagent EOAAgent
everyonesocialbot EveryoneSocial
evrinid Evri bot
exabot Exalead's bot
exaleadcloudview ExaleadCloudView
ez publish eZ Publish Link Validator
facebookexternalhit Facebook Bot
facebot Facebook Bot
feedburner RSS bot
feedfetcher-google Google Feedfetcher
findxbot Findxbot
flipboardproxy FlipboardProxy
friendfeedbot FriendFeed
fyrebot Fyrebot
garlik GarlikCrawler
genieo Genieo Web filter bot
germcrawler GermCrawler
getprismatic.com getprismatic.com
gigabot Gigabot spider
gimme60bot Gimme60 (gimme60.com)
gimmeusabot Gimme60 (gimme60.com)
go http package Go http library
go-http-client Go http client
google page speed insights Google Page Speed Insights
google web preview Google Instant Previews crawler
google-site-verification Google Site Verification
google-structured-data-testing-tool Google Structured Data Testing Tool
google-structureddatatestingtool Google Structured Data Testing Tool
google-xrawler Google Shopping
googlebot Google Bot
googleimageproxy Google Image Proxy
googlestackdrivermonitoring-uptimechecks Google Stackdriver Monitoring - Uptime Checks
grapeshotcrawler GrapeshotCrawler
gravitybot Gravity Bot
hatena::bookmark Hatena::Bookmark
heritrix heritrix
https://developers.google.com/+/web/snippet Google+ Snippet Fetcher
httrack HTTrack
hubspot HubSpot
ia_archiver Internet Archive (WayBackMachine)
icoreservice iCoreService
idmarch idmarch.org/bot.html
implisensebot ImplisenseBot
inagist URL resolver
insieve Insieve Bot
insitesbot Insitesbot
instapaper Instapaper
istellabot IstellaBot
jaunt Jaunt - Java Web Scraping & JSON Querying
jetslide Jetslide
jobseeker jobseeker.com.au/bot.html
jooble Jooble
js-kit URL resolver
kemvibot Kemvi
kimengi Kimengi Bot
knows.is knows.is
kojitsubot Kojitsubot
komodiabot KomodiaBot
kraken kraken
laconica Laconica
lijit crawler Lijit
linkdexbot Linkdex Bot
linkedinbot LinkedIn
linkscrawler LinksCrawler
linode Linode Longview
lipperhey Lipperhey
livelapbot Livelapbot
loadtimebot Load Time Bot
longurl URL expander service
ltx71 ltx71.com
lumibot Lumibot
magpie-crawler magpie-crawler
mail.ru_bot Mail.ru Bot
mappydata Mappy
mastodon Mastodon URL expander
mauibot MauiBot
meanpathbot meanpath
mediapartners-google Google Adsense bot
megaindex.ru MegaIndex
memorybot mignify.com/bot.html
metauri MetaURI
mfe_expand Mcafee spider
mir web crawler MIR web crawler
mj12bot Majestic-12 spider
mojeekbot Mojeek UK search crawler
ms search 6.0 robot MS Search 6.0 Robot
msnbot-media Microsoft media bot
msnbot Microsoft bot
nerdybot NerdyBot
netcraft Netcraft
netstate netEstate NE Crawler
netvibes Personalized dashboard bot
netzcheckbot netzcheck
newrelicmonitor NewRelic monitor
newrelicpinger NewRelicPinger
newsme newsme
niki-bot niki-bot
ning NING - Yet Another Twitter Swarmer
nutch Apache search spider
openhosebot OpenHoseBot
orangebot OrangeBot
paessler paessler.com - PRTG Network Monitor
pagesinventory pagesinventory.com
panopta Monitoring service
paperlibot PaperLi
peerindex peerindex
percolatecrawler PercolateCrawler
perfectmarketkwtbot PerfectMarket
phantomjs PhantomJS
pingdom Pingdom monitoring
pinterest Pinterest
plukkie botje.com/plukkie.htm
pr-cy.ru PR-CY.RU
privacyawarebot PrivacyAwareBot
proximic Proximic Spider
psbot-page Picsearch
pu_in Pu_iN Crawler
publiclibraryarchive.org publiclibraryarchive.org
pycurl Python http library
python-httplib2 Python-httplib2
python-requests Python http library
python-urllib Python http library
queryseeker QuerySeekerSpider
quick-crawler Quick-Crawler
quicklook QuickLook
re-animator Domain Re-Animator Bot
readability Readability
rebelmouse RebelMouse
redditbot Reddit Bot
relateiq RelateIQ
riddler Riddler Bot
rogerbot SeoMoz spider
rssmicro RSS/Atom Feed Robot (rssmicro.com)
scouturlmonitor ScoutURLMonitor
scrapy Scrapy
screaming frog seo spider Screaming Frog SEO Spider
searchmetricsbot SearchmetricsBot
semanticbot Semanticbot
semrushbot SEO analysis bot
seo-audit seo-audit-check-bot
seobilitybot SeobilityBot
seodiver SEOdiver
seokicks SEOKicks
seznambot SeznamBot
shopwiki ShopWiki
shortlinktranslate Link shortener
showyoubot Showyou iOS app spider
siege Joe Dog Siege
sistrix SISTRIX
sitecheck SiteCheck sitecrawl
siteuptime Site monitoring services
skypeuripreview SkypeUriPreview
slack-imgproxy Slack Image Proxy
slack-linkexpanding Slack Link Expanding
slack Slack Link Expanding
slackbot Slackbot
slurp Yahoo spider
smtbot SimilarTech
snapchat Snapchat
socialrank SocialRankIOBot
sogou Chinese search engine
spbot OpenLinkProfiler
spinn3r Spinn3r aggregator
sputnikbot SputnikBot
squider Squider
statuscake StatusCake
stripe Stripe
swiftbot Swiftype Bot
tangibleebot TangibleeBot
teeraid TeeRaidBot
test certificate info C http library?
the knowledge ai Knowledge AI Bot
tineye TinEye Bot
traackr Traackr Bot
trendictionbot Trendiction Search
trendsmap Trendsmap Resolver
turnitinbot TurnitinBot
tweetedtimes The Tweeted Times
tweetmemebot TweetMeMe Crawler
twikle Social web search bot
twitjobsearch TwitJobSearch
twitmunin Twitmunin
twitterbot Twitter URL expander
twurly Twurly
typhoeus Typhoeus
umbot uberMetrics
unwindfetch Gnip
updown Updown.io monitor
uptimerobot Uptime Robot
vagabondo Vagabondo
vb project Visual Basic
vigil Vigil
vkshare VKontake Sharer
voilabot VoilaBot
vrcrawler Venture Radar
wasalive-bot Wasalive Bots
watchsumo WatchSumo
wbsearchbot Ware Bay Best Buys
webceo online-webceo-bot
webscout Webscout
wesee WeSEE
wget wget unix CLI http client
whatsapp WhatsApp
wikido WikiDo
woorank WooRank
wordpress WordPress spider
woriobot woriobot
wormly WormlyBot
wotbox Wotbox
xenu link sleuth Xenu Link Sleuth
xing-contenttabreceiver Xing bot
xovibot XoviBot
yacybot YaCy
yahoo-ad-monitoring Yahoo Ad monitoring
yandex Yandex
yanga Yanga WorldSearch Bot
yeti Naver Corp
yourls YOURLS
zabbix Zabbix
zelist.ro feed parser
zibb ZIBB spider
zitebot Zite
zoombot ZoomBot
zoominfobot ZoominfoBot
zyborg Zyborg
amazonbot Amazon
anthropic-ai Anthropic-AI
applebot Apple
bytespider TikTok
ccbot Common Crawl
chatgpt-user ChatGPT
claude-web Anthropic-AI
cohere-ai Cohere
diffbot Diffbot
facebookbot Facebook
google-extended Google
googleother Google
gptbot ChatGPT
omgili Webz.io
perplexitybot Perplexity
webz.io Webz.io
youbot You.com
httpie HTTPie
eventmachine httpclient Ruby http library
go 1.1 package http Go 1.1 package http
htmlparser HTMLParser
http_request2 HTTP_Request2
httpclient HTTPClient
jakarta commons Jakarta Commons HttpClient
java Generic Java http library
libwww-perl Perl client-server library
lwp-trivial Another Perl library
ruby Ruby