Daniela und Frank Leyhe GbR
Alte Hellersdorfer Straße 141
Berlin
Germany / Berlin / Berlin 12629
Phone: +49 (0)30 889 454 24

robots.txt

robots.txt

1 Beitrag / 0 neu
Frank
Bild des Benutzers Frank
robots.txt

Eine robots.txt sollte jeder erstellen. Der Inhalt kann hier ganz unterschiedlich sein. Eine Empfehlung von Seiten Browser Capabilities Project ist dieser Inhalt hier:

User-agent: *CFNetwork*
User-agent: *Check&Get*
User-agent: *E-Mail Address Extractor*
User-agent: *grub*
User-agent: *heritrix*
User-agent: *HTTrack*
User-agent: *ickHTTP*
User-agent: *java*
User-agent: *Larbin*
User-agent: *libwww*
User-agent: *maxamine.com--robot*
User-agent: *MSIECrawler*
User-agent: *naver*
User-agent: *Netcraft Web Server Survey*
User-agent: *Netcraft Webserver Survey*
User-agent: *Nutch*
User-agent: *PhotoStickies/*
User-agent: *research*
User-agent: *squid*
User-agent: *TweakMASTER*
User-agent: *WebGrabber*
User-agent: *WinHttpRequest*
User-agent: *www4mail/*
User-agent: *Zeus*
User-agent: 12345
User-agent: 3D-FTP/*
User-agent: 3wGet/*
User-agent: 8484 Boston Project*
User-agent: A .NET Web Crawler
User-agent: A1 Website Download/1.* (*) miggibot
User-agent: abot/*
User-agent: AcadiaUniversityWebCensusClient
User-agent: ActiveRefresh*
User-agent: Ad Muncher*
User-agent: AideRSS/2.0 (aiderss.com)
User-agent: Amico Alpha * (*) Gecko/* AmicoAlpha/*
User-agent: AndroidDownloadManager
User-agent: annotate_google; http://ponderer.org/*
User-agent: Anonymisiert*
User-agent: Anonymizer/*
User-agent: Anonymizied*
User-agent: Anonymous*
User-agent: Anonymous/*
User-agent: Artera (Version *)
User-agent: Atomic_Email
User-agent: Atomic_Email_Hunter/*
User-agent: AutoHotkey
User-agent: AutoMate5
User-agent: b2w/*
User-agent: BackStreet Browser *
User-agent: BasicHTTP/*
User-agent: BDFetch
User-agent: Beamer*
User-agent: BilgiBot/*
User-agent: BitBeamer/*
User-agent: BitTorrent/*
User-agent: BlockNote.Net
User-agent: BlueCoat ProxySG
User-agent: bot/* (bot; *bot@bot.bot)
User-agent: Busiversebot/v1.0 (http://www.busiverse.com/bot.php)
User-agent: Camcrawler*
User-agent: CAST
User-agent: CazoodleBot/*
User-agent: CE-Preload
User-agent: CerberianDrtrs/*
User-agent: CFNetwork/*
User-agent: CFSCHEDULE*
User-agent: CherryPicker*/*
User-agent: Chilkat/*
User-agent: CMS crawler (?http://buytaert.net/crawler/)
User-agent: CobWeb/*
User-agent: Cocoal.icio.us/* (*)*
User-agent: ColdFusion*
User-agent: ContactBot/*
User-agent: copyright sheriff (*)
User-agent: CopyRightCheck*
User-agent: Crawl_Application
User-agent: CTerm/*
User-agent: curl*
User-agent: Custo*
User-agent: CyberPatrol*
User-agent: CydralSpider/*
User-agent: cz32ts
User-agent: DA *
User-agent: DataCha0s/*
User-agent: DataFountains/DMOZ Downloader*
User-agent: DeepIndexer*
User-agent: Der gro\xdfe BilderSauger*
User-agent: Desktop Sidebar*
User-agent: DISCo Pump *
User-agent: DomainsBotBot/1.*
User-agent: DotBot/* (http://www.dotnetdotcom.org/*)
User-agent: Download Demon*
User-agent: Download Express*
User-agent: Download Master*
User-agent: Download Ninja*
User-agent: Download Wonder*
User-agent: DownloadSession*
User-agent: e-SocietyRobot(http://www.yama.info.waseda.ac.jp/~yamana/es/)
User-agent: EasyDL/*
User-agent: eCatch*
User-agent: EmailCollector*
User-agent: EMAILsearcher
User-agent: EmailSiphon*
User-agent: EmailWolf*
User-agent: envolk/* (?http://www.envolk.com/envolk*)
User-agent: envolk?ITS?spider/* (?http://www.envolk.com/envolk*)
User-agent: Epsilon SoftWorks' MailMunky
User-agent: eStyleSearch * (compatible; MSIE 6.0; Windows NT 5.0)
User-agent: Exabot-Images/1.0
User-agent: Exabot-Test/*
User-agent: Exabot/2.0
User-agent: Exabot/3.0
User-agent: exactseek-pagereaper-* (crawler@exactseek.com)
User-agent: Exalead NG/*
User-agent: ExtractorPro*
User-agent: Extreme Picture Finder
User-agent: ezic.com http agent *
User-agent: FairAd Client*
User-agent: FANGCrawl/*
User-agent: favorstarbot/*
User-agent: FDM 1.x
User-agent: Feed::Find/*
User-agent: Feedfetcher-Google*
User-agent: Feedfetcher-Google-iGoogleGadgets*
User-agent: fetch libfetch/*
User-agent: FGet*
User-agent: findfiles.net/* (Robot;test_robot@gmx-topmail.de)
User-agent: Flaming AttackBot*
User-agent: FlashGet
User-agent: FLATARTS_FAVICO
User-agent: FollowSite.com (*)
User-agent: Foobot*
User-agent: Fooky.com/ScorpionBot/ScoutOut;*
User-agent: Forschungsportal/*
User-agent: FOTOCHECKER
User-agent: Franklin Locator*
User-agent: FreshDownload/*
User-agent: FyberSpider*
User-agent: GameSpyHTTP/*
User-agent: GetRight/*
User-agent: GetRightPro/*
User-agent: GetSmart/*
User-agent: gnome-vfs/*
User-agent: Go!Zilla*
User-agent: Go-Ahead-Got-It*
User-agent: Gozilla/*
User-agent: gsa-crawler*
User-agent: Gulper Web *
User-agent: GurujiBot/1.*
User-agent: Harvest/*
User-agent: Hatena Antenna/*
User-agent: Hatena Bookmark/*
User-agent: Hatena RSS/*
User-agent: Hatena::Crawler/*
User-agent: HatenaScreenshot*
User-agent: hcat/*
User-agent: Healthbot/Health_and_Longevity_Project_(HealthHaven.com)
User-agent: HiddenMarket-*
User-agent: hitcrawler_0.*
User-agent: HLoader
User-agent: Holmes/*
User-agent: HooWWWer/*
User-agent: HTML2JPG Blackbox, http://www.html2jpg.com
User-agent: HTMLParser/*
User-agent: http generic
User-agent: http://Anonymouse.org/*
User-agent: http://arachnode.net*
User-agent: http://hilfe.acont.de/bot.html ACONTBOT
User-agent: httpclient*
User-agent: httperf/*
User-agent: HTTPFetch/*
User-agent: HTTPGrab
User-agent: HttpSession
User-agent: httpunit/*
User-agent: HyperEstraier/*
User-agent: ia_archiver*
User-agent: ICE_GetFile
User-agent: IconSurf/2.*
User-agent: iCopyright Conductor*
User-agent: IE/6.01 (CP/M; 8-bit*)
User-agent: iexplore.exe
User-agent: iGetter/*
User-agent: Inet - Eureka App
User-agent: inetbot/* (?http://www.inetbot.com/bot.html)
User-agent: INetURL/*
User-agent: InetURL:/*
User-agent: InfociousBot (?http://corp.infocious.com/tech_crawler.php)
User-agent: Inne: Mozilla/4.0 (compatible; Cerberian Drtrs*)
User-agent: Internet Exploiter/*
User-agent: Internet Explore *
User-agent: Internet Explorer *
User-agent: Internet Ninja*
User-agent: InternetArchive/*
User-agent: IP*Works!*/*
User-agent: IPiumBot laurion(dot)com
User-agent: IRLbot/*
User-agent: IrssiUrlLog/*
User-agent: IWAgent/*
User-agent: JetBrains Omea Reader*
User-agent: JPluck/*
User-agent: JUST-CRAWLER(*)
User-agent: Kapere (http://www.kapere.com)
User-agent: KBeeBot/0.*
User-agent: Kevin http://*
User-agent: Kolinka Forum Search (www.kolinka.com)
User-agent: Kontiki Client*
User-agent: KRetrieve/
User-agent: Lachesis
User-agent: LeechFTP
User-agent: LeechGet*
User-agent: LetsCrawl.com/1.0*
User-agent: lftp/3.2.1
User-agent: libcurl-agent/*
User-agent: libWeb/clsHTTP*
User-agent: Liferea/1.* (Linux; *; http://liferea.sf.net/)
User-agent: LightningDownload/*
User-agent: Lincoln State Web Browser
User-agent: Link Valet Online*
User-agent: LinkextractorPro*
User-agent: Links4US-Crawler,*
User-agent: LMQueueBot/*
User-agent: LOOQ/0.1*
User-agent: Lorkyll *.* -- lorkyll@444.net
User-agent: Lsearch/sondeur
User-agent: LucidMedia ClickSense/4.?
User-agent: lwp*
User-agent: Made by ZmEu @ WhiteHat v0.* (www.WhiteHat.ro)
User-agent: MapoftheInternet.com?(?http://MapoftheInternet.com)
User-agent: MetaProducts Download Express/*
User-agent: metatagsdir/*
User-agent: MFC Foundation Class Library*
User-agent: MFC_Tear_Sample
User-agent: MFHttpScan
User-agent: Microsoft BITS/*
User-agent: Microsoft Data Access Internet Publishing Provider Cache Manager
User-agent: Microsoft Data Access Internet Publishing Provider DAV*
User-agent: Microsoft Data Access Internet Publishing Provider Protocol Discovery
User-agent: Microsoft Internet Explorer
User-agent: Microsoft Office Existence Discovery
User-agent: Microsoft Office Protocol Discovery
User-agent: Microsoft Office/* (*Picture Manager*)
User-agent: Microsoft URL Control*
User-agent: Microsoft Visio MSIE
User-agent: Microsoft Windows Network Diagnostics
User-agent: Microsoft-WebDAV-MiniRedir/*
User-agent: Missigua Locator*
User-agent: Mister PIX*
User-agent: Mono Browser Capabilities Updater*
User-agent: Moozilla
User-agent: Morfeus Fucking Scanner
User-agent: MovableType/*
User-agent: Mozilla/* (compatible; linktiger/*; *http://www.linktiger.com*)
User-agent: Mozilla/* (compatible; OffByOne; Windows*) Webster Pro V3.*
User-agent: Mozilla/* (TuringOS; Turing Machine; 0.0)
User-agent: Mozilla/0.9* no dos :) (Linux*)
User-agent: Mozilla/2.0 (compatible; NEWT ActiveX; Win32)
User-agent: Mozilla/3.0 (compatible; Indy Library)
User-agent: Mozilla/4.0 (compatible; Advanced Email Extractor*)
User-agent: Mozilla/4.0 (compatible; BorderManager*)
User-agent: Mozilla/4.0 (compatible; BOTW Spider; *http://botw.org)
User-agent: Mozilla/4.0 (compatible; Cerberian Drtrs*)
User-agent: Mozilla/4.0 (compatible; Getleft*)
User-agent: Mozilla/4.0 (compatible; http://search.thunderstone.com/texis/websearch/about.html)
User-agent: Mozilla/4.0 (compatible; MSIE ?.0; SaferSurf*)
User-agent: Mozilla/4.0 (compatible; MSIE 4.01; Vonna.com b o t)
User-agent: Mozilla/4.0 (compatible; MSIE 6.0; Bluecoat DRTR)
User-agent: Mozilla/4.0 (compatible; Scumbot/*; Linux/*)
User-agent: Mozilla/4.0 (compatible; Spider; Linux)
User-agent: Mozilla/4.0 (compatible; Trend Micro tmdr 1.*
User-agent: Mozilla/4.0 (compatible; Win32)
User-agent: Mozilla/5.0 (*) Gecko/* Firefox/2.0 OneRiot/1.0 (http://www.oneriot.com)
User-agent: Mozilla/5.0 (*) VoilaBot*
User-agent: Mozilla/5.0 (*http://gnomit.com/) Gecko/* Gnomit/1.0
User-agent: Mozilla/5.0 (compatible; AboutUsBot/*)
User-agent: Mozilla/5.0 (compatible; archive.org_bot*)
User-agent: Mozilla/5.0 (compatible; BuzzRankingBot/*)
User-agent: Mozilla/5.0 (compatible; Charlotte/*; *)
User-agent: Mozilla/5.0 (compatible; ClixSense; http://www.clixsense.com/)
User-agent: Mozilla/5.0 (compatible; Crawly/1.*; +http://*/crawler.html)
User-agent: Mozilla/5.0 (compatible; del.icio.us-thumbnails/*; *) KHTML/* (like Gecko)
User-agent: Mozilla/5.0 (compatible; DKIMRepBot/*)
User-agent: Mozilla/5.0 (compatible; DotBot/*; http://www.dotnetdotcom.org/*)
User-agent: Mozilla/5.0 (compatible; Exabot-Images/3.0*)
User-agent: Mozilla/5.0 (compatible; Exabot/3.0*)
User-agent: Mozilla/5.0 (compatible; IPCheck Server Monitor*)
User-agent: Mozilla/5.0 (compatible; JadynAveBot; *http://www.jadynave.com/robot*
User-agent: Mozilla/5.0 (compatible; KaloogaBot; http://www.kalooga.com/info.html?page=crawler)
User-agent: Mozilla/5.0 (compatible; LegalAnalysisAgent/1.*; http://www.legalx.net)
User-agent: Mozilla/5.0 (compatible; MJ12bot/v1.*)
User-agent: Mozilla/5.0 (compatible; NetcraftSurveyAgent/1.0; *info@netcraft.com)
User-agent: Mozilla/5.0 (compatible; nextthing.org/*)
User-agent: Mozilla/5.0 (compatible; NGBot/*)
User-agent: Mozilla/5.0 (compatible; OsO;*
User-agent: Mozilla/5.0 (compatible; ScoutJet; +http://www.scoutjet.com/)
User-agent: Mozilla/5.0 (compatible; Scrubby/*; +http://www.scrubtheweb.com/abs/meta-check.html)
User-agent: Mozilla/5.0 (compatible; Seznam screenshot-generator 2.0;*)
User-agent: Mozilla/5.0 (compatible; Speedy Spider; http://www.entireweb.com/about/search_tech/speedy_spider/)
User-agent: Mozilla/5.0 (compatible; Theophrastus/*)
User-agent: Mozilla/5.0 (compatible; Twitturls; +http://twitturls.com)
User-agent: Mozilla/5.0 (compatible; Viralheat Bot/*)
User-agent: Mozilla/5.0 (compatible; Webbot/*)
User-agent: Mozilla/5.0 (compatible; Webscan v0.*; +http://otc.dyndns.org/webscan/)
User-agent: Mozilla/5.0 (compatible; YodaoBot/1.*)
User-agent: Mozilla/5.0 (compatible;YodaoBot-Image/1.*)
User-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X) Excel/12.*
User-agent: Mozilla/5.0 (Macintosh; U; *Mac OS X; *) AppleWebKit/* (*) Pandora/2.*
User-agent: Mozilla/5.0 (SnapPreviewBot) Gecko/* Firefox/*
User-agent: Mozilla/5.0 (Twiceler*)
User-agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US) Speedy Spider (http://www.entireweb.com/about/search_tech/speedy_spider/)
User-agent: Mozilla/5.0 gURLChecker/*
User-agent: mp3Spider cn-search-devel at yahoo-inc dot com
User-agent: MQbot*
User-agent: MSProxy/*
User-agent: Myzilla
User-agent: naoFavicon4IE*
User-agent: Net Vampire/*
User-agent: Net_Vampire*
User-agent: NetAnts*
User-agent: NetCarta_WebMapper/*
User-agent: Netchart Adv Crawler*
User-agent: NetID.com Bot*
User-agent: Netprospector*
User-agent: NetPumper*
User-agent: NetSucker*
User-agent: NetZip Downloader*
User-agent: NewsGator/*
User-agent: NextGenSearchBot*(for information visit *)
User-agent: NexTools WebAgent*
User-agent: NG-Search/*
User-agent: ng/*
User-agent: nicebot
User-agent: Nozilla/P.N (Just for IDS woring)
User-agent: NP/*
User-agent: NPBot*
User-agent: NSO_Debugger_User/2.0
User-agent: Nudelsalat/*
User-agent: Nutch/0.? (OpenX Spider)
User-agent: Nutscrape
User-agent: Nutscrape/* (CP/M; 8-bit*)
User-agent: NV32ts
User-agent: oBot
User-agent: OCN-SOC/*
User-agent: Offline Downloader*
User-agent: Offline Explorer*
User-agent: online link validator (http://www.dead-links.com/)
User-agent: Open Web Analytics Bot*
User-agent: Oracle Enterprise Search
User-agent: OSSProxy*
User-agent: OutfoxBot/*
User-agent: P3P Client
User-agent: PageDown*
User-agent: Pageload*
User-agent: PageNest/*
User-agent: Pajaczek/*
User-agent: panscient.com
User-agent: pavuk/*
User-agent: PEAR HTTP_Request*
User-agent: Pete-Spider/1.*
User-agent: PHP*
User-agent: PicaLoader*
User-agent: PigBlock (Windows NT 5.1; U)*
User-agent: pixfinder/*
User-agent: PlantyNet_WebRobot*
User-agent: PMAFind
User-agent: Pockey*
User-agent: POE-Component-Client-HTTP/*
User-agent: polybot?*
User-agent: Privoxy/*
User-agent: ProWebWalker*
User-agent: ProxyTester*
User-agent: Prozilla*
User-agent: psbot/* (?http://www.picsearch.com/bot.html)
User-agent: PycURL/*
User-agent: Python*
User-agent: QuickFinder Crawler
User-agent: Radiation Retriever*
User-agent: RealDownload/*
User-agent: RedCarpet/*
User-agent: RepoMonkey*
User-agent: RPT-HTTPClient/*
User-agent: rssImagesBot/0.1 (*http://herbert.groot.jebbink.nl/?app=rssImages)
User-agent: SBL-BOT*
User-agent: ScollSpider/2.*
User-agent: ScoutAbout*
User-agent: searchbot admin@google.com
User-agent: sEasyDL/*
User-agent: Seeker.lookseek.com
User-agent: SeznamBot/*
User-agent: shaboyi spider
User-agent: shareaza*
User-agent: Shelob (shelob@gmx.net)
User-agent: shelob v1.*
User-agent: sherlock/*
User-agent: Shim?Crawler*
User-agent: ShowXML/1.0 libwww/5.4.0
User-agent: SilentSurf*
User-agent: Site Valet Online*
User-agent: SiteParser/*
User-agent: SiteSnagger*
User-agent: SiteSucker/*
User-agent: SiteWinder*
User-agent: SlySearch/*
User-agent: SmallProxy*
User-agent: SmartDownload/*
User-agent: sna-0.0.*
User-agent: Snapbot/*
User-agent: Snoopy*
User-agent: SOFTWING_TEAR_AGENT*
User-agent: Sogou develop spider/*
User-agent: Sogou head spider*
User-agent: sogou js robot(*)
User-agent: Sogou Orion spider/*
User-agent: Sogou Pic Agent
User-agent: Sogou Pic Spider/*
User-agent: Sogou Push Spider/*
User-agent: sogou spider
User-agent: sogou web spider*
User-agent: Sogou-Test-Spider/*
User-agent: sohu*
User-agent: Space*Bison/*
User-agent: SpankBot*
User-agent: SpeedDownload/*
User-agent: Speedy Spider (http://www.entireweb.com/about/search_tech/speedy_spider/)
User-agent: spider (tspyyp@tom.com)
User-agent: Sqeobot/0.*
User-agent: SquigglebotBot/*
User-agent: Sqworm/*
User-agent: Star*Downloader/*
User-agent: Steeler/*
User-agent: STEROID Download
User-agent: Strategic Board Bot (?http://www.strategicboard.com)
User-agent: Sunrise/0.*
User-agent: SuperBot/*
User-agent: SuperHTTP/*
User-agent: Surf Knight
User-agent: SurfControl
User-agent: SurveyBot/*
User-agent: SynapticSearch/AI Crawler 1.?
User-agent: Taiga web spider
User-agent: Talkro Web-Shot/*
User-agent: Tarantula/*
User-agent: Tasap-image-robot/0.* (http://www.tasap.com)
User-agent: Tcl http client package*
User-agent: Teleport*
User-agent: TerrawizBot/*
User-agent: TheInformant*
User-agent: Theme Spider*
User-agent: Titanium 2005 (4.02.01)
User-agent: TMCrawler
User-agent: Toata dragostea*
User-agent: TurnitinBot/*
User-agent: TutorGigBot/*
User-agent: Tutorial Crawler*
User-agent: Twingly Recon
User-agent: Twisted PageGetter
User-agent: Twitturly*
User-agent: UofTDB_experiment* (leehyun@cs.toronto.edu)
User-agent: URI::Fetch/*
User-agent: URL2File/*
User-agent: User*Agent:*
User-agent: USER_AGENT
User-agent: USyd-NLP-Spider*
User-agent: UtilMind HTTPGet
User-agent: VCI WebViewer*
User-agent: Vegas95/*
User-agent: VengaBot/*
User-agent: virus_detector*
User-agent: vobsub
User-agent: wadaino.jp-crawler*
User-agent: WAP_Browser/5.0 (compatible; YodaoBot/1.*)
User-agent: Web Downloader*
User-agent: Web Downloader/*
User-agent: Web Image Collector*
User-agent: Web Magnet*
User-agent: WebAlta Crawler/*
User-agent: WebAuto/*
User-agent: webbandit/*
User-agent: Webclipping.com
User-agent: webcollage*
User-agent: WebCopier*
User-agent: WebCorp/*
User-agent: WebDownloader*
User-agent: WebEnhancer*
User-agent: WebFetch
User-agent: webfetch/*
User-agent: WebGatherer*
User-agent: WebGet
User-agent: WebImages * (?http://herbert.groot.jebbink.nl/?app=WebImages?)
User-agent: WebMiner*
User-agent: WebPix*
User-agent: WebReaper*
User-agent: WebRipper
User-agent: WebSauger*
User-agent: Website Downloader*
User-agent: Website eXtractor*
User-agent: Website Quester
User-agent: WebsiteExtractor*
User-agent: WebSnatcher*
User-agent: Webster Pro*
User-agent: WebStripper*
User-agent: WebWhacker*
User-agent: WebZIP*
User-agent: West Wind Internet Protocols*
User-agent: Wget*
User-agent: WinHttp*
User-agent: WinScripter iNet Tools
User-agent: WinTools
User-agent: WIRE/* (Linux*Bot,Robot,Spider,Crawler)
User-agent: WISEbot/*
User-agent: WordPress-B-/2.*
User-agent: WordPress-Do-P-/2.*
User-agent: woriobot*
User-agent: WWW-Mechanize/*
User-agent: wwwster/* (Beta, mailto:gue@cis.uni-muenchen.de)
User-agent: Xaldon WebSpider*
User-agent: Xenu* Link Sleuth*
User-agent: Xerka WebBot v1.*
User-agent: XSpider*
User-agent: Y!OASIS*
User-agent: Yahoo-MMCrawler*
User-agent: YodaoBot/*
User-agent: YodaoBot/1.* (*)
User-agent: YooW!/* (?http://www.yoow.eu)
User-agent: YRL_ODP_CRAWLER
User-agent: Zao-Crawler
User-agent: Zao/*
User-agent: Zend_Http_Client
User-agent: ZIBB Crawler (email address / WWW address)
Disallow: /

Ob sich da nun einer der genannten User-Agents tatsächlich daran hält nichts zu tun lassen wir mal offen, aber schaden tut das natürlich nicht.

Quelle: http://browsers.garykeith.com

Bearbeitet von: Frank an 07.01.2012 - 20:41

Benutzeranmeldung

Um automatisierten Spam vorzubeugen lassen Sie dieses Feld leer.

Counter

  • Site Counter:1,552,720
  • Besucher:
    • Heute:20
    • Woche:3,312
    • Monat:23,317

Social