@,bi%SrSSKrSSKrSSKrSSKrSSKrS/r\R"SS5r Sr Sr "SS5r "S S 5r "S S 5rg) a robotparser.py Copyright (C) 2000 Bastian Kleineidam You can choose between two licenses when using this package: 1) GNU GPLv2 2) PSF license for Python 2.2 The robots.txt Exclusion Protocol is implemented as specified in http://www.robotstxt.org/norobots-rfc.txt NRobotFileParser RequestRatezrequests secondscz[RRUSS9n[RRUSS9$)Nsurrogateescape)errors)urllibparseunquotequote)pathunquoteds 9/opt/alt/python313/lib64/python3.13/urllib/robotparser.py normalizers7||##D1B#CH <<  h/@  AAcURS5upn[U5nU(a![R"SSU5nUSU-- nU$)N?z[^=&]+c[US5$)Nr)r)ms r normalize_path.. s IadOr) partitionrresub)r sepquerys rnormalize_pathrsG~~c*Du T?D y";UC e  Krcd\rSrSrSrSSjrSrSrSrSr Sr S r S r S r S rS rSrSrg)r%zjThis class provides a set of methods to read, parse and answer questions about a single robots.txt file. cz/Ul/UlSUlSUlSUlUR U5 SUlg)NFr)entriessitemaps default_entry disallow_all allow_allset_url last_checkedselfurls r__init__RobotFileParser.__init__+s;  !! SrcUR$)zReturns the time the robots.txt file was last fetched. This is useful for long-running web spiders that need to check for new robots.txt files periodically. )r&r(s rmtimeRobotFileParser.mtime4s   rc6SSKnUR5Ulg)zISets the time the robots.txt file was last fetched to the current time. rN)timer&)r(r1s rmodifiedRobotFileParser.modified=s  IIKrcnXl[RRU5SSuUlUlg)z,Sets the URL referring to a robots.txt file.N)r)rr urlsplithostr r's rr%RobotFileParser.set_urlEs+%||44S9!A> 49rc[RRUR5nUR 5nUR UR SS5R55 g![RRaYnURS;aSUl O'URS:aURS:aSUl UR5 SnAgSnAff=f)z4Reads the robots.txt URL and feeds it to the parser.zutf-8r)iiTiiN)rrequesturlopenr)readr decode splitlineserror HTTPErrorcoder#r$close)r(frawerrs rr=RobotFileParser.readJs L&&txx0A&&(C JJszz'+<=HHJ K||%% xx:%$(!SSXX^!% IIKK  s)A++C" ACC"cSUR;aURcXlggURRU5 gN*) useragentsr"r append)r(entrys r _add_entryRobotFileParser._add_entryXs; %"" "!!)%*"* LL   &rcSn[5nUR5 UGHnU(d6US:Xa [5nSnO#US:XaURU5 [5nSnURS5nUS:aUSUnUR 5nU(dMvUR SS5n[ U5S:XdMUSR 5R5US'USR 5US'USS:XaDUS:XaURU5 [5nURRUS5 SnGM USS:Xa6US:wa-URR[USS 55 SnGM\GM_USS :Xa6US:wa-URR[USS 55 SnGMGMUSS :XaGUS:wa>USR 5R5(a[US5UlSnGMGMUSS :XaUS:waUSR S5n[ U5S:XauUSR 5R5(aOUSR 5R5(a)[[US5[US55UlSnGMGMUSS:XdGMUR"RUS5 GM US:XaURU5 gg)z|Parse the input lines from a robots.txt file. We allow that a user-agent: line is not preceded by one or more blank lines. rr5#N:z user-agentdisallowFallowTz crawl-delayz request-rate/sitemap)Entryr2rNfindstripsplitlenlowerrKrL rulelinesRuleLineisdigitintdelayrreq_rater!)r(linesstaterMlineinumberss rr RobotFileParser.parseas DA:!GEEaZOOE*!GEE #AAvBQx::>   \\**3/ ll%%r2&? 12&?@S!C\\E **s++"   %%//4 4rcUR5(dgURH'nURU5(dMURs $ UR(aURR$gN)r.r rmrbr"r(rorMs r crawl_delayRobotFileParser.crawl_delaysYzz||\\E **{{""   %%++ +rcUR5(dgURH'nURU5(dMURs $ UR(aURR$grt)r.r rmrcr"rus r request_rateRobotFileParser.request_ratesYzz||\\E **~~%"   %%.. .rc>UR(dgUR$rt)r!r-s r site_mapsRobotFileParser.site_mapss}}}}rcURnURbXR/-nSR[[U55$)Nz )r r"joinmapstr)r(r s r__str__RobotFileParser.__str__s>,,    )!3!3 44G{{3sG,--r) r$r"r#r r8r&r r!r)N)rk)__name__ __module__ __qualname____firstlineno____doc__r*r.r2r%r=rNr rqrvryr|r__static_attributes__rrrr%sF !(? L'G#R8 .rc*\rSrSrSrSrSrSrSrg)r_zhA rule line is a single "Allow:" (allowance==True) or "Disallow:" (allowance==False) followed by a path.cNUS:Xa U(dSn[U5UlX lg)NrkT)rr rn)r(r rns rr*RuleLine.__init__s! 2:iI"4( "rcdURS:H=(d URUR5$rI)r startswith)r(filenames rrmRuleLine.applies_tos%yyCA8#6#6tyy#AArcLUR(aSOSS-UR-$)NAllowDisallowz: rnr r-s rrRuleLine.__str__s>>zTADIIMMrrN) rrrrrr*rmrrrrrr_r_s1#BNrr_c0\rSrSrSrSrSrSrSrSr g) rXz?An entry has one or more user-agents and zero or more rulelinesc</Ul/UlSUlSUlgrt)rKr^rbrcr-s rr*Entry.__init__s  rc/nURHnURSU35 M URbURSUR35 URb7URnURSURSUR 35 UR [[UR55 SRU5$)Nz User-agent: z Crawl-delay: zRequest-rate: rV ) rKrLrbrcrequestssecondsextendrrr^r)r(retagentrates rr Entry.__str__s__E JJeW- .% :: ! JJtzzl3 4 == $==D JJ a ~F G 3sDNN+,yy~rcURS5SR5nURH"nUS:Xa gUR5nX!;dM" g g)z2check if this entry applies to the specified agentrVrrJTF)r[r]rK)r(rors rrmEntry.applies_to sQOOC(+113 __E|KKME! %rcrURH'nURU5(dMURs $ g)zJPreconditions: - our agent applies to this entry - filename is URL encodedT)r^rmrn)r(rrfs rrnEntry.allowances0NNDx((~~%#r)rbrcr^rKN) rrrrrr*rrmrnrrrrrXrXsI  rrX)r collectionsr urllib.errorr urllib.parseurllib.request__all__ namedtuplerrrrr_rXrrrrsd    $$]4FG B~.~.@NN"((r