g8ddlmZddlZddlZddlZddlZddlZddlZddl Z ddl Z ddl m Z ddl mZddl mZddlmZee j&ZdZdZd Zd Zd Zd Zd ZdZeeeeefZeeefZeeefZheeeZej@ejBjEej@ejFjEej@ejHjEej@ejJjEeeZ&ddZ'ddZ(ddZ)ddZ*ddZ+d dZ, d!dZ-d"dZ.d#dZ/ej`dejbejdzZ3ej`dZ4d$dZ5d%dZ6y)&) annotationsN)IO) extensions) interpreters)licenses directorysymlinksocketfile executableznon-executabletextbinaryc tj|}|j}t j |rthSt j|rthSt j|rthSth}tj|tj}|r|jt n|jt"t%tj&j)|}t+|dkDr|j-|n8|r6t/|}t+|dkDr|j-t1|dt2|zs6t5|r|jt6n|jt8t2|zsJ|t:|zsJ||S#ttf$rt|dwxYw)N does not exist.r)oslstatOSError ValueErrorst_modestatS_ISDIR DIRECTORYS_ISLNKSYMLINKS_ISSOCKSOCKETFILEaccessX_OKadd EXECUTABLENON_EXECUTABLEtags_from_filenamepathbasenamelenupdateparse_shebang_from_filetags_from_interpreter ENCODING_TAGS file_is_textTEXTBINARY MODE_TAGS)r$srmodetagsr tshebangs @/opt/hc_python/lib/python3.12/site-packages/identify/identify.pytags_from_pathr5(sn4 XXd^ ::D ||D{ ||Dy }}Tx 6D4)J    277++D12A 1vz A -d3G7|a 1'!*=> 4    HHTN HHV  4 %%  t !T!  KQ Z 4D6!12334s F00GcXtjj|\}}tjj|\}}t }|g|jdzD]8}|t j vs|jt j |nt|dkDr}|ddj}|t jvr$|jt j||S|t jvr"|jt j||S)N.r) rr$splitsplitextsetrNAMESr'r&lower EXTENSIONSEXTENSIONS_NEED_BINARY_CHECK)r$_filenameextretparts r4r#r#Vs''--%KAx WW  h 'FAs %C X^^C00 :## # JJz''- . 1  3x!|!"gmmo *'' ' JJz,,S1 2 JJ;; ; JJz>>sC D Jc|jd\}}}|r=|tjvrtj|S|jd\}}}|r=tS)N/r7) rpartitionr INTERPRETERSr;) interpreterr@s r4r)r)lsb#..s3Aq+  ,33 3,,[9 9 + 6 6s ; KA  5LrEctgdttddzttddz}t|jdj d| S)zReturn whether the first KB of contents seems to be binary. This is roughly based on libmagic's binary/text detection: https://github.com/file/file/blob/df74b09b9027676088c797528edcaae5a9ce9ad0/src/encoding.c#L203-L228 )  iN) bytearrayrangeboolread translate)bytesio text_charss r4is_textr_ys` /0%d#$ %%e$% & GLL&00zBC CCrEctjj|st|dt |d5}t |cdddS#1swYyxYw)Nrrb)rr$lexistsropenr_)r$fs r4r+r+sD 77??4 D6!1233 dD Qqz   s AAcl tj|S#t$r|jcYSwxYw)N)shlexr9r)lines r4_shebang_splitrhs5{{4  zz|s 33cr|jddk(r|j} |jd}|D]}|tvs |cSt t |j}t|ddD]\}}|dk7r ||dzf}|jddk(r|S#t$r|cYSwxYw)N#!UTF-8z-ir8) r[readlinedecodeUnicodeDecodeError printabletuplerhstrip enumerate)r]cmd next_line_b next_linec line_tokensitokens r4_parse_nix_shebangr|s ,,q/U "&&(  #**73IA ! N9??+<=> !+cr"23HAu}q1u%'C 4 ,,q/U "" J" J sB(( B65B6cP|jddk7ry|j} |jd}|D] }|tvs yt t |j}|dddk(r|dd}n |dddk(r|dd}|d k(r t||S|S#t$rYywxYw) z8Parse the shebang from a file opened for reading binary.rjrkrlN) /usr/bin/envz-Sr8)r)z nix-shell) r[rnrorprqrrrhrsr|)r] first_line_b first_linerxrus r4 parse_shebangrs||A%##%L!((1  I  z//12 3C 2Aw((!"g Ra% %!"g n!'3// J# sB B%$B%cjtjj|st|dtj|tj sy t |d5}t|cdddS#1swYyxYw#t$r(}|jtjk(rYd}~yd}~wwxYw)z$Parse the shebang given a file path.rr~raN) rr$rbrrrrcrrerrnoEINVAL)r$rdes r4r(r(s 77??4 D6!1233 99T277 # $  #    77ell "  s< B A5+ B5A>:B>B B2 B-,B--B2z^\s*(Copyright|\(C\)) .*$z\s+cztjd|}tjd|}|jS)N ) COPYRIGHT_REsubWS_RErs)ss r4 _norm_licensers0QA #qA 779rEcddl}t|d5}|j}dddt}tj }d}t jdt|z}tjD]n\}} t| } || k(r|cS|r0tt|t| z t|z dkDrL|j|| |} | |kse| |ksk| }|}p|r||kr|Sy#1swYxYw)aReturn the spdx id for the license contained in `filename`. If no license is detected, returns `None`. spdx: https://spdx.org/licenses/ licenses from choosealicense.com: https://github.com/choosealicense.com Approximate algorithm: 1. strip copyright line 2. normalize whitespace (replace all whitespace with a single space) 3. check exact text match with existing licenses 4. failing that use edit distance rNrl)encodingrg?) ukkonenrcr[rsysmaxsizemathceilr&rLICENSESabsdistance) rArrdcontentsnorm min_edit_distmin_edit_dist_spdxcutoffspdxr norm_license edit_dists r4 license_idrs h )Q668 *  "DKKM YYsSY 'F'' d$T* < K CD C $556TBSH $$T<@ v )m";%M!% ( &!!= * )s C44C=)r$strreturnset[str])rJrrr)r] IO[bytes]rrZ)r$rrrZ)rgrrz list[str])r]rrutuple[str, ...]rr)r]rrr)r$rrr)rrrr)rArrz str | None)7 __future__rrros.pathrrerfrstringrtypingridentifyrridentify.vendorr frozensetrqrrrrr!r"r,r- TYPE_TAGSr.r* _ALL_TAGSr'r>valuesr?r<rIALL_TAGSr5r#r)r_r+rhr|rr(compileI MULTILINErrrrr~rEr4rs"  !$ f&& '      !   y$8 9 z>2 3 64.) 4i 4) 4m 4  *''..01 *99@@BC *""))+, ,++2245 Y +\,  D  06"rzz6r||8KL  6 .rE