Your IP : 3.138.102.163


Current Path : /lib/python2.7/site-packages/pip/_vendor/chardet/
Upload File :
Current File : //lib/python2.7/site-packages/pip/_vendor/chardet/charsetprober.pyc

�
��abc@sBddlZddlZddlmZdefd��YZdS(i����Ni(tProbingStatet
CharSetProbercBs�eZdZd
d�Zd�Zed��Zd�Zed��Z	d�Z
ed��Zed��Z
ed	��ZRS(gffffff�?cCs(d|_||_tjt�|_dS(N(tNonet_statetlang_filtertloggingt	getLoggert__name__tlogger(tselfR((sE/usr/lib/python2.7/site-packages/pip/_vendor/chardet/charsetprober.pyt__init__'s		cCstj|_dS(N(Rt	DETECTINGR(R	((sE/usr/lib/python2.7/site-packages/pip/_vendor/chardet/charsetprober.pytreset,scCsdS(N(R(R	((sE/usr/lib/python2.7/site-packages/pip/_vendor/chardet/charsetprober.pytcharset_name/scCsdS(N((R	tbuf((sE/usr/lib/python2.7/site-packages/pip/_vendor/chardet/charsetprober.pytfeed3scCs|jS(N(R(R	((sE/usr/lib/python2.7/site-packages/pip/_vendor/chardet/charsetprober.pytstate6scCsdS(Ng((R	((sE/usr/lib/python2.7/site-packages/pip/_vendor/chardet/charsetprober.pytget_confidence:scCstjdd|�}|S(Ns([-])+t (tretsub(R((sE/usr/lib/python2.7/site-packages/pip/_vendor/chardet/charsetprober.pytfilter_high_byte_only=scCszt�}tjd|�}xX|D]P}|j|d �|d}|j�re|dkred}n|j|�q"W|S(s5
        We define three types of bytes:
        alphabet: english alphabets [a-zA-Z]
        international: international characters [�-�]
        marker: everything else [^a-zA-Z�-�]

        The input buffer can be thought to contain a series of words delimited
        by markers. This function works to filter all words that contain at
        least one international character. All contiguous sequences of markers
        are replaced by a single space ascii character.

        This filter applies to all scripts which do not use English characters.
        s%[a-zA-Z]*[�-�]+[a-zA-Z]*[^a-zA-Z�-�]?i����s�R(t	bytearrayRtfindalltextendtisalpha(Rtfilteredtwordstwordt	last_char((sE/usr/lib/python2.7/site-packages/pip/_vendor/chardet/charsetprober.pytfilter_international_wordsBs			

	cCs�t�}t}d}x�tt|��D]�}|||d!}|dkrTt}n|dkrit}n|dkr(|j�r(||kr�|r�|j|||!�|jd�n|d}q(q(W|s�|j||�n|S(s�
        Returns a copy of ``buf`` that retains only the sequences of English
        alphabet and high byte characters that are not between <> characters.
        Also retains English alphabet and high byte characters immediately
        before occurrences of >.

        This filter can be applied to all scripts which contain both English
        characters and extended ASCII characters, but is currently only used by
        ``Latin1Prober``.
        iit>t<s�R(RtFalsetrangetlentTrueRR(RRtin_tagtprevtcurrtbuf_char((sE/usr/lib/python2.7/site-packages/pip/_vendor/chardet/charsetprober.pytfilter_with_english_lettersgs"			N(Rt
__module__tSHORTCUT_THRESHOLDRR
RtpropertyR
RRRtstaticmethodRRR)(((sE/usr/lib/python2.7/site-packages/pip/_vendor/chardet/charsetprober.pyR#s			%(RRtenumsRtobjectR(((sE/usr/lib/python2.7/site-packages/pip/_vendor/chardet/charsetprober.pyt<module>s

?>