Your IP : 3.144.98.61


Current Path : /opt/alt/python311/lib64/python3.11/urllib/__pycache__/
Upload File :
Current File : //opt/alt/python311/lib64/python3.11/urllib/__pycache__/robotparser.cpython-311.pyc

�

c��f�$���dZddlZddlZddlZdgZejdd��ZGd�d��ZGd�d��Z	Gd	�d
��Z
dS)a% robotparser.py

    Copyright (C) 2000  Bastian Kleineidam

    You can choose between two licenses when using this package:
    1) GNU GPLv2
    2) PSF license for Python 2.2

    The robots.txt Exclusion Protocol is implemented as specified in
    http://www.robotstxt.org/norobots-rfc.txt
�N�RobotFileParser�RequestRatezrequests secondsc�\�eZdZdZdd�Zd�Zd�Zd�Zd�Zd�Z	d	�Z
d
�Zd�Zd�Z
d
�Zd�ZdS)rzs This class provides a set of methods to read, parse and answer
    questions about a single robots.txt file.

    �c��g|_g|_d|_d|_d|_|�|��d|_dS)NFr)�entries�sitemaps�
default_entry�disallow_all�	allow_all�set_url�last_checked��self�urls  �9/opt/alt/python311/lib64/python3.11/urllib/robotparser.py�__init__zRobotFileParser.__init__sG�������
�!���!���������S���������c��|jS)z�Returns the time the robots.txt file was last fetched.

        This is useful for long-running web spiders that need to
        check for new robots.txt files periodically.

        )r�rs r�mtimezRobotFileParser.mtime%s
��� � rc�@�ddl}|���|_dS)zYSets the time the robots.txt file was last fetched to the
        current time.

        rN)�timer)rrs  r�modifiedzRobotFileParser.modified.s#��
	���� �I�I�K�K����rc�|�||_tj�|��dd�\|_|_dS)z,Sets the URL referring to a robots.txt file.��N)r�urllib�parse�urlparse�host�pathrs  rr
zRobotFileParser.set_url6s4�����%�|�4�4�S�9�9�!�A�#�>���	�4�9�9�9rc��	tj�|j��}|���}|�|�d�������dS#tjj	$rK}|j
dvrd|_n)|j
dkr|j
dkrd|_Yd}~dSYd}~dSYd}~dSYd}~dSd}~wwxYw)z4Reads the robots.txt URL and feeds it to the parser.zutf-8)i�i�Ti�i�N)
r�request�urlopenr�readr�decode�
splitlines�error�	HTTPError�coderr)r�f�raw�errs    rr&zRobotFileParser.read;s���		9���&�&�t�x�0�0�A��&�&�(�(�C��J�J�s�z�z�'�*�*�5�5�7�7�8�8�8�8�8���|�%�	&�	&�	&��x�:�%�%�$(��!�!���S���S�X��^�^�!%���������!������"�!�!�!�!�!�%3�^�^�^�^�^�����	&���s�$A6�6C�
.C�Cc�p�d|jvr|j�	||_dSdS|j�|��dS�N�*)�
useragentsr
r�append)r�entrys  r�
_add_entryzRobotFileParser._add_entryHsM���%�"�"�"��!�)�%*��"�"�"�*�)�
�L����&�&�&�&�&rc�<�d}t��}|���|D�]V}|sB|dkrt��}d}n+|dkr%|�|��t��}d}|�d��}|dkr
|d|�}|���}|s��|�dd��}t
|��dk�r�|d������|d<tj	�
|d�����|d<|ddkrM|dkr#|�|��t��}|j�|d��d}��o|ddkr8|dkr0|j
�t|dd	����d}���|dd
kr8|dkr0|j
�t|dd����d}���|ddkrP|dkrH|d������rt!|d��|_d}��S|dd
kr�|dkr�|d�d��}t
|��dkr�|d������rg|d������r;t%t!|d��t!|d����|_d}��*|ddkr |j�|d����X|dkr|�|��dSdS)z�Parse the input lines from a robots.txt file.

        We allow that a user-agent: line is not preceded by
        one or more blank lines.
        rr��#N�:z
user-agent�disallowF�allowTzcrawl-delayzrequest-rate�/�sitemap)�Entryrr5�find�strip�split�len�lowerrr�unquoter2r3�	rulelines�RuleLine�isdigit�int�delayr�req_rater	)r�lines�stater4�line�i�numberss       rrzRobotFileParser.parseQsJ���������
�
�����7	2�7	2�D��
��A�:�:�!�G�G�E��E�E��a�Z�Z��O�O�E�*�*�*�!�G�G�E��E��	�	�#���A��A�v�v��B�Q�B�x���:�:�<�<�D��
���:�:�c�1�%�%�D��4�y�y�A�~�~��q�'�-�-�/�/�/�/�1�1��Q�� �,�.�.�t�A�w�}�}���?�?��Q����7�l�*�*���z�z�����.�.�.� %�����$�+�+�D��G�4�4�4��E�E��!�W�
�*�*���z�z���.�.�x��Q���/G�/G�H�H�H� !����!�W��'�'���z�z���.�.�x��Q���/F�/F�G�G�G� !����!�W�
�-�-���z�z� ��7�=�=�?�?�2�2�4�4�7�*-�d�1�g�,�,�E�K� !����!�W��.�.���z�z�"&�q�'�-�-��"4�"4����L�L�A�-�-�'�!�*�2B�2B�2D�2D�2L�2L�2N�2N�-� '��
� 0� 0� 2� 2� :� :� <� <�.�-8��W�Q�Z���#�g�VW�j�/�/�-Z�-Z�E�N� !����!�W�	�)�)�
�M�(�(��a��1�1�1���A�:�:��O�O�E�"�"�"�"�"��:rc��|jrdS|jrdS|jsdStj�tj�|����}tj�dd|j|j	|j
|jf��}tj�|��}|sd}|j
D].}|�|��r|�|��cS�/|jr|j�|��SdS)z=using the parsed robots.txt decide if useragent can fetch urlFTrr<)rrrrrr rD�
urlunparser"�params�query�fragment�quoter�
applies_to�	allowancer
)r�	useragentr�
parsed_urlr4s     r�	can_fetchzRobotFileParser.can_fetch�s ����	��5��>�	��4�
� �	��5��\�*�*�6�<�+?�+?��+D�+D�E�E�
��l�%�%�r�"�Z�_���j�.�
�0C�'E�F�F���l� � ��%�%���	��C��\�	,�	,�E����	�*�*�
,����s�+�+�+�+�+�
,���	5��%�/�/��4�4�4��trc��|���sdS|jD] }|�|��r	|jcS�!|jr|jjSdS�N)rrrVrIr
�rrXr4s   r�crawl_delayzRobotFileParser.crawl_delay�sm���z�z�|�|�	��4��\�	#�	#�E����	�*�*�
#��{�"�"�"�
#���	,��%�+�+��trc��|���sdS|jD] }|�|��r	|jcS�!|jr|jjSdSr\)rrrVrJr
r]s   r�request_ratezRobotFileParser.request_rate�sm���z�z�|�|�	��4��\�	&�	&�E����	�*�*�
&��~�%�%�%�
&���	/��%�.�.��trc�"�|jsdS|jSr\)r	rs r�	site_mapszRobotFileParser.site_maps�s���}�	��4��}�rc��|j}|j�||jgz}d�tt|����S)Nz

)rr
�join�map�str)rrs  r�__str__zRobotFileParser.__str__�s>���,����)���!3� 4�4�G��{�{�3�s�G�,�,�-�-�-rN)r)�__name__�
__module__�__qualname__�__doc__rrrr
r&r5rrZr^r`rbrg�rrrrs���������
����!�!�!�(�(�(�?�?�?�
9�9�9�'�'�'�G#�G#�G#�R���:���������
.�.�.�.�.rc�$�eZdZdZd�Zd�Zd�ZdS)rFzoA rule line is a single "Allow:" (allowance==True) or "Disallow:"
       (allowance==False) followed by a path.c���|dkr|sd}tj�tj�|����}tj�|��|_||_dS)NrT)rrrQr rUr"rW)rr"rWs   rrzRuleLine.__init__�s[���2�:�:�i�:��I��|�&�&�v�|�'<�'<�T�'B�'B�C�C���L�&�&�t�,�,��	�"����rc�L�|jdkp|�|j��Sr0)r"�
startswith)r�filenames  rrVzRuleLine.applies_to�s$���y�C��A�8�#6�#6�t�y�#A�#A�Arc�.�|jrdnddz|jzS)N�Allow�Disallowz: )rWr"rs rrgzRuleLine.__str__�s���>�9���z�T�A�D�I�M�MrN)rhrirjrkrrVrgrlrrrFrF�sS������1�1�#�#�#�B�B�B�N�N�N�N�NrrFc�*�eZdZdZd�Zd�Zd�Zd�ZdS)r>z?An entry has one or more user-agents and zero or more rulelinesc�>�g|_g|_d|_d|_dSr\)r2rErIrJrs rrzEntry.__init__�s"����������
���
�
�
rc�|�g}|jD]}|�d|�����|j�|�d|j����|j�,|j}|�d|j�d|j����|�tt|j	����d�
|��S)NzUser-agent: z
Crawl-delay: zRequest-rate: r<�
)r2r3rIrJ�requests�seconds�extendrerfrErd)r�ret�agent�rates    rrgz
Entry.__str__�s������_�	/�	/�E��J�J�-�e�-�-�.�.�.�.��:�!��J�J�3�t�z�3�3�4�4�4��=�$��=�D��J�J�F��
�F�F���F�F�G�G�G��
�
�3�s�D�N�+�+�,�,�,��y�y��~�~�rc��|�d��d���}|jD]&}|dkrdS|���}||vrdS�'dS)z2check if this entry applies to the specified agentr<rr1TF)rArCr2)rrXr}s   rrVzEntry.applies_to�sp���O�O�C�(�(��+�1�1�3�3�	��_�	�	�E���|�|��t�t��K�K�M�M�E��	�!�!��t�t�"��urc�V�|jD] }|�|��r	|jcS�!dS)zZPreconditions:
        - our agent applies to this entry
        - filename is URL decodedT)rErVrW)rrqrMs   rrWzEntry.allowance
sA���N�	&�	&�D����x�(�(�
&��~�%�%�%�
&��trN)rhrirjrkrrgrVrWrlrrr>r>�sV������I�I����
�
�
��������rr>)rk�collections�urllib.parser�urllib.request�__all__�
namedtuplerrrFr>rlrr�<module>r�s���
�
��������������
��$�k�$�]�4F�G�G��~.�~.�~.�~.�~.�~.�~.�~.�BN�N�N�N�N�N�N�N�$(�(�(�(�(�(�(�(�(�(r

?>