
Python Requests Tutorial
Last updated on 12th Oct 2020, Blog, Tutorials
The http or HyperText Transfer Protocol works on client server models. Usually the web browser is the client and the computer hosting the website is the server. IN python we use the requests module for creating the http requests. It is a very powerful module which can handle many aspects of http communication beyond the simple request and response data. It can handle authentication, compression/decompression, chunked requests etc.
An HTTP client sends an HTTP request to a server in the form of a request message which includes following format:
- A Request-line
- Zero or more header (General|Request|Entity) fields followed by CRLF
- An empty line (i.e., a line with nothing preceding the CRLF) indicating the end of the header fields
- Optionally a message-body
The following sections explain each of the entities used in an HTTP request message.
Request-Line
The Request-Line begins with a method token, followed by the Request-URI and the protocol version, and ending with CRLF. The elements are separated by space SP characters.
Request-Line = Method SP Request-URI SP HTTP-Version CRLF
Let’s discuss each of the parts mentioned in the Request-Line.
Request Method
The request method indicates the method to be performed on the resource identified by the given Request-URI. The method is case-sensitive and should always be mentioned in uppercase. The following table lists all the supported methods in HTTP/1.1.
Method | Description |
---|---|
GET | The GET method is used to retrieve information from the given server using a given URI. Requests using GET should only retrieve data and should have no other effect on the data. |
HEAD | Same as GET, but it transfers the status line and the header section only. |
POST | A POST request is used to send data to the server, for example, customer information, file upload, etc. using HTML forms. |
PUT | Replaces all the current representations of the target resource with the uploaded content. |
DELETE | Removes all the current representations of the target resource given by URI. |
CONNECT | Establishes a tunnel to the server identified by a given URI. |
OPTIONS | Describe the communication options for the target resource. |
TRACE | Performs a message loop back test along with the path to the target resource. |
Request-URI
The Request-URI is a Uniform Resource Identifier and identifies the resource upon which to apply the request. Following are the most commonly used forms to specify an URI:
Request-URI = “*” | absoluteURI | abs_path | authority
Method | Description |
---|---|
* | The asterisk * is used when an HTTP request does not apply to a particular resource, but to the server itself, and is only allowed when the method used does not necessarily apply to a resource. For example:OPTIONS * HTTP/1.1 |
absoluteURI | The absoluteURI is used when an HTTP request is being made to a proxy. The proxy is requested to forward the request or service from a valid cache, and return the response. For example:GET http://www.w3.org/pub/WWW/TheProject.html HTTP/1.1 |
abs_path | The most common form of Request-URI is that used to identify a resource on an origin server or gateway. For example, a client wishing to retrieve a resource directly from the origin server would create a TCP connection to port 80 of the host “www.w3.org” and send the following lines:GET /pub/WWW/TheProject.html HTTP/1.1Host: www.w3.orgNote that the absolute path cannot be empty; if none is present in the original URI, it MUST be given as “/” (the server root). |
Using Python requests
We will use the module requests for learning about http requests.
pip install requests
In the below example we see a case of a simple GET request and print out the result of the response. We choose to print only the first 300 characters.
# How to make http request
- import requests as req
- r = req.get(‘http://www.tutorialspoint.com/python/’)
- print(r.text)[0:300]
When we run the above program, we get the following output −
- <!DOCTYPE html>
- <!–[if IE 8]><html class=”ie ie8″> <![endif]–>
- <!–[if IE 9]><html class=”ie ie9″> <![endif]–>
- <!–[if gt IE 9]><!–> <html> <!–<![endif]–>
- <head>
- <!– Basic –>
- <meta charset=”utf-8″>
- <title>Python Tutorial</title>
- <meta name=”description” content=”Python Tutorial
The http or HyperText Transfer Protocol works on client server models. Usually the web browser is the client and the computer hosting the website is the server. Upon receiving a request from the client the server generates a response and sends it back to the client in a certain format.
Subscribe For Free Demo
Error: Contact form not found.
After receiving and interpreting a request message, a server responds with an HTTP response message:
- A Status-line
- Zero or more header (General|Response|Entity) fields followed by CRLF
An empty line (i.e., a line with nothing preceding the CRLF)
- indicating the end of the header fields
- Optionally a message-body
The following sections explain each of the entities used in an HTTP response message.
Message Status-Line
A Status-Line consists of the protocol version followed by a numeric status code and its associated textual phrase. The elements are separated by space SP characters.
Status-Line = HTTP-Version SP Status-Code SP Reason-Phrase CRLF
HTTP Version
A server supporting HTTP version 1.1 will return the following version information:
HTTP-Version = HTTP/1.1
Status Code
The Status-Code element is a 3-digit integer where the first digit of the Status-Code defines the class of response and the last two digits do not have any categorization role. There are 5 values for the first digit:
Code | Description |
---|---|
1xx | InformationalIt means the request was received and the process is continuing. |
2xx | SuccessIt means the action was successfully received, understood, and accepted. |
3xx | RedirectionIt means further action must be taken in order to complete the request. |
4xx | Client ErrorIt means the request contains incorrect syntax or cannot be fulfilled. |
5xx | Server ErrorIt means the server failed to fulfill an apparently valid request. |
HTTP status codes are extensible and HTTP applications are not required to understand the meaning of all registered status codes.
Using Python Requests
In the below python program we use the urllib3 module to make a http GET request and receive the response containing the data. It also provides the response code which is also managed by the functions in the module. The PoolManager object handles all of the details of connection pooling and also handles the thread safety.
- import urllib3
- http = urllib3.PoolManager()
- resp = http.request(‘GET’, ‘http://tutorialspoint.com/robots.txt’)
- print resp.data
# get the status of the response
- print resp.status
When we run the above program, we get the following output −
User-agent: *
Disallow: /tmp
Disallow: /logs
Disallow: /rate/*
Disallow: /cgi-bin/*
Disallow: /video tutorials/video_course_view.php?*
Disallow: /video tutorials/course_view.php?*
Disallow: /videos/*
Disallow: /*/*_question_bank/*
Disallow: //*/*/*/*/src/*
200
The request and response between client and server involves header and body in the message. Headers contain protocol specific information that appears at the beginning of the raw message that is sent over TCP connection. The body of the message is separated from headers using a blank line.
Example of Headers
The headers in the http response can be categorized into following types. Below is a description of the header and an example.
Cache-Control
The Cache-Control general-header field is used to specify directives that MUST be obeyed by all the caching systems. The syntax is as follows:
- Cache-Control : cache-request-directive|cache-response-directive
An HTTP client or server can use the Cache-control general header to specify parameters for the cache or to request certain kinds of documents from the cache. The caching directives are specified in a comma-separated list. For example:
- Cache-control: no-cache
Connection
The Connection general-header field allows the sender to specify options that are desired for that particular connection and must not be communicated by proxies over further connections. Following is the simple syntax for using connection header:
- Connection : “Connection”
HTTP/1.1 defines the “close” connection option for the sender to signal that the connection will be closed after completion of the response. For example:
- Connection: close
By default, HTTP 1.1 uses persistent connections, where the connection does not automatically close after a transaction. HTTP 1.0, on the other hand, does not have persistent connections by default. If a 1.0 client wishes to use persistent connections, it uses the keep-alive parameter as follows:
- Connection: keep-alive
Date
All HTTP date/time stamps MUST be represented in Greenwich Mean Time (GMT), without exception. HTTP applications are allowed to use any of the following three representations of date/time stamps:
Sun, 06 Nov 1994 08:49:37 GMT ; RFC 822, updated by RFC 1123
Sunday, 06-Nov-94 08:49:37 GMT ; RFC 850, obsoleted by RFC 1036
Sun Nov 6 08:49:37 1994 ; ANSI C’s asctime() format
Transfer-Encoding
The Transfer-Encoding general-header field indicates what type of transformation has been applied to the message body in order to safely transfer it between the sender and the recipient. This is not the same as content-encoding because transfer-encodings are a property of the message, not of the entity-body. The syntax of Transfer-Encoding header field is as follows:
- Transfer-Encoding: chunked
All transfer-coding values are case-insensitive.
Upgrade
The Upgrade general-header allows the client to specify what additional communication protocols it supports and would like to use if the server finds it appropriate to switch protocols. For example:
Upgrade: HTTP/2.0, HTTP/1.3, IRC/6.9, RTA/x11
The Upgrade header field is intended to provide a simple mechanism for transition from HTTP/1.1 to some other, incompatible protocol.
Via
The Via general-header must be used by gateways and proxies to indicate the intermediate protocols and recipients. For example, a request message could be sent from an HTTP/1.0 user agent to an internal proxy code-named “fred”, which uses HTTP/1.1 to forward the request to a public proxy at nowhere.com, which completes the request by forwarding it to the origin server at www.ics.uci.edu. The request received by www.ics.uci.edu would then have the following Via header field:
Via: 1.0 fred, 1.1 nowhere.com (Apache/1.1)
The Upgrade header field is intended to provide a simple mechanism for transition from HTTP/1.1 to some other, incompatible protocol.
Warning
The Warning general-header is used to carry additional information about the status or transformation of a message which might not be reflected in the message. A response may carry more than one Warning header.
Warning : warn-code SP warn-agent SP warn-text SP warn-date
Example
In the below example we use the urllib2 module to get a response using urlopen. Next we apply the info() method to get the header information for that response.
- import urllib2
- response = urllib2.urlopen(‘http://www.tutorialspoint.com/python’)
- html = response.info()
- print html
When we run the above program, we get the following output −
Access-Control-Allow-Headers: X-Requested-With
Access-Control-Allow-Origin: *
Cache-Control: max-age=2592000
Content-Type: text/html; charset=UTF-8
Date: Mon, 02 Jul 2018 11:06:07 GMT
Expires: Wed, 01 Aug 2018 11:06:07 GMT
Last-Modified: Sun, 01 Jul 2018 21:05:38 GMT
Server: ECS (tir/CDD1)
Vary: Accept-Encoding
X-Cache: HIT
Content-Length: 22063
Connection: close
The Hypertext Transfer Protocol (HTTP) is a protocol used to enable communications between clients and servers. It works as a request-response protocol between a client and server. The requesting device is known as the client and the device that sends the response is known as the server.
The urllib is the traditional python library which is used in python programs to handle the http requests. But now there is urllib3 which does more than what urllib used to do. We import the urllib3 library to see how python can use it to make a http request and receive a response. We can customize the type of request by choosing the request method.
Pip install urllib3
Example
In the below example we use the PoolManager() object which takes care of the connection details of the http request. Next we use the request() object to make a http request with the POST method. Finally we also use the json library to print the received values in json format.
- import urllib3
- import json
- http = urllib3.PoolManager()
- r = http.request(
- ‘POST’,
- ‘http://httpbin.org/post’,
- fields={‘field’: ‘value’})
- print json.loads(r.data.decode(‘utf-8’))[‘form’]
When we run the above program, we get the following output −
{field’: value’}
URL Using a Query
We can also pass query parameters to build custom URLs. In the below example the request method uses the values in the query string to complete the URL which can be further used by another function in the python program.
- import requests
- query = {‘q’: ‘river’, ‘order’: ‘popular’, ‘min_width’: ‘800’, ‘min_height’: ‘600’}
- req = requests.get(‘https://pixabay.com/en/photos/’, params=query)
- print(req.url)
When we run the above program, we get the following output −
Are you looking training with Right Jobs?
Contact Us- Top Python Framework’s
- Python Interview Questions and Answers
- Python Tutorial
- Why Python Is Essential for Data Analysis and Data Science
- Advantages and Disadvantages of Python Programming Language
Related Articles
Popular Courses
- C and c Plus Plus Training
11025 Learners
- Java Online Training
12022 Learners
- Dot Net Training
11141 Learners
- What is Dimension Reduction? | Know the techniques
- Difference between Data Lake vs Data Warehouse: A Complete Guide For Beginners with Best Practices
- What is Dimension Reduction? | Know the techniques
- What does the Yield keyword do and How to use Yield in python ? [ OverView ]
- Agile Sprint Planning | Everything You Need to Know