HTTP stands for Hypertext Transfer Protocol. Hypertext is any document that contains hyperlinks (Hyperlinks are a reference to data, which a user can follow by clicking, hovering etc. above it.) It is the protocol that works behind the webpages we use daily, and it helps us to get a resource. These resources can be anything like image, video or an HTML document. Though a complete document isn’t completed by mere HTML it requires to refer a lot of websites in order to get images, text, styling (CSS scripts) and even ads, after putting them all together we get what we finally see.
When a request-response communication is being carried out, normally we think that all it needs to do this is a requester (client) and a responder (server). But have you ever thought how then the cache of the web browser work? There exist several proxies in between that perform functions like caching the recently visited pages, running antivirus programs, filtering content and many more. They can be transparent, meaning they don’t do anything to the content and the content passes through them as is, or non-transparent meaning that they alter the content going through them.
Origin-Constraint Or The Same Origin Policy
According to this a webpage shares data with another webpage if both have the same origin. Mostly criterions like Protocol, Host and Port are checked. If all are found to be exactly same, then the data is shared else it is not. This is to maintain secure seperation between two websites. But HTTP needs Hyperlinks that sometimes might need to access the data of another webpage to know the context. Or hyperlinks also work on hovering, if you hover your mouse over a hyperlink to know about it, it’d have to access data on another website or a different webpage. Hence HTTP can relax this origin constraint sometimes.
HTTP with the help of cookies can help create a session. In truth, HTTP is a stateless protocol. Since it’s required to be fast hence, it does just what is required to complete the request with minimal extra information. It doesn’t relates two transactions between the same requester. The information about a session is maintained by cookies.
The basic workflow used by HTTP is, First it opens TCP connection, then it sends an HTTP message to the server, in which it requests the content with any specific requirements if present (They can be listed in the optional part of the HTTP message called the headers). Then the response sent by the server is interpreted and the connection is opened for further requests or closed if required. And to allow several messages to go at once multiplexing is done. The response sent involves status code like 200 or 404 that we normally use to check if the request was succesful or if any error like file not found etc. occured.
Why TCP is used?
We keep on reading in articles about HTTP, that mention TCP connection. First let’s see what is TCP. It’s a communication service that works between the application program (or layer) and the IP (Internet Protocol). Let us see what it means. Application layer is the one that an end user sees directly. It means all of the content received from different sources is combined to one beautiful page that a user sees. And IP, the internet protocol takes care of how the content is transmitted. It uses the IP address in packet headers to transmit them. It essentially is what makes a connection possible. Without it, the data just can’t be transmitted. But for efficient transmission and ensuring reliability, there was required another protocol. That’s where TCP came useful. The Transmission Control Protocol. It is not concerned with what the user sees or how data packets are transferred from source to destination, but it is concerned with how that packets to be transferred are created. It divides the data into octets which are transmitted. But thats not all. TCP also ensures the content is delievered complete and in proper order. For complete content, it keeps a record of the packets it sent. And the receiver has to reply with an acknowledgment that tells the sender that the data was received. If the acknowledgment is not received within a specified amount of time, then the packet is resent. When all of the data is sent it is reassembled into the proper order that it was meant to be and finally displayed to the user.
HTTP request methods
GET: This option is used to get data from a website.
HEAD: This method is used to get response but without a body. This means, for example if you do a get request for a file or a web page, then the actual file or web page won’t be sent in the response. Only the information about it like, last date of edit, or size of content, type of content etc. would be delivered.
POST: It is used to send some data to the server so that it stores it, or evaluates it as required. It is used in cases like submitting a web form or uploading content online.
PUT: This method creates a resource at the target. And if one already exists then update its value with the content of the PUT message’s body. It is different from the POST method in the context that POST if done repeatedly, repeats the action and repeatedly sends the file to be accepted by the receiver, it may endup creating several same resources. But PUT method just updates the values of an existing resource. Or create it if it doesn’t exist.
DELETE: As the name implies it deletes the resource specified.
TRACE: This method echoes back the content of the request sent by us as it’d be seen by a server. Hence it’s used for debugging purposes. Like it helps to see how the request sent by the client is modified by proxies.
OPTIONS: This specifies how you are going to communicate, what method are you going to use. Without actually using them, and hence get response from the server what all methods are allowed.
CONNECT: Method deals with creating a connection. A two way connection. It is also used in tunneling.
PATCH: This one deals with resources in somewhat a hybrid of PUT and POST. This uses a patch document to make partial changes to the resource, unlike PUT that writes new content to the resource everytime but like POST changes done by it depend on how many times this request has been made.
This post just aims to aquaint faintly with the vast topic of HTTP. There will be more posts in this interesting topic featuring hands on exercises to test them and you’ll find those on thegeekyway.com with lots of other tutorials!
Never lose the spirit!