What are cookies?
Cookies are small files which are sent by a web server to the user and stored by the browser on the user's computer (on disk).
They are designed to hold a modest amount of data specific to a particular client and website, and can be accessed either by the web server or the client computer. Data in a cookie is simple textual data. It is not binary data.
The website can only read its own cookie. It cannot read the cookie of the other website/domain. This security is ensured by the browser. A cookie is not shared among different browsers. Means, one browser cannot read the cookie stored by another browser even if it is the same domain.
As per HTTP protocol, the size of the cookies cannot be greater than 4KB. A number of cookies sent by the web server for a given domain cannot be unlimited. The restriction is put by the browser to avoid disk space consumption. It is about 20-25 cookies per domain.
Why are Cookies Used?
Cookies are a convenient way to carry information from one session on a website to another, or between sessions on related websites, without having to burden a server machine with massive amounts of data storage. Storing the data on the server without using cookies would also be problematic, because it would be difficult to retrieve a particular user's information without requiring a login on each visit to the website.
If there is a large amount of information to store, then a cookie can simply be used as a means to identify a given user so that further related information can be looked up on a server-side database. Subsequently, when the site is revisited, the server will read the cookie and then retrieve all the user's information from the database without it having to be re-entered.
The web is built on top of HTTP which in turn is build using TCP. Even though, TCP is stateful (connection-oriented) protocol, HTTP is a stateless protocol. Stateless protocols do not maintain any information about the previous communication.
Cookies structure
By using cookies an application can store information at the user-agent (browser) across multiple different HTTP requests, where the data is being stored in a name=value
basic format. From the Cookie RFC we can see that there are many different extra fields in a cookie, but out of those you will mostly only run into these:
- Domain – the domain of the cookie where this cookie can be sent to. In case the domain is not present, it will be handled as a host-based cookie, and browsers will only send it to that exact domain (so no subdomains! Also beware that IE may behave differently).
- Max-Age – the amount of time the cookie should be valid. Of course, IE doesn't support this field, so everyone uses "Expires" instead of with a GMT timestamp.
- Path – a given URL path where the cookie applies to (for example
/auth
). The default value for the Path option is the path of the URL that sent theSet-Cookie
header. - Secure – when used, the cookie can be only transferred through HTTPS, regular HTTP requests won't include this cookie.
- HttpOnly – when used, the cookie won't be accessible through JavaScript, giving you some protection against XSS attacks.
- Same-site– when used, the cookie won't be enabled for third-party usage.
Dealing with cookies
If you create a cookie with Domain "larapulse.com", then that Cookie will be available for an application sitting at "blog.larapulse.com" as well, and basically for all other subdomains. Browsers will only send cookies with requests made to the corresponding domains. Moreover, the browser will only set cookies for domains where the response did actually come from (i.e. at "malicious.onion" the browser will discard Set-Cookie
headers with "larapulse.com" Domain).
Parent domain can set cookies for sub-domain and sub-domain can also set cookies for parent domain. Sometimes websites use the www
subdomain to restrict access to cookies on this particular domain.
Browsers will discard cookies created by applications on TLDs (Top Level Domain, like .de
, .ru
or .com
), also the same happens with cookies created for IP addresses, or for not valid domains (like "localhost", or "myserver"), the Domain has to be a valid FQDN (Fully Qualified Domain Name) if present.
By default, the cookie has a lifetime of the browser window. When a browser is closed, the cookie is gone. It is deleted. Such a cookie is called Session Cookie. You can also create a Permanent Cookie by specifying a Max-Age/Expires
.
In order to clear out/remove a cookie, you need to create a cookie with the same name (value can be anything), and set the Expires
property to date in the past.
In case you request a page, but a Set-Cookie
is coming out of a subsequent request (i.e. resource on the page – frame/iframe/etc), then that cookie is considered as a third-party cookie. Browsers may ignore third-party cookies based on their security settings, so watch out.
Tracking your activity: advertisement cookies
Advertising cookies (of third parties) collect information to help better advertising to your interests. De-selecting these cookies may result in seeing advertising that is not as relevant to you or you not being able to link effectively with social networks and/or not allowing you to share content on social media.
Tracking cookies are usually used for advertising purposes, retargeting in particular. Retargeting is a tactic that often relies on tracking cookies to show ads to people who have previously visited a specific site or shown interest in a particular product. If you've ever bought or even looked at a particular product on Web and then started seeing ads for similar products on other websites, you've been retargeted.
Tracking cookies can record all kinds of information: search queries, purchases, device information, location, when and where you saw previous advertisements, how many times you've seen an ad, and what links you click on.
According to the GDPR, all European websites have to inform you about cookie usage and ask you to allow their usage.
Why Cookies have such a bad reputation?
Right from the days of their creation, cookies were and are still subject to criticism due to their nature. Many opposed the idea of servers saving data on the user's computer. But eventually benefits outwitted all those privacy concerns.
However, it also opened many security loopholes for them. Developers often ignored security measure until an accident knocked on their door. Primarily, cookies suffer from three major attacks:
- MITM — Man-in-the-middle
- XSS — Cross-site scripting
- CSRF — Cross-site request forgery
Secure usage
Secure
flag - cookies are exchanged only if you use an HTTPS connection.
If you have already adopted HTTPS protocol on your website, you may think that your cookies are protected as they are transmitted through a secure communication and as they can't be reached in Javascript. Unfortunately, a noticeable issue remains.
Only the Secure
attribute will let you forbid a cookie to be ever transmitted via simple HTTP.
If the Secure
cookie flag is set, and the application doesn't have other vulnerabilities, cookie information can be protected in a MITM attack and will let you forbid a cookie to be ever transmitted via simple HTTP, while trying to reach your web server over HTTP (simply because user is typing your URL without mentioning "https://").
HttpOnly
flag - cookies are not accessible with JavaScript and only sent to the server.
You should note that by doing so, your own scripts also lose the ability to read cookies. It prevents the use of a cookie on the client side.
If you do not sanitize the user's input, an XSS injection can happen. If an attacker succeeds in injecting some JavaScript despite all your precautions, he won't be able to access the cookies anyway. That will significantly limit the attack range.
Of course, if you are using modern SPA frameworks, you do not have to worry about this. Otherwise, using HttpOnly
directive/flag can protect you against XSS attacks.
Same-site
attribute - is used to disable third-party usage for a specific cookie.
It requests the browser to only send the cookie in a first-party context, i.e. when you are using the web application directly. When another site tries to request something from the web application, the cookie is not sent. This effectively makes CSRF impossible, because an attacker can not use a user’s session from his site anymore.
There are two possible values for the same-site attribute:
Strict
: the cookie is withheld with any cross-site usage. Even when the user follows a link to another website the cookie is not sent.Lax
: some cross-site usage is allowed. Specifically, if the request is a GET request and the request is top-level. Top-level means that the URL in the address bar changes because of this navigation. This is not the case for iframes, images or XMLHttpRequests.
As you would expect strict mode gives better security, but breaks some functionality. Links to protected resources won't work from other sites. Even if you are logged in and have access to some resource, your strict cookies won't be sent when coming from another site. With lax mode, this still works, while providing decent security by blocking cross-site post requests.
The same-site attribute is a good method to protect against CSRF attacks, because it seems to the attacker as though you are no longer logged in to the website under attack.
Before Same-site
attribute was introduced, the simplest solution to prevent CSRF attack was to check the Referer
Header and CSRF-token.
TL;DR: Conclusion
The cookie is one of the pioneering ideas that power the modern Web. With the rise of modern web around smartphones and devices, the underlying concept of cookie is evolving. Old ideas dying but replaced with something more complex.
To prevent various hacker attacks and protect your users, use Same-site
attribute, Secure
and HttpOnly
flags.