Denial of Service (DoS) attacks, or their bigger badder brother, Distributed Denial of Service (DDoS) attacks, are back in the news with attacks getting bigger, badder and more frequent. Akamai’s ‘State of the Internet Q3 2014 report’ quotes average attack traffic rising from less than 3Gbps to in excess of 13Gbps over the 12 months from Q3 2013 to Q3 2014, although privately 50Gbps is no longer considered exceptional and late in 2014, one attack surpassed the 400Gbps mark, while SC Magazine UK report that the number of companies under constant cyber-attack has rocketed from four percent in 2013 to 19 percent in 2014.
Attacks are becoming both truly massive and are also increasing in sophistication; multiple attack vectors with a mix of volumetric as well as application attacks, and fast adapting to circumvent mitigation actions. All this puts DDoS back in people’s mind and so this article is intended to help guide you as you work out quite what is the best way to protect your online services from such attack.
It is always worth starting by considering the threat. The ‘problem’ with DoS attacks is they are a little like preventing against a lightning strike. You may never be struck or you may be struck out of the blue for no apparent good reason, it really is very hard to predict.
Obviously some businesses are more liable to be attacked than others, however, unexpected events in the news and ‘hacktivism’ can be upon you in an instant and without the need for your organisation to have done anything. The BBC news stories above include one attack again Mole Valley District Council. Who, you might ask. What is so ‘appealing’ about Mole Valley District Council to a hacktivist group, it is after all, simply a small district council to the south west of London responsible for items such as street lighting and rubbish collection.
This attack is believed to have been due to the questioning of David Miranda at London Heathrow airport. Who, you might again ask. Mr Miranda was (and maybe still is) the partner of Guardian journalist Glenn Greenwald who was associated with the publication of the Snowden papers in the UK Guardian newspaper. He was en-route from Germany to Brazil and was detained for a number of hours as he transited Heathrow. ‘Someone’ took exception to this questioning and Mole Valley District Council just happened to be the UK ‘.gov.uk’ website the group settled on attacking in ‘protest’. Hence Mole Valley had an attack from nowhere, completely unrelated to their activities and completely outside of their control, but where they were ‘targeted by association’ by virtue of being a .gov.uk website and located vaguely near Heathrow airport. It makes you realise, you don’t have to do something yourself for your services to be attacked.
Attackers use a variety of mechanisms in their attempt to cripple your service and their methods are evolving.
The ‘first generation’ of attacks were principally launched from infected workstations; the classic ‘botnet’, and focussed on ‘lower layer’ infrastructure attacks that manipulated the TCP connection to tie-up your infrastructure servicing requests that weren’t genuinely there. ‘SYN floods’ are the most typical of this style of attack, although are just one of various mechanisms used. All these attack mechanisms attempt to swamp your infrastructure and deny access by your legitimate users.
The use of botnets had implications on the attack; it would typically be widely distributed and include maybe 10 to 30,000 nodes or botnet agents. These infected machines might not always be available to the botnet controller as their unsuspecting owner might have powered-down either the workstation and/or their broadband connection and so coordination of the attack was a little haphazard. Consequently attacks might take a few hours to ‘get up steam’ as all the agents came online and received instruction.
Then more recently a ‘second generation’ of attack has switched from a huge number of infected workstations, all on comparatively low bandwidth connections, to much smaller numbers of compromised servers typically sat in data centres on very high bandwidth connections. The ‘itsoknoproblembro’ attack typifies this new style of attack. Here the ‘attack base’ is smaller, but all the attack agents are available all the time and so attack coordination can be ‘tighter’, meaning more abrupt starting, stopping and switching of the attack. This means an attack can reach full effect in a matter of one or two minutes meaning any response also needs to be similarly fast. It also means the attack can switch quickly.
This ‘fast switching’ has been used against some attack mitigation devices where the typical response time to determine the source of an attack and implement a block was around 10 minutes. Hence the attackers took to switching attack sources every 7 minutes, always switching just before the targeted site could put an effective block in place.
The ‘itsoknoproblembro’ attack was also a move into application/Layer 7 attacks which are becoming more common. Radware’s “2014-2015 Global Application & Network Security Report ” quotes the mix of volumetric network attacks to application attacks as being about 50:50. Alternatively, the Akamai ‘State of the Internet Q3 2014 report’ quotes 89:11 and includes a detailed breakdown on current attack types.
Whatever the actual figure, ‘Layer 7’ or application tier attacks are becoming more common and are themselves evolving in sophistication, with attackers realising that a few well-targeted requests can have as great an impact on a service as millions of static page requests. Accordingly login pages and site searches are increasingly coming under attack as these functions can consume significant amounts of application and database server processing capacity vs. simply requesting a static page from the web server many times.
As a consequence of this specific focus on individual attack targets, an attack is often preceded with several days of ‘reconnaissance’ by the attackers as they assess ‘hot spots’ in your service and so optimise how and where they will attack. This indicates the importance of continuous monitoring of activity against your service, as this may identify the low-volume ‘scouting’ and give you some forewarning of what may be coming your way.
And most recently a ‘third generation’ of attack based on ‘reflected attacks’ is taking hold. These reflected attacks take advantage of legitimate Internet services such as Domain Name Service (DNS), Network Time Protocol (NTP), Simple Network Management Protocol (SNMP) and Simple Service Discovery Protocol (SSDP). Here the attacker makes a legitimate request to the service, but spoofing the address of an intended attack target. Then by making a small request of a few hundred bytes, a much larger response of several thousand bytes can sent, not to the attacker, but to the designated attack target which suddenly starts to receive large volumes of response data to requests it never made. In this way the attacker can ‘magnify’ their own request traffic by two orders of magnitude into traffic to the target.
During a full-scale attack, any and all of these different mechanisms may be deployed against you at different times, with attack vectors being switched regularly in an attempt to keep the attack one step ahead of any mitigation activities.
You can visualise these attack styles as a funnel, with the greatest number of attacks and greatest volume of data per attack made against the lower TCP layers (higher up the funnel) and then progressing towards application (Layer 7) attacks with lower data volumes and reducing frequency, although conversely probably increased focus against your specific service. Right at the bottom of this funnel you have the most targeted attacks, crafted specifically at your service and based on the results of prior ‘site reconnaissance’.
The ‘take away thought’ from this evolution of attacks is you should be careful about believing protection against low level TCP attacks will save your service. It will prevent some forms of attack, but these will be determined during any ‘reconnaissance phase’. You might then be bypassed for an easier target, or a different attack vectors selected on the basis it’s been shown to be effective prior to the all-out assault.
You should be prepared for attacks on higher level services such as DNS and your web application itself, and currently this can also include SSL (TLS) protected applications. A simple but effective attack could be to launch relatively small number of ‘junk’ login requests. That might nicely tie-up your resources and achieve the attacker’s ultimate goal; to disrupt access to your online services.
So, if you decide you ought to consider your options, we’ve prepared a decision tree to help guide you through the major questions you need to consider, the answers to which will control the options available to you.
Key points to consider are:
Is your infrastructure and your ISP capable and willing of supporting the 13 to 400Gbps floods that could come your way? Unless you have a very large infrastructure and Internet connection, then the answer to this question will almost certainly be no and your options will lie exclusively with the various managed service/SaaS/cloud options (take your pick of favourite term!). The scale of attacks these days means on-site mitigation is unlikely to be viable as it is unlikely anyone would maintain such huge excesses of infrastructure capacity.
The managed services then break-up into two main types:
- Swung services
- Always-on services
Akamai’s Prolexic do now offer a hybrid of the two which we’ll come to.
So what is a swung service and how does it work?
The basic idea behind the ‘swung services’ is that they operate rather like insurance; You hope you never have to use it, but if you have a fire, you’re eternally grateful you took out the policy. With these services, while there is no attack, traffic is routed direct to your own service with no involvement of the mitigation service, but should you come under attack, traffic is ‘swung’ to route via the mitigation service which then ‘scrubs’ the traffic before passing on mostly-clean traffic to your service. This model has a number of implications:
- These services usually route entire network blocks, meaning that all traffic on all addresses using all protocols are protected by the mitigation services – or to be more precise – are routed via the mitigation service which may or may not have the ability to detect and mitigate a specific attack on a non-standard port/protocol
- From a security perspective, the mitigation service has no direct access to your traffic other than during an attack, although the service may well require ‘traffic sampling’ devices within your infrastructure to enable the service to detect the onset of an attack meaning the service does gain ‘visibility’ of your secure traffic even though it may not be routed via the service infrastructure, however…
- For the service to start protecting your site, a traffic routing change must be made to ‘swing’ the traffic via the scrubbing service
- For the traffic routing change to be made, someone within your organisation has to have the ability to authorise that live service change. It’s no good the mitigation service being ready to cope with your attack within 10 minutes, if it takes your change process 10 hours to agree to enable the service
And there are some technical implications too:
- Traffic is usually ‘swung’ via an update to the ‘BGP routing tables’. These advertise to the Internet how traffic can reach your service, however, BGP can only define networks of 256 addresses – a ‘Class C’ or CIDR ‘/24’ network – or larger. If you don’t have your own dedicated /24 subnet, then you can’t use ‘BGP swing’
- Traffic that is passed back from the scrubbing service to your own site is tunnelled for security, however, this tunnel introduces an overhead and so there’s a requirement to ‘tinker’ with the TCP ‘MTU’ – maximum transmission unit – or size of TCP packet, to enable them to be efficiently tunnelled
These issues can all be overcome; they are not insurmountable issues in any way shape or form, however, they are ‘details’ you and your network staff need to be aware of. It’s also worth noting that DNS can be used to ‘swing’ traffic, however, this introduces its own issues, not least an attacker opting to ignore DNS and simply attacking you by IP address.
So in what way does an ‘always-on’ service differ?
Well, as the name indicates, with this style of service, traffic always traverses via the mitigation service. This has its own set of implications:
- Typically these services operate on web (HTTP/HTTPS) traffic-only
- The mitigation service will have constant access to your secure traffic (presuming you elect to protect your secure services) as traffic always passes via the service, although the ‘swung services’ will also need to sample your secure traffic anyway and so this may not be such a big difference
- There is no need to detect an attack and make any live service changes to ‘swing traffic’, as the service will simply detect and start to counter any attack as soon as it starts
- Attack mitigation is consequently instantaneous and doesn’t require any ‘middle-of-the-night authorisation’ and service technical changes
- Small-scale attacks, including Slowloris and pre-attack probing, will also be detected and quite possibly mitigated without the need for explicit mitigation actions as all traffic is always being examined and ‘scrubbed’
- Because these services may be HTTP proxy (surrogate)-based, it means you can protect services individually rather than every service in your complete address block, but...
- If the mitigation service is HTTP-only, then it won’t protect other services, i.e. SMTP or FTP
As mentioned, Akamai’s Proxlexic now operate a mid-way option between these two modes of operation. Prolexic has ‘traditionally’ been a swung service, however, a new offering has traffic always routed via the Prolexic infrastructure (much like an ‘always-on’ service), but the ‘scrubbing’ is only enabled during an attack. This means there is no longer a need for a traffic (BGP) swing to mitigate an attack; the traffic is already passing through Prolexic’s network and so any enabling of mitigation services is entirely internal to their own network. However, despite the enabling of ‘scrubbing’ being internal, Prolexic may still require your authorisation to enable the mitigation and so your change control processes may still have a part to play, and someone may still need to be available for that ‘middle-of-the-night’ call.
In summary therefore you should consider:
- Whether your infrastructure and ISP are capable and willing to absorb massive traffic floods
- If you then decide your only option is with a provided specialist service, you must consider:
- Do you opt for a swung ‘insurance policy’ or for an always-on service? Key to this decision will be whether you see benefit from scrubbing traffic outside of a full-scale flood attack and protection against low-volume attacks and probes
- Do you need to protect specific services, e.g. HTTP(S) services, or do you want to protect an entire network block and all traffic entering your infrastructure?