Thursday, 27 October 2011

Event Correlation: There is no such thing as BAU

BAU or Business as Usual is a term that is used to define a number of different things depending on the nature of your business. For projects going through new implementations or changes, the progress through transition, transformation and testing ultimately leads to the point where it is supported by the normal business and technical management processes that will keep it going until the next major change. More generally, BAU is used to describe the steady state of any process, service or infrastructure, the point at which there are no exceptional changes or problems and where it can easily "tick over" in the same way day after day.

BAU can however lead you into a false sense of security. Achieving a steady state of operation should make it quicker and easier to identify issues which arise. However, problems occur when you start to consider regular issues or low-level low-impact incidents which occur on a daily basis, as normal or part of the BAU operation. Once you do that, you may be ignoring the signs of a larger problem which is bubbling away under the surface.

There are a number of shows on TV now that dissect significant incidents and disasters to examine how they were caused. Typically, these incidents are things that are in the public conscious, were heavily reported in the news at the time and either threatened or took lives. Incidents such as plane crashes, train accidents, ferries sinking or industrial accidents of some kind are all subjects of these shows. The key point made by all of these programmes is that these things don't just happen without any warning signs and cannot be attributed to a single issue or failing. These types of incident are a chain of events which have come together to cause a far more significant incident or disaster. The reason that these issues are not identified in time to prevent a disaster is that they are each only visible to different people, have no correlation or visibility in a holistic fashion and more often are not considered to be issues because they are just things that happen as part of Business as Usual.

For a plane crash, the programme wil talk about a number of minor factors which could contribute to an accident: the maintenance team not following proper procedures in order to get their job done quicker, the ground crew who ignore an issue with the plane, the fuel truck driver who incorrectly tries to convert litres to gallons, the pilots who don't get enough sleep and are not fully alert, the air traffic controller working long hours with too many planes to mange, the airport with out-of-date equipment to facilitate landings, the company that transports dangerous materials on the flight without appropriate controls or the airline that pushes for faster turnarounds to make or save more money. These are all typical findings but are all either treated as normal events and not given the visibility at a level that can assess the overall risk to the flight itself. It's not until after the event does someone (typically the team investigating the crash) put all the pieces together to lead up to the event. By then it's too late.

The same applies in information security. Events don't just occur and incidents don't happen without warning, however there are often minor issues which are ignored as "acceptable failings" such as the patches that don't get applied in time, the ongoing virus detections which are quickly handled by the AV and not investigated, that one ID that always seems to log failed access attempts, the documentation not completed during changes as it holds the process up too much and demands from customers and management to respond faster and achieve more in less time. On their own, these are things that may just be treated as BAU occurrences, but may actually be symptoms of a larger problem bubbling away under the surface. The only way you're going to identify the true risk posed by the aggregation of these events is to firstly have visibility of them and secondly to understand how these individual issues might ultimately cause a larger problem. This is where event correlation is important.

There are plenty of options for Security Information and Event Management (SIEM) tool sets to correlate event data from the many technical sources within your environment. The signature of a security event can comprise information from many sources in the network which individually may not seem significant. However, SIEM tools are only part of the solution and although they can sift though potentially millions of alerts and log entries to give a concise and actionable picture of technical events, this then needs to be combined with other information to give you a correlation at a higher level. Process failings and incidents which are not detected through technical measures are also elements which can contribute to a security incident and it may be that a low-level correlated event from your SIEM system, combined with additional information gathered externally, indicates a more significant threat that you are facing. Security management standards such as ISO 27001 define the importance of measuring the effectiveness of all you security controls, not just the technical ones, as an ineffective manual or procedural control can just as easily contribute to a security incident. The human element can not only be the weakest link but is typically also performing the types of controls where failings and effectiveness shortfalls are far more difficult to detect due to no technical monitoring being in place.

The upshot is that it is important to know your operating environment and have an overall view of both minor incidents that may currently be treated as 'normal' as well as the effectiveness of all your controls, both technical and procedural. Only by being able to correlate the risk of each event and though an understanding of how even if individually the risk of each one is negligible, the combined risk is perhaps intolerable, will you be able to predict and prevent the big incidents or disasters.

Photo: David Castillo Dominici

No comments:

Post a Comment