January 31, 2020 By BlueAlly
By Gui Alvarenga, Product Marketing Manager, published January 31 2020
Today’s enterprises have typically reached high levels of cloud maturity, including distributed application architectures that are based on advanced cloud technologies such as containers and FaaS (Function as a Service, aka serverless). F5’s 2019 State of Application Services report found that more than half (56%) of the enterprises surveyed were using containers for a wide range of use cases, including serverless.
The latest published Cloud Native Computing Foundation (CNCF) survey indicated that 38% of the respondents were currently using serverless technology and an additional 26% were planning on doing so within the next 18 months. In fact, market researchers expect the FaaS market to grow to $7.72 billion by 2021 (up from $1.88 billion in 2016), at a remarkable compound annual growth rate of 32.7%.
Another clear trend among cloud-mature organizations is the deployment of complex hybrid and multicloud infrastructures. The RightScale (now Flexera) 2019 State of the Cloud Report indicates that 84% of enterprises have a multicloud strategy, of which 58% are hybrid (public and private)—with 33% of their workloads running in a public cloud and 46% in a private cloud. Regarding public multicloud, organizations using any public cloud are typically deployed in two public clouds and experimenting with 1.8 more.
As applications and infrastructures reach higher levels of sophistication and maturity, they generate a very high volume and diversity of monitoring data. A typical enterprise has at least four data-ingestion tools, three monitoring tools, and multiple security analytics tools, each with its own logs and dashboards. In the face of billions of event log records, many of which are simple text strings with no context, traditional log management solutions are unable to provide the visibility and context required to identify and analyze an attack. In the absence of advanced analytics and visualization capabilities, the cloud security incident response process typically takes 200+ days, during which time considerable human resources are expended..
This blog post will explore cloud visibility and monitoring challenges and provide some initial guidelines on how to mitigate them in order to enhance the organization’s security posture.
Obscured Visibility
In the shared security responsibility model, the public cloud customer is responsible for securing its data and traffic flows. This responsibility, however, is difficult to uphold if an enterprise has poor visibility into its cloud assets across complex hybrid and multicloud environments. Organizations often cannot effectively monitor basic issues such as who is accessing a cloud service or application; where the traffic is coming from; or misconfigured controls that, for example, inappropriately expose data storage resources to the internet.
It is not surprising, therefore, that Keysight’s The State of Cloud Monitoring report, based on a global survey of over 330 IT professionals, found that the number one priority for public cloud users was gaining visibility into application and data traffic. Yet less than 20% of the respondents felt their organization could properly monitor their public cloud environments, and 87% feared that this lack of cloud visibility could obscure security threats.
Enterprises must find a way to integrate fragmented (private/public, multiple public providers) monitoring and security stacks for enhanced visibility that provides:
- A comprehensive resource inventory that is automatically kept up to date so that asset and risk management is based on a complete and single source of truth.
- A real-time view of where and when data and workloads are being accessed, as well as by whom with granular Identity and Access Management (IAM) controls.
- Dynamic change monitoring and anomaly detection via alerts.
- Continuous configuration assessment against corporate policies and industry best practices.
- Cloud forensics in general and timely incident investigation and remediation in particular.
- A high level of visualization for at-a-glance understanding of system health and performance.
Dynamic Applications
Traditional monolithic, single-tiered, and self-contained applications can certainly run on cloud infrastructures, but they cannot take advantage of the agility and flexibility of cloud resources and services. Today, therefore, the trend is toward microservice architectures, whereby an application is actually a collection of small autonomous services that communicate over a lightweight protocol such as HTTP REST APIs. Each microservice is responsible for its own UI, logic and database requirements, with the development team designing and centrally managing the modular application to ensure that the components are orchestrated and optimized.
Two advanced cloud technologies have emerged in the wake of the cloud-native microservice approach: serverless functions (Functions as a Service, FaaS) and containers. FaaS abstracts the infrastructure layer from the application layer. In other words, developers build their code and specify infrastructure requirements. During runtime, a fully managed service such as AWS Lambda automatically runs the event-triggered code, scaling resources continuously and automatically as required.
Containers provide yet another level of abstraction. Docker defines a container as “…a standard unit of software that packages up code and all its dependencies so the application runs quickly and reliably from one computing environment to another…” A container image is a stand-alone executable software package that is launched at runtime by a container engine that ensures the software will run the same regardless of the host operating system and underlying infrastructure. Container orchestrators like Kubernetes automate the deployment, scaling, and management of containerized applications.
The bottom line is that compute services in cloud-native applications are designed to be ephemeral and tend to have short lifespans. As a result, attackers are unable to easily achieve long-term presence in a system and must, therefore, shift tactics. For example, a method that has become popular is the Groundhog Day Attack, in which an attacker leverages the automatic scaling of cloud-native applications to craft much shorter, repetitive attacks that steal just a few credit card numbers at a time. Monitoring such attacks is hard.
Other security challenges posed by microservices, serverless functions, containers, and container orchestration are:
- Third-party container images and other open-source modules whose pedigrees and security postures are unknown.
- With containerized applications involving communication among many (sometimes hundreds) of microservices, there is a high proportion of east-west traffic. Although it is important to monitor these lateral flows, it is very challenging to gain visibility into the packet and application-level data.
How to Improve Cloud Visibility
In this section, we look at three approaches that can mitigate cloud visibility challenges: automated risk analysis, orchestration, and advanced data analytics using artificial intelligence (AI) and machine learning (ML).
Automated Risk Analysis
Given the scale and diversity of log data, an essential first step in risk analysis is cleaning up the data by differentiating between significant signals and noise. After careful consideration of the criteria that define what noise is, automated methods can then be used to scan log data for signals and patterns that are most relevant to assessing risk levels.
Once the log data has been filtered, other automated methods can be used to enrich that data with both external sources, such as threat intelligence, and internal sources, such as asset management systems. The result is real-time and precise risk level analysis that is relevant to the organization’s unique business and IT environments.
Although automation can help in the reduction of event noise, it is important that a fine balance be struck between what is truly low risk and can therefore be automatically filtered out, and what is not. These judgement calls and decisions are best made by security practitioners.
Orchestration
Many cloud operational tasks have become automated, which is a good thing. However, when automated tasks are deployed at scale, they can, in themselves, become an obstacle to visibility. Each team—security, operations, development—works in its own silo, with its own “proprietary” automation tools and scripts, leaving the company as a whole with little oversight or control.
Cloud orchestration organizes discrete automated tasks, diverse workloads, and fragmented tool stacks to work together seamlessly in a coordinated and efficient workflow. Orchestration workflows are often called playbooks, a term borrowed from the world of team sports. The orchestrator is like a coach who knows the capabilities of each “player” (i.e., a task, resource, service, environment, and so on) and how to get those players to work together in the right sequence to achieve a goal.
Orchestration can improve cloud visibility by:
- Modeling corporate policies once and then enforcing them consistently and automatically across all accounts, regions, and assets.
- Maintaining self-service portals with pre-approved and standardized provisioning and deployment templates.
- Providing single-source-of-truth dashboards and other visualizations that are shared across all relevant teams, such as security, network, development, and operations.
- Tidying up the environment by detecting and eliminating superfluous resources, such as idle instances or orphaned storage volumes.
Advanced Analytics Powered by AI and ML
One of the key challenges in cloud visibility is the diversity of monitoring tools, each generating huge volumes of event logs and performance metrics and each in its own proprietary format. Today, there are next-generation cloud monitoring solutions that aggregate and normalize these diverse data sources into a data meta-layer on which advanced data analytics methods like AI and ML can be applied.
AI and ML use big data to enhance cloud visibility in a number of ways. For example, historical big data is used to train AI models of what constitutes “normal” baseline behavior or performance. Once trained, these models are used during runtime to detect and alert to out-of-range, anomalous performance or activities.
ML is also used for graphical event modelling, where an information model is created that interprets the events into a graphical relationship model. A good example of ML-based graphical relationship modeling is the graphic representation of a kill chain of cyber events carried out by an adversary within the cloud. Instead of trying to piece together the breadcrumbs of events and logs, the graphical relationship model of the events lets the security analyst quickly draw conclusions from an attack that is happening, or has happened, in their network.
In short, advanced data analytics methods like AI and ML provide insights and context that enhance visibility across even the most complex hybrid and multicloud architecture.
Summary
It is difficult to achieve actionable visibility into today’s distributed cloud-native applications and complex cloud infrastructures. Ephemeral instances and microservices are spun up and down automatically and dynamically, often across multiple cloud providers and hybrid environments. Log data generated by fragmented monitoring tool stacks are of such volume, variety, and velocity that traditional log management tools usually fail to extract real time, contextual insights into threats and risk levels.
Automation, orchestration, and advanced analytics are the keys to achieving meaningful cloud visibility in many domains, including security. Check Point’s CloudGuard Log.ic, for example, combines cloud inventory and configuration information with a wide variety of real time data monitoring sources to deliver advanced and contextualized cloud security intelligence. Log.ic’s automated detection of anomalies, security posture visualizations, intuitive querying, and smart alerts dramatically shorten the time and cost of security incident response.