Anomaly detection in sequence data is becoming more and more important to detect cyber security intrusions. Markov chain technique has been widely accepted for its simple realization with few parameters.
A Markov Chain-based detection model  can be described as a discrete-time stochastic process which denotes a set of random variables and defines how these variables change over time. Markov Chain can be applied to illustrate a series of events where, what state will occur next depends only on the previous state. A series of events represents user activity and state represents sensor conditions (i.e., sensor values, on/off status) of the sensors in a smart device. We can represent the probabilistic condition of the Markov Chain as in Equation 1 where Xt denotes the state at time t.
( + 1 = ∣ 1 = 1, 2 = 2, …., = ) = ( + 1 = ∣ = )
Equation (1) when,
(1 = 1, 2 = 2, …, = ) > 0
We use a reformed version of the general Markov Chain. Here, instead of predicting the next state, we determine the probability of a transition occurring between two states at a given time. We determine conditions of sensors for time t and t+1. Let us assume, a and b are a sensor’s state in time t and t+1. We look up for the probability of transition from state a to b. If transition from state a to b is nefarious, the calculated probability from transition matrix will be zero. For example, consider a resource usage pattern USERNAME->HOSTNAME->PROCESSNAME. The pattern “ertugrul->testserver01->java” which corresponds to normal behavior for the user ertugrul would be reasonable and wouldn’t cause an alert. However, a resource usage ertugrul->testserver01->perl is extremely unlikely and raise an alert
Markov Chain Mixture Models are simple and rich tool to model the sequences of actions performed by a user during a session. This model can fit different user profiles and predict the likelihood of a sequence of actions during an application session.
One of SureLog SIEM approach to anomaly detection involves using a Markov chain model, a technique from machine learning, to learn patterns of normal behavior. Once SureLog has a learned probabilistic model of normal behavior, SureLog can then compare it against observed log messages in order to detect anomalous behavior. Essentially, a learned model of normal behavior allows SureLog to estimate the probability of each observed log messages, given the previously observed sequence of messages. Then, SureLog call the messages with the lowest estimated probabilities anomalous and alert a system administrator to further investigate.
An example of an event is shown in Table 1.
processID UserID Taxonomy File Time
Event 10 Administrator Informational.Aıuthentication.Sucess C:\Windows\temp\a.dat 28/12/2020 00:00:00
Table 1. Example of an SureLog Event
The major computation steps are as follows:
1. Generate the new event
2. Determine the density of the new event
1. Calculate the distance for each pattern
2. Calculate the distance for each event
3. Decide the legitimacy of the new event
4. Add valid event to pattern
WHAT IS UEBA?
Hackers can break into firewalls, send you e-mails with malicious and infected attachments, or even bribe an employee to gain access into your firewalls. Old tools and systems are quickly becoming obsolete, and there are several ways to get past them.
User and entity behavior analytics (UEBA) give you more of a comprehensive way to make sure that your organization has top-notch IT security, while also helping you detect users and entities that might compromise your entire system.
User and entity behavior analytics, or UEBA, is a type of cyber security process that takes note of the normal conduct of users. In turn, they detect any anomalous behavior or instances when there are deviations from these “normal” patterns. For example, if a particular user regularly downloads 10 MB of files every day but suddenly downloads gigabytes of files, the system would be able to detect this anomaly and alert the adminstrators immediately.
UEBA uses machine learning, algorithms, and statistical analyses to know when there is a deviation from established patterns, showing which of these anomalies could result in a potential, real threat. UEBA can also aggregate the data you have in your reports and logs, as well as analyze file, flow, and packet information.
In UEBA, you do not track security events or monitor devices; instead, you track all the users and entities in your system. As such, UEBA focuses on insider threats, such as employees who have gone rogue, employees who have already been compromised, and people who already have access to your system and then carry out targeted attacks and fraud attempts, as well as servers, applications, and devices that are working within your system.
BENEFITS OF UEBA
It is the unfortunate truth that today’s cyber security tools are fast becoming obsolete, and more skilled hackers and cyber attackers are now able to bypass the perimeter defenses that are used by most companies. In the old days, you were secure if you had web gateways, firewalls, and intrusion prevention tools in place. This is no longer the case in today’s complex threat landscape, and it’s especially true for bigger corporations that are proven to have very porous IT perimeters that are also very difficult to manage and oversee.
SURELOG UEBA ANOMALY DETECTIN STEPS
The entire workflow can be broken down into four distinct phases. We will describe each one of them below.
1) Data Preparation – In the first step, the workflow obtains relevant data from all the data sources. It applies all the defined filters, groups data by identified entities and prepares data for the next feature extraction stage.
2) Feature Extraction – In this step, data is obtained from all the relevant fields, grouped by each entity per day, and the configured features are computed and stored
3) Behavior Profiling – This is the step where for each entity, the extracted features are grouped into configured baselines and the machine learning model (SVD) is applied to generate a behavior profile for that particular entity.
4) Anomaly Detection – In the final step, the test feature values are scored against the behavior profile and an event is generated with an associated confidence score
The bottom line? Preventive measures are no longer enough. Your firewalls are not going to be 100% foolproof, and hackers and attackers will get into your system at one point or another. This is why detection is equally important: when hackers do successfully get into your system, then you should be able to detect their presence quickly in order to minimize the damage.
With SureLog UEBA Module, you can detect:
New resource access: If a new resource was accessed (a new user access on a computer, new remote to a server from a client, new process ran on a server) it will be alerted.
Sample use cases:
1. Account accessing a host for the first time
2. User creating/modifying stored procedure for the first time
3. DNS Server(s) not seen before
4. DNS Server(s) not used by Peers
5. Possible use of unauthorized devices – MAC address never seen before
6. Account authentication from a geolocation never seen before
7. Account authentication from a geolocation never used before
8. Users logging in from location (or IP) never seen before
9. Account accessing a file share never accessed before