Anomaly detection in sequence data is becoming more and more important to detect intrusions in cyber security. PCA is a simple and mature statistical technique widely used in anomaly detection.
PCA uses the Singular Value Decomposition (SVD) to find low rank representations of the data. The robust version of PCA (RPCA) identifies a low rank representation, random noise, and a set of outliers by repeatedly calculating the SVD and applying “thresholds” to the singular values and error for each iteration.
A robust algorithm is paramount to the success of any anomaly detection system and RPCA has worked very well for detecting anomalies.
WHAT IS RPCA?
Robust Principal Component Analysis Robust Principal Component Analysis (RPCA) is an adjusted statistical approach of PCA which works with corrupted observations and outliers [25]. While PCA is susceptible to outliers as previously stated, RPCA can detect a more accurate low dimensional space to be recovered, and that is why RPCA is necessary over standard PCA for anomaly detection in the network data. RPCA works to recover a low-rank matrix L and a sparse matrix S from corrupted measurements, which in the case of network data would be the anomalies of the data set. Robust PCA works in conjunction with the Singular Value Decomposition (SVD) factorization method of separating matrices into distinct unitary and diagonal matrices, giving an optimization problem as follows:
min(L,S)||L||∗ +λ||S||1
subject to, |M −(L + S)| ≤ ε
In this equation, L is the low rank matrix which can be factorized as an SVD, ||L||∗ is the nuclear norm of L (the sum of the singular values), λ is the coupling constant between L and S, ||S||1 is the sum of the entries S, and ε is the matrix of point-wise error constants that improve the noise generated from real world data. Due to the nature of the network data, traditional PCA would be too receptive to outliers and would prove to be ineffective and anomaly detection. As a result, RPCA was the statistical method chosen for anomaly detection.
SureLog impelents Robust Principle Component Analysis (RPCA) with Netflix SURUs
WHAT IS UEBA?
Hackers can break into firewalls, send you e-mails with malicious and infected attachments, or even bribe an employee to gain access into your firewalls. Old tools and systems are quickly becoming obsolete, and there are several ways to get past them.
User and entity behavior analytics (UEBA) give you more of a comprehensive way to make sure that your organization has top-notch IT security, while also helping you detect users and entities that might compromise your entire system.
User and entity behavior analytics, or UEBA, is a type of cyber security process that takes note of the normal conduct of users. In turn, they detect any anomalous behavior or instances when there are deviations from these “normal” patterns. For example, if a particular user regularly downloads 10 MB of files every day but suddenly downloads gigabytes of files, the system would be able to detect this anomaly and alert them immediately.
UEBA uses machine learning, algorithms, and statistical analyses to know when there is a deviation from established patterns, showing which of these anomalies could result in a potential, real threat. UEBA can also aggregate the data you have in your reports and logs, as well as analyze file, flow, and packet information.
In UEBA, you do not track security events or monitor devices; instead, you track all the users and entities in your system. As such, UEBA focuses on insider threats, such as employees who have gone rogue, employees who have already been compromised, and people who already have access to your system and then carry out targeted attacks and fraud attempts, as well as servers, applications, and devices that are working within your system.
BENEFITS OF UEBA
It is the unfortunate truth that today’s cyber security tools are fast becoming obsolete, and more skilled hackers and cyber attackers are now able to bypass the perimeter defenses that are used by most companies. In the old days, you were secure if you had web gateways, firewalls, and intrusion prevention tools in place. This is no longer the case in today’s complex threat landscape, and it’s especially true for bigger corporations that are proven to have very porous IT perimeters that are also very difficult to manage and oversee.

SURELOG UEBA ANOMALY DETECTIN STEPS
The entire workflow can be broken down into four distinct phases. We will describe each one of them below.
1) Data Preparation – In the first step, the workflow obtains relevant data from all the data sources. It applies all the defined filters, groups data by identified entities and prepares data for the next feature extraction stage.
2) Feature Extraction – In this step, data is obtained from all the relevant fields, grouped by each entity per day, and the configured features are computed and stored
3) Behavior Profiling – This is the step where for each entity, the extracted features are grouped into configured baselines and the machine learning model (SVD) is applied to generate a behavior profile for that particular entity.
4) Anomaly Detection – In the final step, the test feature values are scored against the behavior profile and an event is generated with an associated confidence score
The bottom line? Preventive measures are no longer enough. Your firewalls are not going to be 100% foolproof, and hackers and attackers will get into your system at one point or another. This is why detection is equally important: when hackers do successfully get into your system, then you should be able to detect their presence quickly in order to minimize the damage.
With SureLog UEBA Module, you can detect abnormal activities like If a user has over 38 logons when the expected number is only 11, it will be alerted.
Sample use cases:
• Abnormal amount of data copied to unauthorized removable media
• Abnormal amount of data egress to CD compared to past behavior
• Abnormal amount of data egressed to competitor domains compared to past behavior
• Abnormal amount of data egressed to non-business domains compared to past behavior
• Abnormal amount of data egressed to personal email account compared to past behavior
• Abnormal amount of data egressed to removable media compared to past behavior
• Abnormal amount of data egressed to unauthorized removable media compared to past behavior
• Abnormal amount of data uploads compared to past behavior
• Abnormal amount of email DLP match count violation compared to peer behavior
• Abnormal amount of emails to competitor domain
• Abnormal amount of emails to non-business domain
• Abnormal amount of emails to personal email account
• Abnormal amount of match count copied to unauthorized removable media
• Abnormal amount of removable DLP match count violation compared to peer behavior
• Abnormal amount of removable DLP match count violation to unauthorized removable media compared to past behavior
• Abnormal amount of source code files egressed
• Abnormal amounts of data burnt to CD
• Abnormal high no of files egress compared to peer
• Abnormal high no of policy rules violated
• Abnormal high volume of data egress
• Abnormal match count of policy rules violated
• Abnormal network share access attempts
• Abnormal no of compressed files egressed
• Abnormal no of email forwards
• Abnormal no of emails to competitor domains
• Abnormal no of emails to non-business domain
• Abnormal no of emails to personal email account
• Abnormal number of CD DLP violations compared to past behavior
• Abnormal number of compressed files egressed compared to past behavior Email
• Abnormal number of data uploads compared to past behavior
• Abnormal number of email DLP violations compared to peer behavior
• Abnormal number of email forwards compared to past behavior
• Abnormal number of emails sent to competitor domains compared to past behavior
• Abnormal number of emails to non-business domains compared to past behavior
• Abnormal number of emails to personal email account compared to past behavior
• Abnormal number of files burnt on CD
• Abnormal number of files copied to unauthorized removable media
• Abnormal number of files deleted
• Abnormal number of files downloaded
• Abnormal number of files egressed to removable media compared to past behavior
• Abnormal number of files egressed to unauthorized removable media compared to past behavior
• Abnormal number of files modified
• Abnormal number of files opened
• Abnormal number of files printed compared to peer
• Abnormal number of pages printed compared to peer
• Abnormal number of permission addition
• Abnormal number of removable DLP violations compared to past behavior
• Abnormal number of removable DLP violations compared to peer behavior
• Abnormal number of source code files egressed compared to past behavior
• Abnormal number of suspicious file access attempts
• Abnormal object or network share access

Published On: June 15th, 2023 / Categories: Blog /

Subscribe To Receive The Latest News

Add notice about your Privacy Policy here.