File Content Rules
Employing a pattern matching rule language to analyze files.Synonyms: File Content Signatures , and File Signatures .
How it works
Rules, often called signatures, are used for both generic and targeted malware detection. The rules are usually expressed in a domain specific language (DSL), then deployed to software that scans files for matches. The rules are developed and broadly distributed by commercial vendors, or they are developed and deployed by enterprise security teams to address highly targeted or custom malware. Conceptually, there are public and private rule sets. Both leverage the same technology, but they are intended to detect different types of cyber adversaries.
- Patterns expressed in the DSLs range in their complexity. Some scanning engines support file parsing and normalization for high fidelity matching, others support only simple regular expression matching against raw file data. Engineers must make a trade-off in terms of:
- The fidelity of the matching capabilities in order to balance high recall with avoiding false positives,
- The computational load for scanning, and
- The resilience of the engine to deal with adversarial content presented in different forms-- content which in some cases is designed to exploit or defeat the scanning engines.
- Signature libraries can become large over time and impact scanning performance.
- Some vendors who sell signatures have to delete old signatures over time.
- Simple signatures against raw content cannot match against encoded, encrypted, or sufficiently obfuscated content.
The following references were used to develop the File Content Rules knowledge-base article.
(Note: the consideration of references does not imply specific functionality exists in an offering.)
Computational modeling and classification of data streams
Provides a mechanism to classify files using file signatures based on a computational model. Training data that comprises at least a portion of a file, e.g. number of bytes, is used as input to the computational model to develop a file signature and classify the file as malware.
Detecting script-based malware
The patent describes techniques that can be implemented to detect and block malicious commands and command scripts from being executed by scripting engines.
Script Execution Monitoring explanation
This patent describes software installed on the host system that hooks into methods of a scripting engine to intercept commands before they are executed and block commands if they are determined to be harmful. For example regular expression checking may be used to identify commands having malicious patterns. Expression checking may be used for script files as well as interactively - typed commands.
File Content Signatures explanation
This patent includes File Content Signatures because in the case of a script file, a hash of the file is compared against hashes of known malicious script files to determine whether the script file is malicious.
Distributed meta-information query in a network
Provides a mechanism to detect, monitor, locate, and control files installed on host computers. Each host has a host agent that analyzes file system activity and takes action based on policies configured on a server. The policies identify whether to block, log, allow, or quarantine actions such as file accesses and execution of executables. Examples of policies include:
- Block/log execution of new executables and detached scripts (e.g., .exe or .bat)
- Block/log reading/execution of new embedded content (e.g., macros in .doc)
- Block/log installation/modification of Web content (alteration of content in .html or .cgi files)
- Block/log execution of new files in an administratively defined 'class'; e.g., an administrator might want to block screen savers .scr, but not the entire class of executables .exe, .dll, .sys, etc . . .
System and methods thereof for logical identification of malicious threats across a plurality of end-point devices (epd) communicatively connected by a network
This patent describes detecting suspicious files using file metadata such as the prevalence of the file deployed on the network, file installation times, and how the file was spread within the network. The combination of these factors are used to determine a risk score of the file and if below a threshold, sends an alert.