maec.analytics.distance Module

Version: 4.1.0.17


Classes

class maec.analytics.distance.Distance(maec_entity_list)[source]

Bases: object

Calculates distance between two or more MAEC entities. Currently supports only Packages or Malware Subjects.

add_log(number, log_list)[source]

Added a log’d (log-ized??) number to a list

bin_list(numeric_value, numeric_list, n=10)[source]

Bin a numeric value into a bucket, based on a parent list of values. N = number of buckets to use (default = 10).

build_string_vector(string_list, superset_string_list, ignore_case=True)[source]

Build a vector from an input list of strings and superset list of strings.

calculate()[source]

Calculate the distances between the input Malware Subjects.

create_dynamic_result_vector(dynamic_vector)[source]

Construct the dynamic result (matching) vector for a corresponding feature vector

create_static_result_vector(static_vector)[source]

Construct the static result (matching) vector for a corresponding feature vector

create_superset_vectors()[source]

Calculate vector supersets from the feature vectors

euclidean_distance(vector_1, vector_2)[source]

Calculate the Euclidean distance between two input vectors

flatten_vector(vector_entry_list)[source]

Generate a single, flattened vector from an input list of vectors or values.

generate_feature_vectors(merged_subjects)[source]

Generate a feature vector for the binned Malware Subjects

normalize_numeric(numeric_value, numeric_list, normalize=True, scale_log=True)[source]

Scale a numeric value, based on a parent list of values. Return the scaled/normalized form.

normalize_numeric_list(value_list, numeric_list, normalize=True, scale_log=True)[source]

Scale a list of numeric values, based on a parent list of numeric value lists. Return the scaled/normalized form.

normalize_vectors(vector_1, vector_2)[source]

Normalize two input vectors so that they have similar composition.

perform_calculation()[source]

Perform the actual distance calculation. Store the results in the distances dictionary.

populate_hashes_mapping(malware_subject_list)[source]

Populate and return the Malware Subject -> Hashes mapping from an input list of Malware Subjects.

preprocess_entities(dereference=True)[source]

Pre-process the MAEC entities

print_distances(file_object, default_label='md5', delimiter=', ')[source]

Print the distances between the Malware Subjects in delimited matrix format to a File-like object.

Try to use the MD5s of the Malware Subjects as the default label. Uses commas as the default delimiter, for CSV-like output.

class maec.analytics.distance.StaticFeatureVector(malware_subject, deduplicator)[source]

Bases: object

Generate a feature vector for a Malware Subject based on its static features

create_object_vector(object, static_feature_dict, callback_function=None)[source]

Create a vector from a single Object

create_static_vectors(malware_subject)[source]

Create a vector of static features for an input Malware Subject

extract_features(malware_subject)[source]

Extract the static features from the Malware Subject

get_unique_features()[source]

Calculates the unique set of static features for the Malware Subject

class maec.analytics.distance.DynamicFeatureVector(malware_subject, deduplicator, ignored_object_properties, ignored_actions)[source]

Bases: object

Generate a feature vector for a Malware Subject based on its dynamic features

create_action_vector(action)[source]

Create a vector from a single Action

create_dynamic_vectors(malware_subject)[source]

Create a vector of unique action/object pairs for an input Malware Subject

extract_features(malware_subject)[source]

Extract the dynamic features from the Malware Subject

get_unique_features()[source]

Calculates the unique set of dynamic features for the Malware Subject

prune_dynamic_features(min_length=2)[source]

Prune the dynamic features based on ignored Object properties/Actions