lux.metrics.average_jackart
- lux.metrics.average_jackart(rule_1, rule_2, dataset, features, categorical_indicator)
Calculate the average Jaccard similarity coefficient between two sets of rules.
- Parameters:
rule_1 (dict) – A dictionary representing the first set of rules. Each key corresponds to a feature, and the corresponding value is a list of conditions applied to that feature.
rule_2 (dict) – A dictionary representing the second set of rules. Similar to rule_1.
dataset (pandas.DataFrame) – The dataset on which the rules are applied. It should be a pandas DataFrame.
features (list) – A list of feature names in the dataset.
categorical_indicator (list) – A list indicating whether each feature is categorical or not. Each element of the list corresponds to a feature in features, with True indicating the feature is categorical and False indicating it is not.
- Returns:
The average Jaccard similarity coefficient between the rules in rule_1 and rule_2. If there are no rules in either rule_1 or rule_2, the function returns 0.
- Return type:
- Notes:
If rule_1 or rule_2 contains rules for features that are not present in the dataset, those rules will be ignored.
The function handles cases where either rule_1 or rule_2 is empty by returning 0.
The Jaccard similarity coefficient is calculated between the values of the features specified in the rules. If both rule_1 and rule_2 contain rules for the same feature, the coefficient is calculated between the corresponding values.