lux.metrics.average_jackart

lux.metrics.average_jackart(rule_1, rule_2, dataset, features, categorical_indicator)

Calculate the average Jaccard similarity coefficient between two sets of rules.

Parameters:
  • rule_1 (dict) – A dictionary representing the first set of rules. Each key corresponds to a feature, and the corresponding value is a list of conditions applied to that feature.

  • rule_2 (dict) – A dictionary representing the second set of rules. Similar to rule_1.

  • dataset (pandas.DataFrame) – The dataset on which the rules are applied. It should be a pandas DataFrame.

  • features (list) – A list of feature names in the dataset.

  • categorical_indicator (list) – A list indicating whether each feature is categorical or not. Each element of the list corresponds to a feature in features, with True indicating the feature is categorical and False indicating it is not.

Returns:

The average Jaccard similarity coefficient between the rules in rule_1 and rule_2. If there are no rules in either rule_1 or rule_2, the function returns 0.

Return type:

float

Notes:

  • If rule_1 or rule_2 contains rules for features that are not present in the dataset, those rules will be ignored.

  • The function handles cases where either rule_1 or rule_2 is empty by returning 0.

  • The Jaccard similarity coefficient is calculated between the values of the features specified in the rules. If both rule_1 and rule_2 contain rules for the same feature, the coefficient is calculated between the corresponding values.