Contents
pip install exsum on the command line. Alternatively, clone the GitHub repo linked above and run pip install . inside the cloned directory.
exsum command will be available on the command line. It should be called as
exsum model-fn [--model-var-name model --log-dir logs --save-dir saves]
The first argument model-fn specifies an exsum.Model object. It could be one of the following two:
.py) that defines this object in the global namespace with the default variable name being model, or.pkl) of that object. Note that since exsum.Model objects contain functions as instance variables (as described below), it need to be pickled by the dill library. --model-var-name (default: model) specifies the name of the exsum.Model variable to load in the model-fn Python script. It is not applicable if a pickled object file (i.e. *.pkl) is given for model-fn. --log-dir argument (default: logs) specifies the logging directory. All GUI operations in an exsum session are saved as a timestamped log file to this directory.--save-dir argument (default: saves) specifies the location where the modified rules are saved by clicking a Save button on the GUI. Each time two files are saved, both named with the current timestamp:
exsum.Model object with the current parameters, which can be passed in as the model-fn parameter and loaded in a new exsum session.latest.(txt|pkl) are also created with the same content for the user to look up easily. These files should be renamed or moved if the user wants to keep them, as they will be overwritten on the next save.
log-dir, and not shared in any way.
exsum with pip (see above for instructions). exsum-demos repository, and cd into it.
git clone https://github.com/YilunZhou/exsum-demos
cd exsum-demosexsum-demo directory, run one of the two following commands to visualize the SST or QQP rule union.
exsum sst_rule_union.py # OR
exsum qqp_rule_union.pyReset and Save button. The Reset button discards all changes to the parameter values made in the rule (Panel D), and the Save button saves a copy of the current rule union to the --save-dir (default: saves).
target metric, target metric for, target value). For example, in the pop-up above, the search tries to find a parameter value that achieves a validity value of at least 0.9 for the selected rule. Note that the coverage metric is not available for selection since we are tuning a parameter for the behavior function. All three metrics can be selected for parameters of the applicability function. start value and the stop value. The start value and stop value are initialized to the current parameter value, but they should be changed appropriately. precision as a step size. Suppose that we have start value = 0.0, stop value = -1.0, precision = 0.01. Then it sequentially evaluates values of 0.0, -0.01, -0.02, ..., -0.99, -1.0, and terminates when the objective is first met. Thus, it stops at the satisfying value closest to the start value. precision as the smallest halving length. It requires that the stop value is feasible (i.e. satisfies the objective), and initializes the interval [left, right] to [start value, stop value]. At every iteration, if the mid-point of the interval is feasible, it uses the interval [left, mid-point] as the new interval, and if it is infeasible, it uses [mid-point, right]. The procedure stops when the interval length is smaller than precision. Thus, if the metric value is monotonically increasing in the direction of start value to stop value, then the binary search will also output the satisfying value closest to the start value, but can be much faster than the linear search. However, with non-monotonicity, it could miss the closest satisfying value, which may be undesirable. exsum runs.
exsum (e.g. the fully quantified class name for Model is exsum.Model). The three green classes represent list membership. For example, a Rule has a list of Parameters. For the BehaviorRange shown in yellow, it is technically not contained in Rule, but is instead produced by its behavior function. We include it here for completeness. The top-level Model object is passed to the command exsum to start the GUI visualization.
ModelModel class is the top-level object that contains everything necessary about the rule union and the dataset on which the rule union is evaluated. It should be initialized as:
model = Model(rule_union, data)
where rule_union and data are objects of the RuleUnion and Data class respectively. The Model class is also responsible for calculating metric values and producing instance visualizations (which are queried by the GUI server), but users should not be concerned about these functionalities.
RuleUnionRuleUnion is specified by a list of rules and a composition structure of these rules. It should be initialized as:
rule_union = RuleUnion(rules, composition_structure)
rules is a list of Rule objects. As described below, each rule has an index, which we assume to be unique. composition_structure specifies how the rules are composed. If a rule union contains only one rule, it is an integer with value being the rule index. Otherwise, it is a tuple of three items. The first and the third one recursively represents the two constituent rule (specified by an integer) or rule union (specified by a tuple), and the second one is either '>' for precedence mode composition, or '&' for intersection mode composition. (3, '>', (1, '&', 2)) means that rule 1 and 2 are first combined in intersection mode, and then combined with rule 3 with a lower precedence. composition_structure, but no rule can be used more than once. RuleUnion class is also responsible for supplying the metric values requested by the Model and generating the counterfactual RuleUnion without a specified rule, but users should not be concerned about these functionalities.
RuleRule contains the following information: index (an integer), name (a string), applicability function a_func and its parameters a_params, and behavior function b_func and its parameters b_params. It should be initialized as:
rule = Rule([index, name, a_func, a_params, b_func, b_params])
Note that the constructor takes in the single variable of a list that contains every piece of information rather than each piece separately. a_func and b_func are native Python functions. They have the same input format: an FEU as the first input, followed by a list of current values for the parameters, as below for an example of an applicability function with three parameters:
def a_func(feu, param_vals):
param1_val, param2_val, param3_val = param_vals
...
a_func returns True or False to represent the applicability status of the input FEU, and b_func returns a BehaviorRange object to represent the prescribed behavior value. a_func and b_func share many common implementation details. In this case, it could be more convenient to combine them into one ab_func. In this case, the rule can be initialized as
rule = Rule([index, name, ab_func, a_params, b_params])
Note that again a list is provided, but this time with five items instead of six. The Rule constructor uses the length of the list to distinguish between these two cases. ab_func should take in the FEU, a list of current values for a_params and a list of current values for b_params, and return a tuple of two values, True or False and a BehaviorRange object, as below:
def ab_func(feu, a_param_vals, b_param_vals)
When b_func or ab_func is called for a non-applicable input, arbitrary object can be returned (but the function should not raise an exception) and the result is guaranteed to not be used. Furthermore, the name of the rule has no effect on everything else, and is only for human readability.
ParameterParameter object encapsulates everything about a parameter, including its name, value range, default value and current value. It is initialized as
param = Parameter(name, param_range, default_value)
where name is a string, param_range is a ParameterRange object, and default_value is a floating point value. The current value is set to the default_value on initialization and reverted back to it whenever the user presses a Reset button on the GUI.
ParameterRangeParameterRange is specified by its lower and upper bound:
param_range = ParameterRange(lo, hi)
where lo and hi are floating point values for the two bounds. Note that the interval is assumed to be a closed interval. To represent an open interval, offset the bound value by a small epsilon value (e.g. 1e-5).
BehaviorRangeParameterRange, BehaviorRange objects are also defined as closed intervals. However, a key difference is that a BehaviorRange allows multiple disjoint intervals. For example to represent that an attribution value should have extreme values on either side, the behavior range could be [-1.0, -0.9] ∪ [0.9, 1.0]. Thus, it is initialized as
behavior_range = BehaviorRange(intervals)
where intervals is a list of (lo, hi) tuples. In the case where a single closed interval is needed, an alternative class method is also provided in a way similar to the syntax of ParameterRange:
behavior_range = BehaviorRange.simple_interval(lo, hi)
To represent an open interval, offset the bound value by a small epsilon value (e.g. 1e-5).
DataData object represents the set of instances along with their explanation values, on which the rules and rule union are evaluated. The main data are stored in a SentenceGroupedFEU object. In addition, to compute the sharpness metric, the probability measure for the marginal distribution of all explanation values need to be used. Operations on the probability measure is enabled by the Measure object. With these two objects, a Data object can be initialized as:
data = Data(sentence_grouped_feu, measure, normalize)
where normalize is an optional boolean variable (default to False) that specifies whether the explanation values should be scaled so that all are within the range of [-1.0, 1.0]. Measure object is also scaled accordingly. Its default value is set to False to prevent any unintended effects, but we recommend normalization (or pre-normalization of explanation values before loading into this Data object) since the coloring used by the GUI text rendering assumes -1.0 and 1.0 as the extreme values.
SentenceGroupedFEUSentenceGroupedFEU class is designed to represent a data instance with its explanation as a whole. A data instance contains the following information:
a_func and b_func rather than the classifier, and thus should be human-interpretable, andsentence_grouped_feu = SentenceGroupedFEU(words, features, explanations, true_label, prediction)
where words is a list of strings, features is a list of tuples (of arbitrary elements), explanations is a list of floats, true_label is either 0 or 1, and prediction is a float between 0 and 1. The items in words, features and explanations should be aligned with each other, and thus the three lists should have the same length.
A SentenceGroupedFEU in spirit contains a list of FEUs. However, to save space, this list is never explicitly kept, but elements of it are generated on the fly. Users should not be concerned with the details.
FEUSentenceGroupedFEU does not require the actual instantiation of FEUs. However, as a_func and b_func take inputs of this class, it is important to familiar with its instance variables. An FEU object feu has the following instance variables:
feu.context points to the SentenceGroupedFEU object from which this feu is created; feu.idx is an integer for the (0-based) position of the FEU in the sentence; feu.word is a string for the word of the FEU; feu.explanation is a float for the explanation of the FEU; feu.true_label is an integer of either 0 or 1 for the ground truth label;feu.prediction is a float in [0, 1] for the model's predicted probability for the positive class; andfeu.L is an integer for the length of the whole sentence. SentenceGroupedFEU. However, they are included for convenience due to frequent use.
MeasureMeasure class represents an estimated probabilty measure on the marginal distribution of explanation values. It is computed by kernel density estimation, and the resulting cumulative distribution function is approximated by a dense linear interpolation for fast inference. This is a relatively expensive computation, especially for a large dataset. Thus, it should be pre-computed and provided explicitly when constructing the Data object. measure = Measure(explanations, weights, zero_discrete)
explanations is a flattened list of all explanation values for all data instances; weights is a flattened list of the (unnormalized) weights corresponding to the items in explanations. ExSum defines metric values by considering each data instance as equally weighted. Thus, an FEU in a longer input receives less weight than an FEU in a shorter input. The simplest way is to assign 1 / feu.L as the weight for the explanation; andzero_discrete specifies whether the zero explanation value should be considered as a point mass in the probability distribution (i.e. a mixed discrete/continuous distribution). Some explainers (such as LIME with LASSO regularization) produce sparse explanations where a large fraction of explanation values are strictly 0. This case should be modeled with zero_discrete = True, while generally continuous distributions should be modeled with zero_discrete = False.