Contents
pip install exsum
on the command line. Alternatively, clone the GitHub repo linked above and run pip install .
inside the cloned directory.
exsum
command will be available on the command line. It should be called as
exsum model-fn [--model-var-name model --log-dir logs --save-dir saves]
The first argument model-fn
specifies an exsum.Model
object. It could be one of the following two:
.py
) that defines this object in the global namespace with the default variable name being model
, or.pkl
) of that object. Note that since exsum.Model
objects contain functions as instance variables (as described below), it need to be pickled by the dill
library. --model-var-name
(default: model
) specifies the name of the exsum.Model
variable to load in the model-fn
Python script. It is not applicable if a pickled object file (i.e. *.pkl
) is given for model-fn
. --log-dir
argument (default: logs
) specifies the logging directory. All GUI operations in an exsum
session are saved as a timestamped log file to this directory.--save-dir
argument (default: saves
) specifies the location where the modified rules are saved by clicking a Save
button on the GUI. Each time two files are saved, both named with the current timestamp:
exsum.Model
object with the current parameters, which can be passed in as the model-fn
parameter and loaded in a new exsum
session.latest.(txt|pkl)
are also created with the same content for the user to look up easily. These files should be renamed or moved if the user wants to keep them, as they will be overwritten on the next save.
log-dir
, and not shared in any way.
exsum
with pip
(see above for instructions). exsum-demos
repository, and cd
into it.
git clone https://github.com/YilunZhou/exsum-demos
cd exsum-demos
exsum-demo
directory, run one of the two following commands to visualize the SST or QQP rule union.
exsum sst_rule_union.py # OR
exsum qqp_rule_union.py
Reset
and Save
button. The Reset
button discards all changes to the parameter values made in the rule (Panel D), and the Save
button saves a copy of the current rule union to the --save-dir
(default: saves
).
target metric
, target metric for
, target value
). For example, in the pop-up above, the search tries to find a parameter value that achieves a validity value of at least 0.9 for the selected rule. Note that the coverage metric is not available for selection since we are tuning a parameter for the behavior function. All three metrics can be selected for parameters of the applicability function. start value
and the stop value
. The start value
and stop value
are initialized to the current parameter value, but they should be changed appropriately. precision
as a step size. Suppose that we have start value = 0.0, stop value = -1.0, precision = 0.01
. Then it sequentially evaluates values of 0.0, -0.01, -0.02, ..., -0.99, -1.0
, and terminates when the objective is first met. Thus, it stops at the satisfying value closest to the start value
. precision
as the smallest halving length. It requires that the stop value
is feasible (i.e. satisfies the objective), and initializes the interval [left, right]
to [start value, stop value]
. At every iteration, if the mid-point of the interval is feasible, it uses the interval [left, mid-point]
as the new interval, and if it is infeasible, it uses [mid-point, right]
. The procedure stops when the interval length is smaller than precision
. Thus, if the metric value is monotonically increasing in the direction of start value
to stop value
, then the binary search will also output the satisfying value closest to the start value
, but can be much faster than the linear search. However, with non-monotonicity, it could miss the closest satisfying value, which may be undesirable. exsum
runs. exsum
(e.g. the fully quantified class name for Model
is exsum.Model
). The three green classes represent list membership. For example, a Rule
has a list of Parameter
s. For the BehaviorRange
shown in yellow, it is technically not contained in Rule
, but is instead produced by its behavior function. We include it here for completeness. The top-level Model
object is passed to the command exsum
to start the GUI visualization.
Model
Model
class is the top-level object that contains everything necessary about the rule union and the dataset on which the rule union is evaluated. It should be initialized as:
model = Model(rule_union, data)
where rule_union
and data
are objects of the RuleUnion
and Data
class respectively. The Model
class is also responsible for calculating metric values and producing instance visualizations (which are queried by the GUI server), but users should not be concerned about these functionalities.
RuleUnion
RuleUnion
is specified by a list of rules and a composition structure of these rules. It should be initialized as:
rule_union = RuleUnion(rules, composition_structure)
rules
is a list of Rule
objects. As described below, each rule has an index, which we assume to be unique. composition_structure
specifies how the rules are composed. If a rule union contains only one rule, it is an integer with value being the rule index. Otherwise, it is a tuple of three items. The first and the third one recursively represents the two constituent rule (specified by an integer) or rule union (specified by a tuple), and the second one is either '>'
for precedence mode composition, or '&'
for intersection mode composition. (3, '>', (1, '&', 2))
means that rule 1 and 2 are first combined in intersection mode, and then combined with rule 3 with a lower precedence. composition_structure
, but no rule can be used more than once. RuleUnion
class is also responsible for supplying the metric values requested by the Model
and generating the counterfactual RuleUnion
without a specified rule, but users should not be concerned about these functionalities.
Rule
Rule
contains the following information: index
(an integer), name
(a string), applicability function a_func
and its parameters a_params
, and behavior function b_func
and its parameters b_params
. It should be initialized as:
rule = Rule([index, name, a_func, a_params, b_func, b_params])
Note that the constructor takes in the single variable of a list that contains every piece of information rather than each piece separately. a_func
and b_func
are native Python functions. They have the same input format: an FEU
as the first input, followed by a list of current values for the parameters, as below for an example of an applicability function with three parameters:
def a_func(feu, param_vals):
param1_val, param2_val, param3_val = param_vals
...
a_func
returns True
or False
to represent the applicability status of the input FEU, and b_func
returns a BehaviorRange
object to represent the prescribed behavior value. a_func
and b_func
share many common implementation details. In this case, it could be more convenient to combine them into one ab_func
. In this case, the rule can be initialized as
rule = Rule([index, name, ab_func, a_params, b_params])
Note that again a list is provided, but this time with five items instead of six. The Rule
constructor uses the length of the list to distinguish between these two cases. ab_func
should take in the FEU
, a list of current values for a_params
and a list of current values for b_params
, and return a tuple of two values, True
or False
and a BehaviorRange
object, as below:
def ab_func(feu, a_param_vals, b_param_vals)
When b_func
or ab_func
is called for a non-applicable input, arbitrary object can be returned (but the function should not raise an exception) and the result is guaranteed to not be used. Furthermore, the name of the rule has no effect on everything else, and is only for human readability.
Parameter
Parameter
object encapsulates everything about a parameter, including its name, value range, default value and current value. It is initialized as
param = Parameter(name, param_range, default_value)
where name
is a string, param_range
is a ParameterRange
object, and default_value
is a floating point value. The current value is set to the default_value
on initialization and reverted back to it whenever the user presses a Reset
button on the GUI.
ParameterRange
ParameterRange
is specified by its lower and upper bound:
param_range = ParameterRange(lo, hi)
where lo
and hi
are floating point values for the two bounds. Note that the interval is assumed to be a closed interval. To represent an open interval, offset the bound value by a small epsilon value (e.g. 1e-5).
BehaviorRange
ParameterRange
, BehaviorRange
objects are also defined as closed intervals. However, a key difference is that a BehaviorRange
allows multiple disjoint intervals. For example to represent that an attribution value should have extreme values on either side, the behavior range could be [-1.0, -0.9] ∪ [0.9, 1.0]. Thus, it is initialized as
behavior_range = BehaviorRange(intervals)
where intervals
is a list of (lo, hi)
tuples. In the case where a single closed interval is needed, an alternative class method is also provided in a way similar to the syntax of ParameterRange
:
behavior_range = BehaviorRange.simple_interval(lo, hi)
To represent an open interval, offset the bound value by a small epsilon value (e.g. 1e-5).
Data
Data
object represents the set of instances along with their explanation values, on which the rules and rule union are evaluated. The main data are stored in a SentenceGroupedFEU
object. In addition, to compute the sharpness metric, the probability measure for the marginal distribution of all explanation values need to be used. Operations on the probability measure is enabled by the Measure
object. With these two objects, a Data
object can be initialized as:
data = Data(sentence_grouped_feu, measure, normalize)
where normalize
is an optional boolean variable (default to False
) that specifies whether the explanation values should be scaled so that all are within the range of [-1.0, 1.0]. Measure
object is also scaled accordingly. Its default value is set to False
to prevent any unintended effects, but we recommend normalization (or pre-normalization of explanation values before loading into this Data
object) since the coloring used by the GUI text rendering assumes -1.0 and 1.0 as the extreme values.
SentenceGroupedFEU
SentenceGroupedFEU
class is designed to represent a data instance with its explanation as a whole. A data instance contains the following information:
a_func
and b_func
rather than the classifier, and thus should be human-interpretable, andsentence_grouped_feu = SentenceGroupedFEU(words, features, explanations, true_label, prediction)
where words
is a list of strings, features
is a list of tuples (of arbitrary elements), explanations
is a list of floats, true_label
is either 0 or 1, and prediction
is a float between 0 and 1. The items in words
, features
and explanations
should be aligned with each other, and thus the three lists should have the same length.
A SentenceGroupedFEU
in spirit contains a list of FEU
s. However, to save space, this list is never explicitly kept, but elements of it are generated on the fly. Users should not be concerned with the details.
FEU
SentenceGroupedFEU
does not require the actual instantiation of FEU
s. However, as a_func
and b_func
take inputs of this class, it is important to familiar with its instance variables. An FEU
object feu
has the following instance variables:
feu.context
points to the SentenceGroupedFEU
object from which this feu
is created; feu.idx
is an integer for the (0-based) position of the FEU in the sentence; feu.word
is a string for the word of the FEU; feu.explanation
is a float for the explanation of the FEU; feu.true_label
is an integer of either 0 or 1 for the ground truth label;feu.prediction
is a float in [0, 1] for the model's predicted probability for the positive class; andfeu.L
is an integer for the length of the whole sentence. SentenceGroupedFEU
. However, they are included for convenience due to frequent use.
Measure
Measure
class represents an estimated probabilty measure on the marginal distribution of explanation values. It is computed by kernel density estimation, and the resulting cumulative distribution function is approximated by a dense linear interpolation for fast inference. This is a relatively expensive computation, especially for a large dataset. Thus, it should be pre-computed and provided explicitly when constructing the Data
object. measure = Measure(explanations, weights, zero_discrete)
explanations
is a flattened list of all explanation values for all data instances; weights
is a flattened list of the (unnormalized) weights corresponding to the items in explanations
. ExSum defines metric values by considering each data instance as equally weighted. Thus, an FEU in a longer input receives less weight than an FEU in a shorter input. The simplest way is to assign 1 / feu.L
as the weight for the explanation; andzero_discrete
specifies whether the zero explanation value should be considered as a point mass in the probability distribution (i.e. a mixed discrete/continuous distribution). Some explainers (such as LIME with LASSO regularization) produce sparse explanations where a large fraction of explanation values are strictly 0. This case should be modeled with zero_discrete = True
, while generally continuous distributions should be modeled with zero_discrete = False
.