canonical_sets.lucid.LUCID
- class LUCID(model, outputs, example_data, numb_of_samples=100, numb_of_epochs=200, lr=0.1, low=-1, high=1, seed=1234, index=True, extra_epoch=True, one_hot_pre=False, one_hot_post=True, log_every_n=0, prefix_sep='+')[source]
Bases:
objectGradient-based inverse design to generate canonical sets.
This class generates a canonical set via inverse design and attributes the
pd.DataFrametoresults.- results
A dataframe with the canonical inputs.
- Type
pd.DataFrame
- results_processed
A dataframe with the processed canonical inputs.
- Type
pd.DataFrame
Examples
>>> model = tf.keras.Model() >>> outputs = pd.DataFrame([[0, 1]], columns=["No", "Yes"]) >>> example_data = train_data >>> lucid = LUCID(model, outputs, example_data) >>> lucid.results
Initialize the inverse design.
- Parameters
model (torch.nn.Module or tf.keras.Model) – The trained model to use for inverse design.
outputs (pd.DataFrame) – The outputs to use for inverse design. These are the targets/labels that have been used during training. For example,
pd.DataFrame([[0, 1]], columns=["<=50K", ">50K"])in theAdultdata set.example_data (pd.DataFrame) – The example data to infer columns, dtypes, … This is often (a part of) the training data itself, but can also be an artificial example.
numb_of_samples (int) – The number of samples to generate. The default is 100.
numb_of_epochs (int) – The number of epochs to train the model. The default is 200.
lr (float) – The learning rate for the optimizer. The default is 0.1.
low (float) – The lower bound for the random uniform distribution. The default is -1.
high (float) – The upper bound for the random uniform distribution. The default is 1.
seed (int) – The seed for the random number generator. The default is 1234.
index (bool) – If True the sample and epoch numbers are used as indices in the results
pd.DataFrame. Otherwise they are just columns. The default is True.extra_epoch (bool) – If True an additional forward pass is run after the categorical features have been one-hot encoded (post-processed). The results are saved for the last sample as the
numb_of_epochs+ 1 epoch. If there are no categorical features the argument is ignored. The default is True.one_hot_pre (bool) – If True, the initial values for the categorical features are pre-processed to be one-hot. If there are no categorical features the argument is ignored. Note that the inverse design will start from this one-hot sample, hence the pre- process. If False, the inverse design will start from the randomly drawn initial vectors. The default is False.
one_hot_post (bool) – If True, the values for the categorical features are post-processed to be one-hot. Note that the predictions during the inverse design are made with the original values of the categorical features and not with the post-processed values. To run an additional forward pass with the post-processed values check the
extra_epochargument. If there are no categorical features the argument is ignored. The default is True.log_every_n (int) – The number of epochs to log results. If 0, this argument is set equal to the
numb_of_epochsargument which makes it a static analysis with only the start and end samples. The default is 0.prefix_sep (str) – The separator for the prefix of the column names. The one-hot encoded features are grouped via the prefix. To be safe, make sure that the prefix only appears as a prefix in the column names (i.e., avoid Categorical-category-name, and opt for Categorical+category-name instead). The default is “+”.
- Raises
ValueError – If any columns are neither integer (one-hot encoded) or float (numerical).
ValueError – If the model is neither a torch.nn.Module or (tf.)keras.Model.
Methods
Plot the results for a given feature.
Plot the outputs.
Process the results by applying inverse scaler and one-hot encoding to categories.
Attributes
- hist(features)[source]
Plot the results for a given feature.
- Parameters
features (str or list of str) – The feature(s) to plot (either 1, 2, 3, 4, 6 or 8).
- Raises
ValueError – If the
featuresare neither a string or a list of strings of size 2, 3, 4, 6 or 8.
Note
If the
resultsare not yet processed, they will be withprocess_results.- Return type
- process_results(scaler=None)[source]
Process the results by applying inverse scaler and one-hot encoding to categories.
- Parameters
scaler (sklearn.base.TransformerMixin, optional) – Any of the
sklearnpreprocessing modules. The default is None which means there is no transformation on numerical features.- Return type