wbia.algo.verif package¶
Subpackages¶
- wbia.algo.verif.torch package
- Submodules
- wbia.algo.verif.torch.fit_harness module
- wbia.algo.verif.torch.gpu_util module
- wbia.algo.verif.torch.lr_schedule module
- wbia.algo.verif.torch.models module
- wbia.algo.verif.torch.netmath module
- wbia.algo.verif.torch.old_harness module
- wbia.algo.verif.torch.siamese module
- wbia.algo.verif.torch.train_main module
- Module contents
Submodules¶
wbia.algo.verif.clf_helpers module¶
This module is a work in progress, as such concepts are subject to change.
- MAIN IDEA:
- MultiTaskSamples serves as a structure to contain and manipulate a set of samples with potentially many different types of labels and features.
-
class
wbia.algo.verif.clf_helpers.
ClfProblem
[source]¶ Bases:
utool.util_dev.NiceRepr
-
learn_deploy_classifiers
(task_keys=None, clf_key=None, data_key=None)[source]¶ Learns on data without any train/validation split
-
learn_evaluation_classifiers
(task_keys=None, clf_keys=None, data_keys=None)[source]¶ Evaluates by learning classifiers using cross validation. Do not use this to learn production classifiers.
python -m wbia.algo.verif.vsone evaluate_classifiers –db PZ_PB_RF_TRAIN –show
Example:
- CommandLine:
- python -m clf_helpers learn_evaluation_classifiers
Example
>>> # ENABLE_DOCTEST >>> from wbia.algo.verif.clf_helpers import * # NOQA >>> pblm = IrisProblem() >>> pblm.setup() >>> pblm.verbose = True >>> pblm.eval_clf_keys = ['Logit', 'RF'] >>> pblm.eval_task_keys = ['iris'] >>> pblm.eval_data_keys = ['learn(all)'] >>> result = pblm.learn_evaluation_classifiers() >>> res = pblm.task_combo_res['iris']['Logit']['learn(all)'] >>> res.print_report() >>> res = pblm.task_combo_res['iris']['RF']['learn(all)'] >>> res.print_report() >>> print(result)
-
rrr
(verbose=True, reload_module=True)¶ special class reloading function This function is often injected as rrr of classes
-
-
class
wbia.algo.verif.clf_helpers.
ClfResult
[source]¶ Bases:
utool.util_dev.NiceRepr
Handles evaluation statistics for a multiclass classifier trained on a specific dataset with specific labels.
-
classmethod
combine_results
(res_list, labels=None)[source]¶ Combine results from cross validation runs into a single result representing the performance of the entire dataset
-
get_pos_threshes
(metric='fpr', value=0.0001, maximize=False, warmup=200, priors=None, min_thresh=0.5)[source]¶ Finds a threshold that achieves the desired value for the desired metric, while maximizing or minimizing the threshold.
For positive classification you want to minimize the threshold. Priors can be passed in to augment probabilities depending on support. By default a class prior is 1 for threshold minimization and 0 for maximization.
-
get_thresholds
(metric='mcc', value='maximize')[source]¶ get_metric = ‘thresholds’ at_metric = metric = ‘mcc’ at_value = value = ‘maximize’
a = [] b = [] for x in np.linspace(0, 1, 1000):
a += [cfms.get_metric_at_metric(‘thresholds’, ‘fpr’, x, subindex=True)] b += [cfms.get_thresh_at_metric(‘fpr’, x)]a = np.array(a) b = np.array(b) d = (a - b) logger.info((d.min(), d.max()))
-
hardness_analysis
(samples, infr=None, method='argmax')[source]¶ samples = pblm.samples
# TODO MWE with sklearn data
# ClfResult.make_single(ClfResult, clf, X_df, test_idx, labels, # data_key, feat_dims=None):
import sklearn.datasets iris = sklearn.datasets.load_iris()
# TODO: make this setup simpler pblm = ClfProblem() task_key, clf_key, data_key = ‘iris’, ‘RF’, ‘learn(all)’ X_df = pd.DataFrame(iris.data, columns=iris.feature_names) samples = MultiTaskSamples(X_df.index) samples.apply_indicators({‘iris’: {name: iris.target == idx
for idx, name in enumerate(iris.target_names)}})samples.X_dict = {‘learn(all)’: X_df}
pblm.samples = samples pblm.xval_kw[‘type’] = ‘StratifiedKFold’ clf_list, res_list = pblm._train_evaluation_clf(
task_key, data_key, clf_key)labels = pblm.samples.subtasks[task_key] res = ClfResult.combine_results(res_list, labels)
res.get_thresholds(‘mcc’, ‘maximize’)
predict_method = ‘argmax’
-
index
¶
-
classmethod
make_single
(clf, X_df, test_idx, labels, data_key, feat_dims=None)[source]¶ Make a result for a single cross validiation subset
-
rrr
(verbose=True, reload_module=True)¶ special class reloading function This function is often injected as rrr of classes
-
classmethod
-
class
wbia.algo.verif.clf_helpers.
IrisProblem
[source]¶ Bases:
wbia.algo.verif.clf_helpers.ClfProblem
Simple demo using the abstract clf problem to work on the iris dataset.
- Example:
>>> # ENABLE_DOCTEST >>> from wbia.algo.verif.clf_helpers import * # NOQA >>> pblm = IrisProblem() >>> pblm.setup() >>> pblm.samples
-
rrr
(verbose=True, reload_module=True)¶ special class reloading function This function is often injected as rrr of classes
-
class
wbia.algo.verif.clf_helpers.
MultiClassLabels
[source]¶ Bases:
utool.util_dev.NiceRepr
Used by samples to encode a single set of mutually exclusive labels. These can either be binary or multiclass.
import pandas as pd pd.options.display.max_rows = 10 # pd.options.display.max_rows = 20 pd.options.display.max_columns = 40 pd.options.display.width = 160-
gen_one_vs_rest_labels
()[source]¶ Example
>>> # ENABLE_DOCTEST >>> from wbia.algo.verif.clf_helpers import * # NOQA >>> indicator = ut.odict([ >>> ('state1', [0, 0, 0, 1]), >>> ('state2', [0, 0, 1, 0]), >>> ('state3', [1, 1, 0, 0]), >>> ]) >>> labels = MultiClassLabels.from_indicators(indicator, task_name='task1') >>> sublabels = list(labels.gen_one_vs_rest_labels()) >>> sublabel = sublabels[0]
-
rrr
(verbose=True, reload_module=True)¶ special class reloading function This function is often injected as rrr of classes
-
target_type
¶
-
y_bin
¶
-
y_enc
¶
-
-
class
wbia.algo.verif.clf_helpers.
MultiTaskSamples
(index)[source]¶ Bases:
utool.util_dev.NiceRepr
Handles samples (i.e. feature-label pairs) with a combination of non-mutually exclusive subclassification labels
- CommandLine:
- python -m wbia.algo.verif.clf_helpers MultiTaskSamples
Example
>>> # ENABLE_DOCTEST >>> from wbia.algo.verif.clf_helpers import * # NOQA >>> samples = MultiTaskSamples([0, 1, 2, 3]) >>> tasks_to_indicators = ut.odict([ >>> ('task1', ut.odict([ >>> ('state1', [0, 0, 0, 1]), >>> ('state2', [0, 0, 1, 0]), >>> ('state3', [1, 1, 0, 0]), >>> ])), >>> ('task2', ut.odict([ >>> ('state4', [0, 0, 0, 1]), >>> ('state5', [1, 1, 1, 0]), >>> ])) >>> ]) >>> samples.apply_indicators(tasks_to_indicators)
-
apply_encoded_labels
(y_enc, class_names, task_name)[source]¶ Adds labels for a specific task. Alternative to apply_indicators
Parameters:
-
apply_indicators
(tasks_to_indicators)[source]¶ Adds labels for a specific task
Parameters: tasks_to_indicators (dict) – - takes the form:
}
-
group_ids
¶
-
rrr
(verbose=True, reload_module=True)¶ special class reloading function This function is often injected as rrr of classes
-
class
wbia.algo.verif.clf_helpers.
XValConfig
(**kwargs)[source]¶ Bases:
wbia.dtool.base.Config
wbia.algo.verif.deploy module¶
-
class
wbia.algo.verif.deploy.
Deployer
(dpath='.', pblm=None)[source]¶ Bases:
object
Transforms a OneVsOne problem into a deployable model. Registers and loads published models.
-
deploy
(task_key=None, publish=False)[source]¶ Trains and saves a classifier for deployment
Notes
- A deployment consists of the following information
- The classifier itself
- Information needed to construct the input to the classifier
- TODO: can this be encoded as an sklearn pipeline?
- Metadata concerning what data the classifier was trained with
- PUBLISH TO /media/hdd/PUBLIC/models/pairclf
Example
>>> # xdoctest: +REQUIRES(module:wbia_cnn, --slow) >>> from wbia.algo.verif.vsone import * # NOQA >>> params = dict(sample_method='random') >>> pblm = OneVsOneProblem.from_empty('PZ_MTEST', **params) >>> pblm.setup(with_simple=False) >>> task_key = pblm.primary_task_key >>> self = Deployer(dpath='.', pblm=pblm) >>> deploy_info = self.deploy()
- Ignore:
- pblm.evaluate_classifiers(with_simple=False) res = pblm.task_combo_res[pblm.primary_task_key][‘RF’][‘learn(sum,glob)’]
-
find_latest_local
()[source]¶ >>> self = Deployer() >>> self.find_pretrained() >>> self.find_latest_local()
-
find_latest_remote
()[source]¶ Used to update the published dict
- CommandLine:
- python -m wbia.algo.verif.vsone find_latest_remote
Example
>>> # DISABLE_DOCTEST >>> from wbia.algo.verif.vsone import * # NOQA >>> self = Deployer() >>> task_clf_names = self.find_latest_remote()
-
fname_fmtstr
= 'vsone.{species}.{task_key}.{clf_key}.{n_dims}.{hashid}'¶
-
fname_parts
= ['vsone', '{species}', '{task_key}', '{clf_key}', '{n_dims}', '{hashid}']¶
-
meta_suffix
= '.meta.json'¶
-
publish_info
= {'path': '/data/public/models/pairclf', 'remote': 'cthulhu.dyn.wildme.io'}¶
-
published
= {'giraffe_reticulated': {'match_state': 'vsone.giraffe_reticulated.match_state.RF.131.kqbaqnrdyxpjrzjd.ggr2.cPkl'}, 'zebra_grevys': {'match_state': 'vsone.zebra_grevys.match_state.RF.131.qwmzlhlnnsgzropq.cPkl'}, 'zebra_grevys+_canonical_': {'match_state': 'vsone.zebra_grevys+_canonical_.match_state.RF.107.cusnlyxbberandka.cPkl'}, 'zebra_mountain': {'match_state': 'vsone.zebra_mountain.match_state.RF.131.lciwhwikfycthvva.cPkl'}, 'zebra_plains': {'match_state': 'vsone.zebra_plains.match_state.RF.131.eurizlstehqjvlsu.cPkl'}}¶
-
rrr
(verbose=True, reload_module=True)¶ special class reloading function This function is often injected as rrr of classes
-
wbia.algo.verif.oldvsone module¶
wbia.algo.verif.pairfeat module¶
-
class
wbia.algo.verif.pairfeat.
MatchConfig
(**kwargs)[source]¶ Bases:
wbia.dtool.base.Config
-
class
wbia.algo.verif.pairfeat.
PairFeatureConfig
(**kwargs)[source]¶ Bases:
wbia.dtool.base.Config
Config for building pairwise feature dimensions
I.E. Config to distil unordered feature correspondences into a fixed length vector.
-
class
wbia.algo.verif.pairfeat.
PairwiseFeatureExtractor
(ibs=None, config={}, use_cache=True, verbose=1, match_config=None, pairfeat_cfg=None, global_keys=None, need_lnbnn=None, feat_dims=None)[source]¶ Bases:
object
Parameters: - ibs (wbia.IBEISController) – image analysis api
- match_config (dict) – config for building feature correspondences
- pairfeat_cfg (dict) – config for making the pairwise feat vec
- global_keys (list) – global keys to use
- need_lnbnn (bool) – use LNBNN for enrichment
- feat_dims (list) – subset of feature dimensions (from pruning) if None, then all dimensions are used
- use_cache (bool) – turns on disk based caching (default = True)
- verbose (int) – verbosity flag (default = 1)
- CommandLine:
- python -m wbia.algo.verif.pairfeat PairwiseFeatureExtractor
Example
>>> # ENABLE_DOCTEST >>> from wbia.algo.verif.pairfeat import * # NOQA >>> import wbia >>> ibs = wbia.opendb('testdb1') >>> extr = PairwiseFeatureExtractor(ibs) >>> edges = [(1, 2), (2, 3)] >>> X = extr.transform(edges) >>> featinfo = vt.AnnotPairFeatInfo(X.columns) >>> print(featinfo.get_infostr())
-
class
wbia.algo.verif.pairfeat.
VsOneFeatConfig
(**kwargs)[source]¶ Bases:
wbia.dtool.base.Config
keypoint params
-
class
wbia.algo.verif.pairfeat.
VsOneMatchConfig
(**kwargs)[source]¶ Bases:
wbia.dtool.base.Config
wbia.algo.verif.ranker module¶
TODO: rewrite the hotspotter lnbnn algo to be a generator
Wrapper around LNBNN hotspotter algorithm
wbia.algo.verif.sklearn_utils module¶
-
class
wbia.algo.verif.sklearn_utils.
PrefitEstimatorEnsemble
(clf_list, voting='soft', weights=None)[source]¶ Bases:
object
hacks around limitations of sklearn.ensemble.VotingClassifier
-
predict
(X)[source]¶ Predict class labels for X.
Parameters: X ({array-like, sparse matrix}, shape = [n_samples, n_features]) – Training vectors, where n_samples is the number of samples and n_features is the number of features. Returns: maj – Predicted class labels. Return type: array-like, shape = [n_samples]
-
-
class
wbia.algo.verif.sklearn_utils.
StratifiedGroupKFold
(n_splits=3, shuffle=False, random_state=None)[source]¶ Bases:
sklearn.model_selection._split._BaseKFold
Stratified K-Folds cross-validator with Grouping
Provides train/test indices to split data in train/test sets.
This cross-validation object is a variation of GroupKFold that returns stratified folds. The folds are made by preserving the percentage of samples for each class.
Parameters: n_splits (int, default=3) – Number of folds. Must be at least 2.
-
wbia.algo.verif.sklearn_utils.
classification_report2
(y_true, y_pred, target_names=None, sample_weight=None, verbose=True)[source]¶ References
https://csem.flinders.edu.au/research/techreps/SIE07001.pdf https://www.mathworks.com/matlabcentral/fileexchange/5648-bm-cm-?requestedDomain=www.mathworks.com Jurman, Riccadonna, Furlanello, (2012). A Comparison of MCC and CEN
Error Measures in MultiClass PredictionExample
>>> # DISABLE_DOCTEST >>> from wbia.algo.verif.sklearn_utils import * # NOQA >>> y_true = [1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3] >>> y_pred = [1, 2, 1, 3, 1, 2, 2, 3, 2, 2, 3, 3, 2, 3, 3, 3, 1, 3] >>> target_names = None >>> sample_weight = None >>> verbose = True >>> report = classification_report2(y_true, y_pred, verbose=verbose)
- Ignore:
>>> size = 100 >>> rng = np.random.RandomState(0) >>> p_classes = np.array([.90, .05, .05][0:2]) >>> p_classes = p_classes / p_classes.sum() >>> p_wrong = np.array([.03, .01, .02][0:2]) >>> y_true = testdata_ytrue(p_classes, p_wrong, size, rng) >>> rs = [] >>> for x in range(17): >>> p_wrong += .05 >>> y_pred = testdata_ypred(y_true, p_wrong, rng) >>> report = classification_report2(y_true, y_pred, verbose='hack') >>> rs.append(report) >>> import wbia.plottool as pt >>> pt.qtensure() >>> df = pd.DataFrame(rs).drop(['raw'], axis=1) >>> delta = df.subtract(df['target'], axis=0) >>> sqrd_error = np.sqrt((delta ** 2).sum(axis=0)) >>> print('Error') >>> print(sqrd_error.sort_values()) >>> ys = df.to_dict(orient='list') >>> pt.multi_plot(ydata_list=ys)
-
wbia.algo.verif.sklearn_utils.
predict_from_probs
(probs, method='argmax', target_names=None, **kwargs)[source]¶ Predictions are returned as indices into columns or target_names
- Doctest:
>>> from wbia.algo.verif.sklearn_utils import * >>> rng = np.random.RandomState(0) >>> probs = pd.DataFrame(rng.rand(10, 3), columns=['a', 'b', 'c']) >>> pred1 = predict_from_probs(probs, 'argmax') >>> pred2 = predict_from_probs(probs, 'argmax', target_names=probs.columns) >>> threshes = probs.loc[0] >>> pred3 = predict_from_probs(probs, threshes.values, force=True, >>> target_names=probs.columns)
-
wbia.algo.verif.sklearn_utils.
predict_proba_df
(clf, X_df, class_names=None)[source]¶ Calls sklearn classifier predict_proba but then puts results in a dataframe using the same index as X_df and incorporating all possible class_names given
-
wbia.algo.verif.sklearn_utils.
predict_with_thresh
(probs, threshes, target_names=None, force=False, multi=True, return_flags=False)[source]¶ if force is true, everything will make a prediction, even if nothing passes the thresholds. In that case it will use argmax.
if more than one thing passes the thresold we take the highest one if multi=True, and return nan otherwise.
- Doctest:
>>> from wbia.algo.verif.sklearn_utils import * >>> probs = np.array([ >>> [0.5, 0.5, 0.0], >>> [0.4, 0.5, 0.1], >>> [1.0, 0.0, 0.0], >>> [0.3, 0.3, 0.4], >>> [0.1, 0.3, 0.6], >>> [0.1, 0.6, 0.3], >>> [0.6, 0.1, 0.3],]) >>> threshes = [.5, .5, .5] >>> pred_enc = predict_with_thresh(probs, threshes) >>> a = predict_with_thresh(probs, [.5, .5, .5]) >>> b = predict_with_thresh(probs, [.5, .5, .5], force=True) >>> assert np.isnan(a).sum() == 3 >>> assert np.isnan(b).sum() == 0
wbia.algo.verif.verifier module¶
-
class
wbia.algo.verif.verifier.
BaseVerifier
[source]¶ Bases:
utool.util_dev.NiceRepr
-
easiness
(edges, real)[source]¶ Gets the probability of the class each edge is labeled as. Indicates how easy it is to classify this example.
-
fit
(edges)[source]¶ The vsone.OneVsOneProblem currently handles fitting a model based on edges. The actual fit call is in clf_helpers.py
-
rrr
(verbose=True, reload_module=True)¶ special class reloading function This function is often injected as rrr of classes
-
-
class
wbia.algo.verif.verifier.
IntraVerifier
(pblm, task_key, clf_key, data_key)[source]¶ Bases:
wbia.algo.verif.verifier.BaseVerifier
Predicts cross-validated intra-training sample probs.
Note
Requires the original OneVsOneProblem object. This classifier is for intra-dataset evaulation and is not meant to be pushlished for use on external datasets.
-
predict_proba_df
(want_edges)[source]¶ - Predicts task probabilities in one of two ways:
- if the edge was in the training set then its cross-validated probability is returned.
- if the edge was not in the training set, then the average prediction over all cross validated classifiers are used.
-
rrr
(verbose=True, reload_module=True)¶ special class reloading function This function is often injected as rrr of classes
-
-
class
wbia.algo.verif.verifier.
Verifier
(ibs=None, deploy_info=None)[source]¶ Bases:
wbia.algo.verif.verifier.BaseVerifier
Notes
- deploy_info should be a dict with the following keys:
clf: sklearn classifier metadata: another dict with key:
class_names - classes that clf predicts task_key - str clf_key - str data_info - tuple of (feat_extract_config, feat_dims) # TODO: make feat dims part of feat_extract_config defaulted to None data_info - tuple of (feat_extract_config, feat_dims)
Example
>>> # DISABLE_DOCTEST >>> from wbia.algo.verif.vsone import * # NOQA >>> import wbia >>> ibs = wbia.opendb('PZ_MTEST') >>> speceis = 'zebra_plains' >>> task_key = 'match_state' >>> verif = Deployer()._load_published(ibs, species, task_key)
-
rrr
(verbose=True, reload_module=True)¶ special class reloading function This function is often injected as rrr of classes
wbia.algo.verif.vsone module¶
- CommandLine:
# Test how well out-of-the-box vsone classifiers to: python -m wbia.algo.verif.vsone evaluate_classifiers –db DETECT_SEATURTLES
# Train a classifier for deployment # Will output to the current working directory python -m wbia.algo.verif.vsone deploy –db GZ_Master1
-
class
wbia.algo.verif.vsone.
AnnotPairSamples
(ibs, aid_pairs, infr=None, apply=False)[source]¶ Bases:
wbia.algo.verif.clf_helpers.MultiTaskSamples
,ubelt.util_mixins.NiceRepr
Manages the different ways to assign samples (i.e. feat-label pairs) to 1-v-1 classification
- CommandLine:
- python -m wbia.algo.verif.vsone AnnotPairSamples
Example
>>> # DISABLE_DOCTEST >>> from wbia.algo.verif.vsone import * # NOQA >>> pblm = OneVsOneProblem.from_empty() >>> pblm.load_samples() >>> samples = AnnotPairSamples(pblm.ibs, pblm.raw_simple_scores, {}) >>> print(samples) >>> samples.print_info() >>> print(samples.sample_hashid()) >>> encode_index = samples.subtasks['match_state'].encoded_df.index >>> indica_index = samples.subtasks['match_state'].indicator_df.index >>> assert np.all(samples.index == encode_index) >>> assert np.all(samples.index == indica_index)
-
edge_set_hashid
()[source]¶ Faster than using ut.combine_uuids, because we condense and don’t bother casting back to UUIDS, and we just directly hash.
-
group_ids
¶ Prevents samples with the same group-id from appearing in the same cross validation fold. For us this means any pair within the same name or between the same names will have the same groupid.
-
rrr
(verbose=True, reload_module=True)¶ special class reloading function This function is often injected as rrr of classes
-
class
wbia.algo.verif.vsone.
OneVsOneProblem
(infr=None, verbose=None, **params)[source]¶ Bases:
wbia.algo.verif.clf_helpers.ClfProblem
Keeps information about the one-vs-one pairwise classification problem
- CommandLine:
python -m wbia.algo.verif.vsone evaluate_classifiers python -m wbia.algo.verif.vsone evaluate_classifiers –db PZ_PB_RF_TRAIN python -m wbia.algo.verif.vsone evaluate_classifiers –db PZ_PB_RF_TRAIN –profile python -m wbia.algo.verif.vsone evaluate_classifiers –db PZ_MTEST –show python -m wbia.algo.verif.vsone evaluate_classifiers –db PZ_Master1 –show python -m wbia.algo.verif.vsone evaluate_classifiers –db GZ_Master1 –show python -m wbia.algo.verif.vsone evaluate_classifiers –db RotanTurtles –show
python -m wbia.algo.verif.vsone evaluate_classifiers –db testdb1 –show -a default
Example
>>> # xdoctest: +REQUIRES(module:wbia_cnn, --slow) >>> from wbia.algo.verif.vsone import * # NOQA >>> pblm = OneVsOneProblem.from_empty('PZ_MTEST') >>> pblm.hyper_params['xval_kw']['n_splits'] = 10 >>> assert pblm.xval_kw.n_splits == 10 >>> pblm.xval_kw.n_splits = 5 >>> assert pblm.hyper_params['xval_kw']['n_splits'] == 5 >>> pblm.load_samples() >>> pblm.load_features()
-
appname
= 'vsone_rf_train'¶
-
auto_decisions_at_threshold
(primary_task, task_probs, task_thresh, task_keys, clf_key, data_key)[source]¶
-
build_feature_subsets
()[source]¶ Try to identify a useful subset of features to reduce problem dimensionality
- CommandLine:
python -m wbia.algo.verif.vsone build_feature_subsets –db GZ_Master1 python -m wbia.algo.verif.vsone build_feature_subsets –db PZ_PB_RF_TRAIN
python -m wbia Chap4._setup_pblm –db GZ_Master1 –eval python -m wbia Chap4._setup_pblm –db PZ_Master1 –eval
Example
>>> # DISABLE_DOCTEST >>> from wbia.algo.verif.vsone import * # NOQA >>> from wbia.algo.verif.vsone import * # NOQA >>> pblm = OneVsOneProblem.from_empty('PZ_MTEST') >>> pblm.load_samples() >>> pblm.load_features() >>> pblm.build_feature_subsets() >>> pblm.samples.print_featinfo()
-
deploy
(dpath='.', task_key=None, publish=False)[source]¶ Trains and saves a classifier for deployment
Parameters: Example
>>> # DISABLE_DOCTEST >>> from wbia.algo.verif.vsone import * # NOQA >>> pblm = OneVsOneProblem.from_empty(defaultdb='PZ_MTEST', >>> sample_method='random') >>> task_key = ut.get_argval('--task', default='match_state') >>> publish = ut.get_argflag('--publish') >>> pblm.deploy(task_key=task_key, publish=publish)
Notes
- A deployment consists of the following information
- The classifier itself
- Information needed to construct the input to the classifier
- TODO: can this be encoded as an sklearn pipeline?
- Metadata concerning what data the classifier was trained with
- PUBLISH TO /media/hdd/PUBLIC/models/pairclf
- Ignore:
- pblm.evaluate_classifiers(with_simple=False) res = pblm.task_combo_res[pblm.primary_task_key][‘RF’][‘learn(sum,glob)’]
-
evaluate_classifiers
(with_simple=False)[source]¶ - CommandLine:
- python -m wbia.algo.verif.vsone evaluate_classifiers python -m wbia.algo.verif.vsone evaluate_classifiers –db PZ_MTEST python -m wbia.algo.verif.vsone evaluate_classifiers –db GZ_Master1 python -m wbia.algo.verif.vsone evaluate_classifiers –db GIRM_Master1
Example
>>> # DISABLE_DOCTEST >>> from wbia.algo.verif.vsone import * # NOQA >>> pblm = OneVsOneProblem.from_empty(defaultdb='PZ_MTEST', >>> sample_method='random') >>> #pblm.default_clf_key = 'Logit' >>> pblm.default_clf_key = 'RF' >>> pblm.evaluate_classifiers()
-
evaluate_simple_scores
(task_keys=None)[source]¶ >>> from wbia.algo.verif.vsone import * # NOQA >>> pblm = OneVsOneProblem.from_empty() >>> pblm.set_pandas_options() >>> pblm.load_samples() >>> pblm.load_features() >>> pblm.evaluate_simple_scores()
-
feature_importance
(task_key=None, clf_key=None, data_key=None)[source]¶ - CommandLine:
- python -m wbia.algo.verif.vsone report_importance –show python -m wbia.algo.verif.vsone report_importance –show –db PZ_PB_RF_TRAIN
Example
>>> # DISABLE_DOCTEST >>> from wbia.algo.verif.vsone import * # NOQA >>> pblm = OneVsOneProblem.from_empty('GZ_Master1') >>> data_key = pblm.default_data_key >>> clf_key = pblm.default_clf_key >>> task_key = pblm.primary_task_key >>> pblm.setup_evaluation() >>> featinfo = pblm.feature_info(task_key, clf_key, data_key) >>> ut.quit_if_noshow() >>> import wbia.plottool as pt >>> text = importances >>> pt.wordcloud(featinfo.importances) >>> ut.show_if_requested()
-
classmethod
from_aids
(ibs, aids, verbose=None, **params)[source]¶ Constructs a OneVsOneProblem from a subset of aids. Use pblm.load_samples to sample a set of pairs
-
classmethod
from_empty
(defaultdb=None, **params)[source]¶ >>> from wbia.algo.verif.vsone import * # NOQA >>> defaultdb = 'GIRM_Master1' >>> pblm = OneVsOneProblem.from_empty(defaultdb)
-
classmethod
from_labeled_aidpairs
(ibs, labeled_aid_pairs, class_names, task_name, **params)[source]¶ Build a OneVsOneProblem directly from a set of aid pairs. It is not necessary to call pblm.load_samples.
Parameters: - ibs (IBEISController) –
- labeled_aid_pairs (list) – tuples of (aid1, aid2, int_label)
- class_names (list) – list of names corresponding to integer labels
- task_name (str) – identifier for the task (e.g. custom_match_state)
-
load_features
(use_cache=True, with_simple=False)[source]¶ - CommandLine:
- python -m wbia.algo.verif.vsone load_features –profile
Example
>>> # DISABLE_DOCTEST >>> from wbia.algo.verif.vsone import * # NOQA >>> #pblm = OneVsOneProblem.from_empty('GZ_Master1') >>> pblm = OneVsOneProblem.from_empty('PZ_PB_RF_TRAIN') >>> pblm.load_samples() >>> pblm.load_features(with_simple=False)
-
load_samples
()[source]¶ - CommandLine:
- python -m wbia.algo.verif.vsone load_samples –profile
Example
>>> # DISABLE_DOCTEST >>> from wbia.algo.verif.vsone import * # NOQA >>> #pblm = OneVsOneProblem.from_empty('PZ_MTEST') >>> #pblm = OneVsOneProblem.from_empty('PZ_PB_RF_TRAIN') >>> pblm = OneVsOneProblem.from_empty('PZ_Master1') >>> pblm.load_samples() >>> samples = pblm.samples >>> samples.print_info()
-
make_graph_based_bootstrap_pairs
()[source]¶ Sampling method for when you want to bootstrap VAMP after several reviews.
Sample pairs for VAMP training using manually reviewed edges and mines other (random) pairs as needed.
- We first sample a base set via:
- take all manually reviewed positive edges (not in an inconsistent PCC)
- take all manually reviewed negative edges (not touching an inconsistent PCC)
(3) take all manually reviewed incomparable edges. Note: it is important to ignore any PCC currently in an inconsistent state.
We can then generate additional positive samples by sampling automatically reviewed positive edges within PCCs.
We can do the same for negatives.
-
make_training_pairs
()[source]¶ - CommandLine:
- python -m wbia.algo.verif.vsone make_training_pairs –db PZ_Master1
Example
>>> # DISABLE_DOCTEST >>> from wbia.algo.verif.vsone import * # NOQA >>> pblm = OneVsOneProblem.from_empty('PZ_MTEST') >>> pblm.make_training_pairs()
-
prune_features
()[source]¶ References
http://blog.datadive.net/selecting-good-features-part-iii-random-forests/ http://alexperrier.github.io/jekyll/update/2015/08/27/feature-importance-random-forests-gini-accuracy.html https://arxiv.org/abs/1407.7502 https://github.com/glouppe/phd-thesis
Example
>>> # DISABLE_DOCTEST >>> from wbia.algo.verif.vsone import * # NOQA >>> pblm = OneVsOneProblem.from_empty(defaultdb='PZ_MTEST') >>> pblm = OneVsOneProblem.from_empty(defaultdb='PZ_PB_RF_TRAIN') >>> pblm = OneVsOneProblem.from_empty(defaultdb='PZ_Master1')
- Ignore:
>>> from wbia.algo.verif.vsone import * # NOQA >>> pblm = OneVsOneProblem.from_empty(defaultdb='GZ_Master1') >>> pblm.setup_evaluation()
-
qt_review_hardcases
()[source]¶ Example
>>> # DISABLE_DOCTEST >>> from wbia.algo.verif.vsone import * # NOQA >>> pblm = OneVsOneProblem.from_empty('PZ_Master1') >>> #pblm = OneVsOneProblem.from_empty('GIRM_Master1') >>> #pblm = OneVsOneProblem.from_empty('PZ_PB_RF_TRAIN') >>> pblm.evaluate_classifiers() >>> win = pblm.qt_review_hardcases()
- Ignore:
>>> from wbia.scripts.postdoc import * >>> self = VerifierExpt('RotanTurtles') >>> self = VerifierExpt('humpbacks_fb') >>> import wbia >>> self._precollect() >>> ibs = self.ibs >>> aids = self.aids_pool >>> pblm = vsone.OneVsOneProblem.from_aids(ibs, aids) >>> infr = pblm.infr >>> infr.params['algo.hardcase'] = True >>> infr.params['autoreview.enabled'] = False >>> infr.params['redun.enabled'] = False >>> infr.params['ranking.enabled'] = False >>> win = infr.qt_review_loop()
>>> pblm.eval_data_keys = [pblm.default_data_key] >>> pblm.eval_clf_keys = [pblm.default_clf_key] >>> pblm.evaluate_classifiers()
- Ignore:
>>> # TEST to ensure we can priorizite reviewed edges without inference >>> import networkx as nx >>> from wbia.algo.graph import demo >>> kwargs = dict(num_pccs=6, p_incon=.4, size_std=2) >>> infr = demo.demodata_infr(**kwargs) >>> infr.params['redun.pos'] = 1 >>> infr.params['redun.neg'] = 1 >>> infr.apply_nondynamic_update() >>> edges = list(infr.edges()) >>> prob_match = ut.dzip(edges, infr.dummy_matcher.predict(edges)) >>> infr.set_edge_attrs('prob_match', prob_match) >>> infr.params['redun.enabled'] = True >>> infr.prioritize('prob_match', edges) >>> order = [] >>> while True: >>> order.append(infr.pop()) >>> print(len(order))
-
report_evaluation
()[source]¶ - CommandLine:
- python -m wbia.algo.verif.vsone report_evaluation –db PZ_MTEST
Example
>>> # DISABLE_DOCTEST >>> from wbia.algo.verif.vsone import * # NOQA >>> pblm = OneVsOneProblem.from_empty(defaultdb='PZ_MTEST', >>> sample_method='random') >>> pblm.eval_clf_keys = ['MLP', 'Logit', 'RF'] >>> pblm.eval_data_keys = ['learn(sum,glob)'] >>> pblm.setup_evaluation(with_simple=False) >>> pblm.report_evaluation()
-
rrr
(verbose=True, reload_module=True)¶ special class reloading function This function is often injected as rrr of classes
-
class
wbia.algo.verif.vsone.
PairSampleConfig
(**kwargs)[source]¶ Bases:
wbia.dtool.base.Config
Module contents¶
-
wbia.algo.verif.
IMPORT_TUPLES
= [('clf_helpers', None), ('sklearn_utils', None), ('vsone', None), ('deploy', None), ('verifier', None), ('pairfeat', None)]¶ cd /home/joncrall/code/wbia/wbia/algo/verif makeinit.py –modname=wbia.algo.verif
Type: Regen Command
-
wbia.algo.verif.
reassign_submodule_attributes
(verbose=1)[source]¶ Updates attributes in the __init__ modules with updated attributes in the submodules.
-
wbia.algo.verif.
rrrr
(verbose=1)¶ Reloads wbia.algo.verif and submodules