Grid search vs Random search
Among many hyperparameter tuning techniques, two of the most basic and widely used are grid search and random search. In grid search, a.k.a brute force search, a grid of hyperparamter values are set up for evaluation, which enumerates every combination of hyperparameters. The disadvantage of this approach is that the grid grows exponentially with the number of hyperparameters.
Image adapted from Bergstra and Bengio 2012 by Sydney F
Unlike grid search, random search selects random combinations of hyperparmaters to evaluate, which over the same domain is able to find models that are competitive within a small fraction of the computation time of a grid search. It can find the optimal hyperparameter combination by effectively searching a larger and high dimensional configuration space. Furthermore, it has also been shown to be sufficiently efficient for learning neural networks for several datasets.
Bayesian optimisation
In order to improve the efficiency of the hyperparameter tuning process, Sequential Model Based Optimisation (SMBO) has been used in many applications where evaluation of the fitness function is expensive. The most typical and widely used one is Bayesian optimisation. It looks for the most promising hyperparameters according to the surrogate function which is much cheaper and easier to optimise, and evaluates them with the actual objective function.
from hyperopt import hp
2 space = hp.choice('classifier_type', [
3 {
4 'type': 'naive_bayes',
5 },
6 {
7 'type': 'svm',
8 'C': hp.lognormal('svm_C', 0, 1),
9 'kernel': hp.choice('svm_kernel', [
10 {'ktype': 'linear'},
11 {'ktype': 'RBF', 'width': hp.lognormal('svm_rbf_width', 0, 1)},
12 ]),
13 },
14 {
15 'type': 'dtree',
16 'criterion': hp.choice('dtree_criterion', ['gini', 'entropy']),
17 'max_depth': hp.choice('dtree_max_depth',
18 [None, hp.qlognormal('dtree_max_depth_int', 3, 1, 1)]),
19 'min_samples_split': hp.qlognormal('dtree_min_samples_split', 2, 1, 1),
20 },
21 ])
Example of defining search space of hyperparameters of classification models by using HyperOpt
Evolutionary algorithm
Evolutionary algorithm is also recognised as a promising optimisation approach for hyperparameter tuning, especially surrogate model assisted Evolutionary Algorithm. Surrogate models, also called metamodels, are less expensive to run, and have the ability to approximate complex objective functions, so as to reproduce experiments or perform many repeats of their own experiments without relying on tremendous computational resources.
Here is a piece of example code to use Platypus NSGAII to optimise hyperparameters for an XGBoost classification model, and maximise the accuracy score:
1 from xgboost import XGBClassifier
2 from platypus import NSGAII, Problem, Real, Integer
3 import numpy as np
4 from sklearn.metrics import accuracy_score, roc_auc_score
5 from sklearn.model_selection import train_test_split
6 import sys
7
8 X, y = np.arange(10).reshape((5, 2)), range(5)
9 X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)
10
11 def ml(**kwargs) -> [float, float]:
12 learner = XGBClassifier(**kwargs)
13 try:
14 learner.fit(X_train, y_train)
15 y_predict = learner.predict(X_test)
16 y_proba = learner.predict_proba(X_test)
17 y_pred_prob = np.array(y_proba)
18 if len(set(y_test)) == 2 and len(y_pred_prob.shape) > 1:
19 y_pred_prob = y_pred_prob[:, 1]
20 acc = accuracy_score(y_test, y_predict)
21 roc = roc_auc_score(y_test, y_pred_prob, multi_class='ovr'
22 return [acc, roc]
23 except Exception as e:
24 print("error! ", e)
25 return [sys.float_info.min, sys.float_info.min]
26
27 class HyperTuneProblem(Problem):
28 def __init__(self):
29 super(HyperTuneProblem, self).__init__(3, 2)
30 self.types[:] = [Integer(10, 100), Integer(0, 100), Integer(1, 100)]
31
32 def evaluate(self, solution):
33 param_values = solution.variables[:]
34 param_names = ["n_estimators", "max_delta_step", "num_parallel_tree"]
35 params = dict(zip(param_names, param_values))
36 solution.objectives[:] = ml(**params)
37
38 # define the range of each parameter, here we are listing the example range for n_estimator, max_delta_step and num_parallel_tree
39 platypus_parameters = [Integer(10, 100), Integer(0, 100), Integer(1, 100)]
40
41 problem = HyperTuneProblem()
42 problem.types[:] = platypus_parameters
43 algorithm = NSGAII(problem, population_size=10)
44 algorithm.run(10)
45
46 for solution in algorithm.result:
47 print(solution.objectives)
Better and faster optimisation with EvoML
Our Evolutionary AI Optimisation platform –EvoML– enables you to optimise hyperparameter better and faster. Apart from basic Bayesian, Evolutionary Algorithm, Random Search, EvoML also introduces a novel approach, Intelligence Evolutionary Algorithm, which integrates surrogate model into evolutionary algorithm to further shorten hyperparameter tuning time and speed up convergence. This efficient process allows you to achieve optimal model performance faster with less compute resources.
About the Author
Yuxi Huan | TurinTech Research & Engineering
Passionate about data science and engineering, interested in optimisation research. Ballet enthusiasts, big fan of classical music, traveling, baking, swimming, always looking to try out new things!