Scratch work only.

Inspired by Austin Kleon’s Show Your Work, I am recording notes, snippets, and curiosities from my Data Science work. You won’t find any finished projects here, only scratch work.

Notes

Ryan Melvin Ryan Melvin

Hyperopt, part 3 (conditional parameters)

The (shockingly) little Hyperopt documentation that exists mentions conditional hyperparameter tuning. (For example, I only need a degree parameter if my SVM has a polynomial kernel). However, after trying three different examples of how to use conditional parameters, I was ready to give up — because none of them worked! Then, I found a Kaggle tutorial that explained I have to unpack the conditions whenever they apply. Scikit-learn, for example, can’t do that for me. So, here is a working (for me at least) example of how to use conditional hyperparameters in Hyperopt with scikit-learn classifiers.

Read More
Ryan Melvin Ryan Melvin

Scikit-optimize

I’m continuing to explore tools that automate hyperparameter tuning for machine learning. Today’s explorations brought me to scikit-optimize, which has thorough documentation, unlike Hyperopt. Additionally, there is a drop-in replacement method — BayesSearchCV — for GridSearchCV() or from RandomizedSearchCV() from scikit-learn. Despite the nearly unforgiveable lack of documentation, it does seem that Hyperopt is something of a standard of practice in the Machine Learning world. And its extensibility does make it powerful and broadly applicable (i.e., not just to scikit-learn models).

Read More
Ryan Melvin Ryan Melvin

Hyperopt, part 2

As expected, the base Hyperopt package for tuning machine learning parameters gives much more control than Hyperopt-sklearn. However, there is virtually no documentation for it. It is only thanks to a Kaggle tutorial and a Medium post that I was able to figure out how to do anything with Hyperopt.

Read More
Ryan Melvin Ryan Melvin

R boot

I’ve used the boot library in R before, but I did not realize just how simple it could be. I worried about creating grids of synthetic data to sample from. Today I learned, thanks to this post on R-bloggers, that all you really need is a wrapped around your model call. I don’t think I could make a better example than they did, so I’ll simply suggest you go read it.

Read More
Ryan Melvin Ryan Melvin

Hyperopt

I am interested in automating hyperparameter tuning for machine learning models. I favor grid-searches, manually expanding the grid whenever a “best” parameter falls on the edge (see Jason Brownlee’s post on the topic). Today, I came across the Python package Hyperopt and its Scikit-Learn-Specific wrapper Hyperopt-sklearn, which have built-in algorithms for strategically searching a space of parameters. In a quick test, though, Hyperopt-sklearn returned a higher accuracy model than my own “manual” tuning on a random forest. Though, I cannot find in the Hyperopt-sklearn documentation how to specify a cross-validaiton method. So the comparison is likely not a fair one. Hyperopt (non-Sklearn) seems to give more control over the space and cross-validation method employed. I plan to play with that soon to see what things look like with more control.

Read More