Talks & workshops

Events I have been invited to present at, shared along with slides, videos, and other linkable resources.

2021

Text Preprocessing in R

Invited talk at New York Open Statistical Programming Meetup Text constitutes an ever-growing part of the data available to us today. However, it is a non-trivial task to transform text, represented as long strings of characters, into numbers that we can use in our statistical and machine learning models. This talk will focus on the {textrecipes} package and its recent advancements in the realm of text preprocessing.

Tidymodels - An Overview

Invited talk at Aalborg RUG The tidymodels framework is a collection of R packages for modeling and machine learning using tidyverse principles. A large number of packages from preprocessing to evaluation come together to form a coherent framework for doing modeling: statistically, and machine learning-wise. This talk will walk through the landscape of packages and their individual place, helping you get a bird’s eye view.

2020

Predictive modeling with text using tidy data principles

Invited workshop for R/Pharma Conference Have you ever encountered text data and suspected there was useful insight latent within it but felt frustrated about how to find that insight? Are you familiar with dplyr and ggplot2, and ready to learn how unstructured text data can be used for prediction within the tidyverse and tidymodels ecosystems? Do you need a flexible framework for handling text data that allows you to engage in tasks from exploratory data analysis to supervised predictive modeling?

palette2vec: A new way to explore color palettes

There are many palettes available in various R packages. Having a way to explore all of these palettes are already found within the https://github.com/EmilHvitfeldt/r-color-palettes repository and the {paletteer} package. This talk shows what happens when we take one step further into explorability. Using handcrafted color features, dimensionality reduction, and interactive tools will we create and explore a color palette embedding. In this embedded space will we interactively be able to cluster palettes, find neighboring palettes, and even generate new palettes in a whole new way.

Looking at stop words: why you shouldn’t blindly trust model defaults

Invited talk at Salt Lake City R Users Group Removing stop words is a fairly common step in natural language processing, and NLP packages often supply a default list. However, most documentation and tutorials don’t explore the nuances of selecting an appropriate list. Defaults for machine learning and modeling can be helpful but may be misleading or wrong. This talk will focus on the importance of checking assumptions and defaults in the software you use.

Looking at Stop Words: Why You Shouldn’t Blindly Trust Model Defaults

Removing stop words is a fairly common step in natural language processing, and NLP packages often supply a default list. However, most documentation and tutorials don’t explore the nuances of selecting an appropriate list. Defaults for machine learning and modeling can be helpful but may be misleading or wrong. This talk will focus on the importance of checking assumptions and defaults in the software you use.

September 9, 2020

7:00 PM

Salt Lake City R Users Group


materials