Articles by Matt Kaye

Lessons Learned From Running R in Production

June 28, 2023 | Matt Kaye

Introduction A couple weeks ago, I wrote a high-level post on REST APIs. One thing that I noted was that I couldn’t, in good faith, recommend running R (or Plumber, a common library used to create APIs in R) in any type of high-load production sy...
[Read more...]

How Can Someone Else Use My Model?

June 20, 2023 | Matt Kaye

This post is part of a series called The Missing Semester of Your DS Education. Introduction At this point in this series, I’ve discussed a lot of aspects of putting machine learning into production. I’ve gone over workflow orchestration...
[Read more...]

A Gentle Introduction to Docker

June 5, 2023 | Matt Kaye

This post is part of a series called The Missing Semester of Your DS Education. Introduction If you’re doing data science work, it’s likely you’ll eventually come across a situation where you need to run your code somewhere else. Whether... [Read more...]

Dependency Management

May 27, 2023 | Matt Kaye

This post is part of a series called The Missing Semester of Your DS Education. Introduction When I was first learning to program, I’d face problems that would require (or at least were just made easier by using) library code. For instan... [Read more...]

Unit Testing Analytics Code

April 4, 2023 | Matt Kaye

This post is part of a series called The Missing Semester of Your DS Education. Introduction Unit testing was a concept I had never even heard of before I started my second data science job. It never came up in any of my college stati...
[Read more...]

Library Code

March 31, 2023 | Matt Kaye

Introduction By definition, library code is code that’s written to being reused by programs other than itself that are unrelated to each other. For instance, dplyr (R) and pandas (Python) are common examples of library code: Instead of writing co... [Read more...]

Balancing Classes in Classification Problems

March 24, 2023 | Matt Kaye

Introduction In my last post I wrote about common classifications metrics and, especially, calibration. With calibration in mind, this post will show why balancing your classes – which is an all-too-common practice when working on classification ...
[Read more...]

Interpreting AUC-ROC

March 8, 2023 | Matt Kaye

AUC goes by many names: AUC, AUC-ROC, ROC-AUC, the area under the curve, and so on. It’s an extremely important metric for evaluating machine learning models and it’s an uber-popular data science interview question. It’s also, at least in my exper...
[Read more...]

Working With Your Fitbit Data in R

June 7, 2021 | Matt Kaye

Note: This post was updated as of 3/25/2023 for fitbitr v0.3.0 Introduction fitbitr 0.1.0 is now available on CRAN! You can install it with install.packages("fitbitr") or you can get the latest dev version with ## install.packages("devtools") d...
[Read more...]

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)