Articles by matloff

New qeML Plotting Function

November 5, 2024 | matloff

I’ve added a new function to qeML 1.2, qeMittalGraph, based on an idea by my student Aditya Mittal. Below is an example that I think is rather compelling. The basic idea is quite simple (and not necessarily new, just something I had not seen below): Instead of comparing several curves ...
[Read more...]

New R Package: Data Science Looks at Discrimination (dsld)

September 23, 2024 | matloff

I’m very pleased to announce a new package, dsld, available on CRAN. This is the work of eight talented undergrad students. I provided the concept and some general guidance, but this is their work. The package is aimed at dealing with discrimination — race, gender, age — in the workplace, education, ... [Read more...]

New Paper on Data Privacy

June 9, 2024 | matloff

Readers who are interested in the Data Privacy field may find our new paper (Perry, Matloff, Tendick) of interest, https://tdp.cat/issues21/tdp.a478a22.pdf…. There we introduce a new method that we call RWN, Randomization within Neighborhoods. We present a bit of supporting theory and do some ... [Read more...]

Torch for R Now in the qeML Package

February 18, 2024 | matloff

I’ve added a new function, qeNeuralTorch, to the qeML package, as an alternative to the package’s qeNeural. It is experimental as this point, but usable and I urge everyone to try it out. In this post, I will (a) state why I felt it desirable to add such ... [Read more...]

Quantile Regression with Random Forests

January 1, 2024 | matloff

In my December 22 blog, I first introduced the classic parametric quantile regression (QR) concept. I then showed how one could use the qeML package to perform quantile regression nonparametrically, using the package’s qeKNN function for a k-Nearest Neighbors approach. A reader then asked if this could be applied to ...
[Read more...]

qeML Example: Nonparametric Quantile Regression

December 22, 2023 | matloff

In this post, I will first introduce the concept of quantile regression (QR), a powerful technique that is rarely taught in stat courses. I’ll give an example from the quantreg package, and then will show how qeML can be used to do model-free QR estimation. Along the way, I ... [Read more...]

A Comparison of Several qeML Predictive Methods

December 3, 2023 | matloff

Is machine learning overrated, with traditional methods being underrated these days? Yes, ML has had some celebrated successes, but these have come after huge amounts of effort, and it’s possible that similar effort with traditional methods may have produced similar results. A related issue concerns the type of data. ...
[Read more...]

The “Secret Sauce” Used in Many qeML Functions

November 22, 2023 | matloff

In writing an R package, it is often useful to build up some function call in string form, then “execute” the string. To give a really simple example: Quite a lot of trouble to go to just to find that 1+1 = 2? Yes, but this trick can be extremely useful, as we’... [Read more...]

qeML Example: Issues of Overfitting, Dimension Reduction Etc.

November 21, 2023 | matloff

What about variable selection? Which predictor variables/features should we use? No matter what anyone tells you, this is an unsolved problem. But there are lots of useful methods. See the qeML vignettes on feature selection and overfitting for detailed background on the issues involved. We note at the outset ... [Read more...]

New Package, New Book!

November 18, 2023 | matloff

Sorry I haven’t been very active on this blog lately, but now that I have more time, that will change. I’ve got myriad things to say. To begin with, then, I’ll announce a major new R package, and my new book. qeML package (“quick and easy machine ... [Read more...]

New Statistics Tutorial

December 30, 2022 | matloff

I’ve recently completed fastStat, https://github.com/matloff/fastStat,a quick introduction to statistics for those who’ve had a calculus-based probability course. Many such people later need to do statistics, and this will give them quick access. It is modeled after my R tutorial, https://github.com/matloff/... [Read more...]

Just How Good Is ChatGPT in Data Science?

December 4, 2022 | matloff

Many of you may have heard of ChatGPT, a dazzling new AI tool. We are hearing lots of gushing praise for the tool. Well, how well does it do in data science contexts? I tried a few queries here, and found interesting results. I first requested, “Write an R function ...
[Read more...]

Use of Differential Privacy in the US Census–All for Nothing?

September 1, 2022 | matloff

The field of data privacy has long been of broad interest. In a medical database, for instance, how can administrators enable statistical analysis by medical researchers, while at the same time protecting the privacy of individual patients? Over the years, many methods have been proposed and used. I’ve done ... [Read more...]

Base-R and Tidyverse Code, Side-by-Side

August 24, 2022 | matloff

I have a new short writeup, showing common R design patterns, implemented side-by-side in base-R and Tidy. As readers of this blog know, I strongly believe that Tidy is a poor tool for teaching R learners who have no coding background. Relative to learning in a base-R environment, learners using ... [Read more...]

A New Approach to Fairness in Machine Learning

August 15, 2022 | matloff

During the last year or so, I’ve been quite interested in the issue of fairness in machine learning. This area is more personal for me, as it is the confluence of several interests of mine: My lifelong activity in probability theory, math stat and stat methodology (in which I ... [Read more...]

Valuable Webinar in Parallel Computing in R

August 10, 2022 | matloff

George Ostrouchov, one of R’s top parallel computing experts, will run a workshop on cluster parallel computing in R next week. BTW, even a multicore laptop is a “cluster,” so anyone can apply this material to their own work even if ... [Read more...]

Base-R Is Alive and Well

August 6, 2022 | matloff

As many readers of this blog know, I strongly believe that R learners should be taught base-R, not the tidyverse. Eventually the students may settle on using a mix of the two paradigms, but at the learning stage they will benefit from the fact that base-R is simple and more ... [Read more...]
1 2 3 7

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)