Site icon R-bloggers

How to Scale Only Numeric Columns in R

[This article was first published on Data Science Tutorials, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

The post How to Scale Only Numeric Columns in R appeared first on Data Science Tutorials

How to Scale Only Numeric Columns in R, To scale only the numeric columns in a data frame in R, use the dplyr package’s following syntax.

Best Books to learn Tensorflow – Data Science Tutorials

library(dplyr)
df %>% mutate(across(where(is.numeric), scale))

How to actually use this function is demonstrated in the example that follows.

Use dplyr to Scale Only Numeric Columns as an example.

Let’s say we have the R data frame shown below, which contains details about numerous basketball players.

How to Scale Only Numeric Columns in R

Let’s create a data frame

df <- data.frame(Team=c('P1', 'P2', 'P3', 'P4', 'P5'),
                 points=c(2, 3, 7, 22, 8),
                 value=c(27, 39, 49, 82, 54))

Now we can view the data frame

df
  Team points value
1   P1      2    27
2   P2      3    39
3   P3      7    49
4   P4     22    82
5   P5      8    54

Technical Remarks

The following fundamental syntax is used by R’s scale() function.

Best Books to Learn Statistics for Data Science (datasciencetut.com)

scale(x, center = TRUE, scale = TRUE)

where:

x: Name of the object to scale

center: whether to scale after subtracting the mean. As a rule, TRUE.

scale: Whether to scale after dividing by the standard deviation. As a general, TRUE.

Scaled values are calculated using the following formula by this function:

xscaled = (xoriginal – x̄) / s

where:

xoriginal: The original x-value

x̄: The sample mean

s: The sample standard deviation

This process, which only changes each original value into a z-score, is also known as normalizing data.

Let’s say we want to scale the data frame’s numeric columns solely, using R’s scale function.

Methods for Integrating R and Hadoop complete Guide – Data Science Tutorials

To do this, we can use the syntax shown below.

library(dplyr)

scale just the data frame’s numerical columns.

df %>% mutate(across(where(is.numeric), scale))
   Team      points      value
1   P1 -0.79813157 -1.1284228
2   P2 -0.67342351 -0.5447558
3   P3 -0.17459128 -0.0583667
4   P4  1.69602958  1.5467175
5   P5 -0.04988322  0.1848279

The team column has remained the same, but the values in the three numerical columns (points, assists, and rebounds) have been scaled.

Check your inbox or spam folder to confirm your subscription.

How to Standardize Data in R? – Data Science Tutorials

The post How to Scale Only Numeric Columns in R appeared first on Data Science Tutorials

To leave a comment for the author, please follow the link and comment on their blog: Data Science Tutorials.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.