Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Introduction
Hey, R users! 🌟 Today, we’re going to look at a great new addition to the healthyR.data package—the get_cms_meta_data()
function! This function is a helpful tool for retrieving and analyzing metadata from CMS (Centers for Medicare & Medicaid Services) datasets. Whether you’re a healthcare analyst, data scientist, or R programming fan, you’ll find this function very useful. Let’s break it down and explore how it works.
Overview of get_cms_meta_data()
The get_cms_meta_data()
function lets you retrieve metadata from CMS datasets easily. You can customize your search using various parameters, ensuring you get precisely the data you need. Here’s the syntax:
get_cms_meta_data( .title = NULL, .modified_date = NULL, .keyword = NULL, .identifier = NULL, .data_version = "current", .media_type = "all" )< section id="arguments" class="level3">
Arguments:
.title
: Search by title..modified_date
: Search by modified date (format: “YYYY-MM-DD”)..keyword
: Search by keyword..identifier
: Search by identifier..data_version
: Choose between “current”, “archive”, or “all”. Default is “current”..media_type
: Filter by media type (“all”, “csv”, “API”, “other”). Default is “all”.
Return Value:
A tibble containing data links and relevant metadata about the datasets.
< section id="details" class="level3">Details:
The function fetches JSON data from the CMS data URL and extracts relevant fields to create a tidy tibble. It selects specific columns, handles nested lists by unnesting them, cleans column names, and processes dates and media types to make the data more useful for analysis. The columns in the returned tibble include:
title
description
landing_page
modified
keyword
described_by
fn
has_email
identifier
start
end
references
distribution_description
distribution_title
distribution_modified
distribution_start
distribution_end
media_type
data_link
Practical Examples
Let’s see the get_cms_meta_data()
function in action with a couple of examples.
Example 1: Basic Usage
First, we’ll load the necessary libraries and fetch some metadata:
# Library Loads library(healthyR.data) library(dplyr) # Get data cms_data <- get_cms_meta_data() glimpse(cms_data)
Rows: 107 Columns: 19 $ title <chr> "Accountable Care Organization Participants",… $ description <chr> "The Accountable Care Organization Participan… $ landing_page <chr> "https://data.cms.gov/medicare-shared-savings… $ modified <date> 2024-01-29, 2024-04-23, 2024-01-12, 2024-01-… $ keyword <list> <"Medicare", "Value-Based Care", "Coordinate… $ described_by <chr> "https://data.cms.gov/resources/accountable-c… $ fn <chr> "Shared Savings Program - CM", "Shared Saving… $ has_email <chr> "SharedSavingsProgram@cms.hhs.gov", "SharedSa… $ identifier <chr> "https://data.cms.gov/data-api/v1/dataset/976… $ start <date> 2014-01-01, 2017-01-01, 2021-01-01, 2021-01-… $ end <date> 2024-12-31, 2024-12-31, 2021-12-31, 2021-12-… $ references <chr> "https://data.cms.gov/resources/acos-aco-part… $ distribution_description <chr> "latest", "latest", "latest", "latest", "late… $ distribution_title <chr> "Accountable Care Organization Participants",… $ distribution_modified <date> 2024-01-29, 2024-04-23, 2024-01-12, 2024-01-… $ distribution_start <date> 2024-01-01, 2024-01-01, 2021-01-01, 2021-01-… $ distribution_end <date> 2024-12-31, 2024-12-31, 2021-12-31, 2021-12-… $ media_type <chr> "API", "API", "API", "API", "API", "API", "AP… $ data_link <chr> "https://data.cms.gov/data-api/v1/dataset/976…
# Attributes atb <- attributes(cms_data) atb$names
[1] "title" "description" [3] "landing_page" "modified" [5] "keyword" "described_by" [7] "fn" "has_email" [9] "identifier" "start" [11] "end" "references" [13] "distribution_description" "distribution_title" [15] "distribution_modified" "distribution_start" [17] "distribution_end" "media_type" [19] "data_link"
atb$class
[1] "cms_meta_data" "tbl_df" "tbl" "data.frame"
atb$url
[1] "https://data.cms.gov/data.json"
atb$date_retrieved
[1] "2024-05-28 10:20:18 EDT"
atb$parameters
$.data_version [1] "current" $.media_type [1] "all" $.title NULL $.modified_date NULL $.keyword NULL $.identifier NULL
In this example, we’re simply calling get_cms_meta_data()
without any parameters. This fetches the default dataset metadata. The glimpse()
function from the dplyr
package provides a quick overview of the data structure.
Example 2: Custom Search by Keyword and Title
Now, let’s refine our search by specifying a keyword and title:
get_cms_meta_data( .keyword = "nation", .title = "Market Saturation & Utilization State-County" ) |> glimpse()
Rows: 1 Columns: 19 $ title <chr> "Market Saturation & Utilization State-County" $ description <chr> "The Market Saturation and Utilization State-… $ landing_page <chr> "https://data.cms.gov/summary-statistics-on-u… $ modified <date> 2024-04-02 $ keyword <list> <"National", "States & Territories", "Countie… $ described_by <chr> "https://data.cms.gov/resources/market-satur… $ fn <chr> "Market Saturation - CPI" $ has_email <chr> "MarketSaturation@cms.hhs.gov" $ identifier <chr> "https://data.cms.gov/data-api/v1/dataset/89… $ start <date> 2023-10-01 $ end <date> 2023-12-31 $ references <chr> "https://data.cms.gov/resources/market-satura… $ distribution_description <chr> "latest" $ distribution_title <chr> "Market Saturation & Utilization StateCounty" $ distribution_modified <date> 2024-04-02 $ distribution_start <date> 2023-10-01 $ distribution_end <date> 2023-12-31 $ media_type <chr> "API" $ data_link <chr> "https://data.cms.gov/data-api/v1/dataset/890…
In this example, we filter the metadata by the keyword “nation” and the title “Market Saturation & Utilization State-County”. The pipe operator (|>
) is used to pass the result directly into the glimpse()
function for a quick preview.
Breaking Down the Code
Let’s break down the code blocks to understand what they’re doing:
< section id="basic-usage" class="level3">Basic Usage
Load Libraries:
library(healthyR.data) library(dplyr)
We load the
healthyR.data
package to access theget_cms_meta_data()
function and thedplyr
package for data manipulation.Fetch Metadata:
cms_data <- get_cms_meta_data()
We call
get_cms_meta_data()
without any parameters to get the default dataset metadata.Preview Data:
glimpse(cms_data)
The
glimpse()
function gives us a quick look at the structure and contents of the fetched metadata.
Custom Search
Custom Search Call:
get_cms_meta_data( .keyword = "nation", .title = "Market Saturation & Utilization State-County" ) |> glimpse()
Here, we call
get_cms_meta_data()
with specific parameters for keyword and title to narrow down our search. The result is passed toglimpse()
using the pipe operator for an immediate preview.
Conclusion
The get_cms_meta_data()
function is a versatile and flexible tool for accessing CMS metadata, making your data analysis tasks more efficient and effective. Whether you’re looking for specific datasets or just exploring the available metadata, this function has got you covered.
Try out get_cms_meta_data()
in your next R project and explore the potential of CMS data with ease! Happy coding! 🚀
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.