hash-1.99.x
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
hash-2.0.0 has been released please read about it here:
Earlier today, hash-1.99.x was released to CRAN. This is a stable release and adds some more functions to an already full-featured hash implementation. This version fixes some bugs, adds some features, improves performance and stability. You can read about the hash package in my previous blog post, The hash package: hashes come to R. All changes were responsible from users who wrote in and contributed, thoughts, ideas and use cases. Keep the good ideas coming. Two of the major changes are summarized below.
Matthias Buch-Kromann of the Copenhagen Business School recommended the ability to access multiple keys from a single call and even access the same key multiple times. This was previously allowed using the [[
method, but was deprecated. By convention, the [[
method returns only one value. ( You can read about the conventions of this and other R accessors in my previous blog post, R Accessors Explained. ) This behavior has returned to hash-1.99.x the use of the values
method and the and optional keys
argument:
h <- hash( c('a','b','c'), 1:3 )
values(h)
values(h, keys=c('a','b','c','a','b','c' ) )
Matthias suggested calling the method mget
, but there was some disparity with the mget
function in base. The generic function that I needed just wouldn't play nice with base::mget.
Another change in the behavior was prompted by Mohammad Fahim of the Department of Computer Engineering and Computer Science at the University of Louisville. He wrote me to ask if there is a way to suppress warnings when trying to access non-existent keys. When accessing hashes hundreds of thousands of times, it becomes a drag to continually see:
key: xxxx not found in the hash : hash_table_name
I have refactored the behavior to be more R-like by following na.action
-type conventions. Now the default behavior is to return NA
when trying to access non-existing keys.
> library(hash)
>h <- hash( c('a','b','c'), 1:3 )
> h h[ letters[1:5] ]
containing 6 key-value pair(s).
a : 1
b : 2
c : 3
d : NA
e : NA
The behavior is also controllable by na.action.hash
option. The functions are provided for most use cases:
na.default.hash
(default) returnsNA
silently ,na.fail.hash
(old default) errors on non-existing keysna.warn.hash
returnsNA
but issues a warning.
Behaviors can be set by setting the na.hash.action
option. For example, to get the default behavior:
> options( na.hash.action = na.fail.hash )
> h$d
Error: key, d, not found in hash.
> h[[ 'd' ]]
Error: key, d, not found in hash.
And , for the [
and [[
methods, this behavior can be declared at access time:
> h[[ 'd', na.action=na.warn.hash ]]
Warning: key, d, not found in hash.
d
NA
> h[[ 'd', na.action=na.fail.hash ]]
Error: key, d, not found in hash.
> h[[ 'd', na.action=na.default.hash ]]
d
NA
If you don’t like these hash-key-miss behaviors, you are free to write your own. Functions should minimally accept arguments of the hash and the key.
Thanks to both Matthias and Mohammed for your feedback.
New features are on their way. Notably, the ability to use any object as keys and to preserve the order of the hash. These are sometimes called Indexed Hashes. Look for that in the hash-2.00.x release. If you would like to see features added contact me at cbrown -at- opendatagroup.com
References:
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.