Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Overview
This walkthrough of the fundamentals of shell programming with Z shell (Zsh) and Bourne Again SHell (BASH) includes a comparison of similar components and features in R and RStudio. An alternate perspective from R is provided for you to leverage while learning the fundamentals of shell programming.
It is important to be aware of the similarities and differences between Zsh and BASH when working with shell programming, particularly considering that Zsh is the default shell for Mac systems as of macOS Catalina, while BASH is the default shell of most distributions of Linux operating systems (OS). BASH is also included in the infrastructure of many remote servers.
Questions
- What are the BASH and Zsh programming languages?
- How do I write code in a command language?
- What are the components and features of BASH and Zsh?
- How can I write and run BASH and Zsh scripts?
Objectives
- Become comfortable with working in the terminal
- Learn about the syntax and common features of the BASH and Zsh programming languages
- Extend concepts of R programming to learn about complementary components used in the BASH and Zsh programming languages
- Practice writing BASH, Zsh, and R code to perform basic operations
- Discover important similarities and differences between R, BASH, and Zsh programming
Related
Make sure to check out my post on R Fundamentals – From Syntax to Control Structures for an overview of related R programming topics!
Also see my post on Programming Fundamentals – Pseudocode, Code, and Algorithms to learn more about the basics of programming and coding.
Components of a Computer
A computer is composed of many parts, and several different hardware and software components are needed to make a computer work.
Operating System
We should become familiar with the following software components of a computer operating system (OS) before learning more about shell programming:
- shell – general name for any user space program with a user interface (UI) that allows access to resources (data and devices) in the computer system
- terminal – allows a user to work with a shell interactively, using a keyboard to provide input and a display (monitor or screen) to see the output on the screen
- command prompt (command line) – enables a human operator (user) to interface with a shell that is running in the terminal
- ‘tree’ – system directories organized in a tree-like format that may be graphically displayed using the tree command
- kernel – allocates a private segment of memory and provides limited access to resources such as files and devices
Hardware
The primary computer hardware components that we need to be aware of when working in the shell include:
- memory – provides short-term storage for the state of your system in a working space, which is cleared (deleted) when you turn off the computer
- disk – provides long-term storage of saved data in the hardware drives of the computer
- CPU – carries out the set of instructions contained in a program and performs any specified operations
The Terminal
The terminal of the computer OS allows a user to interactively work with the kernel via the pseudo teletype (TTY), which reads and writes data from the port to which a terminal is connected.
The standard input (stdin) and standard output (stdout) data streams are used by the TTY to supply input processes and report outputs. After receiving input from the terminal, the shell interprets this text to perform the specified processes.
Alternate RStudio Terminal
RStudio is a very useful software program that allows you to work with the R programming language using a convenient user interface (UI). The interface for the RStudio integrated development environment (IDE) has four components: source, console, environment/history, and files/plots/packages/help.
The RStudio source component provides a convenient location to write and edit shell and R scripts, and other files.
Notice that there is a terminal tab in the RStudio window of the console component. This allows you to run shell commands through the default terminal of your computer OS in the RStudio user interface.
Windows Terminal Option
The Ubuntu terminal for Windows has many of the same features you’ll find using the terminal for Linux (e.g., Ubuntu) or Unix operating systems (OS). There are a large number of Linux distributions, but most Mac OS versions are certified Unix.
As a first step the Windows Subsystem for Linux (WSL) will need to be installed for your version of Windows by following the appropriate Windows 10 or Windows 11 tutorial. After installing the WSL, you will be able to install the Ubuntu app from the Microsoft store on your computer.
Note that to install the Ubuntu app you will need a x86 PC running Windows (version 10+).
The Shell
Remember that in an OS the shell is the general name for any user space program with a UI that allows access to resources in the computer through the terminal. In other words, the shell is the computer program that interprets and runs (executes) the commands given by the user.
The execution of commands is achieved by the shell command interpreter, which reads commands from the standard input (stdin) and converts the commands into a kernel-understandable language.
What is Zsh?
The Z shell (Zsh) is a command language and a shell designed for interactive use. In addition to several original features, many of the useful features of BASH, Ksh, and TCsh have been incorporated into Zsh.
Note that Zsh is the default shell for Mac systems as of macOS Catalina (version 10.15), and some recent Linux distributions.
What is BASH?
The Bourne Again SHell (BASH) is the GNU Project’s interactive shell that includes a powerful user command language that is sh-compatible. It also incorporates useful features from the Korn shell (Ksh) and the C shell (Csh).
BASH it an important tool to know since it is the default shell in most distributions of Linux operating systems (OS), and is included in the infrastructure of many remote servers.
Zsh vs BASH
There are a few key differences between Zsh and BASH that we will encounter while learning about the basics of shell programming:
- Zsh arrays are indexed starting with 1 and BASH arrays are indexed starting with 0
- Hash data structures are supported in Zsh that are not present in BASH
- = expansion of commands is allowed by Zsh, but not allowed in not BASH
Simply enter the following command to check what shell is currently running in your terminal, such as BASH, Zsh Csh, or DASH.
Shell Code
echo $0
It is possible to change your shell in the terminal between Zsh and BASH using the following commands.
Zsh Code
bash
BASH Code
zsh
Shell Scripting
A script is a text file that contains code written in a programming language that is able to be interpreted at runtime. Each language has a different file extension for scripts:
- .zsh or .sh for Zsh
- .sh for BASH
- .r or .R for R
An advantage of saving your Zsh scripts with the .zsh extension is the ability to quickly distinguish or look up these files, which can be helpful given the significant differences between BASH and Zsh.
Interpreter Directives
Note that a Shebang interpreter directive needs to be placed as the first line of a shell script to allow the OS to interpret the shell script using an alternative shell to the default. The following are example interpreter directives for Zsh and BASH.
Zsh Scripts
#!/bin/zsh
BASH Scripts
#!/bin/bash
Before adding the shebang to the top of your script files, you can check the path for your Zsh or BASH library as follows.
Zsh Check
which zsh
BASH Check
which bash
Running Shell Scripts
A simple way to run a BASH script or Zsh script in the command line of the terminal is to use the following commands.
Zsh Scripts
zsh my_zsh_script.zsh
BASH Scripts
bash my_bash_script.sh
Shell vs R Programming Languages
The syntax of a programming language defines the meaning of specific combinations of words and symbols. This is why we call programming coding. Each programming language uses different combinations of words and symbols to get the computer to follow the instructions specified in your code.
Variables & Data Types
Similar to R, in the BASH and Zsh programming languages a combination of letters and symbols are used to give names (variables) to the data you are actively using in the memory of your computer system. However, in contrast to many programming languages, you do not have to declare (set) the data type of variables in BASH or Zsh. That is, shell variables are untyped and in essence, character strings.
We use the = operator in the shell to initialize a variable and assign it a value. This means that the variable is a name tag that points to a specific piece of data in the memory of the computer system. In contrast, the <- assignment operator is typically used to assign values to variables in the R programming language.
Key Points
There are some common conventions for how to format and initialize variables in the shell:
- exported variable names should be upper case
- do not use spaces in the naming or initialization (assignment) of a variable
- variable names can have letters, numbers, or underscores
Next we will practice assigning a value to a variable in the shell and R, followed by a discussion of why the above conventions are important.
Exercise
As a first step, let’s take a look at a simple example of assigning a value to a variable in the shell using BASH or Zsh, and compared with R.
Shell Code
ex_value=24
R Code
ex_value <- 24
It can be convenient to run R code in the command line of the terminal using either Zsh or BASH by entering the following command.
Shell Code
R
This will start the R program in the command line of the terminal at your current directory. It is also possible to open R by clicking the program icon in your computer user interface (UI). Remember that Rstudio is an integrated development environment (IDE) that makes working with R simple, but it is not necessary for writing or running R code and scripts.
Note that to exit an R program in the command line of the terminal, you will need to use the q() command like so.
R Code
q()
Discussion
What happens when you enter the following examples of shell code in the command line?
Shell Code
8
< mark style="background-color:rgba(0, 0, 0, 0);" class="has-inline-color has-red-color">Shell Output
zsh: command not found: 8
And what happens when you enter this piece of shell code in the command line?
Shell Code
test_value = 8
< mark style="background-color:rgba(0, 0, 0, 0);" class="has-inline-color has-red-color">Shell Output
zsh: command not found: test_value
Next, what happens when you enter this piece of shell code in the command line?
Shell Code
# this is a comment
< mark style="background-color:rgba(0, 0, 0, 0);" class="has-inline-color has-red-color">Zsh Output
zsh: command not found: #
< mark style="background-color:rgba(0, 0, 0, 0);" class="has-inline-color has-red-color">BASH Output
< mark style="background-color:#dddddd;" class="has-inline-color has-black-color">No output
Notice that in the above examples any characters up to the first whitespace are parsed by the shell command interpreter as a command name, which is why the error message reads < mark style="background-color:#dddddd;" class="has-inline-color has-black-color">command not found followed by any text preceding the first whitespace.
Note that in the last example, however, the pound or hashtag symbol # is ignored in BASH and parsed as a command in Zsh. This is because the # is ignored in the interactive shell of BASH and many other interactive shells. But lines beginning with this symbol are interpreted as a comment and ignored in BASH and Zsh scripts.
Shell Commands
Remember that the shell interprets the commands given by a user through the terminal with a command language, such as BASH or Zsh. In its most simple form a shell command consists of the command name followed by its arguments, all of which are separated by spaces.
Note that in general commands are the strings (names) used to run a specific function, and the function contains code to perform a particular task.
Printing data to the screen (display) is one of the fundamental functions of any programming language.
Basic Syntax
The most common command to print outputs in BASH and Zsh is named echo. After searching the internet for “echo manual bash”, we can see from the man page (manual) that this command has the following syntax. We specifically include “bash” in the search since this documentation is typically easy to find.
We can use the print command in R to display data to the screen. Use the ? operator in R or RStudio to check the help documentation for the print() function, which is called (run) by the print command.
echo [options]... [String]...
print(x, …)
Note that we can use the ? operator in RStudio to check the print function documentation as follows.
R Code
Key Points
From the above definition, we see that the syntax for calling (running) a function in BASH has the following features:
- function name
- white space
- options (arguments)
Exercise
Now let’s take a look at the common R print and shell echo commands for printing data to the screen in the shell using BASH or Zsh, and compared with R.
Shell Code
echo "cool cool cool"
< mark style="background-color:rgba(0, 0, 0, 0);" class="has-inline-color has-red-color">Shell Output
cool cool cool
R Code
print("cool cool cool")
< mark style="background-color:rgba(0, 0, 0, 0);" class="has-inline-color has-red-color">R Output
[1] "cool cool cool"
Notice how when we used the R print function it displayed the string cool cool cool in addition to the indices [1] and quotes ” “. This contrasts with the clean output from the shell echo function that includes only the string.
Achieving Clean R Output
You can use the cat function in R to achieve a similar output as the BASH echo command as follows.
Basic Syntax
The cat function has the following basic syntax in R. Notice that there are several arguments, but we are only going to use the first for the simple printing of strings.
cat(… , file = "", sep = " ", fill = FALSE, labels = NULL, append = FALSE)
Exercise
Here is a quick example of using the cat command to print the string cool cool cool to the screen in R.
R Code
cat("cool cool cool")
< mark style="background-color:rgba(0, 0, 0, 0);" class="has-inline-color has-red-color">R Output
cool cool cool
Display Shortcuts in R
It is possible to display the data stored in a R object by simply entering the name of the object into the RStudio console, or by running a line of code with the name of the object as follows.
R Code
my_message <- "coding is fun!" my_message
< mark style="background-color:rgba(0, 0, 0, 0);" class="has-inline-color has-red-color">R Output
[1] "coding is fun!"
You can also display the value assigned to an R object by encompassing the line of R code with the ( and ) symbols, like in the following example.
R Code
(my_message <- "coding is fun!")
< mark style="background-color:rgba(0, 0, 0, 0);" class="has-inline-color has-red-color">R Output
[1] "coding is fun!"
Accessing Data
In the R language the = operator is used to set a variable equal to a value, rather than assign the value to the variable using the <- operator. The nuance of this difference hinges on how the value is being stored in memory, and the accessibility of the value associated with the variable.
What this means is that to use a variable in R we simply need to call it by its name. However, for shell variables we need to prepend the $ operator to the name of the variable that we have initialized. Before the shell interprets (runs) each line of code entered in the command line or shell script, it first checks to see if any variable names are present by looking for the $ operator.
Exercise
We can access data assigned to a variable and print that data to the screen in the shell using BASH or Zsh, and compared with R as follows.
Shell Code
ex_num=1 echo $ex_num
R Code
ex_num <- 1 cat(ex_num)
< mark style="background-color:rgba(0, 0, 0, 0);" class="has-inline-color has-red-color">All Output
1
Line Endings in the Shell
It can be convenient to use the semicolon symbol ; to specify the end of a line of Zsh, BASH, or R code in the terminal or RStudio. You can also use the semicolon symbol as a separator between commands in scripts.
So the semicolon allows you to combine short pieces of code into one line and improve the readability of your code, as in the following example.
Shell Code
my_hello="hello there!"; echo $my_hello
< mark style="background-color:rgba(0, 0, 0, 0);" class="has-inline-color has-red-color">Shell Output
hello there!
Discussion
What happens if you enter the following example of shell code into the command line of the terminal using BASH or Zsh?
Shell Code
test_message="test test... is this thing on?"; echo $test_message
< mark style="background-color:rgba(0, 0, 0, 0);" class="has-inline-color has-red-color">Shell Output
test test... is this thing on?
And what happens if you enter the following shell code into the command line of the terminal?
Shell Code
test_message="test test... is this thing on?" echo $test_message
< mark style="background-color:rgba(0, 0, 0, 0);" class="has-inline-color has-red-color">Shell Output
test test... is this thing on?
In this last case, what happens when you open a new terminal or tab (environment) and enter the following code?
Shell Code
echo $test_message
< mark style="background-color:rgba(0, 0, 0, 0);" class="has-inline-color has-red-color">Shell Output
< mark style="background-color:#dddddd;" class="has-inline-color has-black-color">No output
The above examples of shell code highlight the accessibility and scope of the data for variables created in the shell at different points while programming.
We can see that the value of a variable created in one tab or terminal instance, is not available to another. This is because the different instances (copies) of the terminal do not share the same environment, which is the temporary memory located in the kernel of the computer OS.
Arithmetic Operations
Since variables in the shell are essentially character strings, how can we perform arithmetic operations? We can use functions in the shell to give context and perform arithmetic operations and comparisons on variables.
Operators
The let function in the BASH and Zsh programming languages allow you to perform arithmetic operations using the following operator symbols:
- addition +
- subtraction –
- division /
- modulus (remainder) %
Recall that in the R programming language we have access to the following arithmetic operators:
- addition +
- subtraction –
- division /
- exponentiation ^ or **
Basic Syntax
As a first step to learning how to perform arithmetic in the shell, let’s start by checking out the documentation for the let function.
let expression [expression]
Note that a specific command is not necessary in order to perform arithmetic operations in R.
Exercise
Now let’s try an example comparison of arithmetic operations in the shell using BASH or Zsh, and compared with R.
Shell Code
ex_num1=5 ex_num2=10 let "ex_result=$ex_num1 + $ex_num2" echo $ex_result
R Code
ex_num_1 <- 5 ex_num_2 <- 10 ex_result <- ex_num_1 + ex_num_2 cat(ex_result)
< mark style="background-color:rgba(0, 0, 0, 0);" class="has-inline-color has-red-color">All Output
15
Arithmetic Expansion in the Shell
The $(( )) command allows you to easily evaluate math expressions in BASH and Zsh using arithmetic expansion.
By adjusting the code from the previous example, we can achieve the same result as follows.
Shell Code
my_num1=5 my_num2=10 my_result=$(($my_num1 + $my_num2)) echo $my_result
< mark style="background-color:rgba(0, 0, 0, 0);" class="has-inline-color has-red-color">Shell Output
15
Indexed Variables & Arrays
In the R programming language vectors, matrices, and data frames are indexed variables that contain one-dimensional (1D) and two-dimensional (2D) named collections of data. In the shell we can use arrays to create similar 1D and 2D collections of data.
There are two types of indexed arrays in the BASH and Zsh languages:
- indexed arrays – are ordered lists of items in which the keys (indexes) are integer numbers
- associative arrays (hash tables) – are arrays in which the keys (indexes) are represented by arbitrary strings, rather than integers in BASH (version 4+)
Indexed Arrays
Remember that indexed arrays are ordered lists of items in which the keys (indexes) are integer numbers.
Basic Syntax
In R we can use the c function to quickly create a 1D list or vector, and in the shell we can use the declare command to create an array.
declare [-aAfFgiIlnrtux] [-p] [name[=value]]
c(…)
Note that it is the default of the c function to create a vector R object, rather than a list.
Exercise
In the following example we will make a 1D array or vector with the strings first, second, and third in the shell using BASH or Zsh, and compared with R.
Zsh Code
declare -a ex_birds ex_birds[1]="chicken" ex_birds[2]="robin" ex_birds[3]="sparrow" echo $ex_birds
< mark style="background-color:rgba(0, 0, 0, 0);" class="has-inline-color has-red-color">Zsh Output
chicken robin sparrow
BASH Code
declare -a ex_birds ex_birds[0]="chicken" ex_birds[1]="robin" ex_birds[2]="sparrow" echo $ex_birds
< mark style="background-color:rgba(0, 0, 0, 0);" class="has-inline-color has-red-color">BASH Output
chicken
R Code
ex_birds <- c("chicken", "robin", "sparrow") cat(ex_birds)
< mark style="background-color:rgba(0, 0, 0, 0);" class="has-inline-color has-red-color">R Output
chicken robin sparrow
Notice that the way we can refer to (access) array data in Zsh is different from accessing array data in BASH.
Recall that we need to index arrays beginning with 0 in BASH and 1 in Zsh. Additionally, the echo command in BASH only prints the first position (element) of the array.
Discussion
What happens if you run the following BASH and Zsh code for creating an indexed array using the declare command?
Zsh Code
declare -a test_song test_song[2]="to" test_song[3]="you" echo $test_song
< mark style="background-color:rgba(0, 0, 0, 0);" class="has-inline-color has-red-color">Zsh Output
to you
BASH Code
declare -a test_song test_song[1]="to" test_song[2]="you" echo ${test_song[*]}
< mark style="background-color:rgba(0, 0, 0, 0);" class="has-inline-color has-red-color">BASH Output
< mark style="background-color:#dddddd;" class="has-inline-color has-black-color">No output
Notice that we intentionally did not assign (store) a value in the starting index position of Zsh (starts with 1) or BASH (starts with 0). We can see that Zsh is still able to print the contents of a indexed array, while BASH does not print anything to the screen.
Now what happens when you enter the following shell code into Zsh and BASH?
Zsh Code
test_song[1]="happy birthday" echo $test_song
BASH Code
test_song[0]="happy birthday" echo ${test_song[*]}
< mark style="background-color:rgba(0, 0, 0, 0);" class="has-inline-color has-red-color">Shell Output
happy birthday to you
As we have now seen, indexed arrays in BASH need to have a value assigned to the first position at the 0 index of the array. This is in contrast to Zsh arrays that begin at the 1 index of the array.
We also see that it is necessary to use the wildcard operator * in between the [ and ] square brace characters to have the BASH interpreter display a similar output as Zsh and R. Furthermore, the array name and index specification need to be encompassed in { and } curly brace characters.
Shortcut Indexed Arrays
We can easily create indexed 1D arrays using shorthand in BASH and Zsh without using an explicit function call (command), similar to the colon operator : in R.
Basic Syntax
The following basic syntax can be used to create quick 1D arrays in both the shell and R.
name=(value1 value2 … )
from:to
Exercise
As the next pieces of code show, the simple shorthand form of creating arrays in the previous exercise can be very convenient.
Zsh Code
ex_nums=(5 6 7 8) echo $ex_nums
BASH Code
ex_nums=(5 6 7 8) echo ${ex_nums[*]}
R Code
ex_nums <- 5:8 cat(ex_nums)
< mark style="background-color:rgba(0, 0, 0, 0);" class="has-inline-color has-red-color">All Output
5 6 7 8
Associative Arrays
Recall that associative arrays in which the keys (indexes) are represented by arbitrary strings, rather than integers.
Basic Syntax
The following syntax is used to create an associative array in the BASH (version 4+) and Zsh with the declare command and the -A flag, and compare with the syntax in R for the list function.
declare [-aAfFgiIlnrtux] [-p] [name[=value]]
list(…)
Exercise
Let’s look at some example code for creating an associative array in the Zsh and BASH (version 4+) compared to the code for creating lists in R.
Zsh Code
declare -A ex_nature ex_nature[bird]="cheep" ex_nature[frog]="ribbit" echo $ex_nature
BASH Code
declare -A ex_nature ex_nature[bird]="cheep" ex_nature[frog]="ribbit" echo ${ex_nature[*]}
R Code
ex_nature <- list(bird = "cheep", frog = "ribbit") ex_nature_vector <- unlist(ex_nature) cat(ex_nature_vector)
< mark style="background-color:rgba(0, 0, 0, 0);" class="has-inline-color has-red-color">All Output
cheep ribbit
Notice that we needed to use the unlist command in R to convert the associative array we created to a vector. This is because the cat command cannot handle list R objects created by the list command.
And again we see that it is necessary to use the wildcard operator * in between the [ and ] square brace characters to have the BASH interpreter display a similar output as Zsh and R. Furthermore, the array name and index specification need to be encompassed in { and } curly brace characters.
Combining R Code
It is possible to simplify the above R code and reduce the number of created R objects by nesting function calls as follows.
R Code
ex_nature <- list(bird = "cheep", frog = "ribbit") cat(unlist(ex_nature))
< mark style="background-color:rgba(0, 0, 0, 0);" class="has-inline-color has-red-color">R Output
cheep ribbit
Accessing Array Elements by Index & Name
In the shell we were able to store values in an associative array by specifying the numeric index or name of the key (ID) and assigning it a value. Similarly, we can access individual pieces of array data (elements) by referencing the numeric index or key ID of the particular element.
Exercise
Here is an example of creating, accessing, and printing a elements of a Zsh indexed array in comparison with BASH, followed by an example of in R.
Zsh Code
ex_fruit=("pineapple" "pear" "kiwi") echo $ex_fruit[3] echo $ex_fruit[1]
BASH Code
ex_fruit=("pineapple" "pear" "kiwi") echo ${ex_fruit[2]} echo ${ex_fruit[0]}
R Code
ex_fruit <- c("pineapple", "pear", "kiwi") cat(ex_fruit[3]) cat(ex_fruit[1])
< mark style="background-color:rgba(0, 0, 0, 0);" class="has-inline-color has-red-color">All Output
kiwi pineapple
Remember that indexed arrays in BASH need to have a value assigned to the first position, which is the 0 index of the array. This is in contrast to Zsh arrays that begin at the 1 index of the array.
It is also necessary to use the wildcard operator * in between the [ and ] square brace characters to have the BASH interpreter display a similar output as Zsh and R. Furthermore, the array name and index specification need to be encompassed in { and } curly brace characters.
Exercise
Here is an example of creating, accessing, and printing a elements of a Zsh associative array in comparison with BASH (version 4+), followed by an example in R.
Zsh Code
declare -A ex_colors ex_colors[green]="yellow and blue" ex_colors[orange]="yellow and red" echo $ex_colors[orange]
BASH Code
declare -A ex_colors ex_colors[green]="yellow and blue" ex_colors[orange]="yellow and red" echo ${ex_colors[orange]}
R Code
ex_colors <- list(green = "yellow and blue", orange = "yellow and red") cat(ex_colors$orange) cat(ex_colors&green)
< mark style="background-color:rgba(0, 0, 0, 0);" class="has-inline-color has-red-color">All Output
yellow and red yellow and blue
Note that the cat function in R can be used to display the contents (elements) of a list object as long as we specify the name of the associative array position. This is because it is not possible to print all the elements of a list R object with the cat command.
Discussion
What happens when you enter the following shell code into the command line with Zsh or BASH (version 4+)?
Zsh Code
declare -A test_words test_words[one]="lonely" test_words[two]="tango" echo $test_words[1]
BASH Code
declare -A test_words test_words[one]="lonely" test_words[two]="tango" echo ${test_words[1]}
< mark style="background-color:rgba(0, 0, 0, 0);" class="has-inline-color has-red-color">Shell Output
< mark style="background-color:#dddddd;" class="has-inline-color has-black-color">No output
And what happens when you enter the following example of R code in the RStudio console or R program?
R Code
test_words <- list(one = "lonely", two = "tango") cat(unlist(test_words[1]))
< mark style="background-color:rgba(0, 0, 0, 0);" class="has-inline-color has-red-color">R Output
lonely
Note that it is not possible to access (refer) data stored in an associative array in the shell using numeric indexes. However, it is possible to use numbered indexes to refer to values stored in associative arrays in R.
Advanced Coding Challenge
Note that it is not possible to create multi-dimensional arrays, such as 2D arrays in the BASH or Zsh. But it is possible to basically simulate a multi-dimensional collection of data using associative arrays, for example.
Try creating your own 2D array in the BASH or Zsh command languages!
Programming with Logic
A fundamental concept of computer programming, Boolean logic is the mathematical logic underlying Boolean algebra. In Boolean algebra expressions are evaluated to one of two values: TRUE or FALSE. Since an expression may only take on one of two values, Boolean logic is considered two valued logic.
Control Statements
We can combine Boolean expressions with control statements to specify how programs will complete a task. Control statements allow you to have flexible outcomes by selecting which pieces of codes are executed, or not.
The three primary types of control statements are:
- Sequential statements – lines of code executed in the default ordering
- Iterative statements – control the number of times a block of code (set of lines) is executed
- Conditional statements (selection statements) – control which blocks of code are executed, and which are not
Sequential Statements
The most common control structure of sequential statements are lines of code written one after another, and executed line by line.
Exercise
In this example, let’s review the simple lines of code needed to assign a variable a value, and then print that value to the screen.
Pseudocode
- Assign x the character value of “hello”
- Print the value of x
Shell Code
x="hello" echo $x
R Code
x <- "hello" cat(x)
< mark style="background-color:rgba(0, 0, 0, 0);" class="has-inline-color has-red-color">All Output
hello
Iterative Statements & Looping Constructs
Iterative statements allow you to execute the same piece of code a specified number of times, or until a condition is reached. The most common looping construct (iterative statements) are defined using either FOR or WHILE loops. These are compound constructs in the BASH and Zsh programming languages, each of which begins with a reserved word or control operator and is terminated by a corresponding reserved word or operator.
FOR Loops
FOR loops are a type of iterative statement that can be used to repeatedly execute a piece of code, and for a certain number of times. The number of times a piece of code within a FOR loop is executed may typically be specified with a constant value or a variable.
Basic Syntax
for name [in words ...]; do commands; done
for (value in vector) { statements }
Key Points
The syntax in BASH and Zsh for a FOR loop can be tricky to use while coding, particularly when using the command line interface. There are a few necessary reserved words (keywords) we need to use to construct a FOR loop in the shell:
- for
- in
- do
- done
The pieces of code that you want to have executed multiple times needs to be placed in between the do and done keywords. Furthermore, a semicolon symbol ; or newline needs to be used before the do and done keywords.
Keep in mind that multiple lines of code in the body of the FOR loop need to be separated by semicolons ; or newlines as well.
Exercise
Let’s practice writing FOR loops in the BASH and Zsh programming languages using the following example of printing each value contained in a sequence (set).
Pseudocode
- For each value in the sequence a, b, c, d
- Assign x the current value
- print the value of x
Shell Code
for i in {a..d}; do echo $i; done
R Code
for (x in letters[1:4]) { cat(x, sep = "\n") }
< mark style="background-color:rgba(0, 0, 0, 0);" class="has-inline-color has-red-color">All Output
a b c d
Notice that we needed to include the sep (separator) argument in the cat command in order to print each value in the sequence followed by a newline, which allows us to achieve the same output as in BASH and Zsh. The newline is added between (separating) each printed value by giving the \ and n characters to the sep argument.
Alternate Syntax in the Shell
We have seen that multiple lines of code in the body of the FOR loop need to be separated by semicolons ; or newlines. This means that we can arrange the lines of code associated with a FOR loop in different ways.
The following is a one line example, which is my preference while coding in the terminal and with small control structures since it is the quickest and easiest format.
Shell Code
for i in {1..3}; do echo $i; done
Next is a multi-line example that does not use any semicolons to combine lines.
Shell Code
for i in {1..3} do echo $i done
Here is a mixed example, which is my preference while scripting or using large control structures since it makes the body of each statement clearly stand out and in fewer lines of code.
Shell Code
for i in {1..3}; do echo $i done
Shell Input Prompt
The terminal will keep requiring input with the carrot symbol > as you enter the lines of code for a iterative and control statements, such as the following FOR loop examples. If you want to make the terminal stop asking for input, you can use the Ctrl-c or Ctrl-z key combinations:
- Ctrl-c sends the interrupt/terminate signal SIGINT to the current process running in the foreground
- Ctrl-z sends the pause signal SIGTSTP to the current process running in the foreground and returns the user to the shell prompt
- fg is the foreground command that can be used to resume a paused process and fg %1 resumes the most recent previous job
WHILE Loops
WHILE loops are another type of looping construct that can be used as a control structure in your code. This type of iterative statement will continue to execute a piece of code until a condition is reached.
Basic Syntax
while test-commands; do consequent-commands; done
while (test_expression) { statement }
Key Points
The syntax of the WHILE loop in BASH is similar to the FOR loop, but with a few important differences in reserved words (keywords) and operators:
- while
- [
- white space
Of course the while keyword is used instead of for here. Also, the white space following the opening [ and preceding the closing ] is necessary for the proper interpretation of the WILE loop. You will receive a “bad pattern” error message if the white spaces are excluded.
Exercise
The following is an example WHILE loop where we first assign a value to a variable, then we loop over (iterate) this variable and decrease by 1 (decrement) the value of the variable until the the value of the stopping condition is reached.
Pseudocode
- Assign x the value of 3
- While x is greater than 0
- print the value of x
- decrement the value of x by 1
Shell Code
x=3; while [ $x -gt 0 ]; do echo $x; x=$(($x-1)); done
R Code
x <- 3 while (x > 0) { cat(x) x <- x - 1 }
< mark style="background-color:rgba(0, 0, 0, 0);" class="has-inline-color has-red-color">All Output
3 2 1
Conditional Statements
The conditional constructs (conditional statements) in BASH and Zsh are used to conditionally execute commands depending on the specifications of the user.
IF… THEN Statements
The most simple form of conditional statement is the IF… THEN form. These are statements that have two parts: hypothesis (if) and conclusion (then). The execution of the conclusion of the statement is conditional upon the state of the hypothesis expression, which must evaluate to TRUE.
Basic Syntax
if test-commands; then consequent-commands; fi
if(boolean_expression) { // statement(s) will execute if the boolean expression is true. }
Exercise
We’ll next practice writing IF… THEN statements (constructs) in the shell and R with the following example where we first assign a value to a variable, then we check the value of the variable before printing the value or not.
Pseudocode
- Assign x the value of “a”
- If x is equal to “a”, then print the value of x
Zsh Code
x="a"; if [ $x "==" "a" ]; then echo $x; fi
BASH Code
x="a"; if [ $x == "a" ]; then echo $x; fi
R Code
x <- "a" if (x == "a") { cat(x) }
< mark style="background-color:rgba(0, 0, 0, 0);" class="has-inline-color has-red-color">All Output
a
Note how in Zsh we must encompass the equality check == with quotes ” “ in order for the shell to compare the value of the variable with the string. However, the quotes are not necessary in BASH or R. This is because Zsh uses the = operator to allow for the expansion of commands in the interactive shell.
IF… THEN… ELSE Statements
The next type of conditional construct adds another level of complexity with the IF… THEN… ELSE format, which allows you to write shell code that has multiple alternative conclusions to a hypothesis expression.
Basic Syntax
if test-commands; then consequent-commands; [else alternate-consequents;] fi
if(boolean_expression) { // statement(s) will execute if the boolean expression is true. } else { // statement(s) will execute if the boolean expression is false. }
Exercise
In this next example we will use an IF… THEN… ELSE conditional construct (statement) to print the value of a variable if it is equal to a given string, else print a message to the screen.
Pseudocode
- Assign x the value of “b”
- If x is equal to “a”, then print the value of x
- Else print “x is not equal to the character ‘a’”
Zsh Code
x="b" if [ $x "==" "a" ]; then echo $x else echo "x is not equal to the character 'a'" fi
BASH Code
x="b" if [ $x == "a" ]; then echo $x else echo "x is not equal to the character 'a'" fi
R Code
x <- "b" if (x == "a") { cat(x) } else { cat("x is not equal to the character 'a'") }
< mark style="background-color:rgba(0, 0, 0, 0);" class="has-inline-color has-red-color">All Output
x is not equal to the character 'a'
Again we see that in Zsh we must encompass the equality check == with quotes ” “ in order for the shell to compare the value of the variable with the string.
Compound Statements
A more advanced type of conditional construct (statement) combines multiple IF… THEN… ELSE statements to make a compound statement with many alternative outcomes.
Basic Syntax
if test-commands; then consequent-commands; [elif more-test-commands; then more-consequents;] [else alternate-consequents;] fi
if(boolean_expression 1) { // Executes when the boolean expression 1 is true. } else if( boolean_expression 2) { // Executes when the boolean expression 2 is true. } else { // executes when none of the above condition is true. }
Exercise
Here we will practice creating a compound conditional construct with multiple alternative outcomes, which are used to determine the final string to output to the screen.
Pseudocode
- Assign x the value of “c”
- If x is equal to “a”, then print “x is equal to ‘a’”
- Else if x is not equal to “c”, then print “x is not equal to ‘a’ or ‘c’”
- Else if x is equal to “c”, then print “x is equal to ‘c’”
- Else print “last chance!”
Zsh Code
x="c" if [ $x "==" "a" ]; then echo "x is equal to 'a'" elif [ $x "!=" "c" ]; then echo "x is not equal to 'a' or 'c'" elif [ $x "==" "c" ]; then echo "x is equal to 'c'" else echo "last chance!" fi
BASH Code
x="c" if [ $x == "a" ]; then echo "x is equal to 'a'" elif [ $x != "c" ]; then echo "x is not equal to 'a' or 'c'" elif [ $x == "c" ]; then echo "x is equal to 'c'" else echo "last chance!" fi
R Code
x <- 'c' if (x == 'a') { cat("x is equal to 'a'") } else if (x != 'c') { cat("x is not equal to 'a' or 'c'") } else if (x == 'c') { cat("x is equal to 'c'") } else { cat("last chance!") }
< mark style="background-color:rgba(0, 0, 0, 0);" class="has-inline-color has-red-color">All Output
x is equal to 'c'
Here again we see that in Zsh we must encompass the equality check == with quotes ” “ in order for the shell to compare the value of the variable with the string.
Nested Statements
An even more advanced concept, nested IF… THEN… ELSE conditional construct can increase the flexibility of your code by allowing you to specify more complex conditions.
Basic Syntax
if test-commands; then consequent-commands; if more-test-commands; then more-consequents;] fi fi
if( boolean_expression 1) { /* Executes when the boolean expression 1 is true */ if(boolean_expression 2) { /* Executes when the boolean expression 2 is true */ } }
Exercise
In this last example we will create a nested IF… THEN… ELSE construct to determine which string to output after the code is run.
Pseudocode
- Assign x the value of 8
- If x is less than 1, then
- AND If x is equal to ‘c’, then print “x is less than 1 and equal to ‘c’”
- Else print “x is less than 1”
- Else print “x is greater than 1”
Zsh Code
x=8 if [ $x -lt 1 ]; then if [ $x "==" "c" ]; then echo "x is less than 1 and equal to 'c'" else echo "x is less than 1 and not equal to 'c'" fi else echo "x is greater than 1" fi
BASH Code
x=8 if [ $x -lt 1 ]; then if [ $x == "c" ]; then echo "x is less than 1 and equal to 'c'" else echo "x is less than 1 and not equal to 'c'" fi else echo "x is greater than 1" fi
R Code
x <- 8 if (x < 1) { if (x == "c") { cat("x is less than 1 and equal to 'c'") } else { cat("x is less than 1 and not equal to 'c'") } } else { cat("x is greater than 1") }
< mark style="background-color:rgba(0, 0, 0, 0);" class="has-inline-color has-red-color">All Output
x is greater than 1
Again we see that in Zsh we must encompass the equality check == with quotes ” “ in order for the shell to compare the value of the variable with the string.
Conclusion
We have seen that it is possible to use a variety of commands and evaluate mathematical expressions with Zsh and BASH in the terminal, similar to what we have done with the R programming language. We have also seen that writing code or scripts for the shell can be particularly tricky when working with operating systems that use different shells, such as BASH or Zsh.
Key Points & Tips
- BASH, Zsh, and R share a lot of the same basic functionalities
- There are important differences in between the BASH and Zsh command languages
- Use the -h flag to examine the description of some shell commands
- Use the ? operator to examine the description of R commands in installed packages
- Search the internet for further information about shell commands or using the terminal
- Copy and paste!
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.