top of page
Writer's pictureBernard Kilonzo

Combine Data in R: Append Data

combine data by appending in R

Introduction

Append data is the process of combining datasets that have the same variables (columns) but different observations (rows) by stacking them.

To combine datasets in R by appending (or stacking) them, you can use the rbind() function from base R or the bind_rows() function from the dplyr package.

Appending data in R can be summarized by the following view.

appending data in R

(By appending the data, the three tables can be unified into a single table – simplifying data analysis process)

Note: To append data – both the data frames MUST have the same number of columns and matching column names; otherwise, R will return an error.

Here’s a detailed explanation on how to append data using both the base R and dplyr package.

Loading the data

In this case, I am going to load the three worksheets from the excel workbook “Yearly Sales Data”.

sample dataset

(Note, the three tables have the same structure – same number of columns with matched column names)

But before that, lets set the working directory and load the necessary libraries.

loading necessary libraries and reading files in R

Append data using rbind() function from base R

The basic syntax for appending data using the rbind() function is.

combined_data <- rbind( Data1, Data2)

Using this function, you can append the dataset in our case as follows.

appending data using rbind() function from base R

(Note, using the function unique() to view the unique values of the column “Year” from the resulting table – returns 2017, 2018, & 2019 meaning the data from the three tables has been stacked together as needed).

Append data using bind_rows() function from the dplyr package

Assuming that you’ve already loaded the dplyr package.

You can append data using the following syntax.

Combined_data<- bind_rows(data_1, data_2)

Using dataset above, this could be accomplished as follows.

appending data using bind_rows() function from dplyr

Conclusion

Both methods are effective for appending datasets in R. Use rbind() when you are sure that both datasets match in structure. Opt for bind_rows() when dealing with datasets that may have differing structures or when you want to keep track of their origins

If you like the work we do and would like to work with us, drop us an email on our contacts page and we’ll reach out!

Thank you for reading!

bottom of page