Introduction
Bar plots, also known as bar charts, are graphical representations used to display categorical data with rectangular bars. The lengths or heights of these bars are proportional to the values they represent, making it easy to compare different categories visually.
In this short article, I will be demonstrating how to create bar plots using the ggplot2 package - a powerful and flexible data visualization package that allows users to build plots layer by layer using the Grammar of Graphics.
Creating a basic bar plot
To create a basic bar plot in ggplot2, first you need to load the necessary packages in your R session.
Next, set the working directory and load the dataset. In this case, I will be using the Sample – Superstore dataset.
With data loaded, you can create a basic bar plot showing the total Sales by Region using the following code.
(Note, in the above code, I have summarized the Sales using sum)
Executing this, plots the following.
Let’s customize the bar plot, by adding some custom colors from the paletteer package, as well as adjust the bar widths and do some tidy work using the following code.
(Note in this code, I have set the color palette, adjusted the width of bars as well as used the classic theme for a simple clean interface)
Here is the color palette finder.
Executing this you’ve a nicer bar plot shown below.
Creating a grouped bar plot
With a few adjustments on the basic bar plot code, you can create a grouped bar plot by specifying Position = “dodge” in the geom_bar( ) function as shown below.
(In this case, I am grouping Sales by Category for different Regions, to do so – notice that I have assigned the argument fill = “Category”)
Executing these plots the view below.
Creating a stacked bar plot
To create a stacked bar chart visualizing the same fields as in the above case, all you need to do is assign the position as stack (Position = “stack”) in the geom_bar() function as shown below.
Executing these stacks the categories per region as shown below.
Creating 100% stacked bar plot
In this case, let’s create a 100% stacked bar plot showing the proportion of Sales by Category in different Regions.
Let’s compute the proportions and format the values using the following code.
Executing the above code returns the following.
With the proportions computed, you can now plot the 100% stacked bar plot by adding the ggplot section as shown in the code below.
(Notice, on the y axis I am plotting the percent, while stat in the geom_bar() function is specified as ‘identity’ so to plot the actual percent values)
Executing these plots the stacked bar plot shown below.
Conclusion
Bar plots are a powerful tool for visualizing categorical data in R. They provide an intuitive way to compare values across different categories and can be easily customized to enhance clarity and visual appeal. Using ggplot2 package, creating and customizing bar plots is straightforward and flexible. You can easily manipulate various elements of the plot to enhance clarity and visual appeal.
If you like the work we do and would like to work with us, drop us an email on our contacts page and we’ll reach out!
Thank you for reading!