
Understanding Boxplots and Violin Plots
Source:vignettes/Understanding-Boxplots-and-Violin-Plots.Rmd
Understanding-Boxplots-and-Violin-Plots.RmdWhen to use boxplots and violin plots
Boxplots and violin plots are foundational exploratory views. They are useful when your main question is not yet “which formal model should I run?” but instead:
- how are the cytokine values distributed?
- do some groups look shifted, wider, or more variable than others?
- are there obvious outliers or unusual shapes I should know before choosing a statistical test?
These are often the best first plots to inspect after Step 2 filtering.
When not to use boxplots and violin plots
These plots are less suitable when:
- you want a compact many-variable summary rather than a full distribution view
- you need a predictive or multivariate model
- your next decision depends on effect size thresholds or model validation rather than raw spread
In those cases, error-bar plots, volcano plots, PCA, or supervised methods may answer the question more directly.
What the app is showing
Boxplots summarize each cytokine with quartiles, median,
and potential outliers.
Violin Plots emphasize the full distribution shape and
can optionally add Show Boxplot Overlay so you can see both
the density shape and the boxplot summary together.
Both analyses can be ungrouped or split by selected categorical variables.
Which Step 4 arguments matter most
For both methods, the most important controls are:
-
Grouping Columns (Optional): whether the distributions are shown overall or split by group. -
Bin Size: how many numeric variables are shown on one page. -
Y-Axis Limits: whether to keep automatic scaling or force a shared scale across plots.
For violin plots specifically:
-
Show Boxplot Overlay: adds quartile and median structure inside the violin.
These decisions matter more than stylistic choices because they change what patterns are easy to see.
How to read the main outputs
Boxplots
Boxplots are best for quick summary reading:
- the median shows the central value
- the box shows the middle spread
- the whiskers and isolated points help flag potential outliers
Use them when you want a cleaner, simpler overview.
Violin Plots
Violin plots are best when the shape itself matters:
- wide regions indicate where values are more concentrated
- narrow regions indicate fewer observations
- asymmetry or multiple bulges can reveal skewness or multimodality
Use them when boxplots feel too compressed to describe the data shape.
Common cautions
Keep these limits in mind:
- small groups can make violin shapes look more certain than they really are
- auto-scaled y-axes can make panels look more different than they are
- outliers may reflect biology, quality issues, or data-entry problems, so they deserve follow-up rather than automatic removal
- grouped exploratory plots are not formal hypothesis tests
How to reproduce the result in the app
- Filter the dataset to the groups and cytokines you want to inspect.
- Choose
BoxplotsorViolin Plots. - Decide whether to use
Grouping Columns (Optional). - Adjust
Bin Sizeif the page is too dense or too sparse. - Leave
Y-Axis Limitsautomatic unless you need consistent scales across pages. - For violin plots, turn on
Show Boxplot Overlayif you want both shape and quartile summaries.
What to read next
Related articles:
- Understanding Error-Bar Plots for a more compact group-summary view.
- Understanding Volcano Plot when the question shifts to two-group differential screening.
- Understanding Univariate Test Selection when you are ready to move from exploratory views to formal testing.
Last updated: April 28, 2026