Skip to contents

jstats simplifies R for users who need to do social science analyses without being required to become experienced computer programmers first. The package provides consistent syntax, sensible defaults, and protection from confusing base R behaviors, while staying close enough to base R conventions that users learn transferable skills rather than a private dialect. Output is styled after the best conventions from alternative applications such as SPSS, Stata, and SAS, and code syntax is designed to ease the transition from these alternative packages into R. While this package was originally built as teaching infrastructure for a university-level statistics course, it has now been expanded for the broader social science research community.

Audience

The long-term primary audience is the broader social science quantitative research community – criminologists, sociologists, political scientists, psychologists, public health researchers, and others who routinely work with Likert scales, categorical variables, dichotomies, Cronbach's alpha, dummy-coded regression, and haven-imported data from SPSS, Stata, or SAS.

During the current development phase the package is being tested actively by students and colleagues at Griffith University, plus a growing community of former students and collaborating instructors. Feedback from this group shapes ongoing refinements.

Functions by purpose

Descriptive analysis

  • jdesc – univariate descriptives (mean, median, SD, range, etc.) with optional grouping

  • jfreq – frequency tables for one or more variables

  • jcorr – Pearson or Spearman correlations with significance tests

  • jalpha – Cronbach's alpha and item-total statistics for scale reliability

  • jscreen – data screening for outliers, ranges, and skew

Group comparisons and modeling

  • jt – independent or paired t-test

  • jaov – one-way analysis of variance with optional post-hoc tests

  • jcrosstab – cross-tabulation with chi-square and effect-size options

  • jlm – linear regression

  • jlogistic – logistic regression

Variable construction

  • jrecode – recode values, with optional new value labels

  • jrelabel – apply or replace value labels and variable label

  • jsum – row-wise sum across variables, with min-valid handling

  • javg – row-wise mean across variables, with min-valid handling

Pipeline state management

  • juse – set the default data frame used implicitly by analysis functions

  • jsubset – activate a row-level case-selection expression applied to subsequent calls

  • jcomplete – activate listwise filtering on selected variables

  • jdummy – register categorical variables for dummy coding in regression

  • joutput – set session-level output verbosity (minimal / standard / full)

Data import and export

  • jload – load data from .rds, .sav, .dta, .sas7bdat, .xlsx, or .csv

  • jsave – save a data frame, with format inferred from the file extension

Visualisation

  • jplot – base histograms and bar plots for data, plus method dispatch on result objects from jt(), jlm(), etc.

For the full alphabetical listing of every exported function, run library(help = "jstats") or browse the package index.

Workflow conventions

The j-prefix. Every user-facing function starts with j, so the package's whole API can be discovered in RStudio by typing j and pressing Tab. Internal helpers begin with a dot or .jst_ and are not intended for direct use.

Formula vs data-first. Group-comparison and modeling functions follow the base R formula interface: jt(MathScore ~ Gender, data = SampleData). Descriptive and data-management functions take the data frame first, followed by unquoted variable names: jfreq(SampleData, Gender, Program). This matches the conventions of base R functions like aggregate() and cor().

The juse-first habit. A single juse(MyData) call at the start of a session sets a default data frame. Subsequent analysis calls can then omit the data argument: jfreq(Gender) works the same as jfreq(MyData, Gender). The default also scopes the pipeline-state functions, so jsubset(Age < 30) sets a filter on the current default without further specification.

Pipeline stages. jsubset(), jcomplete(), and jdummy() modify session state that subsequent analysis calls read automatically. State is explicit – calls can be inspected, inactivated, and cleared, and active state is reported in analysis output, so a script's behavior stays visible and reproducible rather than depending on hidden context.

Output verbosity. joutput() sets one of three preset levels – minimal, standard (default), or full – that modulate how much detail analysis functions print. Useful for stripping output in production scripts or expanding it during exploration. Per-call arguments always override session-level settings. The Case Processing Summary table follows an auto-suppress rule at the standard tier: it prints when something happened (pipeline state, listwise drops, or a per-variable discrepancy notification) and stays silent otherwise. See ?joutput for the full toggle behavior.

Where to go next

  • For the full alphabetical listing of functions: library(help = "jstats").

  • For source, issue reports, and contribution guidelines: the package's GitHub repository.

  • For statistics and R fundamentals (in preparation): Book 1 of the companion book series.

  • For migration patterns from SPSS, Stata, or SAS, and a deeper guide to the package's design and use in real research (in preparation): Book 2, the adopter's guide.

See also

Author

Maintainer: Jeff Ackerman SurveyCentre@griffith.edu.au