Skip to contents

A small synthetic mental-health and wellbeing intervention sample, used as the package's messy-data companion to community. Where community is the clean default, clinic deliberately carries undeclared missing-value codes, a column whose value labels were stripped on import, and an imperfect scale item, so the declare-and-clean workflow has realistic material to work on. The 70 clients and 16 variables echo the teaching structure of community – an interaction, a null variable, non-overlapping missingness, a recode dichotomy, a clean logistic outcome, a multi-category variable, and a Likert battery – in a psychology setting. The data are synthetic, but the relationships among the variables are realistic.

Usage

clinic

Format

A data frame with 70 rows and 16 variables:

ClientID

Client ID, character ("C001", "C002", ...).

Stress

Perceived stress (integer, 0-40). Carries SPSS-style missing values (-99 Refused, -98 Don't know).

SocialSupport

Perceived social support (integer, 0-24); the buffering partner in the Stress-by-SocialSupport interaction on Flourishing.

SleepHours

Average nightly sleep in hours. Carries SPSS-style missing values (-99 Refused, -98 Don't know), placed on cases that do not overlap the Stress codes, so a model using both predictors drops more cases than either alone.

Flourishing

Flourishing score (integer, 0-100); built with a Stress-by-SocialSupport interaction (the buffering hypothesis).

ScreenTime

Daily screen time in hours; deliberately near-independent of the other variables.

PriorTherapy

Received therapy before the study, dichotomy coded 1/2 (1 Yes, 2 No); recode to 0/1 before use as a logistic-regression outcome.

SoughtHelp

Sought professional help during the study, dichotomy coded 0/1 (0 No, 1 Yes). A clean logistic-regression outcome.

Medication

Currently taking medication, dichotomy coded 0/1 (0 No, 1 Yes). Carries an SPSS-style missing value (-99 Refused).

Condition

Treatment condition, 4 categories (1 Control, 2 CBT, 3 Mindfulness, 4 Support group); has a modest effect on Flourishing.

MoodRating

Mood rating (integer, 1-10). Arrives "dirty": literal -99 (Refused) and -98 (Don't know) codes are present in the data with NO missing-value declaration, the state of play after a CSV or Excel import. The package's jdeclare_udm() demonstration variable: summary statistics are poisoned until the codes are declared.

Anxiety1

"I felt calm and relaxed." 5-point Likert (1 Not at all to 5 Extremely); reverse-keyed (the variable label ends in " R"). Reverse-code before scale scoring.

Anxiety2

"I worried about many different things." 5-point Likert. Arrives with literal -99/-98 codes present in the data and NO missing-value declaration – the undeclared contrast to Anxiety4.

Anxiety3

"I felt afraid for no clear reason." 5-point Likert; arrives with its value labels stripped (a plain numeric column, as after a CSV import that dropped the labels).

Anxiety4

"I had trouble controlling my worry." 5-point Likert. Carries properly declared SPSS-style missing values (-99 Refused, -98 Don't know) – the declared contrast to Anxiety2.

Anxiety5

"I felt restless or on edge." 5-point Likert; weakly loaded (a Cronbach's-alpha drop candidate when scoring the scale).

Source

Synthetic data generated by data-raw/clinic_data_generator.R (random seed 20260614).

Details

The five Anxiety items form a single scale, with one deliberate problem per item: Anxiety1 is reverse-keyed; Anxiety2 carries literal -99/-98 codes that are not declared as missing; Anxiety3 arrives with its value labels stripped; Anxiety4 carries the same -99/-98 codes but properly declared (the clean contrast to Anxiety2); and Anxiety5 is the weak item that scale-reliability output flags for dropping. Stress and SleepHours carry SPSS-style missing values on non-overlapping cases, so listwise deletion across both reduces the analysis sample below the per-variable counts. MoodRating and Anxiety2 are the two columns whose -99/-98 codes arrive undeclared, awaiting jdeclare_udm(). The Stress-by-SocialSupport interaction on Flourishing is the buffering hypothesis (higher social support weakens the negative association between stress and flourishing), and treatment Condition has a modest effect on Flourishing.

See also

community, the clean default example dataset.