Compute a row-wise sum across multiple variables

jsum() computes the sum of values across multiple variables for each case (row) in the data frame. This is typically used to create composite scores from a set of related items (e.g. summing 6 survey items into a total scale score).

By default, cases with any missing values receive NA. Use the min.valid argument to allow partial sums — for example, min.valid = 1 returns the sum of available values as long as at least one item is non-missing.

Variables can be listed individually or using colon notation to select a range of consecutive columns (e.g. Attitude1:Attitude6).

Usage

jsum(data, ..., min.valid = NULL, var.label = NULL)

Arguments

data: A data frame, or omit to use the juse() default.
...: Unquoted variable names. Use colon notation (e.g. Attitude1:Attitude6) to select a range of consecutive columns.
min.valid: Integer (optional). The minimum number of non-missing values required to compute a sum. If a case has fewer non-missing values, the result is NA. If omitted, all values must be non-missing (equivalent to setting min.valid to the number of variables).
var.label: Character string (optional). A variable label to attach to the result. If omitted, an auto-generated label is used.

Value

A numeric vector the same length as nrow(data), suitable for assigning to a new column: MyData$Total <- jsum(Var1, Var2, Var3).

Examples

# Set the default data frame (so you can omit it in function calls)
juse(community)
#> Default data frame set to: community

# Sum three variables (all must be non-missing)
community$EnvTotal <- jsum(Environment1, Environment3, Environment4)
#> Sum of 3 variables computed for 100 cases (18 set to NA due to missing values).
#> Mean of the new variable: 9.220.
#> 
#> Note: jsum() returns the totals; assign them to a column to keep them:
#>   community$<name> <- jsum(...)
#> For the full distribution (min, max, SD), run jdesc() on the new column.

# Sum with partial data allowed (at least 2 non-missing)
community$EnvTotal <- jsum(Environment1, Environment3, Environment4,
                           min.valid = 2)
#> Sum of 3 variables computed for 100 cases (min.valid = 2: 12 cases used partial data, 6 set to NA due to missing values).
#> Mean of the new variable: 8.904.
#> 
#> Note: jsum() returns the totals; assign them to a column to keep them:
#>   community$<name> <- jsum(...)
#> For the full distribution (min, max, SD), run jdesc() on the new column.

# Sum using colon range for consecutive columns
community$EnvTotal <- jsum(Environment1:Environment5)
#> Sum of 5 variables computed for 100 cases (18 set to NA due to missing values).
#> Mean of the new variable: 14.976.
#> 
#> Note: jsum() returns the totals; assign them to a column to keep them:
#>   community$<name> <- jsum(...)
#> For the full distribution (min, max, SD), run jdesc() on the new column.

# Mix colon ranges and explicit names (e.g. after reverse-coding an item)
community$Environment2R <- jrecode(community, Environment2,
                                   map = "1=5; 2=4; 3=3; 4=2; 5=1")
#> 
#> Note: jrecode() returns the recoded values; assign them to a column to keep them:
#>   community$<name> <- jrecode(...)
#> To check the recode landed correctly, compare jfreq() on the original and the new column.
community$ScaleTotal <- jsum(Environment1, Environment2R,
                             Environment3:Environment5)
#> Sum of 5 variables computed for 100 cases (18 set to NA due to missing values).
#> Mean of the new variable: 15.415.
#> 
#> Note: jsum() returns the totals; assign them to a column to keep them:
#>   community$<name> <- jsum(...)
#> For the full distribution (min, max, SD), run jdesc() on the new column.

# With a custom variable label
community$ScaleTotal <- jsum(Environment1:Environment5,
                             var.label = "Environment Scale Total")
#> Sum of 5 variables computed for 100 cases (18 set to NA due to missing values).
#> Mean of the new variable: 14.976.
#> 
#> Note: jsum() returns the totals; assign them to a column to keep them:
#>   community$<name> <- jsum(...)
#> For the full distribution (min, max, SD), run jdesc() on the new column.

# With an explicit data frame (instead of using juse default)
community$EnvTotal <- jsum(community, Environment1, Environment3,
                           Environment4)
#> Sum of 3 variables computed for 100 cases (18 set to NA due to missing values).
#> Mean of the new variable: 9.220.
#> 
#> Note: jsum() returns the totals; assign them to a column to keep them:
#>   community$<name> <- jsum(...)
#> For the full distribution (min, max, SD), run jdesc() on the new column.

# Not normally needed. You'd clear a default or registration only to
# undo a mistake, or -- as in this example -- to reset state for testing.
juse(NULL)
#> Default data frame cleared.

Compute a row-wise sum across multiple variables

Usage

Arguments

Value

See also

Examples