# Chapter 5 An Illustrative Example of Case 3 Best–Worst Scaling

Author: Hideo Aizaki

First Release: 29 March 2021

## 5.1 Before you get started

This tutorial presents the entire process of Case 3 (multi-profile case) best–worst scaling (BWS)—from constructing the choice sets to analyzing the responses using **R** (R Core Team 2021b). The demonstration assumes some familiarity with **R**. If you are completely unfamiliar with **R**, you are advised to consult R Core Team (2021a) or other general support materials before attempting the example.

## 5.2 Overview of Case 3 BWS

This section explains Case 3 BWS briefly. For complete details on the approach please refer to works such as Marley and Pihlens (2012), Lancsar et al. (2013), and Louviere, Flynn, and Marley (2015 Chapter 4). For an outline of three variants of BWS, see Section 3.2.

Similar to discrete choice experiments (see Chapter 2), a Case 3 BWS question consists of three or more profiles. Each profile has two or more attributes with each attribute having two or more levels. Consequently, the profile is expressed as a combination of attribute-levels. Respondents are asked to choose the best and worst profiles from a Case 3 BWS question (i.e., a choice set). This style of question is repeated until all the choice sets are evaluated.

There are various approaches to construct the choice sets for Case 3 BWS. This example uses a basic design approach with a fractional factorial design (FFD)/orthogonal array (OA) and balanced incomplete block design (BIBD), as explained by Louviere, Flynn, and Marley (2015 Chapter 4). This approach creates profiles from an FFD and then assigns the profiles into choice sets using a BIBD. Therefore, the FFD and BIBD used for the approach need to satisfy a condition in which the number of rows of the FFD is equal to the number of treatments in the BIBD. An example with **R** code chunks will be shown later.

There are two approaches for analyzing the responses to Case 3 BWS questions similar to the Case 1 and Case 2 BWS—the counting and modeling approaches. Since the modeling approach seems to be major in applied studies, we focus only on this approach below. The modeling approach uses discrete choice models to analyze the responses. When using the modeling approach, a model type must be selected according to the assumptions made about the respondents’ choices to the Case 3 BWS questions. Following this, the dataset must be formatted as per the selected model. There are three standard models of analysis: maximum difference (maxdiff), sequential, and rank-ordered models (Lancsar et al. 2013; Marley and Pihlens 2012). Although the three models all assume that respondents derive utility from each profile in a choice set and select the best and worst profiles in the choice set based on the subjective utilities for profiles, the assumption regarding how they select the best and worst profiles from the choice set differs across the three models. The maxdiff model assumes that respondents select profile \(i\) as the best and profile \(j\) (\(i \neq j\)) as the worst because the difference in utility between the two profiles represents the greatest utility difference among all the utility differences. The number of utility differences in a pair is equal to the number of possible pairs in which profile \(i\) is selected as the best and profile \(j\) is selected as the worst from \(P\) profiles, that is, \(P \times (P - 1)\). The sequential model assumes that respondents select profile \(i\) as the best from \(P\) profiles in the choice set since the utility for profile \(i\) is the maximum, and then select profile \(j\) as the worst from the remaining \(P - 1\) profiles since the utility for profile \(j\) is the minimum. The rank-ordered model assumes that respondents select profile \(i\) as the best among \(P\) profiles since the utility for profile \(i\) is the maximum, profile \(k\) (\(k \neq i, j\)) as the second best from the remaining \(P - 1\) profiles since the utility of profile \(k\) is the second maximum, profile \(l\) (\(l \neq i, j, k\)) as the third best from the remaining \(P - 2\) profiles since the utility of profile \(l\) is the third maximum, and so on.

The probability of selecting profiles from a choice set \(S\) for each model can be expressed using the conditional logit model. Under assumptions such as a choice set consists of four profiles (\(S = \{1, 2, 3, 4\}\)); respondent \(n\) selected Profile 2 as the first best (FB), Profile 4 as the first worst (FW), Profile 1 as the second best (SB), and Profile 3 as the second worst (SW = the third best [TB] for the rank-ordered model); and the systematic component of utility of profile \(i\) for respondent \(n\) is \(v_{in}\), the choice probability for the maxdiff, sequential, and rank-ordered models is given as follows (\(n\) is omitted for the simplicity):

\[\begin{equation} \mathrm{Pr}(\textrm{FB} = 2, \textrm{FW} = 4) = \frac{\exp{(v_{2} - v_{4})}}{\sum_{p, q \subset S, p \neq q} \exp{(v_{p} - v_{q})}}. \end{equation}\]

\[\begin{equation} \mathrm{Pr}(\textrm{FB} = 2, \textrm{FW} = 4, \textrm{SB} = 1) = \frac{\exp{(v_{2})}}{\sum_{p \subset S} \exp{(v_{p})}}\frac{\exp{(-v_{4})}}{\sum_{q \subset S - \{2\}} \exp{(-v_{q})}}\frac{\exp{(v_{1})}}{\sum_{r \subset S - \{2, 4\}} \exp{(v_{r})}}. \end{equation}\]

\[\begin{equation} \mathrm{Pr}(\textrm{FB} = 2, \textrm{SB} = 1, \textrm{TB} = 3) = \frac{\exp{(v_{2})}}{\sum_{p \subset S} \exp{(v_{p})}}\frac{\exp{(v_{1})}}{\sum_{q \subset S - \{2\}} \exp{(v_{q})}}\frac{\exp{(v_{3})}}{\sum_{r \subset S - \{2, 1\}} \exp{(v_{r})}}. \end{equation}\]

where \(v_{i} = x_{i}'\beta\); \(x_{i}\) is a vector of attribute-level variables for profile \(i\); and \(\beta\) is a vector of the coefficients to be estimated.

The following **R** code example, given in this chapter, focuses only on the maxdiff model.

## 5.3 Packages for Case 3 BWS

We use five CRAN packages **support.BWS3** (Aizaki 2019b), **support.CEs** (Aizaki 2012), **DoE.base** (Grömping 2018), **crossdes** (Sailer 2013), and **survival** (Therneau 2020) to explain how to implement Case 3 BWS in **R**. The package **support.BWS3** provides functions to convert an FFD/OA and BIBD into Case 3 BWS questions, create a dataset for the analysis from the two designs and the responses to the questions, and generate artificial responses to the questions. The function `oa.design()`

in **DoE.base** constructs OAs, whereas the function `find.BIB()`

in **crossdes** generates BIBDs. The function `clogit()`

in **survival** fits the conditional logit model. Apart from `clogit()`

, other packages such as **mlogit** (Croissant 2020b), **gmnl** (Sarrias and Daziano 2017), and **apollo** (Hess and Palma 2019, 2021) are also used for the conditional logit model analysis. The functions `gofm()`

and `mwtp()`

in **support.CEs** calculate the goodness-of-fit measures for conditional logit model fitted by `clogit()`

and marginal willingness to pays for the non-monetary variables in the fitted model, respectively. Details for the CRAN packages regarding experimental designs and discrete choice models can be found on CRAN Task View: https://CRAN.R-project.org/view=ExperimentalDesign for experimental designs; and https://CRAN.R-project.org/view=Econometrics for econometrics including discrete choice models.

Once the packages have been downloaded and installed, we then load the packages needed for the example into the current session:

```
library("support.BWS3")
library("support.CEs")
library("survival")
library("crossdes")
library("DoE.base")
```

Note that we printed out six digits in the example. To set the same appearance format in **R**, execute

## 5.4 Designing a choice situation

We explain how to use the functions on the basis of a hypothetical survey focusing on potential tourists’ valuation of agritourism, which is constructed according to the survey given in Chapter 4. Agritourism refers to various activities offered by farms to visitors, such as hands-on farm work or outdoor recreation. In the hypothetical survey, a total of 100 individuals were asked to evaluate agritourism packages provided by dairy farms.

As the first step of implementing Case 3 BWS, it is necessary to determine the features of agritourism that are potentially available. In this example, the agritourism package consists of the following three types of activities and the fee, each with three levels (names used in **R** code chunks are in parenthesis):

- Hands-on farm chore attribute (
`chore`

)- Milking a cow (
`milking`

) - Feeding a cow (
`feeding`

) - Nursing a calf (
`nursing`

)

- Milking a cow (
- Hands-on food processing attribute (
`food`

)- Butter making (
`butter`

) - Ice-cream making (
`ice`

) - Creamy caramel making (
`caramel`

)

- Butter making (
- Outdoor recreation attribute (
`outdoor`

)- Horse riding (
`horse`

) - Tractor riding (
`tractor`

) - Walking with cows (
`cow`

)

- Horse riding (
- Fee per person of the agritourism package attribute (
`fee`

) (Note that USD 1 \(\neq\) JPY 104 as of January 2021)- JPY 3000
- JPY 6000
- JPY 9000

## 5.5 Generating designs

As mentioned above, this example uses a design method for Case 3 BWS as explained by Louviere, Flynn, and Marley (2015 Chapter 4). Therefore, the second step is to prepare an FFD/OA and BIBD corresponding to the choice situation.

Firstly, we create an OA with four three-level factors (attributes) using the function `oa.design()`

in **DoE.base**. Although the function has many arguments (see the `oa.design`

help document for details), this example uses two arguments:

`nlevels`

: A vector containing the number of levels corresponding to the OMED that you wish to create.`randomize`

: A logical variable denoted by`TRUE`

when randomizing the order of the rows in the resultant design or`FALSE`

when not doing so.

The following lines of code create the OMED used in this example and store it in an object `oa`

:

```
## A B C D
## 1 1 1 1 1
## 2 1 2 3 2
## 3 1 3 2 3
## 4 2 1 3 3
## 5 2 2 2 1
## 6 2 3 1 2
## 7 3 1 2 2
## 8 3 2 1 3
## 9 3 3 3 1
## class=design, type= oa
```

The resultant design is a matrix with nine rows and four columns. The columns of the OA correspond to attributes, while the rows correspond to profiles. Therefore, the first row denotes Profile 1 that consists of level 1 of each attribute, and the last (ninth) row denotes Profile 9 that consists of level 3 of attribute 1, 2, and 3 each, and level 1 of attribute 4.

Next, we create a BIBD with nine treatments, three treatments per block, and 12 blocks using the `find.BIB()`

function in **crossdes**. Three arguments are used:

`trt`

: A number of treatments.`k`

: A number of treatments per block.`b`

: A number of blocks.

**R** code for creating the BIBD for this example is given as:

```
## [,1] [,2] [,3]
## [1,] 2 4 5
## [2,] 3 4 9
## [3,] 2 3 6
## [4,] 1 6 9
## [5,] 1 3 5
## [6,] 2 8 9
## [7,] 5 7 9
## [8,] 3 7 8
## [9,] 1 4 8
## [10,] 4 6 7
## [11,] 1 2 7
## [12,] 5 6 8
```

```
##
## [1] The design is a balanced incomplete block design w.r.t. rows.
```

As the `find.BIB()`

uses random numbers internally, a seed for random number generator is set in advance by `set.seed()`

for reproducibility. The result of executing the **crossdes** function `isGYD()`

with the output from `find.BIB()`

indicates that the resultant design is a BIBD.

As mentioned above, the resultant BIBD is expressed as a matrix with 12 rows and three columns. According to Louviere, Flynn, and Marley (2015 Chapter 4), we can generate 12 choice sets for Case 3 BWS by replacing the treatment numbers in the BIBD such as 1, 2, …, and 9 with the corresponding profiles such as Profile 1, Profile 2, …, and Profile 9 created above. In this example, a choice set corresponding to the first row (block) of the BIBD consists of Profile 2, Profile 4, and Profile 5, while one corresponding to the ninth row of the BIBD consists of Profile 5, Profile 6, and Profile 8. This assignment process can be done using the **support.BWS3** function `bws3.design()`

with the following arguments:

`bibd`

: A data frame or matrix containing a BIBD.`ffd`

: A data frame or matrix containing an FFD.`attribute.levels`

: A list containing the names of the attributes and their levels.

The following code chunk stores the names of the attributes and their levels into an object `atr`

, and then creates a Case 3 BWS design for this example from the objects `bibd`

, `oa`

, and `atr`

:

```
atr <- list(
chore = c("milking", "feeding", "nursing"),
food = c("butter", "ice", "caramel"),
outdoor = c("horse", "tractor", "cow"),
fee = c(3000, 6000, 9000))
bws3dsgn <- bws3.design(
bibd = bibd,
ffd = oa,
attribute.levels = atr)
class(bws3dsgn)
```

`## [1] "cedes" "list"`

The resultant design is an object S3 class `"cedes"`

, which is an S3 class for discrete choice experimental design generated using functions in **support.CEs** (see Chapter 2).

## 5.6 Creating questions

The third step is to create Case 3 BWS questions from the design generated above. As the design generated from `bws3.design()`

is an object S3 class `"cedes"`

, the function `questionnaire()`

in **support.CEs** is used to create Case 3 BWS questions. The following is the **R** code chunk for this example:

```
questionnaire(bws3dsgn)
Block 1
Question 1
alt.1 alt.2 alt.3
chore "milking" "feeding" "feeding"
food "ice" "butter" "ice"
outdoor "cow" "cow" "tractor"
fee "6000" "9000" "3000"
...
Question 12
alt.1 alt.2 alt.3
chore "feeding" "feeding" "nursing"
food "ice" "caramel" "ice"
outdoor "tractor" "horse" "horse"
fee "3000" "6000" "9000"
```

As you can see, the first question has three different agritourism packages. The first package (`alt.1`

), which comes at JPY 6000, consists of three activities which are milking a cow, ice-cream making, and walking with cows; the second package (`alt.2`

) including feeding a cow, butter making, and waking with cows is priced at JPY 9000; and the third package (`alt.3`

) of feeding a cow, ice-cream making, and tractor riding comes at JPY 3000. When an opt-out option (i.e., I would like to select none of these packages) is available, respondents are asked to select the best and worst alternative from among four including the opt-out option. We can the copy the resultant questions from the **R** console and then paste them into other software packages such as a word processor to create the actual questionnaire.

## 5.7 Conducting a survey/Synthesizing responses to questions

In practice, the fourth step is to conduct the actual questionnaire survey using the Case 3 BWS questions created above. As this example assumes a hypothetical survey, we skip this step and instead synthesize responses to the Case 3 BWS questions (designs) using the function `bws3.response()`

in **support.BWS3**. This function generates artificial responses on the basis of a maxdiff model and has the following arguments:

`design`

: An object of the S3 class`"cedes"`

.`categorical.attributes`

: A vector containing the names of the attributes that are treated as dummy-coded variables in the model. If there are no categorical variables, it is set as`NULL`

(default).`continuous.attributes`

: A vector containing the names of the attributes that are treated as continuous variables in the model. If there are no continuous variables, it is set as`NULL`

(default).`asc`

: A vector containing binary values that takes the value of 1 when an alternative specific constant (ASC) is included in the utility function of an alternative and 0 otherwise. The \(i\)-th element in the vector corresponds to the \(i\)-th alternative. If there are no ASCs, it is set as`NULL`

(default).`common`

: A named vector containing a fixed combination of levels corresponding to a common alternative (base/reference) in the choice sets. If the common alternative is an opt-out (no choice) option, use the argument`optout`

instead. If there is no common option, it is set as`NULL`

(default).`optout`

: A logical variable describing whether the opt-out option is included in the Case 3 BWS questions. If`TRUE`

(default), the opt-out option is included; if`FALSE`

, it is not.`b`

: A vector containing parameters of independent variables in the model. The vector is used to calculate utilities for alternatives.`n`

: An integer value showing the number of respondents in the resultant dataset.`detail`

: A logical variable. If`TRUE`

, the dataset is returned in a detailed format; and if`FALSE`

(default), the dataset is returned in a simple format.`seed`

: Seed for a random number generator.

Before using the function `bws3.response()`

, we have to consider assumptions of an opt-out option and systematic component of utility function. Assumptions for this example are given as: (i) an opt-out option is included in each choice set (i.e., respondents are asked to select one from a set of three profiles and the “none of these” option); (ii) three activities are treated as categorical (qualitative) attributes, and the third level in each activity attribute (i.e., nursing a calf for hands-on farm chore, creamy caramel making for hands-on food processing, and walking with cows for outdoor recreation) is a reference (base level) for dummy variables; (iii) the fee is treated as a continuous (quantitative) attribute; and (iv) an alternative specific constant (ASC) is included in a utility function for the opt-out option, while it is not included for agritourism packages. Accordingly, the systematic component of a utility function for three agritourism packages (\(i = 1, 2, 3\)) and that for the opt-out option (\(i = 4\)) are given as follows:
\[\begin{align}
v_{i} = &\beta_{1}milking_{i} + \beta_{2}feeding_{i} + \beta_{3}butter_{i} + \beta_{4}ice_{i} + \beta_{5}horse_{i} + \beta_{6}tractor_{i} + \beta_{7}fee_{i},
\end{align}\]
\[\begin{align}
v_{4} = ASC,
\end{align}\]
where \(milking_{i}\) is a dummy variable taking the value of 1 if agritourism package \(i\) has a “milking a cow” as the hands-on farm chore activity, 0 otherwise; \(feeding_{i}\) is a dummy variable taking the value of 1 if agritourism package \(i\) has a “feeding a cow” as the hands-on farm chore activity, 0 otherwise; \(butter_{i}\) is a dummy variable taking the value of 1 if agritourism package \(i\) has a “butter making” as the hands-on food processing activity, 0 otherwise; \(ice_{i}\) is a dummy variable taking the value of 1 if agritourism package \(i\) has a “ice-cream making” as the hands-on food processing activity, 0 otherwise; \(horse_{i}\) is a dummy variable taking the value of 1 if agritourism package \(i\) has a “horse riding” as the outdoor recreation, 0 otherwise; \(tractor_{i}\) is a dummy variable taking the value of 1 if agritourism package \(i\) has a “tractor riding” as the outdoor recreation, 0 otherwise; \(fee_{i}\) is a numerical variable for fee per person of agritourism package \(i\) (unit: JPY); \(ASC\) is an alternative specific constant; and \(\beta\)s are coefficients (parameters) for these variables. Further, this example assumes that the values of the parameters, including \(ASC\), are as follows: \(\beta_{1} = 0.5\); \(\beta_{2} = -0.2\); \(\beta_{3} = 1\); \(\beta_{4} = 0.1\); \(\beta_{5} = 0.1\); \(\beta_{6} = 0.1\); \(\beta_{7} = -0.0003\); and \(ASC = -1\). Note that the function `bws3.response()`

assumes that the opt-out option is located as the last alternative in each choice set, if the option is available.

After calculating the utility values (by adding the calculated values of the systematic component of the utility and random numbers generated from a type I extreme value distribution), the function `bws3.response()`

finds the pair with the highest difference in utility from the \(P \times (P - 1)\) differences in utility. Given the assumptions mentioned above, the code chunk for synthesizing 100 individuals’ responses to Case 3 BWS questions for this example is as follows:

```
bws3rsp <- bws3.response(
design = bws3dsgn,
b = c(0, 0, 0, -1, 0.5, -0.2, 0, 1, 0.1, 0, 1, 0.1, 0, -0.0003),
n = 100,
optout = TRUE,
categorical.attributes = c("chore", "food", "outdoor"),
continuous.attributes = c("fee"),
asc = c(0,0,0,1),
detail = F,
seed = 123)
```

Here, we provide additional explanation on the settings for arguments `b`

and `asc`

. The numerical vector assigned to the argument `b`

is `c(0, 0, 0, -1, 0.5, -0.2, 0, 1, 0.1, 0, 1, 0.1, 0, -0.0003)`

. The first, second, third, and fourth elements correspond to ASC in the first, second, third, and fourth alternatives, respectively, in a choice set. The value of 0 for ASCs means the corresponding ASC does not exist: the first, second, and third alternative has no ASC. The fifth, sixth, and seventh elements correspond to the parameter values of dummy variables regarding the first attribute: milking, feeding, and nursing. Note that the seventh element has the value of 0, meaning that nursing a calf is treated as the reference for dummy variables regarding the hands-on farm chore attribute. The eighth, ninth, and tenth elements correspond to the parameter values of dummy variables regarding the second attribute: butter, ice, and caramel (the reference for hands-on food processing attribute). The eleventh, twelfth, and thirteenth elements correspond to the third attribute: horse, tractor, and cow (the reference for outdoor recreation attribute). The last element corresponds to the variable for the fourth attribute: fee.

When the systematic component of the utility includes ASCs, the argument `asc`

is set according to the assumption on ASCs. The \(i\)-th element of a binary vector assigned to the argument `asc`

takes the value of 1 if the \(i\)-th alternative has an ASC and takes the value of 0 otherwise. For this example, since a choice set consists of four alternatives and only the fourth alternative (i.e., the opt-out option) has an ASC, a vector `c(0, 0, 0, 1)`

is assigned to the argument `asc`

.

The resultant dataset is given as follows:

```
## 'data.frame': 100 obs. of 26 variables:
## $ id : int 1 2 3 4 5 6 7 8 9 10 ...
## $ BLOCK: num 1 1 1 1 1 1 1 1 1 1 ...
## $ B1 : num 4 1 3 1 3 3 4 3 1 3 ...
## $ W1 : num 2 2 4 3 1 1 2 2 3 2 ...
## $ B2 : num 4 4 4 4 1 3 4 3 4 3 ...
## $ W2 : num 3 2 1 3 2 4 1 1 1 1 ...
## $ B3 : num 3 4 3 2 3 2 3 3 3 2 ...
## $ W3 : num 1 2 2 3 2 4 2 4 2 3 ...
## $ B4 : num 1 1 1 1 1 1 1 2 1 1 ...
## $ W4 : num 2 2 4 3 3 3 2 4 2 3 ...
## $ B5 : num 1 1 4 4 3 1 1 1 4 1 ...
## $ W5 : num 3 2 2 2 2 4 2 2 3 3 ...
## $ B6 : num 3 1 3 2 1 4 4 1 3 3 ...
## $ W6 : num 2 3 2 1 2 2 2 2 2 1 ...
## $ B7 : num 4 2 1 2 2 1 2 3 1 4 ...
## $ W7 : num 3 4 4 3 1 4 3 4 2 1 ...
## $ B8 : num 2 4 2 2 3 4 2 2 1 4 ...
## $ W8 : num 1 1 1 1 1 1 3 3 4 2 ...
## $ B9 : num 1 1 1 1 1 3 1 1 4 1 ...
## $ W9 : num 2 3 3 2 3 2 2 2 2 2 ...
## $ B10 : num 4 4 2 2 1 2 2 4 4 2 ...
## $ W10 : num 1 1 1 4 2 1 3 2 1 1 ...
## $ B11 : num 2 1 3 1 1 1 1 1 1 1 ...
## $ W11 : num 4 2 4 3 4 2 2 4 4 4 ...
## $ B12 : num 3 1 4 4 4 1 4 2 4 1 ...
## $ W12 : num 1 2 2 2 2 2 2 1 1 2 ...
```

The dataset contains 100 observations (individuals) with 26 variables. The `id`

variable is the respondents’ unique id variable. The `BLOCK`

variable is the block number of choice sets. For this example, the choice sets are not divided into sub blocks. As such, the `BLOCK`

variable takes only the value of 1. The remaining 24 variables are response variables `B1`

to `W12`

that correspond to the Case 3 BWS questions: `B`

denotes a response as the best, `W`

denotes a response as the worst, and the numbers `1`

to `12`

indicate the Case 3 BWS question number. The alternative numbers selected as the best and worst are stored in the response variables. For example, respondent 1 (`id = 1`

) selected the opt-out option as the best (`B1 = 4`

) and the second alternative as the worst (`W1 = 2`

) in the first question.

When you conduct a survey, you would have to create a respondent dataset on the basis of the completed questionnaires. The structure of your respondent dataset needs to follow the basic structure of the dataset mentioned above: it must include the `id`

, `BLOCK`

, response variables. The response variables must be constructed such that the best selection alternates with the worst by question, which is similar to the synthesized respondent dataset: it contains 24 response variables `B1`

, `W1`

, `B2`

, `W2`

, …, `B12`

, and `W12`

. Other variables in the respondent dataset are treated as the respondents’ characteristics such as gender and age.

## 5.8 Creating a dataset

As the fifth step, we need to create a dataset for the analysis by combining the Case 3 BWS design generated in the second step and the responses to the Case 3 BWS questions. A dataset for the analysis is created using the **support.BWS3** function `bws3.dataset()`

with the following arguments:

`data`

: A data frame containing a respondent dataset.`id`

: A character representing the respondent identification number variable in the respondent dataset.`response`

: A named list containing the names of response variables in the respondent dataset.`choice.sets`

: An object of the S3 class`"cedes"`

.`categorical.attributes`

: A vector containing the names of attributes treated as dummy-coded variables in the analysis, or a named vector containing the reference levels of attributes treated as effect-coded variables. If there are no categorical variables, it is set as`NULL`

(default).`continuous.attributes`

: A vector containing the names of attributes treated as continuous variables in the analysis. If there are no continuous variables, it is set as`NULL`

(default).`common`

: A named vector containing a fixed combination of levels corresponding to a common alternative (base/reference) in the choice sets. If the common alternative is an opt-out (no choice) option, use the argument`optout`

instead. If there is no common option, it is set as`NULL`

(default).`optout`

: A logical variable describing whether the opt-out option is included in the BWS questions. If`TRUE`

(default), the opt-out option is included;`FALSE`

otherwise it is not.`asc`

: A vector containing binary values that takes the value of 1 when an ASC is included in the utility function of an alternative and 0 otherwise. The \(i\)-th element in the vector corresponds to the \(i\)-th alternative. If there are no ASCs, it is set as`NULL`

(default).`model`

: A character showing a type of dataset created by the function:`"maxdiff"`

for a maximum difference model;`"sequential"`

for a sequential model; and`"rank"`

for a rank-ordered model.`detail`

: A logical variable describing whether the response variables assigned to the argument`response`

are also stored in the dataset created by the function.

As you can see, the definitions of the arguments `data`

, `choice.sets`

, `categorical.attributes`

, `continuous.attributes`

, `common`

, `optout`

, and `asc`

are similar to those of the function `bws3.response()`

. Note that the order of questions in the respondent dataset has to be the same as that in the Case 3 BWS design assigned to `choice.sets`

.

In this example, we store the names of the response variables used in the synthesized respondent dataset into the vector `bws3rsp.var`

, which will be assigned to the argument `response`

of `bws3.dataset()`

:

```
## [1] "B1" "W1" "B2" "W2" "B3" "W3" "B4" "W4" "B5" "W5" "B6" "W6"
## [13] "B7" "W7" "B8" "W8" "B9" "W9" "B10" "W10" "B11" "W11" "B12" "W12"
```

According to the settings mentioned above, the following code chunk creates the dataset for the maxdiff model (note the argument `model = "maxdiff"`

):

```
bws3dat <- bws3.dataset(
data = bws3rsp,
response = bws3rsp.var,
choice.sets = bws3dsgn,
categorical.attributes = c("chore", "food", "outdoor"),
continuous.attributes = c("fee"),
optout = TRUE,
asc = c(0, 0, 0, 1),
model = "maxdiff")
```

In this example, each question contains four alternatives, including the none-of-these option, and thus the number of possible pairs per question is 12 (\(= 4 \times 3\)). Accordingly, the resultant dataset consists of 14,400 rows (= 100 respondents \(\times\) 12 questions \(\times\) 12 possible pairs per question):

`## [1] 14400 24`

The 24 variables, corresponding to the columns of the dataset, are as follows:

```
## [1] "id" "BLOCK" "QES" "PAIR" "BEST" "WORST" "RES.B"
## [8] "RES.W" "RES" "ASC1" "ASC2" "ASC3" "ASC4" "milking"
## [15] "feeding" "nursing" "butter" "ice" "caramel" "horse" "tractor"
## [22] "cow" "fee" "STR"
```

Among these variables, the variables `RES`

and `ASC4`

to `STR`

are used for the modeling analysis explained below. The following shows the first 12 rows, which correspond to the 12 possible pairs (`PAIR`

= `1`

to `12`

) in the first question (`QES`

= `1`

) for respondent 1 (`id`

= `1`

):

```
## id QES PAIR RES ASC4 milking feeding nursing butter ice caramel horse
## 1 1 1 1 0 0 1 -1 0 -1 1 0 0
## 2 1 1 2 0 0 1 -1 0 0 0 0 0
## 3 1 1 3 0 -1 1 0 0 0 1 0 0
## 4 1 1 4 0 0 -1 1 0 1 -1 0 0
## 5 1 1 5 0 0 0 0 0 1 -1 0 0
## 6 1 1 6 0 -1 0 1 0 1 0 0 0
## 7 1 1 7 0 0 -1 1 0 0 0 0 0
## 8 1 1 8 0 0 0 0 0 -1 1 0 0
## 9 1 1 9 0 -1 0 1 0 0 1 0 0
## 10 1 1 10 0 1 -1 0 0 0 -1 0 0
## 11 1 1 11 1 1 0 -1 0 -1 0 0 0
## 12 1 1 12 0 1 0 -1 0 0 -1 0 0
## tractor cow fee STR
## 1 0 0 -3000 1010
## 2 -1 1 3000 1010
## 3 0 1 6000 1010
## 4 0 0 3000 1010
## 5 -1 1 6000 1010
## 6 0 1 9000 1010
## 7 1 -1 -3000 1010
## 8 1 -1 -6000 1010
## 9 1 0 3000 1010
## 10 0 -1 -6000 1010
## 11 0 -1 -9000 1010
## 12 -1 0 -3000 1010
```

The variable `RES`

, which will be used as a response variable in the modeling analysis below, takes the value of 1 if the corresponding pair is selected by the respondent, 0 otherwise. As such, we can understand that respondent 1 selected the pair corresponding to the 11th row in the first BWS question. Although four ASCs (i.e., `ASC1`

, `ASC2`

, `ASC3`

, and `ASC4`

; see the output from `names()`

mentioned above) are created by the function `bws3.dataset()`

in this example, only the variable `ASC4`

will be used in the following analysis. This is because the function `bws3.dataset()`

creates ASCs according to the length of a vector assigned to the argument `asc`

. This example assigned a vector with four elements; as such four ASCs are included in the resultant dataset. The ASCs, corresponding to the elements with the value of 1 in the vector assigned to the argument `asc`

, are used in the following analysis, while the remaining ASCs, which have only the value of 0 in all the observations (rows) of the resultant dataset, are not used. The nine variables `milking`

to `cow`

correspond to the dummy variables regarding three activity attributes. Note that although three dummy variables are created according to the three levels in each attribute, one of three dummy variables for each attribute is not used in the modeling analysis because the non-used one is treated as the reference level in the attribute. The variable `fee`

corresponds to the fee per person of the agritourism package. These variables regarding the four attributes are named according to the level names in the design assigned to the argument `choice.sets`

and will be used as the independent variables in the following analysis. The last variable `STR`

is a stratification variable that has different values depending on the combinations of respondents and BWS questions.

## 5.9 Measuring preferences

Finally, we fit the conditional logit model to the responses on the basis of the maxdiff model. The function `clogit()`

in **survival** is used to fit the models. Although this function has many arguments, only two are used in this example:

`formula`

: A model formula.`data`

: A data frame containing the variables used in the model formula.

The structure of the model formula is similar to that for a linear regression function, `lm()`

: the dependent variable is set on the left-hand side of `~`

, whereas the independent variables are set on the right-hand side according to a systematic component of the utility function to be fitted. Unlike `lm()`

, however, a stratification variable must be added to the end of the right-hand side of `~`

using the `strata()`

function. As shown above, the stratification variable used in `clogit()`

is automatically generated as the variable `STR`

and stored in the dataset created using the function `bws3.dataset()`

. Three level variables—`nursing`

, `caramel`

, and `cow`

—are omitted from the model as those are assumed as the reference in the corresponding attributes (see the systematic component of the utility function for this example mentioned in Section 5.7). Therefore, the model formula for `clogit()`

is described (and stored as an object `bws3mf`

) as:

and the model is fitted using `clogit()`

with the dataset `bws3dat`

as follows:

```
## Call:
## clogit(formula = bws3mf, data = bws3dat)
##
## coef exp(coef) se(coef) z p
## ASC4 -8.98e-01 4.07e-01 9.83e-02 -9.14 < 2e-16
## milking 4.35e-01 1.55e+00 7.10e-02 6.13 8.6e-10
## feeding -1.76e-01 8.39e-01 6.17e-02 -2.85 0.0044
## butter 9.73e-01 2.65e+00 7.26e-02 13.40 < 2e-16
## ice 1.79e-01 1.20e+00 6.11e-02 2.93 0.0034
## horse 1.10e+00 3.01e+00 7.27e-02 15.14 < 2e-16
## tractor 1.34e-01 1.14e+00 6.22e-02 2.16 0.0308
## fee -2.97e-04 1.00e+00 1.29e-05 -23.03 < 2e-16
##
## Likelihood ratio test=1087 on 8 df, p=<2e-16
## n= 14400, number of events= 1200
```

In this output, the column `coef`

contains the estimated coefficients, column `exp(coef)`

is the exponential function of the estimated coefficients, column `se(coef)`

is the standard error of the estimated coefficients, and columns `z`

and `p`

give the \(z\)-value and \(p\)-value under the null hypothesis, in which the coefficient is zero. According to our model assumption on references for dummy variables regarding the three activity attributes, tourists’ preference for feeding is smaller than that for nursing (the reference in the hands-on farm chore), while that for milking is larger than that for the reference; those for butter and ice cream makings are larger than that for creamy caramel making (the reference in the hands-on food processing); and those for horse and tractor ridings are larger than that for walking with cows (the reference in the outdoor recreation). The negative estimate for fee means that tourists prefer an agritourism package with a lower fee.

The summary statistics for the model fitted with `clogit()`

can be calculated using the function `gofm()`

in **support.CEs**:

```
##
## Rho-squared = 0.182222
## Adjusted rho-squared = 0.179539
## Akaike information criterion (AIC) = 4893.04
## Bayesian information criterion (BIC) = 4933.76
## Number of coefficients = 8
## Log likelihood at start = -2981.89
## Log likelihood at convergence = -2438.52
```

Further, when using `clogit()`

for fitting the conditional logit model, marginal willingness to pay (MWTP) for the levels can be calculated with the function `mwtp()`

in **support.CEs**. Three arguments are used for this example:

`output`

: An output from`clogit()`

.`monetary.variables`

: A character vector containing the monetary variable.`nonmonetary.variables`

: A character vector containing the non-monetary variables used to calculate the MWTPs.

The command for this example is given as:

```
mwtp(
output = bws3md.cl,
monetary.variables = c("fee"),
nonmonetary.variables = c("milking", "feeding", "butter",
"ice", "horse", "tractor"))
```

```
##
## MWTP 2.5% 97.5%
## milking 1465.6 1016.3 1919.8
## feeding -591.9 -1006.7 -185.6
## butter 3275.9 2805.9 3765.3
## ice 603.4 194.6 1010.5
## horse 3705.9 3221.4 4200.7
## tractor 451.9 40.3 850.1
##
## method = Krinsky and Robb
```

According to the 95% confidence intervals for the MWTPs, MWTPs of activities milking, butter, ice, horse, and tractor are significant positives at 5% level, while that of feeding is a significant negative at 5% level.

### References

Aizaki, H. 2012. “Basic Functions for Supporting an Implementation of Choice Experiments in R.” *Journal of Statistical Software, Code Snippets* 50 (2): 1–24. https://doi.org/10.18637/jss.v050.c02.

Aizaki, H. 2019b. *Support.BWS3: Tools for Case 3 Best-Worst Scaling*. https://CRAN.R-project.org/package=support.BWS3.

Croissant, Y. 2020b. “Estimation of Random Utility Models in R: The Mlogit Package.” *Journal of Statistical Software* 95 (11): 1–41. https://doi.org/10.18637/jss.v095.i11.

Grömping, U. 2018. “R Package DoE.base for Factorial Experiments.” *Journal of Statistical Software* 85 (5): 1–41. https://doi.org/10.18637/jss.v085.i05.

Hess, S., and D. Palma. 2019. “Apollo: A Flexible, Powerful and Customisable Freeware Package for Choice Model Estimation and Application.” *Journal of Choice Modelling* 32: 100170. https://doi.org/10.1016/j.jocm.2019.100170.

Hess, S., and D. Palma. 2019. “Apollo: A Flexible, Powerful and Customisable Freeware Package for Choice Model Estimation and Application.” *Journal of Choice Modelling* 32: 100170. https://doi.org/10.1016/j.jocm.2019.100170.

*Apollo: A Flexible, Powerful and Customisable Freeware Package for Choice Model Estimation and Application*. Choice Modelling Centre. http://www.ApolloChoiceModelling.com.

Lancsar, E., J. Louviere, C. Donaldson, G. Currie, and L. Burgess. 2013. “Best Worst Discrete Choice Experiments in Health: Methods and an Application.” *Social Science & Medicine* 79: 74–82. https://doi.org/10.1016/j.socscimed.2012.10.007.

Louviere, J. J., T. N. Flynn, and A. A. J. Marley. 2015. *Best–Worst Scaling: Theory, Methods and Applications*. Cambridge, UK: Cambridge University Press.

Marley, A. A. J., and D. Pihlens. 2012. “Models of Best-Worst Choice and Ranking Among Multiattribute Options (Profiles).” *Journal of Mathematical Psychology* 56: 24–34. https://doi.org/10.1016/j.jmp.2011.09.001.

R Core Team. 2021a. *An Introduction to R*. https://CRAN.R-project.org/manuals.html.

R Core Team. 2021b. *R: A Language and Environment for Statistical Computing*. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/.

Sailer, M. O. 2013. *Crossdes: Construction of Crossover Designs*. https://CRAN.R-project.org/package=crossdes.

Sarrias, M., and R. Daziano. 2017. “Multinomial Logit Models with Continuous and Discrete Individual Heterogeneity in R: The Gmnl Package.” *Journal of Statistical Software, Articles* 79 (2): 1–46. https://doi.org/10.18637/jss.v079.i02.

Therneau, T. M. 2020. *Survival: Survival Analysis*. https://CRAN.R-project.org/package=survival.