^{1}

^{2}

Competing Interests: The authors declare no conflicts of interests.

One of the major goals of basic studies in psychiatry is to find etiological
mechanisms or biomarkers of mental disorders. A standard research strategy to
pursue this goal is to compare observations of potential factors from patients
with those from healthy controls. Classifications of individuals into patient
and control groups are generally based on a diagnostic system, such as the

One of the major challenges of basic research in psychiatry is to find etiological
mechanisms or biomarkers that can be helpful for investigating the treatment or
predicting the prognosis of a mental disorder. A standard research strategy for
achieving this goal is as follows. Researchers first classify the individuals into a
clinical population (patient group) and a nonclinical population (control group).
This classification is generally based on the current diagnostic systems, such as
the

Several methodological flaws have been recognized in the conventional category-based
approaches (e.g.,Cuthbert & Kozak,

To overcome the aforementioned problems, the U.S. National Institute of Mental Health
(NIMH) established the Research Domain Criteria (RDoC; Cuthbert,

In addition, the RDoC approach does not treat symptoms as a syndrome (a group of
signs and symptoms that occur together) as the category-based approach does. The
RDoC approach encourages the investigation to focus on minimum behavioral elements
and underlying mechanisms. In this regard, the spirit of RDoC includes that of
cognitive neuropsychiatry, which is the field that attempts to understand symptoms
in mental disorders as aberrations of cognitive functions (David & Halligan,

However, how RDoC can improve research in psychiatry remains a matter of debate.
Although there are methodological flaws in the current diagnostic systems (

In this study, we propose a theoretical framework for examining how effective
research strategies, including those encouraged by RDoC, are in basic research in
psychiatry. The proposed framework evaluates how effectively each method finds a
target pathogenetic factor given some situational settings. In this framework, we
first construct a formal, hypothetical generative model of symptoms and their
causes. From the generative model, synthesized samples are generated. Then, we
evaluate how likely several research methodologies are to detect the true cause of a
symptom or pathological behavior from observed data based on standard statistical
methods, including

One issue that our framework addresses is whether a mental disorder is best
investigated using categorical or dimensional approaches (First,

The remainder of this article is organized as follows. First, we formally explain the proposed framework. Next, we provide some simulation results that highlight the properties of the proposed framework. Finally, we discuss the implications of the results and the limitations of the proposed framework.

Here we formally describe the proposed framework. We assume that there are _{
j
}. All the pathogenetic factors are summarized as a column vector: _{1}, … , _{
N
}]^{
T
}, where ⋅^{
T
} denotes the transpose. Here the pathogenetic factors may include specific
alleles or brain connectivity, which can be predictors of risk. The factors may also
include the dysregulation of a neuromodulator or neurotransmitter, which can be a
target of medical treatment, as well as environmental and social miliex. The
measurement of the value of variable _{
j
} is denoted as

We consider _{1}, … , _{
M
}]^{
T
}. Here the behavioral observations include self-reported symptoms and signs
that are used in

The next step is to define the generative model, which represents how the
pathogenetic factors are translated to behavioral observations. We assume that
the pathogenetic factors

In the following simulations and analyses (Cases 1–5), we consider a simple model
with linear mapping and Gaussian noise, called the linear Gaussian model. The
mathematical formulation of this model is presented in Katahira and Yamashita
(_{
i
} is the linear combination of _{
i
}. From this assumption, each _{
i
} also obeys the Gaussian distribution, which means that behavioral
observations are continuous variables. Many inclusion criteria in the current
diagnostic systems (i.e., _{
i
} may be interpreted as a behavioral phenotype, based on which a
psychiatrist or a patient makes decisions regarding each symptom rather than the
symptom itself. In Case 6, rather than a linear Gaussian model, a reinforcement
learning model is used as a generative model for behavioral observations related
to psychosis.

In the proposed framework, the category-based approach first classifies the
individuals into the patient group or the control group, depending on the values
of their behavioral observation _{
i
} for all _{
i
} (_{
i
} ≥ _{
i
} ∀_{
i
}), the individual is classified into the patient group (in _{
i
} < _{
i
} ∃_{
i
}) are classified into the control group (the individuals indicated with
gray dots in _{
i
} = _{
i
} unless otherwise stated.

The category-based approach seeks the pathogenetic factors that significantly
differ between the two groups. The estimated or measured value of _{
j
}, which is denoted by _{
j
} to the true value _{
j
} (see Katahira & Yamashita,

In the simulation, the samples of subjects (_{1} subjects from the control group and _{2} subjects from the patient group, resulting in a total of _{1} + _{2} = _{
j
} is considered to be a pathogenetic factor relevant to the mental
disorder. When multiple candidate factors are submitted to statistical tests, a
correction should be made for multiple comparisons (e.g., Bonferroni correction)
to suppress family-wise error rates. However, for simplicity, we do not perform
such a correction in this article. Incorporating a correction is straightforward
and does not influence the quantitative results reported herein.

When more than one pathogenetic factor can be observed at the same time, one may
use logistic regression, in which the objective variable is diagnostic category
(0 control, 1 patient) and the regressors (predictors) are the observed values
of potential pathogenetic factors

The dimensional approach addresses the behavioral observations, including
symptoms and pathogenetic factors, without categorical labels. This approach
utilizes the natural variation of the population. RDoC basically encourages the
dimensional approach. In the proposed framework, the dimensional approach is
simulated by sampling _{
i
} and _{
j
} is deemed to be a factor that is relevant to the behavioral observation
_{
i
}.

When more than one potential pathogenetic factor can be observed at the same
time, one may use multiple linear regression, in which the objective variable is
each behavioral measure _{
j
} and the regressors are potential pathogenetic factors

RDoC is more than just a dimensional approach to mental disorders. It emphasizes the consideration of constructs of the mental disorders based on neurobiological grounding. In addition, RDoC encourages investigations that target multiple units of analysis. The targeted units of analysis can all be biological ones (e.g., “cells” and “circuits”). However, to relate the biological mechanisms to mental disorders, it at least requires knowledge that relates some biological factors to some behavioral observations as a starting point. Additionally, RDoC is intended to create a novel discrete category of mental disorder after sufficient research progress. Thus the dimensional approach in the present framework models are intended to capture only one aspect of the RDoC approach, particularly at the beginning phase.

Below, we provide the results of simulations based on our framework. The first simple cases (Cases 1-1 and 1-2) demonstrate what types of suggestions the proposed framework can provide. The next four cases (Cases 2–5) highlight the basic theoretical properties of the proposed models. Readers not interested in the theoretical details may wish to skip these four cases. The model settings in Cases 1–5 are intended to illustrate the common structures of basic research in psychiatry rather than to model a specific disorder. In the final case (Case 6), we present a concrete example of considering a specific disorder (i.e., psychosis) by incorporating a computational model into the proposed framework.

In general, the statistical power monotonically increases as the sample size (the
number of subjects) increases (e.g., see

The first two cases (Cases 1-1 and 1-2) consider the basic problem in psychiatric
research. Consider that there are two disorder categories, which we call
Disorder A and Disorder B (e.g., depression and psychosis, respectively). For
simplicity, we assume that each disorder has two symptoms and that one of these
symptoms is shared by both disorders (_{2} represents the common symptom between the two disorders (e.g.,
anhedonia), and _{1} and _{3} are specific symptoms for each disorder. Assume that the goal of
the researcher is to detect the pathogenetic factor of the common symptom
_{2}. An individual is assumed to be diagnosed with Disorder A if
_{1} ≥ _{2} ≥ _{2} ≥ _{3} ≥ _{2} ≥ _{2} <

Here we consider two possible etiological mechanisms for this situation. Case 1-1
assumes that there are two distinct pathogenetic factors for each disorder
category (_{1} and _{2}. _{1}. In this case, the category-based analysis using only a single
criterion (Method 1) did not yield a good result: Half of the trials failed to
obtain the significant difference in ^{1}

The reason for the difference between the statistical powers of Method 1 and
Method 2 can be understood by checking the distributions of samples for each
group in each method (_{1} ≥ _{2} ≥ _{2} ≥ 1 but _{1} < _{1} compared to the individuals marked with orange filled circles.
Thus, Method 2, which employs only the individuals indicated with orange filled
circles, improves the discriminability compared to Method 1, which includes the
individuals marked with blue triangles (dashed lines in _{1} (_{1} (_{1}).

Method 3 is a diagnostic category-based approach that utilizes the information
about both Disorder A and Disorder B: Individuals in the patient group are with
Disorder A but without Disorder B, whereas individuals in the control group are
without both disorders. Method 3 provided slightly higher power than Method 2
(_{1} compared to those with both disorders. If the individuals have
Disorder B, then they tend to have a higher value of _{2}. Having a larger common symptom value _{2} may be due to this higher value of _{2} and not due to _{1}. Excluding such individuals and choosing patients only with
Disorder A helps to illuminate the effect of _{1}.

Dimensional approaches, which test the correlation between _{2} (Method 4), yielded greater power than Method 1 but less power than
the two category-based approaches (Methods 2 and 3). Including both pathogenetic
factors in the explanatory variables of multiple regression increased the power
(Method 5), but it did not exceed the power of the two category-based
approaches.

Case 1-2, which assumes a one-to-one mapping from the pathogenetic factors to the
symptoms, produced quantitatively different results. In this case, there is one
distinct pathogenetic factor for each symptom. The use of the disorder
categories with multiple criteria did not increase the power. In contrast, the
analyses using double criteria (Methods 2 and 3) yielded lower power compared to
the category-based analysis using only a single criterion (_{2} ≥ _{2} ≥ _{1} < _{2}, which are comparable to those of individuals indicated with orange
dots. Thus including the subjects marked with blue triangles pulls the
distribution of _{2} of the control group (black solid line in _{2} (_{2} (_{2} (Method 4), yielded greater power than the category-based
approaches. Here including other variables in the multiple regression did not
increase the power (Method 5).

The implications of the results are as follows. The diagnostic categories based
on the syndrome can be useful for detecting the pathogenetic factor if the
factor is related to those symptoms (as in Case 1-1). However, if there is no
such one-to-many mapping, using multiple criteria is not useful: Using a single
observation is more efficient for detecting the underlying factor (as in Case
1-2). Among the methods based on a single behavioral observation, the
dimensional (correlation) approach provides greater statistical power (compare
Method 1 and Method 4 in

In Case 2, we compared the statistical powers of the category-based approach and
the dimensional (correlation) approach for the simplest case in which there is a
single pathogenetic factor (

_{1} is detected as a function of the total number of individuals. For
this case, the statistical powers of both methods can be analytically obtained
(see Katahira & Yamashita, _{1}, whereas the category-based approach ignores the information of the
distribution within the group. If there is a margin, then the category-based
approach can outperform the dimensional approach. With a larger margin, the
category-based approach can distinguish clusters in the distribution _{1} while suppressing the impact of the noise added to _{1}. However, note that with a larger margin, it becomes more difficult
to find samples for the control group.

In the preceding results, the cutoff point was fixed to

Note that the samples to be analyzed are different between the two approaches.
The dimensional approaches randomly draw samples from the population without any
inclusion criteria. In Case 2, if the same sample (which is sampled by the
category-based approach) was used for both of the approaches, then the
dimensional (correlation) approach always provided superior power to the
category-based approaches (

As shown in Case 1-1, if more than one symptom has a common pathogenetic factor,
then including such symptoms in a single disorder category for the
category-based approach improves the power for detecting the factor compared to
the approach based on a single symptom. In Case 3, we systematically examined
the effect of using multiple criteria that share a common pathogenetic factor.
_{1} is a factor relevant to the mental disorder and is of interest, and
_{2} is irrelevant to the mental disorder (_{1} for behavioral observation _{
j
}(_{2} is set to zero.

The standard deviation of the noise _{
ϵ
} and the number of behavioral observations _{
ϵ
} = 2.0).

Here the samples submitted to analysis are again different between the
category-based approach and the dimensional approach. If the same category-based
samples are submitted in both of the approaches, then the dimensional
(correlation) approach provides higher power than the category-based approach,
particularly when _{
i
}. The cat egorical analysis uses _{
i
}, but the effect is reduced when the number of

For the irrelevant factor _{2}, the fraction of the factor deemed significant was kept to the
preset significance level of 0.01 (

We now discuss the case in which a single behavioral observation _{
i
} is affected by more than one pathogenetic factor _{
j
}. It is conceivable that a larger mixture degree leads to difficulty in
detecting each pathogenetic factor. For simplicity, we consider the case with
two pathogenetic factors and two behavioral observations (

The transformation matrix is parametrized using a parameter _{1} and _{2} equally contribute to both behavioral observations _{1} and _{2} (complete mixture). When _{1} and _{2} independently contribute to _{1} and _{2}, respectively (no mixture). When _{1} and _{2} have opposite effects on the behavioral measures (one has a
positive impact, whereas the other has a negative impact). The effect of

We consider two cases in the category-based approach: One uses only a single
behavioral observation _{1} as a criterion, and the other uses both behavioral observations.
The resulting statistical powers are plotted in _{1} ≥ _{
j
} when _{1} and _{2}; _{1} and _{2}. The increase in the power for positive _{1} is higher can easily be classified into the control group because
of the inhibition from other _{2}. Distributing subjects with higher values of _{1} into two groups makes the discrimination difficult, thus reducing
the statistical power.

The additional pathogenetic factor _{2} is added to _{1} when _{2} is detected as a relevant pathogenetic factor even when the single
criterion _{1} is used (

The effect of the mixture reported in Case 4 was not drastic because there were
only two pathogenetic factors (_{
j
}. We varied _{1}, which is assumed to be the target of analysis.

The results are presented in

Thus far, we have considered the situation in which researchers can measure only
a single pathogenetic factor (out of many factors) in one study. Thus we have
focused on the simple correlation, or equivalently, simple linear regression
analysis. However, large-scale psychiatry studies, such as genome-wide analysis,
may measure multiple factors (alleles) in a single study (e.g., Cross-Disorder
Group of the Psychiatric Genomics Consortium, _{1} is computed based on the null hypothesis that the regression
coefficient for _{1}, suppressing the influence of other confounding factors on the
behavioral observation. We also examined the intermediate situation between
single regression and full multiple regression: The case in which only _{1}) are available. The power for the case with

Thus far, we have considered simple linear Gaussian models as models of
generative processes of behavioral observations. In reality, the generative
processes must be more complex. Using computational models may provide an
explicit and more realistic form of the generative processes (Kurth-Nelson et
al.,

The details of the model are provided in Katahira and Yamashita (_{SDT}), which can cause DA activity at inappropriate times, and (2)
decreased tonic DA level (parametrized by

_{SDT}(= 0.4), allow the prediction error signal (denoted by

A negative symptom, a diminished engagement with high-cost activities, is
examined in the simulation as follows (for details, see Katahira &
Yamashita,

_{SDT} has a direct positive influence on the development of aberrant
valuation of thoughts (dominance of a thought), whereas the tonic DA level
_{SDT} is, the less the individuals tend to show diminished engagement
with high-cost activities (

The statistical power of the approaches based on the aberrant valuation of
thoughts is relatively weak compared to that based on diminished engagement with
high-cost activities. This result occurs because the relation between the
probability of DA transients _{SDT} and the emergence of dominant thought are highly stochastic: Even
when _{SDT} is very large, more than 20% of individuals do not show an
aberrant valuation of thoughts (

Note that this simulation is intended only as a simple illustration. In reality, psychosis (including schizophrenia) must be a more complex disorder that may involve several etiological mechanisms. Nevertheless, the present demonstration illustrates the first step to incorporating computational models into the proposed framework to evaluate a psychiatric research strategy.

In this article, we proposed a novel framework for discussing the effectiveness of research strategies in psychiatry. We demonstrated the basic features of the framework considering simple cases. There are many discrepancies between the assumptions of the simple cases and the realistic situations. Before discussing the discrepancies, we discuss the implications derived from the analyses of the model properties.

The results of the computer simulations highlight the effectiveness of
dissociating a behavioral measure from other behavioral phenotypes that reflect
different pathophysiological states. If one uses a diagnostic category whose
criteria contain symptoms that can arise from different pathogenetic factors,
then the chance of finding the corresponding factor is lowered compared with
when one classifies the subjects based on a single symptom separately (as in
Cases 1-2 and 3). When two factors have opposite effects on the symptoms (i.e.,
one factor influences one symptom agonistically, another symptom
antagonistically), the statistical power would be weak if both symptoms were
used to classify the individuals (Case 4, when the mixture parameter

Meanwhile, the behavioral observations can be contaminated with noise, including errors in the subjective report and individual differences in the reactivity to pathogenetic factors. By using the behavioral observations that share common pathogenetic factors to define the category, the category-based approach can reduce the effect of the noise. Consequently, increasing the number of independent criteria can reduce the impact of such noise and make the detection of the pathogenetic factors easier, given that the errors are mutually independent (as observed in Cases 1-1 and 3).

Therefore, in some cases, the conventional diagnostic category-based approach could be more efficient in detecting a pathogenetic factor than the dimensional (RDoC) approach (as in Case 3). Which approach is better depends on the case. The proposed model provides a promising approach for designing an efficient research strategy to investigate a specific target. By incorporating detailed generative models of psychiatric diseases, the researcher can determine the better research strategy, as we demonstrated in Case 6, which suggests that separating the positive symptoms and negative symptoms is better than treating them as symptoms of a single disorder category (e.g., psychosis).

Nevertheless, there are several limitations of the proposed framework. The results and their implications strongly depend on the model assumption. If invalid models are used, then the simulation may recommend a suboptimal or even worse strategy. For example, the selection of the population distribution of model parameters (e.g., DA parameters in Case 6, which we arbitrarily set) might quantitatively change the results. Although the proposed method provides a quantitative prediction, that is, statistical power, keeping the conclusion qualitative would be better. For example, one may draw the conclusion that separating the positive symptom and negative symptom is better for finding the pattern of aberrant DA activity, but one should not trust the specific value of the power to determine the number of subjects to obtain statistically significant results (as an ordinary power analysis does). Seeking an appropriate model of mental disorders itself is a challenging task and within the scope of computational psychiatry.

We have primarily considered the linear Gaussian model as a generative model.
This model assumes that the variables take continuous values and obey a Gaussian
distribution. Although this assumption makes the theoretical analysis easier, it
is an obvious oversimplification. For example, consider a genetic mutation as a
pathogenetic factor. The presence or absence of an allele is represented as a
categorical variable. The behavioral measure or symptom can also be categorical
(e.g., the existence or absence of a specific symptom). The distribution of
scores for some symptom ratings can be best explained using an exponential
distribution with a cutoff (Melzer, Tom, Brugha, Fryers, & Meltzer,

Another drastic simplification in the present model is the assumption of
independence among errors and also among pathogenetic factors. In realistic
situations, there may be considerable correlations among them. A second-order
correlation can be modeled using a multivariate Gaussian distribution, which is
a simple extension of the current model. However, there may be a higher order
correlation, for example, in gene expression (Nakahara, Nishimura, Inoue, Hori,
& Amari,

As we have emphasized, the category-based approach and dimensional approach
differ in how they sample the subjects. The category-based approach uses
diagnostic criteria for selecting the subjects (it samples the “patients”),
whereas the dimensional approach is assumed to gather subjects without prior
criteria. As shown in Case 2, when the same data (those gathered by the
category-based approach) are used, then the dimensional (correlation) approach
tends to provide greater statistical power compared to the categorical approach
(

On the basis of the proposed framework, one can optimize the inclusion criteria (cutoff point) of specific disorders. As shown in Case 2, the more severe the inclusion criteria of the patient group are, the higher the statistical power is, given our assumption. However, there is a trade-off between the power and difficulty in finding samples. Severe criteria make finding the “patient” individuals difficult. The proposed framework can help determine the criteria considering the trade-off for specific situations. Note that this is the issue of basic research. For practical clinical applications, the optimal criteria would differ. A treatment can be effective even for individuals with modest symptoms. For clinical diagnosis, the optimal boundary should depend on the treatment response rather than be based solely on the statistical power.

We demonstrate how to incorporate the computational model in the proposed
framework by using a reinforcement learning model with dopamine dysfunction.
Candidate mathematical or computational models can span Marr’s three levels:
computational, algorithmic, and implementational (Kurth-Nelson et al.,

Currently the most successful applications of computational approaches to
psychiatry may be those based on fitting models to behavior and/or neural
activities. For example, model parameters that are fit to the subject’s behavior
(e.g., Ahn et al.,

Recently, Flagel et al. (

The Bayesian integrative framework shares a common feature with our framework: Both frameworks use mathematical or computational models to model the generative processes of mental disorders. However, the goals of the two frameworks differ. The Bayesian integrative framework is designed to analyze individual data (including the diagnosis): It infers the latent cause of the disorder of the individuals through Bayesian inference. The Bayesian integrative framework is intended also to be used by clinicians to gather data for improving the model and nosology. The ultimate goals are aiding the diagnosis, prognosis, prevention, and treatment of mental disorders. In contrast, the scope of our framework is basic research strategies in psychiatry rather than a clinical use. In addition, the target of modeling in our framework is the population rather than the individual. Our framework is concerned with the sampling method of subjects, whereas the Bayesian integrative framework does not explicitly deal with the sampling procedure, at least currently. Thus our framework cannot provide useful information about the cause of the disorder of an individual patient, whereas the Bayesian integrative framework would provide such information.

The statistical methods considered in the framework also differ. The Bayesian integrative framework uses the Bayesian method, as its name suggests, whereas our framework is formulated with the classical hypothesis testing in mind, although replacing it with Bayesian hypothesis testing is straightforward. Whereas the Bayesian approach is more flexible, the classical statistical framework is more prevalent in psychiatry studies. The relations between the Bayesian integrative framework and our framework are complementary rather than competing. Our framework would be useful in choosing the research strategy in standard psychiatric studies. As sufficient knowledge is accumulated, more detailed models and precise data become available. Submitting these models and data into the Bayesian integrative framework would help further refine the model and predict treatment outcomes for individual patients.

Psychiatry targets extremely complex processes and systems, that is, mental processes and the brain. Many factors are involved in these processes and systems. Accordingly, there should be various research strategies in psychiatry, as well as in neuroscience and psychology. A framework for evaluating the research strategies is required. Discussion at the verbal description level is limited because the target system is very complex and may not be fully described verbally. Thus computational and mathematical models could play important roles. Although there is plenty of room for modifications, the present study is a first step toward such theoretical evaluations. Our study also provides an avenue via which computational approaches can contribute to psychiatric research.

K.K. and Y.Y. conceived and designed research; K.K. conducted simulations and analysis; K.K. and Y.Y. wrote the paper.

This work was supported by JSPS KAKENHI grants JP15K12140, JP25330301, JP17H05946, and JP17H06039 and by JST CREST grant JPMJCR16E2.

The authors thank Tsukasa Okimura, Yoshihiko Kunisato, and Asako Toyama for their helpful discussion and constructive comments on this study.

The Monte Carlo simulation to obtain the power was performed 100,000 times for each condition. We confirmed that the confidence intervals of the power estimate were less than 0.01 (the confidence intervals are drawn on the figure, although they are almost invisible). Thus the estimated powers are highly reliable.

The additional file for this article can be found as follows:

Supplementary Material. DOI: