1 Introduction

Paranoid delusions are commonly defined as unfounded beliefs that others intend to deliberately cause harm () and they are a frequent symptom in early psychosis occurring in about 50–70% of first-episode psychosis patients (FEP) (; ; ). While paranoid delusions are a key symptom of schizophrenia, they are also present in the general population (; ) and are frequently reported in other psychotic disorders and affective disorders, such as bipolar disorder and depression (). Importantly, paranoid delusions are a heavy burden for those afflicted by them as they are associated with more frequent suicidal ideation in the general population () and higher suicide risk in patients (; ).

Despite an urgent clinical need to address these symptoms, the emergence and consolidation of paranoid delusions remain a subject of debate. Recent cognitive theories suggest that ‘aberrant salience’ caused by overly precise prediction errors (PEs) – possibly mediated through dopaminergic signaling – lead to an uncertain model of the world providing a breeding ground for delusions to form (; ; ; ; ). It has been proposed that these overly precise PEs could then be explained away by adopting more abstract higher order beliefs that may take the form of delusions (; ; ).

Here, we pursue a Bayesian approach that enables us to formalize the concept of ‘aberrant salience’. We will first discuss ‘aberrant salience’ in a non-hierarchical framework and then proceed to a hierarchical framework using a hierarchical Bayesian model of learning (, ) to derive competing computational mechanisms that are tested in this study.

When adopting a Bayesian framework, ‘aberrant salience’ can be understood as reduced uncertainty (i.e., variance) or increased precision (inverse of uncertainty) that up-weighs incoming sensory information (; ; ; ; ; ). In a non-hierarchical model, ‘aberrant salience’ would be expressed in relatively increased precision associated with the likelihood or reduced precision associated with the prior distribution (e.g., see Sterzer et al. ()).

However, for example Fletcher and Frith () have argued that beliefs may better be conceptualised in a hierarchical manner. Assuming a hierarchical structure of beliefs where the lower level corresponds to beliefs about sensory information and the higher level to beliefs about the volatility of the environment and further assuming that beliefs can be expressed as Gaussian distributions, ‘aberrant salience’ can be viewed as a ratio of precisions associated with beliefs about sensory inputs and high-level beliefs (; ; ). An increase in this precision ratio will result in exaggerated belief updates or ‘aberrantly salient’ PEs. From here on out we will refer to beliefs about volatility when we speak about high-level beliefs.

In line with this literature, we have recently derived different hypotheses about the emergence of delusions based on simulations () using the Hierarchical Gaussian Filter (HGF; Mathys et al. (, )). Specifically, we hypothesised that different stages of psychosis may be associated with different computational mechanisms. Experiences of ‘aberrant salience’ in prodromal stages can be expressed computationally as overly precise prediction errors (i.e., increased learning rate). In the HGF, an increased learning rate results from either (1) increased precision associated with incoming sensory prediction errors (2) reduced precision of high-level beliefs about the volatility of the environment or (3) a combination of the two. Furthermore, we in line with others (; ) speculated that delusional conviction during later stages of psychosis may be accompanied by a compensatory increase of precision associated with high-level beliefs about volatility that functions to explain away overly precise prediction errors. This increase in high-level belief precision may render beliefs resistant to contradictory evidence and culminate in delusional conviction. Here, we test these hypotheses and investigate the computational mechanisms of emerging paranoia in early psychosis.

2 Methods

2.1 Participants

The sample comprised 19 individuals at clinical high risk for psychosis (CHR-P), 19 healthy controls (HC) that were group-matched to CHR-P with respect to age, gender, handedness, and cannabis consumption, and 18 short term medicated FEP (5.44 ± 2.79 days, median: 6, range: [0, 10]) resulting in a total sample of N = 56 participants. FEP were recruited from both inpatient care and the outpatient departments of the University Psychiatric Hospital (UPK) Basel, CHR-P were recruited from the Basel Early Treatment Service (BEATS) and HC via online advertisements and advertisements in public places (supermarkets, dentist clinics). All participants provided written informed consent. The study was approved by the local ethics committee (Ethikkommission Nordwest- und Zentralschweiz, no. 2017–01149) and conducted in accordance with the latest version of the Declaration of Helsinki.

2.2 In- and exclusion criteria

All participants were required to be at least 15 years old. Specific inclusion criteria for FEP were the diagnosis of a first psychotic episode of a schizophrenia spectrum disorder, which was assessed by the treating clinicians, and a treatment recommendation to begin antipsychotic medication issued independently of the study.

We included CHR-P who fulfilled either ultra-high risk for psychosis criteria, i.e. one or more of the following (1) attenuated psychotic symptoms (APS), (2) brief and limited intermittent psychotic symptoms (BLIP), and (3) a trait vulnerability in addition to a marked decline in psychosocial functioning also referred to as genetic risk and deterioration syndrome (GRD), assessed with the Structured Interview for Prodromal Symptoms (SIPS; Miller et al. ()); or basic symptom criteria, (; ) i.e., cognitive-perceptive basic symptoms (COPER) or cognitive disturbances (COGIDS) assessed with the Schizophrenia Proneness Instrument, adult version (SPI-A; Schultze-Lutter et al. ()) or the Schizophrenia Proneness Instrument, child and youth version (SPI-CY; Schultze-Lutter and Koch ()), assessed by experienced clinical raters.

Exclusion criteria for all three groups were previous psychotic episodes, psychotic symptomatology secondary to an organic disorder, any neurological disorder (past or present), premorbid IQ <70 (assessed with the Mehrfachwahl-Wortschatz-Test, Version A; Lehrl et al. ()), colour blindness, substance use disorders according to ICD-10 criteria (except cannabis), alcohol or cannabis consumption within 24 hours prior to measurements, and regular drug consumption (except alcohol, nicotine, and cannabis), which was assessed during the admission interview and confirmed with a drug screening before the initial measurement (assessments were postponed following a positive test until a negative test result was obtained).

FEPs whose psychotic symptoms were associated with an affective psychosis or a borderline personality disorder at the time of the measurement were excluded. Since data was collected as part of a larger study that included neuroimaging assessments, additional exclusion criteria for CHR-P and HC were contraindications for fMRI and contraindications for EEG measurements for all three groups. However, we only present behavioural results here.

2.3 Clinical assessment

Demographic and clinical information were assessed during an interview conducted within five days of the social learning task. This interview comprised assessment of clinical symptoms using the Positive and Negative Syndrome Scale (PANSS; Kay et al. ()) administered through trained clinical raters and self-assessment of paranoid thoughts (frequency, conviction and distress) using the Paranoia Checklist (PCL; ).

2.4 Task

All participants were asked to perform a deception-free and ecologically valid social learning task (Figure 1A) (; ), which required them to learn about the intentions of an adviser that changed over time. The task comprised two phases. In the first phase participants received stable helpful advice, whereas advisers intentions were changing more rapidly during a second phase, the volatile phase (see volatility schedule in Figure 1B). Participants were asked to predict the outcome of a binary lottery on each trial. To this end, they received information from two sources, a non-social cue displaying the true winning probabilities of the lottery, and a recommendation of an adviser (social cue) presented in form of prerecorded videos that were extracted from trials in which a human adviser either tried to help or deceive a player in a previous human-human interaction (see Diaconescu et al. (, ) for more details).

Figure 1 

Social learning task and volatility schedule. A Social learning task. B Volatility schedule.

Participants were truthfully informed that the adviser received privileged – but not complete – information about the upcoming outcome and that inaccurate advice could be due to mistakes or that the adviser could pursue a different agenda than the player and that the adviser’s intentions could change during the course of the experiment. We expected patients to be more sensitive to the increasing volatility of the task compared to HC.

2.5 Computational modelling

2.5.1 Hierarchical Gaussian Filter

We modelled participants’ behaviour during the social learning task with a 3-level HGF (, ). The model comprises a perceptual model and a response model, which will be detailed below.

Perceptual model The standard 3-level HGF assumes that participants infer on a hierarchy of hidden states in the world x1, x2, and x3 that cause the sensory inputs that participants perceive (, ). Participants’ inference on the true hidden states of the world xi(k) at level i of the hierarchy on trial k are denoted μi(k). In the context of this task, the states that participants need to infer from experimental inputs on each trial (non-social cue and advice) are structured as follows: The lowest level state corresponds to the advice accuracy. On each trial k an advice can either be accurate (x1(k)=1) or inaccurate (x1(k)=0). This state can be described by a Bernoulli distribution that is linked to the state at the second level x2(k) through the unit sigmoid transformation:




x2(k) represents the unbounded tendency towards helpful advice (–∞, +∞) or the adviser’s fidelity and is specified by a normal distribution:


The state at the third level x3(k) expresses the (log) volatility of the adviser’s intentions over time and is also specified by a normal distribution:


The dynamics of these states are governed by a number of subject-specific parameters, i.e., the evolution rate at the second level ω2, the coupling strength between the second and third level κ2, which determines the impact of the volatility of the adviser’s intentions on the belief update at the level below, and the evolution rate at the third level or the meta-volatility ϑ, which we fixed to a value of 0.5 to reduce the number of free parameters (see Table 1 for overview over all parameters). Additional subject-specific, free parameters were the prior expectations before seeing any input about the adviser’s fidelity μ2(0) and the volatility of the adviser’s intentions μ3(0). These parameters can be understood as an individual’s approximation to Bayesian inference and provide a concise summary of a participant’s learning profile. Using a variational approximation, efficient one step update equations can be derived (see Mathys et al. (, ) for more details), which take the following form:


where μi(k) is the expectation or belief at trial k and level i of the hierarchy, π^i1(k) is the precision (inverse of the variance) from the level below (the hat symbol denotes that this precision has not been updated yet and is associated with the prediction before observing a new input), πi(k) is the updated precision at the current level, and δi1(k) is a PE expressing the discrepancy between the expected and the observed outcome.

Table 1

Model parameter overview. Prior parameter values were chosen based on the references indicated next to prior means and variances. aCole et al. (). bDiaconescu et al. (). cHauke et al. ().



κ2 Perceptual Model0.5a,b,c 1a,b,c logit[0, 1]

ω2 Perceptual Model–2b,c 4c (–∞, +∞)

θPerceptual Model0.5a,b,c 0c logit[0, 1]Yesc

μ2(0) Perceptual Model0a,b,c 1a,b,c (–∞, +∞)

σ2(0) Perceptual Model1a,b,c 0a,c log[0, +∞)Yesa,c

μ3(0) Perceptual Model1a,b,c 1b,c (–∞, +∞)

σ3(0) Perceptual Model1b,c 0c log[0, +∞)Yesc

ζResponse Model0.5b,c 1b,c logit[0, 1]

νResponse Model48a,b,c 1a,b,c log[0, +∞)



m3 Perceptual Model1a,c 1a,c (–∞, +∞)

ϕ3 Perceptual Model0.1a,c 0a,c logit[0, 1]Yesa,c

We also employed a second, modified version of the HGF () that assumed that learning about an adviser’s intentions was not only driven by hierarchical PE updates, but also included a mean-reverting process at the third level formalising the idea that an altered perception of volatility may underlie learning about others’ intentions. In this mean-reverting HGF, the third level can again be described by a normal distribution:


where ϕ3 represents a drift rate and m3 the equilibrium point towards which the state moves over time.

In this model, we fixed the drift rate ϕ3 to a value of 0.1 and estimated the equilibrium point m3 as a subject-specific, free parameter. Note, that changing m3 to values that are lower than the prior about the volatility of the adviser’s intentions μ3(0) translates into reduced belief updates at all three levels of the hierarchy corresponding to perceiving the environment as increasingly stable over time (Figure 2). Conversely, if m3>μ3(0), the magnitude of belief updates increases in line with a perception that the environment is increasingly volatile over time and beliefs should thus be adjusted more rapidly. Lastly, if m3=μ3(0), agents would revert back to their prior beliefs about environmental volatility over time (i.e., “forget” about the observed inputs). For this reason, we refer to the model as mean-reverting HGF analogous to an Ornstein-Uhlenbeck process in discrete time (). Note, that introducing this drift allows to model an altered perception of volatility that manifest not only during the first trials as changes in prior uncertainty μ3(0) would induce (see simulations in the Supplement), but rather enables a more nuanced characterization of changes that occur within the experimental session. Its effect also impacts belief formation at lower levels and simulated responses more strongly (see Supplement).

Figure 2 

Simulating an altered perception of environmental volatility. Simulations showing the effect of changing the equilibrium point m3. Increasing m3 (colder colours) corresponds to perceiving the environment as increasingly volatile and results in larger precision-weighted prediction errors leading to stronger belief updates across all levels of the hierarchy. Note, that high values of m3 also increase susceptibility to noisy inputs (e.g., trials 120–136). We hypothesised that this would be the case in early stages of psychosis. Reducing m3 (warmer colours) on the other hand corresponds to perceiving the environment as increasingly stable and leads to reduced learning rates rendering an agent insensitive to true changes in the environment. We hypothesised that this could correspond to explaining away overly precise prediction errors and would be associated with delusional conviction. For the simulations, all other parameter values were fixed to the values of an ideal observer given the input.

As outlined in the introduction, we expected that prodromal psychosis would be characterised by overly precise prediction errors, caused by 1) increased low-level precision 2) decreased high-level precision or 3) a combination of both (cf. Eq. 5. In the HGF, the dynamics of these precisions are governed by the model parameters. Based on our hypothesis and previous literature, we thus expected that increased low-level precision would be expressed as changes in the evolution rate at the low level (high ω2; Diaconescu et al. (); Reed et al. ()). Similarly, decreased high-level precision should be associated with parameters at the high level, namely the prior expectation about environmental volatility (high μ3(0); Reed et al. ()), the equilibrium point of the drift at the third level (high m3; Cole et al. (); Diaconescu et al. ()) or the coupling between levels (κ2; Diaconescu et al. (); Reed et al. ()).

To test our a-priori hypothesis () that different disease stages would be associated with distinct information processing changes, i.e. the prodrome with overly precise prediction errors (e.g., high m3) versus delusional conviction with overly precise high-level beliefs that explain away these prediction errors (low m3), we compared the mean-reverting HGF (with m3) to the standard HGF (without m3).

Response model The response model specifies how participants’ inference about the hidden states translates to decisions, i.e., to go with or against the advice. In our case the response model assumes that participants’ integrate the non-social cue c(k) (the outcome probability indicated by the pie chart) and their belief that the adviser is providing accurate advice μ^1(k) before seeing the outcome on the current trial k:


where ζ is a weight associated with the advice that expresses how much participants rely on the social information compared to the non-social cue.

The probability that a participant follows the advice (y = 1) can then be described by a sigmoid transformation of the integrated belief b:




This relationship can be understood as a noisy mapping from the integrated beliefs to participants’ decisions, where the noise level is determined by the current prediction of the volatility of the advisers’ intentions μ^3(k), such that decisions become more deterministic (i.e., exploitative), if the environment is currently perceived as stable or more stochastic (i.e., exploratory), if the environment is perceived as volatile. Modelling the exploration-exploitation trade-off as a function of participants’ perception of volatility was favoured in previous model selection results using the same task (, ). Parameter ν is another subject-specific parameter that captures decision noise that is independent of the perception of volatility (lower values indicate larger decision noise).

The models were implemented in Matlab (version: 2017a; https://mathworks.com) using the HGF toolbox (version: 3.0), which is made available as open-source code as part of the TAPAS () software collection (https://github.com/translationalneuromodeling/tapas/releases/tag/v3.0.0). Perceptual models were implemented using the ‘tapas_hgf_binary’ function for the standard 3-level HGF and the ‘tapas_hgf_ar1_binary’ function for the mean-reverting HGF.

2.5.2 Bayesian model selection

Based on our a simulation analysis () and previous findings (; , ; ), we formulated competing hypotheses about the computational mechanisms that could underlie emerging paranoid behaviour (Figure 3). A standard 3-level HGF (Hypothesis I) was compared to the mean-reverting HGF that assumed that learning about an adviser’s intentions was not only driven by hierarchical PE updates, but also included a drift process at the third level formalising the idea, that an altered perception of volatility underlies learning about others’ intentions in emerging psychosis (Hypothesis II; see also Figure 2). To arbitrate between the two hypotheses we performed random-effects Bayesian model selection (; ). Two additional control models were included, in which all parameters of the perceptual model were fixed to parameter values of an ideal Bayesian observer optimised based on the inputs alone using the ‘tapas_bayes_optimal_binary’ function to assess whether perceptual model parameters needed to be estimated for either of the two main models. These “null” models assume that any variation in advice-taking behavior can be attributed solely to the response model parameters, i.e. the social bias and the decision noise.

Figure 3 

Model space. Left: Standard 3-level Hierarchical Gaussian Filter (HGF). (, ) Right: Mean-reverting HGF with a drift at the third level, which captures learning about the volatility of the adviser’s intentions. This model expresses the notion that early psychosis may be characterised by an altered perception of environmental volatility.

We report protected exceedance probabilities ϕ, which measure the probability that a model is more likely than any other model in the model space (), protected against the risk that differences between models arise due to chance alone (). We also computed relative model frequencies f as a measure of effect size, which can be understood as the probability that a randomly sampled participant would be best explained by a given model. The model selection was implemented using the VBA toolbox () (https://mbb-team.github.io/VBA-toolbox/).

2.5.3 Model recovery

To assess whether models were recoverable, we conducted a series of simulations as done previously (). In brief, our model recovery analysis comprised simulating 20 synthetic datasets based on the empirical parameter estimates obtained from fitting all models to the empirical data of every participant. The sample size of each synthetic dataset was chosen to be equivalent to the empirical sample size (N = 56). The noise level was set based on the empirically estimated decision noise νest. Each simulation was initialised using different random seeds to account for the stochasticity of the simulation. This led to a total of 4 (models) × 56 (participants) × 20 (simulation seeds) = 4,480 simulations. Subsequently, we re-inverted each of the proposed models on the synthetic data to determine, whether we could recover the true model under which synthetic data was generated. To assess model recovery, we then performed random-effects Bayesian model selection on each of the datasets with a sample size of N = 56 as in the empirical data and averaged the resulting protected exceedance probabilities across the 20 simulation seeds to obtain a model confusion matrix.

2.5.4 Parameter recovery

In line with our previous work (), we also performed a parameter recovery analysis to determine whether model parameter estimates were reliable. Using the simulation and model inversion results from the model recovery analysis (see preceding section), we assessed how accurately the parameters generating the data (‘simulated’) corresponded to the parameters that were estimated when re-inverting the same model on that data (‘recovered’). We report Pearson correlations and their associated p-values to quantify our ability to recover the model parameters. Since, the significance of these correlations is influenced by sample size, we also computed Cohen’s f2, where an f2 ≥ 0.35 can be considered a large effect size () and was interpreted as evidence for good parameter recovery.

2.6 Statistical analysis

We tested for differences in behaviour using a linear mixed-effects model with advice taking (#trials, in which participant went with the advice /# total trials) as the dependent variable and fixed effects for group and task phase (stable vs volatile), as well as a group-by-task-phase interaction as predictors of interest and age and working memory performance as covariates of no interest. Additionally, the model included a random intercept per participant.

Note, that including medication as a covariate is not recommended when comparing HC and patient groups. For completeness, however, we also report the results of mixed-effects model with current antipsychotic dose (100mg/day chlorpromazine equivalents) and current antidepressant dose (40mg/day fluoxetine equivalents) as covariates. Chlorpromazine equivalents were derived from The Maudsley® prescribing guidelines in Psychiatry () which is based on the literature and clinical consensus. Since paliperidone was not listed, equivalent estimates for paliperidone were based on Leucht et al. (). Fluoxetine equivalents were based on Hayasaka et al. (), with the exception of vorioxetin and citalopram which were not listed. For these, equivalents doses were assumed to be 10mg vortioxetin and 30mg citalopram, respectively, based on clinical practice.

Differences in model parameters were assessed using non-parametric Kruskal-Wallis tests. All statistical analyses were conducted in R (version: 4.04; https://www.r-project.org/) using R-Studio (version: 1.4.1106; https://www.rstudio.com/). We report both uncorrected p-values (puncorr) and Bonferroni-corrected p-values adjusted for the number of free parameters (n = 7).

3 Results

3.1 Sociodemographic and clinical characteristics

Sociodemographic and clinical characteristics are presented in Table 2.

Table 2

Demographic and clinical characteristics. All p-values are uncorrected. HC: Healthy controls. CHR-P: Individuals at clinical high risk for psychosis. FEP: First-episode psychosis patients. APS: Attenuated psychotic symptoms. BLIP: Brief and limited intermittent psychotic symptoms. GRD: Genetic risk and deterioration syndrome. COGDIS: Cognitive disturbances. COPER: Cognitive-perceptive basic symptoms. Cpz100mg/day: Antipsychotic equivalent dose for 100mg chlorpromazine per day. Flu40mg/day: Antidepressant equivalent dose for 40mg fluoxetine per day. PANSS: Positive and Negative Syndrome Scale.() PCL: Paranoia Checklist (). Bold print highlights p-values significant at: p < 0.05, uncorrected. a Assessed with the digit span backwards task from the Wechsler Adult Intelligence Scale–Revised (). bHigh risk types are not mutually exclusive.


mean [SD]
F = 18.182
p < 0.001

mean [SD]
F = 1.015
p = 0.370

Working memorya
mean [SD]
F = 1.011
p = 0.371

Sex [f/m]11/8
χ2 = 1.767
p = 0.413

Cannabis [y/n]7/12
χ2 = 0.842
p = 0.656

High risk typeb






Psychotic disorder diagnosis

F20 Schizophrenia3

F22 Delusional disorder6

F23 Brief psychotic disorder9

Antipsychotics [y/n]0/19
χ2 = 31.987
p < 0.001








Haloperidol & Aripiprazol1

Antidepressants [y/n]0/19
χ2 = 17.268
p < 0.001







Trazodon & Citalopram1

Trazodon & Sertralin1


median [25th, 75th]
0n= 19
[0, 0]
0n= 18
[0, 0]
83n= 18
[33, 188]
η2 = 0.592
p < 0.001

median [25th, 75th]
0n= 19
[0, 0]
0n= 17
[0, 30]
0n= 18
[0, 0]
η2 = 0.246
p = 0.001

PANSS Positive
median [25th, 75th]
8n= 19
[7, 8]
11n= 19
[10, 14]
16n= 16
[11, 23]
η2 = 0.514
p < 0.001

PANSS Negative
median [25th, 75th]
7n= 19
[7, 8]
9n= 19
[8, 10]
12n= 16
[9, 15]
η2 = 0.364
p < 0.001

PANSS General
median [25th, 75th]
18n= 19
[16, 19]
29n= 19
[22, 32]
34n= 16
[32, 40]
η2 = 0.674
p < 0.001

PCL Frequency
median [25th, 75th]
23n= 19
[19, 25]
30n= 19
[24, 33]
36n= 17
[23, 44]
η2 = 0.202
p = 0.004

PCL Conviction
median [25th, 75th]
26n= 19
[22, 31]
33n= 19
[28, 39]
30n= 17
[22, 55]
η2 = 0.086
p = 0.099

PCL Distress
median [25th, 75th]
26n= 19
[20, 37]
29n= 19
[23, 38]
30n= 17
[21, 46]
η2 = 0.008
p = 0.799

3.2 Behavioural results

We identified a significant group-by-task-phase interaction on the frequency of advice-taking (F = 5.275, p = 0.008; Figure 4A). To unpack this effect we repeated the analysis with three two-group models. We found significant group-by-task-phase interactions when comparing HC vs FEP (F = 8.520, puncorr = 0.006, p = 0.018 Bonferroni-corrected for the number of comparisons, i.e. n = 3) and HC vs CHR-P (F = 7.745, puncorr = 0.009, p = 0.026), but not when comparing CHR-P vs FEP (F = 0.047, puncorr = 0.830, p = 1.000), suggesting that both CHR-P and FEP showed reduced flexibility to take environmental volatility into account as the difference between stable and volatile phase was reduced compared to HC. None of the covariates significantly impacted advice taking.

Figure 4 

Behavioural results and parameter group effects. A Behavioural results (ground truth). Black dashed lines indicate the average accuracy of advice for each of the two phases. B Model prediction. C Parameter effect for drift equilibrium point m3. D Parameter effect for coupling strength κ2. E Correlation between model parameters and either Positive and Negative Syndrome Scale () (PANSS) or Paranoia Checklist () (PCL). Note, that raw scores are displayed for illustration purposes only. Statistical analyses were conducted using nonparametric Kendall rank correlations. Displayed regression lines were computed using a linear model based on the raw scores. Note, that one outlier (κ2 = 0.006) was removed for displaying the effect on κ2 in D and E. This outlier was outside of 7 × the interquartile range. Excluding this participant did not affect the significance of the results. P: Positive symptoms. N: Negative symptoms. G: General symptoms. F- and p-values indicate results of ANCOVAs corrected for working memory performance, antipsychotic medication, antidepressant medication, and age. Boxes span the 25th to 75th quartiles and whiskers extend from hinges to the largest and smallest value that lies within 1.5 × interquartile range. Asterisks indicate significance of non-parametric Kruskal-Wallis tests at: * p < 0.05, using Bonferroni correction.

The group-by-task-phase interaction remained significant after including antipsychotic and antidepressant dose as covariates (F = 4.900, p = 0.011). Neither the effect of antipsychotic dose (F = 0.006, p = 0.939) nor antidepressant dose (F = 0.112, p = 0.739) were significant. Unpacking this model again revealed significant group-by-task-phase interactions when comparing HC vs FEP (F = 8.520, puncorr = 0.006, p = 0.018), but not when comparing CHR-P vs FEP (F = 0.671, puncorr = 0.419, p = 1.00). The group-by-task-phase interaction effect in HC vs CHR-P did not survive Bonferroni correction (F = 5.154, puncorr = 0.030, p = 0.089).

3.3 Modelling results

3.3.1 Bayesian model selection and model recovery

The model recovery analysis (Figure 6) indicated that the control models (CI and CII) could not be well-distinguished. This was likely due to the fact that the equilibrium point m3 in CII was optimised based on the input alone, which resulted in a value for m3 that was close to the prior, rendering the predictions of the two control models very similar. Most importantly, however, the two main models associated with Hypothesis I and II could be well-distinguished.

After confirming that the two hypotheses were distinguishable, we first performed Bayesian model selection including participants from all groups. The results were inconclusive (ϕ = 74.03%, f = 53.80% in favour of Hypothesis II) possibly suggesting that different groups were best explained by different models (i.e., different computational mechanisms). To assess this possibility, we repeated the model selection for each group separately (Figure 5A). In HC, the winning model was the standard 3-level HGF (Hypothesis I; ϕ = 96.63%, f = 95.93%). Conversely, in FEP the mean-reverting HGF that included a drift at the third level was selected (Hypothesis II; ϕ = 99.95%, f = 95.92%). For CHR-P, we observed a more heterogeneous results: While the mean-reverting model was favoured (Hypothesis II; ϕ = 84.50%, f = 60.24%), there was also evidence for the standard HGF, albeit to a much lesser extent (Hypothesis I; ϕ = 14.41%, f = 37.19%). Further inspection of the model attributions for all individual participants revealed an interesting pattern (Figure 5B). All HC were attributed to the standard HGF with over 97% probability, whereas FEP were attributed to the mean-reverting model with over 99%. Interestingly, model attributions for CHR-P were more heterogeneous ranging from 0 to 100% probability, suggesting that some individuals were better explained by the standard HGF, but others by the mean-reverting model. These results remained consistent when including other control models (Supplement).

Figure 5 

Bayesian model selection results. A Protected exceedance probabilities for within-group random-effects Bayesian model selection(; ) to arbitrate between Hypothesis I (HI; standard 3-level HGF) and Hypothesis II (HII; mean-reverting HGF with drift at 3rd level in line with an altered perception of volatility). Two corresponding control models were included (CI and CII), for which the perceptual model parameters were fixed. Model selection was performed separately in healthy controls (HC), individuals at clinical high risk for psychosis (CHR-P), or first-episode psychosis patients (FEP). The dashed line indicates 95% protected exceedance probability. B Model attributions for each participant.

3.3.2 Posterior predictive checks, parameter identifiability and parameter recovery

To assess whether the mean-reverting model (Hypothesis II) captured the behavioural effects of interest, we conducted posterior predictive checks by repeating the behavioural analysis on this model’s predictions. This analysis confirmed that the mean-reverting model recapitulated the group-by-task-phase interaction effect on advice-taking frequency (F = 4.343, p = 0.018; Figure 4B). We also repeated all three two-group models on the model predictions and found a significant group-by-task-phase interaction when comparing HC vs FEP (F = 8.337, puncorr = 0.007, p = 0.020) and no significant interaction when comparing CHR-P vs FEP (F = 1.106, puncorr = 0.300, p = 0.900) as before in the empirical data. The group-by-task-phase interaction did not reach significance for HC vs CHR-P (F = 3.662, puncorr = 0.064, p = 0.191).

When inspecting parameter identifiability, we observed unconcerning correlations (i.e., r ≤ |0.6|) between all pairs of parameters (r ≤ |0.57|, Figure 6).

Figure 6 

Model diagnostics. A–G Parameter recovery result for one random seed for the mean-reverting HGF with drift at the 3rd level (Hypothesis II; Figure 3). H Parameter correlations computed across subjects for the mean-reverting HGF with a drift at the 3rd level (Hypothesis II; Figure 3). I Model recovery analysis. The grey scale indicates protected exceedance probability averaged across all 20 random seeds.

Our parameter recovery analysis indicated good recovery (i.e., Cohen’s f2 0.35 in 100% of the simulations) for four out of the seven model parameters including the drift equilibrium point m3 (Figure 6). However, recovery for μ3(0), μ2(0), and κ2 fulfilled this criterion only in 55%, 65%, and 55% of the simulations respectively. The model selection results remained consistent when fixing non-recoverable parameters (Supplement).

3.3.3 Parameter group effects

The model selection indicated that the mean-reverting model was a better explanation for behaviour of FEP, but not of HC. In this situation, it is generally recommended to investigate parameter group effects using Bayesian model averaging (). However, we were interested in assessing why this model was selected for FEP. Specifically, we wanted to investigate whether the perception of volatility in FEP increased or decreased over time (see also simulations illustrating these two possibilities in Figure 2), because our a priori hypothesis was that individuals with emerging psychosis should perceive the environment as increasingly volatile (increased m3 compared to controls; Diaconescu et al. ()). To distinguish between these two possibilities, we compared the drift equilibrium point m3 across the three groups and found that m3 was significantly different across the groups (η2 = 0.142, puncorr = 0.020). Post hoc tests revealed that m3 was significantly increased in FEP compared to HC suggesting that FEP perceived the intentions of the adviser as increasingly more volatile over time (η2 = 0.212, p = 0.017, Bonferroni-corrected for the number of comparisons across groups, i.e., n = 3; Figure 4C). We also performed an exploratory analysis including all other free model parameters. This analysis revealed an additional effect on coupling strength κ22 = 0.138, puncorr = 0.022), which was driven by reduced coupling strength between the second and third level of the perceptual hierarchy in FEP compared to HC (η2 = 0.217, p = 0.016, Bonferroni-corrected for the number of comparisons across groups, i.e., n = 3; Figure 4D). However, neither the effect on m3 nor κ2 survived Bonferroni correction for the number of parameters, i.e. n = 7 (p = 0.140 and p = 0.157, respectively).

3.3.4 Symptom-parameter correlations

Some authors (e.g., Esterberg and Compton ()) have argued that psychosis may be better conceptualised on a continuum rather than categorically, based on evidence that a significant percentage of the general populations reports some psychosis symptoms (; ). In line with this proposal, we assumed a continuum perspective and investigated whether the equilibrium point m3 and coupling strength κ2 were correlated with specific symptom subscales of the Positive and Negative Syndrome Scale (PANSS) () across all three groups with non-parametric Kendall rank correlations (see Figure 4E).

We found a positive correlation between m3 and PANSS positive symptoms (τ = 0.203, puncorr = 0.038) and negative correlations between κ2 and PANSS negative and general symptoms (τ = –0.253, puncorr = 0.011 and τ = 0.219, puncorr = 0.022 respectively). Firstly, this suggest that individuals who perceived the adviser’s intentions to be increasingly volatile over time also experienced more severe positive psychosis symptoms. Secondly, the negative correlation between κ2 and PANSS negative and general symptoms implies that individuals with more severe negative and general symptoms displayed lower κ2 values or a decoupling between the third and the second levels of the hierarchy. These correlations, however, did not survive Bonferroni correction (p = 0.228, p = 0.068, and p = 0.132 respectively, adjusted for 2 (#parameters) × 3 (#PANSS subscales) = 6 comparisons).

Since the PANSS () was specifically designed to assess symptom expression in clinical populations, we also calculated correlations with the Paranoia Checklist (PCL) (), an instrument more sensitive to expressions of paranoia in healthy or subclinical populations. We found a correlation between m3 and the PCL frequency subscale (τ = 0.201, puncorr = 0.034), indicating that individuals who perceived the adviser’s intentions to be increasingly volatile over time also reported a higher frequency of paranoid beliefs. Again, this correlation did not survive Bonferroni correction (p = 0.204, adjusted for 2 (#parameters) × 3 (#PCL subscales) = 6 comparisons).

4 Discussion

In this study, we investigated the computational mechanisms underlying emerging psychosis. Our model selection results suggest that FEP may operate under a different computational mechanism compared to HC that is characterised by perceiving the environment as increasingly volatile. A strength of our study is that this effect is unlikely due to long term medication effects as FEP were only briefly medicated. Furthermore, we observed more heterogeneity in CHR-P, possibly indicating that this modelling approach may be useful to stratify the CHR-P population and identify individuals that are more likely to transition to psychosis. Assuming a psychosis continuum perspective, we also found tentative evidence suggesting that the drift equilibrium point m3 and the coupling strength between hierarchical levels κ2 may be affected in emerging psychosis and that these parameters provide a clinically relevant description of individuals’ learning profiles. However, due to the small sample size, these results should be interpreted with caution.

Bayesian accounts of psychosis (; ; ) propose that psychosis may be characterised by overly precise PEs that provide the breeding ground for delusions to form. Our results are in line with these proposals and the predictions of increased precision-weighted PE-learning in early psychosis derived through simulations (). Moreover, our results enable a more nuanced characterisation and point towards an altered perception of environmental volatility as a possible consequence of altered PE learning. Specifically, our finding that FEP individuals perceived the intentions of another person as increasingly volatile over time (higher m3) suggests that larger precision-weighted PEs are related to decreased high-level precision (see Eq. 5). This finding is in line with Bayesian accounts, although we cannot say whether changes in the perception of volatility are caused by overly precise PEs or vice versa without longitudinal assessment of changes within the same participants. However, we note that the mean-reverting model was only conclusively selected in the FEP group and not already in the CHR-P group, although the mean-reverting model was favoured in the model attributions for some CHR-P individuals (Figure 5B). In contrast to our a priori hypothesis (), we did not find evidence for a compensatory increase in the precision of high-level priors or reduced learning (e.g., reduced evolution rate ω2) in patients who have strong conviction in their delusional beliefs. This was also proposed as a cognitive mechanism to make sense of ‘aberrantly salient’ PEs by Kapur () and observed empirically by others in healthy participants with paranoid ideations (; ) as well as patients with schizophrenia, (), although Baker et al. () used a non-social probabilistic reasoning task.

Reed et al. () employed the HGF to investigate the computational mechanisms underlying paranoia in a subclinical population and schizophrenia patients using a non-social reversal learning task. They found increased expected volatility (μ3(0)) in participants with higher levels of paranoia using the standard 3-level HGF. Our model selection suggested that this model explains behaviour better in HC, whereas FEP were better characterised by a mean-reverting HGF that included a drift at the third level. It should be noted that increasing μ3(0) and including a drift at the third level, which increases over time, can both be interpreted as expecting the environment to be more volatile, but the drift provides a more nuanced description of changes that occur during the learning session. Our results are thus in line with previous results, but possibly provide a perspective that takes within-task dynamics more explicitly into account (see simulations in the Supplement). An interesting observation based on simulations is that artificial agents with increased m3 are quicker to adapt to volatile changes between very helpful and very misleading advice (trials 68–119), but increasing m3 also leads to more susceptibility to noisy inputs following this period of rapid, but meaningful changes (trials 120–136; Supplement).

Moreover and in contrast to our results, Reed et al. () found increased and not reduced coupling strength κ2. This discrepancy may be related to differences in the tasks employed (non-social three-option reversal learning task vs our social learning task), but we also note that κ2 was not always well-recoverable in our simulation analysis. Therefore, we do not wish to draw strong conclusions based on the κ2 effect in our study, although we found effects suggesting that κ2 may be related to negative and general symptoms.

Even though m3 models within-task dynamics, the time-scale of the modelled effects depends on the frequency of interactions. In our study, players and advisers interacted very frequently within a brief time period. The adviser possessed incomplete information and sometimes made mistakes (although intended to help) and in other phases of the task, the adviser intentionally tried to mislead. Conceivably, in many real-world scenarios such interactions may unfold over much longer time-scales over several weeks or months. A simple explanation for misleading advice is that the adviser makes mistakes because they have incomplete information about the outcome, but interestingly, a participant, who perceives the intentions of the adviser to be increasingly volatile, may be adopting a more sophisticated explanation for the adviser’s actions (i.e., engage in overmentalising). This aligns with a recent finding showing that agents that overinterpret the intentions behind each other’s actions (in terms of depth of mentalisation) become paranoid (). However, this interpretation needs to be tested by using a truly recursive social learning task.

4.2 Is the perception of environmental volatility altered specifically in social contexts?

Here, we employed an ecologically valid social learning task (, ) to study changes in learning about other’s intentions. Some authors (; ) have raised the question of whether changes in learning like the ones observed in this study are reflective of a specifically social or rather a domain-general learning deficit. Here, we did not assess whether differences with respect to the perception of environmental volatility were specific to a social context since we did not include a non-social control task. However, it will be important to address this question in future studies.

Interestingly, recent studies also identified a mean-reverting HGF with a drift towards larger volatility estimates as the winning model in a sample of CHR-P participants () and changes in m3 to be associated with a schizophrenia diagnosis () in non-social, two-option reversal learning tasks. Others found changes in model parameters related to the perception of environmental volatility in healthy, subclinical, and schizophrenia patient populations (; ). Reed et al. () also included a social control task, which did not affect the parameter effects. Therefore, this mechanism may not be specifically tied to social contexts, but instead may be related to a more general deficit in learning under uncertainty (; ). However, we do note that the social control task employed by Suthaharan et al. () was not as ecologically valid as other tasks that were used to study paranoia such as the dictator game (; , ) or our task which was adapted from empirically-observed human-human interactions in a previous study using videos of human advisers intending to either help or mislead players (). Finally, it is also possible that there are both domain-general and domain-specific changes, but that these can only be studied at the neuronal level and converge on the same behavioural model parameters.

4.3 What causes an altered perception of volatility?

Interestingly, there may be at least two possibly interacting pathways that can lead to an altered perception of environmental volatility. First, abnormalities in monoamine systems may lead to chaotic PEs that are unpredictable and lead to the expectation that the environment is very volatile (; ). In line with this pathway, Reed et al. () found that methamphetamine administration induced changes in model parameters that impacted learning about environmental volatility in rats. Moreover, Diaconescu et al. () found activation in dopaminoceptive regions such as the dopaminergic midbrain during the same social learning task that was used in the current study. Similarly, unstable dynamics in cortical circuits (related to synaptic dysfunction, or indeed abnormal neuromodulation) may also increase updating to unexpected evidence and thus increase the perception of environmental volatility (; ). Secondly, external shifts in the volatility of the environment, like for example the global health crisis of the COVID-19 pandemic, may also result in an altered perception of volatility and emergence of paranoid thoughts or endorsement of conspiracy theories (). This second (environmental) pathway may also be relevant for understanding increased incidence of schizophrenia in individuals that experience migration () and those living in urban environments () as individuals exposed to both of these risk factors may be confronted with – in some cases drastically – changing environments. In summary, there are likely multiple possibly interacting pathways that could give rise to an altered perception of environmental volatility.

4.4 Clinical implications

We identified trend-correlations between the drift equilibrium point m3 and PANSS positive symptoms and the frequency of paranoid thoughts and between the coupling strength κ2 and PANSS negative and general symptoms. While the evidence was not conclusive in this study since these correlations were not significant after multiple testing correction, we note that the effects were in the expected direction, such that perceiving the environment as increasingly volatile (higher m3) was associated with higher frequency of paranoid thoughts and more severe positive symptoms in general. Additionally, increased decoupling of the third level from the second level of the HGF, which leads to altered learning under uncertainty, correlated with more severe negative symptoms. Future well-powered studies are needed to assess whether these effects can be confirmed in larger samples. Interestingly, we observed heterogeneous model attributions specifically in CHR-P, whereas the model selection clearly favoured the standard 3-level HGF in HC and the mean-reverting model in FEP. This finding suggests that this model may be helpful to identify CHR-P patients that will more likely transition to a psychotic disorder.

4.5 Limitations

Several limitations of this study merit attention. First, the sample size of this study was small due to very selective inclusion criteria with respect to medication, which, however, enabled us to minimise the impact of long term medication effects. Larger studies are needed to replicate our results and increase statistical power to identify correlations between model parameters and symptoms. Secondly, we cannot assess the specificity of our results with respect to the social domain since we did not include a non-social control task. Lastly, we also cannot speak to the specificity with respect to other diagnoses, because we did not include a clinical control group. Notably, similar results have been reported in individuals with autism for example (; ), who, however, may share some biological mechanisms with schizophrenia (). Disentangling the specificity of our findings with respect to related (e.g., autism) and other disorders (e.g., depression) will be an important avenue for future research.

4.6 Future directions

While we found evidence for increased uncertainty associated with higher-level beliefs about the volatility of others’ intentions, future studies will have to examine whether a compensatory increase in the precision of higher-level beliefs occurs during later stages of schizophrenia, possibly also fluctuating with the severity of psychosis, or whether other models are better suited to capture the conviction associated with delusory beliefs during acute psychotic states (e.g., Baker et al. (); Erdmann and Mathys (); Adams et al. ()). Furthermore, the neural correlates of belief updating in emerging psychosis during social learning should be examined to identify neural pathways that may underlie the changes in perception that were suggested by the model. Lastly, longitudinal studies are needed to assess whether model parameters can be leveraged as predictors for transition to psychosis or treatment response in individual patients with psychosis.

4.7 Conclusions

In conclusion, our results suggest that emerging psychosis is characterised by an altered perception of environmental volatility. Furthermore, we observed heterogeneity in model attributions in individuals at high risk for psychosis suggesting that this computational approach may be useful to stratify the high risk state and for predicting transition to psychosis in clinical high risk populations.

Data Accessibility Statement

The analysis code for this study is publicly available under https://github.com/daniel-hauke/compi_ioio_phase. The data is publicly available under https://osf.io/6rdjc/. Note, that one participant did not consent to make their data available for reuse and was excluded from the public repository. To ensure reproducibility, we report all results excluding this participant in the Supplement.

Additional File

The additional file for this article can be found as follows:

Supplementary material

Supplementary simulations, supplementary results and reproducibility information. DOI: https://doi.org/10.5334/cpsy.95.s1