A Reduced Self-Positive Belief Underpins Greater Sensitivity to Negative Evaluation in Socially Anxious Individuals

Alexandra K. Hopkins; Ray Dolan; Katherine S. Button; Michael Moutoussis

Introduction

‘We don’t see things as they are, we see things as we are’ – Anaïs Nin

We tend towards optimism instead of realism, often overestimating our competence and likeability (). This bias appears useful, allowing individuals who hold a positive self-view to benefit from better psychological well-being and mental health (; ; ). One’s self-view is theorised to be shaped by interpersonal interactions and the perceptions we think others have of us (, ; ; ). The nature of the social information individuals receive, and what they do with that information, is key to understanding how self-beliefs develop and are maintained ().

Cognitive theories of depression and social anxiety hold that repeated exposure to social adversity can teach an individual that the world is an unpredictable and hostile place, where they should expect criticism and poor social outcomes (; ). This negative learning forms the schema, a system of beliefs and expectations through which future self-relevant social information is processed (; ). Once activated, the self-schema acts as an information filter, influencing attention, perception, learning and memory, such that the dysfunctional self-views are maintained (). Schemas are disorder-specific; for social anxiety, their content relates to the core fear of being negatively evaluated by others.

It is important to understand the psychological mechanisms behind inferring evaluation of self and others, and how this integrates into our self-schema. Evidence indicates that the activation of self-beliefs, or self-schema, and the updating of such beliefs in response to social feedback is key (; ; ). However, temperamental preparedness and operant learning routes to anxiety, such as behavioural inhibition and reinforcement via safety-behaviours, are also postulated to be important ().

Individuals who show high fear of negative evaluation (FNE) display negatively biased processing of social-evaluative information () and are prone to social anxiety (). Button and colleagues (, ) demonstrated negative bias about the self in a Social Evaluation Learning Task, wherein a computer persona described either themselves or an unknown other. Those more fearful of negative evaluation selected significantly fewer positive attributes when asked to predict how the computer persona would describe them, but displayed no bias when making predictions about unknown others. The fact that this negative bias specifically manifested when evaluations are related to the self, suggests that individuals integrate social information differently depending on the context and focus of the evaluation, which is consistent with the cognitive models (; ).

Computational cognitive studies have recently addressed self-evaluation (; ). So far, studies have mostly relied on associative learning models () to capture phenomena such as healthy people giving more weight to positive, rather than negative, information about themselves. Koban et al. () analysed self-evaluation using an associative model, to test whether learning rates – association values in learning theory () – depended on social anxiety. Social Anxiety Disorder patients were found to have higher learning rates for negative attributes about themselves, compared to healthy controls. Learning-rate based models give a good description of changes in moment-to-moment evaluation of the self, but learning rates are not stable psychological characteristics, depending on a host of factors (; ; ). Clinically, this maleability is useful, opening up maladaptive learning rates to therapeutic intervention ().

Instead of focusing on behaviour assumed to be gradually reinforced, belief-based frameworks focus how evidence, here provided by social information, updates beliefs. This framework can accommodate the top-down role of self-schema/beliefs, including trait-like views about the self activated given a social context, more naturally than associationist approaches. It also explicitly accounts for the role of uncertainty, which may be especially important for social learning ().

A Bayesian approach is particularly well suited to modelling the top-down influence of beliefs (), as it has belief update at its core and explicitly represents different strengths of belief. For example, I may believe that I am ‘80–90%’ socially competent but also allow for a socially incompetent 10–20%. Alternative beliefs are then strengthened or weakened as social information accumulates. The certainty of beliefs is informed by learning throughout an individual’s history. Certainty then determines how open existing (‘prior’) beliefs are to change, i.e. determines learning rates. Intuitively, someone with a negative self-view may be more likely to integrate negative evaluations, as they are more in line with their own initial beliefs (see SI for a tutorial demonstration). Similarly biased belief-updating has been demonstrated in non-social reward-based tasks ().

We aimed to clarify the explanatory power of these two psychological frameworks in social-evaluation. We expected associative learning models to capture well the dynamics of learning, while a Bayesian cognitivist framework would provide insight into how beliefs evolve and affect learning. We were interested in mechanisms of biased learning in individuals with high fear of negative evaluation, and its potential basis in biased updating of beliefs about the self.

Methods and Materials

Measures

Published data was obtained from (). Data consisted of a Social Evaluation Learning Task (Figure 1) completed by 100 participants and a range of questionnaires, of which the primary measure was the Brief Fear of Negative Evaluation (BFNE) scale (). A higher BFNE score indicates greater fear of negative evaluation. For a full details of the task and sample please see ().

Figure 1

Each task block consisted of 32 trials. Participants had to choose between positive and negative words. There were 6 blocks in total, corresponding to 6 evaluative conditions, termed personas – Self-like, self-neutral, self-dislike, other-like, other-neutral, other-dislike. Self/other refers to who is being evaluated, like/neutral/dislike refers to the probability of a positive word being correct (0.8, 0.5, 0.2 for the like/neutral/dislike rules respectively).

Sample

In line with a dimensional approach to psychopathology, the original study recruited participants with a range of social anxiety symptoms using an efficient sampling approach to over recruit from the maximally informative extremes (high or low symptoms), ensuring a third of participants had scores in the bottom quartile of BFNE scores, a third from the top quartile and a third from the mid-range using random sampling to exclude one out of two participants with mid-range scores. Participants completed the diagnostic CIS-R (), which provides diagnoses in line with ICD-10 and DSM-IV. Seven participants met the diagnostic criteria for social phobia and 62 exceeded the cut-off for clinically significant social anxiety on the BFNE.

Associative and belief-based models

To assess how choices evolved as a function of social feedback, we used computational models. We formalised how social feedback influenced subsequent choices about the self and other using adapted Rescorla-Wagner reinforcement learning models () and novel belief-update models. Here we describe the key features of the models, with technical details to be found in the Supplement.

Associative Learning models

Associative learning models describe learning in terms of value. Here, participants learn the value of the action ‘choose the positive attribute’ or ‘choose the negative attribute’, based on feedback. These action-values Q(action,context) are updated after each outcome. A discrepancy between choice and outcome forms a ‘prediction error’, PE. The PE is then multiplied by a learning rate, λ_c, a parameter weighing the impact of new evidence on existing values, and the result added to update the existing action-value. High learning rates correspond to new evidence having a strong impact, quickly replacing old learning. The context s_t simply indexes which state, i.e. computer persona × (self vs. other), the trial t was about.

(1)

P E t = r t − Q t − 1 (a t, s t) Q t (a t, s t) = Q t − 1 (a t, s t) + λ c P E t

M1 \documentclass[10pt]{article} \usepackage{wasysym} \usepackage[substack]{amsmath} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage[mathscr]{eucal} \usepackage{mathrsfs} \usepackage{pmc} \usepackage[Euler]{upgreek} \pagestyle{empty} \oddsidemargin -1.0in \begin{document} \[ \begin{array}{*{20}{c}} {P{E_t} = {r_t} - {Q_{t - 1}}({a_t},{s_t})}\\ {{Q_t}({a_t},{s_t}) = {Q_{t - 1}}({a_t},{s_t}) + {\lambda _c}\;P{E_t}} \end{array}{\rm{ }} \] \end{document}

We focused on learning rates, as these easily characterise which conditions have a major or minor impact on learning. Following Koban et al. (), we expected that learning could be valence dependent and therefore allowed separate learning rates for trials with a positive or negative outcome word (irrespective of what choice led to it). So, people might have λ_{+ve outcome} > λ_{–ve outcome}. Based on the descriptive findings of Button et al. (), we were interested in self/other distinction and therefore considered models that had separate learning rates depending on whether the object of learning was self or other, giving λ_self,+ve, λ_other,+ve, λ_self,–ve etc. Models could include an initial value parameter, allowing starting values Q(+ve word,s_t=₀) to reflected an individuals starting tendency towards positivity.

Actions were chosen probabilistically, as a function of a propensity variable for choosing each action. This propensity was the action value Q(a,s) biased by a ‘positivity bias’ ρ, which quantified biases in favour of choosing positive attributes independent of learning (Eq. 2). Q(a,s)+ρ then entered a standard softmax function, weighed by a ‘decision noise’ parameter τ > 0:

(2)

P (a = + v e w o r d; s) = z e x p Q (a, s) + ρ τ P (a = − v e w o r d; s) = z e x p Q (a, s) τ

M2 \documentclass[10pt]{article} \usepackage{wasysym} \usepackage[substack]{amsmath} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage[mathscr]{eucal} \usepackage{mathrsfs} \usepackage{pmc} \usepackage[Euler]{upgreek} \pagestyle{empty} \oddsidemargin -1.0in \begin{document} \[ \begin{array}{*{20}{c}} {P(a = + veword;s) = z\;exp\frac{{Q(a,s) + \rho }}{\tau }}\\ {P(a = - veword;s) = z\;exp\frac{{Q(a,s)}}{\tau }} \end{array}{\rm{ }} \] \end{document}

Where z ensured that probabilities add up to 1.

Belief-update models

Belief-update models conceptualised participants as holding beliefs about how approving each computer persona was, from 0 to 1. Such beliefs do not contain just one value (‘this persona will give me 80% +ve attributions’) but also embody an uncertainty (‘but it could be 70 to 90%). They can be formalized by a beta distribution, which conveniently describes beliefs through the amount of positive evidence α and that of negative evidence β held in mind. The mean probability of approval is then the average p = α/(α + β).

The belief parameters were updated in every round by augmenting the evidence corresponding to the outcome (say, positive) by 1 piece of evidence. However, we sought to also model views about the self that participants brought to bear independent of learning. Greatly simplifying clinical theory (), we represented this as the positive and negative evidence people brought to bear. People thus held two belief components. The first was trait-like, (α_trait, β_trait), parameterized individual variability. It was fixed for the duration of the task, and represented the self- or other- view activated given the current context. The second was state-like, (α_state, β_state), and it accumulated task information.

(3)

α t = α t r a i t + α s t a t e, t β t = β t r a i t + β s t a t e, t

M3 \documentclass[10pt]{article} \usepackage{wasysym} \usepackage[substack]{amsmath} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage[mathscr]{eucal} \usepackage{mathrsfs} \usepackage{pmc} \usepackage[Euler]{upgreek} \pagestyle{empty} \oddsidemargin -1.0in \begin{document} \[ \begin{array}{*{20}{c}} {{\alpha _t} = {\alpha _{trait}} + {\alpha _{state,t}}}\\ {{\beta _t} = {\beta _{trait}} + {\beta _{state,t}}} \end{array}{\rm{ }} \] \end{document}

Next, we considered that individuals may not integrate an indefinite amount of evidence, instead gradually discarding older task information. Memory decay parameters 0 < η < 1 thus quantified a participant’s effective working memory. Belief-update models could include separate initial values α_{state, t}₌₀, β_{state, t=}₀. They could also be separated into self/other with respect to α_{trait, self}, α_{trait, other} etc., and with respect to initial values, or indeed the memory decay parameter.

Belief distributions inherently contain uncertainty, which can affect decision variability (). Hence, we considered two classes of probabilistic action choice. In the first, point estimates such as the mean of a belief distribution was used to determine policy. Here, choice variability was independent of belief uncertainty. In the second class, reduced belief uncertainty as a result of evidence accumulation resulted in reduced decision variability. We thus considered several ‘link functions’ from belief to choice (see Supplement), and determined the best by model comparison. The winning action-choice function was the one which only depended on the mean of the belief distributions (Eq. 4):

(4)

P (a = + v e w o r d; c o n t e x t = s) = z e x p α s (α s + β s) τ

M4 \documentclass[10pt]{article} \usepackage{wasysym} \usepackage[substack]{amsmath} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage[mathscr]{eucal} \usepackage{mathrsfs} \usepackage{pmc} \usepackage[Euler]{upgreek} \pagestyle{empty} \oddsidemargin -1.0in \begin{document} \[ P(a = + veword;context = s) = z\;exp\frac{{{\alpha _s}}}{{({\alpha _s} + {\beta _s})\tau }} \] \end{document}

A short summary of all models is displayed in Table 1. Detailed descriptions are given in the Supplement.

Table 1

Model families, grouped according to their defining core parameters.


MODEL FAMILY NAME	NP	CORE PARAMETERS	ADDITIONAL PARAMETERS

Valence model – 2λ	3–5	λ_+ve, λ_–ve, τ	Initial bias	Pos. bias

Self/other asymmetric valence – 3λ	4–6	λ_{self pos}, λ_self,–ve, λ_other, τ	Initial bias	Pos. bias

Self/other valence – 4λ	5–7	λ_self,+ve, λ_self,–ve, λ_other,+ve, λ_other,–ve,τ	Initial bias	Pos. bias

Belief-update	4	α, β, η, τ

Belief-update self/other	7	α_self, β_self, α_other, β_other, η_self, η_other, τ

Belief-update self/other initial bias	9	α_self, β_self, α_other, β_other, η_self, η_other, τ	α_initial	β_initial

Note: The ‘Additional parameters’ were used to optimize fit within each family and hence estimation accuracy for the parameters of core interest. NP gives the range number of parameters in each family, i.e. with or without parameters described as ‘additional’.

Modelling the relation to Fear of Negative Evaluation

We fitted all models using a hierarchical procedure that optimizes estimation of the relation between model parameters and symptomatic measures, i.e. by clinically informed model-fitting. Traditional hierarchical modelling reduces noise in parameter estimates, but we have found that empirical (population) priors which do not take adequately into account the possible correlations with external measures can increase the rates of Type 1 or Type 2 error, in subsequent correlation analyses with unmodelled psychometric measures (). Here, incorporating key psychological hypotheses in the model-fitting can give more accurate estimates of the relationship between model parameters and BFNE scores. As in traditional hierarchical modelling, individual parameters were estimated by taking into account the population distribution they came from, i.e. the ‘group prior distribution’. This was in turn estimated from the data, including BFNE scores. We embedded FNE into model-fitting by including slope parameters that estimated a linear contribution of BFNE scores on the mean of the population distribution whence individuals were sampled from, as detailed below.

Let θ be a cognitive parameter that may correlate with BFNE. We modelled this correlation as a linear relationship between BFNE and the mean of θ over people with that value of BFNE:

(5)

θ ~ N (μ θ (F N E), σ) μ θ (F N E) = w B F N E + θ 0

M5 \documentclass[10pt]{article} \usepackage{wasysym} \usepackage[substack]{amsmath} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage[mathscr]{eucal} \usepackage{mathrsfs} \usepackage{pmc} \usepackage[Euler]{upgreek} \pagestyle{empty} \oddsidemargin -1.0in \begin{document} \[ \begin{array}{*{20}{c}} {\theta \tilde\ N({\mu _\theta }(FNE),\sigma )}\\ {{\mu _\theta }(FNE) = w\;BFNE + {\theta _0}} \end{array}{\rm{ }} \] \end{document}

Where θ₀ is an intercept and in the first instance σ is taken to be independent of FNE. As a cognitive model is fitted using Eq. 5, the posterior distribution over the slope parameter w can be estimated, providing the credible interval over the dependence of of θ on FNE.

We fitted the learning models under consideration (Table 1) using RStan (). Following RStan convention, means over population-level parameters were scaled so as to be sampled from a standard normal distributions. The respective standard deviations were sampled from half-Cauchy distributions. The individual-level parameters were appropriately constrained in their native space (e.g. 0–1 for learning rates), then transformed so as to be subject to the Gaussian distributions informed by the relevant group priors. We initialised Markov-Chain Monte Carlo chains with random starting values. Posterior distributions were formed after 1000 burn-in samples from 4 chains, resulting in a total sample size of approximately 8,000. Convergence was determined by visual inspection of the trace plots and monitoring the Gelman-Rubin statistic for each parameter (), with values close to 1.00 implying convergence.

We compared the goodness of fit of different models via approximate leave-one-out cross-validation (Loo). This provides a measure of the likelihood of left-out data, suitable for estimating model-fit in hierarchical models (). We then examine the credible intervals of correlation parameters (w above) between BFNE and specifically hypothesized parameters (learning rates, beliefs about the self and others) separately in the winning associative and belief-based models. A hypothesis that a parameter correlated with BFNE was tested by determining whether the credible interval of w included zero.

Results

Model fitting and model comparison

Model comparisons using left-out likelihood (LOO) () showed that associative learning models that included separate learning rates for self outperformed ones that did not distinguish between agents. There were also big improvements in model fit upon including an initial bias parameter that allowed individuals to vary in an initial propensity to choose a positive word, and upon including a constant ‘positivity bias’ boosting the action-value of positive information. Although the best-fitting associative learning model in absolute terms was the self/other valence model, LOO model comparison indicated weak evidence for this model over the next best-fitting model with fewer parameters. We thus also took account parameter recoverability, which was enhanced by having fewer parameters. We thus selected for further work the ‘self/other asymmetric valence model’, with 3 learning rates, an initial bias parameter and a positive bias parameter (see Supplementary Information for details of the full self/other valence model).

As shown in Table 2, the best-fitting model overall was a belief-update model with separate self/other alpha, beta and memory parameters and also had free initial bias parameters which also included starting beliefs to vary between individuals. Again, LOO model comparison indicated weak evidence for this model. Following a similar rationale as for the associative models, we selected for further work a ‘separate self/other’ model with a shared memory parameter. The belief-update model without separate initial α and β parameters also performed almost as well as the best models in their respective families. However, the parameters involved might relate to our hypotheses regarding self-Other activated schemata, and hence we proceeded simply with the best-LOO models. Belief models with separate ‘trait’ parameters for self and other performed much better than models without, emphasizing a necessary distinction between self and other in learning. We include more details for all models considered above in the supplement.

Table 2

The best models from each family according to approximate leave-one-out cross-validation. Final models selected are given in bold.


MODEL FAMILY NAME	N. PARAM	LOO

Valence	3–5	–10026

Self/other valence	5–7	–9858

Self/other asymmetric valence	4–6	–9862

General learning rate	3–5	–9966

Belief-update IB	7	–9954

Belief-update self/other IB	8	–9768

Belief-update self/other full IB	9	–9762

Note: IB refers to models with Initial Bias parameters.

Although the belief-based model had better fit statistics overall, we asked whether this was because it fitted most people better than the associative models, or whether those that were better described by associative models were in the minority. To estimate this, we simply examined the distribution of the difference between maximum-likelihood (ML) estimates for the associative vs. belief-based models, shown in Figure 2. This indicates that for the majority of participants there was no clear difference between the models, but for about a fifth there was conventionally strong evidence that one or the other model gave a better account of the data. We did not find a significant correlation between BFNE score and the belief-associative ML difference. Here, we computed the difference in log-likelihoods between the two models, with larger differences indicative of one model describing the data better than the other. There was no significant correlation of log-likelihood with BFNE score when models were analysed separately either.

Figure 2

Individual log likelihoods for associative learning vs belief-update model. Positive values indicates greater evidence for the associative learning model. The horizontal bars indicate log likelihood differences of +/–3 and +/–6, conventionally mild and strong evidence in favour of one model over the other.

The relationship between BFNE and model parameters

Based on the literature () and the theory of self-schema, we examined the specific hypotheses that BFNE would relate to the trait-evidence in the self schema (α_self and/or β_self) or the corresponding learning rates λ_self,+ve and λ_self,-ve (See supplement for the theoretical derivation of this approximate correspondence). We also examined in an exploratory manner whether the other parameters of the winning models correlated with BFNE scores. We assessed each of the BFNE weight parameters to determine whether their credible interval overlapped 0, which would not support an effect of BFNE on that parameter (Table 3).

Table 3

Parameter weights on FNE, derived from clinically informed model-fitting.


Associative Learning parameter	Mean w [Lower CI – Upper CI 95%]	Belief-update parameter	Mean w[Lower CI – Upper CI 95%]

λ_self,+ve	0.01 [–0.09 0.09]	α_self	–0.47 [–0.87 –0.06]

λ_self,–ve	0.11 [0.02 0.20]	β_self	–0.24 [–1.55 1.08]

λ_other	–0.05 [–0.19 0.09]	α_other	–0.02 [–0.16 0.19]

τ	–0.07 [–0.01 0.15]	β_other	0.07 [–0.31 0.45]

Initial bias	–0.09 [–0.19 0.01]	η	–0.22 [–0.56 0.13]

Pos. bias	–0.09 [–0.19 0.01]	τ	–0.09 [–0.25 0.06]

		α_initial	–0.39 [–0.99 0.22]

		β_initial	–0.97 [–5.07 3.13]

^aNote: Mean weights and 95% credible intervals for self/other valence model and self/other belief-update model are shown, with intervals not containing zero shown in bold.

The only associative weight parameter that did not have credible intervals including zero was for the self-negative learning rate (see Table 3). This weight parameter was positive, indicating the higher the individual is in FNE, the larger the self-negative learning rate will be. Therefore, it appears that in an associative learning framework, fear of negative evaluation is specifically related to over weighting of negative information, while positive information processing appears intact.

The only belief-update weight parameter that did not have credible intervals including zero was was between BFNE score and the α_self,+ve parameter (see 3). This weight parameter was negative, indicating the higher the individual is in FNE, the lower the amount of positive evidence in the self-schema, α_trait,self, will be. The more negative balance of the self-schema then decreases the mean belief in approval in individuals with higher FNE.

We then explored whether the best fitted parameter values provided evidence for the theoretical correspondence between the two models. From the MLE fit parameters, indeed, α_trait,self was strongly anticorrelated with the λ_–ve,self, Spearman r = –0.49, raw p = 3.006e–07 and β_trait,self, Spearman r = –0.3, raw p < .01 (Spearman’s rho was used due to non-normality). λ_–ve,self was also correlated with β_trait,other, Spearman r = –0.21, p = 0.04, but none of the other parameters of the belief-model. Finally, λ_–ve,self was also strongly anticorrelated to the proportion of activated positive self-beliefs, represented by the mean of the beta distribution (Spearman r = –0.27, p < 0.01), although this is of not, of course, an independent relationship. The best fitted parameter values from the MCMC fits indicated an even stronger relationship, with the key parameters α_trait,self being strongly anticorrelated with the λ_–ve,self, Spearman r = –0.85, raw p = 1.5349e–29, giving evidence that people with larger learning rates for self-negative information also have lower positive self-belief. Again, there was a strong relationship between the λ_–ve,self parameter and the proportion of activated positive self-beliefs derived from the mean of the self beta distribution, Spearman r = –0.78, raw p = 4.4583e–22. There was also a positive correlation between the initial bias and α_trait,self parameter, suggesting they represent similar concepts (Spearman r = 0.50, p < .001) and suggesting people with lower positive self-belief have a prepotent starting tendency towards more negative responses. None of the other parameters indicated correlations.

Generative Performance

Crucially, good models not only statistically fit the data overall, but are also able to capture specific data features of interest that have not been privileged during modelling (). We therefore tested this using our best-fit models. The best associative learning model and belief-update models were used to generate pseudo-data from 100 sample datasets consisting of 1000 participants each, simulating ‘ideal experiment’ conditions, here with more subjects than resource constraints allow. We checked whether these synthetic experiments reproduced the published findings from real people Button et al. (), ran the same formal statistical tests, and examined the credible intervals of each result over simulated samples. We computed the percentage positive response for each persona from the generated data as the number of positive word choices made/32 (number of trials). We ran linear mixed effects (LME) analyses including BFNE scores, persona (like/neutral/dislike) and referential condition (self/other) as predictor variables and percent positive response as outcome variable.

As illustrated in Figures 3 and 4, the generated data reproduced most key features of the real experiment. Table 4 shows that the LME results presented in () were well reproduced. Using generated data from the belief-update model, we replicated almost all of the main and interaction effects in over 95% of the samples. The three-way interaction, however was slightly underestimated. The associative learning model did better in this regard, not only replicating all of the main and interaction effects, but also providing evidence for the significant three-way BFNE × persona × condition interaction in over 95% of the samples. Both models slightly overestimated the BFNE difference for the neutral condition.

Figure 3

Generative performance for the Associative Learning S/O asymmetric model; mean cumulative positive words chosen for actual data (in black) vs. data generated from ‘clinically informed fitting’ (cyan). Data is visualised using median-split FNE scores (lighter=lower BFNE) and shaded zones represent +/– SEM. The generated data captures the asymmetries in positive vs. negative word selection and the group differences between high and low FNE for the self-referential condition well. There is slower initial learning, especially in the like condition and this model chooses over-optimistically, especially in ‘dislike’ conditions.

Figure 4

Generative performance for the Self/Other Belief-Update model; mean cumulative positive words chosen for actual data (in grey) vs. model (mauve). Again data is visualised using median-split FNE scores, with shaded zones representing +/– SEM for high (darker shade) vs. low (lighter shade) BFNE scores. The generated data captures well the asymmetries in positive vs. negative word selection and the group differences between high and low FNE for the crucial self-referential dislike condition.

Table 4

Generative performance statistics.


CONTRAST	ASSOCIATIVE LEARNING MODEL MEAN β COEFFICIENT	% OF SIG SAMPLES	BELIEF-UPDATE MODEL MEAN β COEFFICIENT	% OF SIG SAMPLES

Main effect BFNE	–0.74 [–0.75 –0.73]	100	–0.73 [–0.74 –0.72]	100

Main effect self/other	–13.28 [–13.56 –13.00]	100	–13.52 [–13.84 –13.20]	100

Main effect persona: like	21.55 [20.98 22.11]	100	24.20 [23.53 24.88]	100

Main effect persona: neutral	19.36 [18.57 20.16]	100	15.97 [15.22 16.73]	94

BFNE × self/other	0.32 [0.32 0.33]	100	0.28 [0.27 0.29]	100

BFNE × persona: like	0.74 [0.73 0.76]	100	0.70 [0.68 0.71]	100

BFNE × persona: neutral	0.19 [0.17 0.20]	34	0.26 [0.25 0.28]	61

BFNE × self/other × persona	–0.30 [–0.31 –0.29]	100	–0.23 [–0.24 –0.21]	89

^aNote: [Lower CI Upper CI 95%].

Discussion

We aimed to understand learning about self and others in those fearful of social evaluation, by formalizing and comparing two classic psychological perspectives. This is important, as the way in which belief-based accounts used by clinicians should be formalized is unknown, as is how valid they are and whether they are distinct from associationist accounts. Using a well-established Social Evaluation Learning task, we provide evidence that reduced positive content within activated self-schemata underpins increased sensitivity to negative evaluation in socially anxious individuals. Individuals with a less positive self-schema also had a larger self-negative learning rate when investigated using the associative framework. Both associative learning and belief-based models described social learning well, with belief-models especially able to capture the interaction between task context and participant disposition.

We replicated, and also refined, influential findings on associative learning in social anxiety (). Using a task with evidence for reproducibility at the psychological level (, ), we reproduced the model results reported in (). Namely, socially anxious (high-FNE) individuals had higher learning rates governing the impact of negative information on predictions about the self. We finessed this associative account by including a ‘positivity parameter’, thus better accounting for participants’ optimism bias (). We also showed that learning rates for positive and negative feedback for the other-referential context were not distinguishable from each other, further pointing at the relevance of self-bias in social anxiety.

Detecting the dependence of task parameters on FNE in this subclinical sample was established through clinically informed model-fitting, which makes use of a fundamental property of hierarchical statistical models. These infer the characteristics of each individual not only from the data they provided, but also from the specific population from which they are drawn. Clinically informed model-fitting allowed (yet did not force) empirical priors over cognitive parameters like learning rates to be informed by clinical data, here BFNE scores (). It thus allowed more accurate estimation of the correlation between parameters and FNE. Research is starting to benefit from clinically informed model-fitting ().

To examine whether key features of successful associative learning models were understandable in terms of self-beliefs, which statistically account for improvement during therapy (), we formulated a very simple model of social belief update. We assumed that upon entering a context of evaluation of self or other, individuals activate beliefs about themselves (or others), over and above the evidence gleaned during the task. We focues on the trait-like component of activated schemata, which are constant for the duration of the task but may differ according to the contextual focus of evaluation. This activated self-schema consisted of positive and negative ‘notional’ evidence that each individual brought to mind. We hypothesised that ordinary beliefs could be modelled as Bayesian beliefs, so that the strength of belief could be quantified much like in CBT (‘I believe 70–80% that I will be judged positively’). This meant that belief change not only depended on evidence, but also on the certainty of prior beliefs (). Overall, the success of belief-based models suggested that this was indeed the case. Next, we hypothesized that the amount of evidence that each individual processed would be variable, in effect a working memory capacity. Again, the evidence supported this hypothesis. Another ‘signature’ of belief-based cognition might be that more uncertain participants would show increased decision variability. However, model comparison provided evidence against this.

Most importantly, FNE was predicted by the amount of positive evidence about the self that was held in mind independent of task feedback. The variation in this positive self-evidence accounted for almost half the variance in self-negative learning rates. This was not, however, the only important model feature, as there was also evidence for reduced negative self-evidence. Combined, these two features may mean that social anxiety is associated with greater uncertainty in one’s beliefs about the self. Such increased uncertainty would predict lesser stability of self-evaluation, reminiscent of the changeable self-evaluation found in individuals with low self-esteem (). Importantly, the proportion of positive to negative self-evidence was greater in those with lower self-negative learning rates. Thus an activated self-schema including more positive evidence correlated strongly with diminished association value for negative attributes, largely reconciling cognitivist and behaviourist perspectives.

Leave-one-out cross-validation measures suggested that belief-based models may give a better account of behaviour overall, but this finding is likely to hide important individual differences in learning mechanisms. Preliminary analyses indicated that a minority of individuals substantially favoured associative learning, while others belief-updating. Belief-based models are a simple case of model-based cognition, updating the probability of a transition in the environment (that a persona will judge one positively), while the association models are model-free, incrementally associating values to actions. Thus, some people may be more model-based, whereas others more model-free in the domain of self-evaluation, as people are in impersonal cognition (; ).

Our study has the potential to inform treatments for social anxiety. Simple tasks, like the one used here, may assess both the extent of biases and also the patient’s predominant cognitive style (belief-based or associationist). Importantly, we describe cognitive mechanisms quantifying and lending support to self-schema theories of social anxiety, reproducing several features of self- and other- evaluation between groups with high and low fear of negative evaluation. Clinically, our results point towards strengthening psycho-education by incorporating rigorous research showing that patients are excessively influenced by negative feedback. In therapy, patients may benefit by learning to activate positive evidence about themselves ‘on line’, specifically upon exposure to negative feedback, consistent with the work of Kube et al. (). Ideally, however, testing such interventions should be guided by a reliable estimate of each individual’s cognitive parameters, rather than by features of their condition in general. Here, as is often still the case with computational analyses, further progress is needed (). Being able to quantify individuals’ self-views may also prove to be useful for assessing the deeper changes that therapy has achieved, rather than just symptomatic change ().

There are important limitations to the modelling employed in this study. Our models include a number of hypothesis-driven additional parameters, which aim to capture well-known psychological phenomenon, such as the optimism bias Sharot et al. () or initial starting propensity towards positive or negative responses (). When performing simulations to assess parameter recoverability, some parameters relevant to our hypotheses were difficult to recover. Limited recovery of the ‘initial bias parameters’ from the belief-update model and ‘positivity bias’ from the associative learning model (see supplement) suggest that our study may have lacked power to detect differences with respect to FNE with respect to these parameters. Aside from reduced power, the poor recoverability of some parameters renders the model less reliable at the individual level. Nevertheless, fit measures and synthetic data studies indicated that the more complex models, though over-parameterized given our concise data at the individual level, were best in describing the subtle differences in learning associated with FNE in our population. Future studies will need data capable of more fully constraining model parameters, and possibly alternative parameterizations of key models.

Despite the decreased reliability of specific parameters and possibly because of the increased accuracy of complex models, we are able to detect our main effects of interest, and found good recoverability for the positive self-belief and self-negative learning rate. Future studies using clinical populations with larger differences at the behavioural level could observe even greater effect sizes. Thus, our study is well able to detect group level differences in learning between the high vs low FNE groups (the main objective of the study), but poor at capturing individual level differences reliably (). An important consideration for our more complex models was the ability to reproduce key behavioural statistics of the data, which () recommend as a method of model falsification. Simpler models, despite showing good fit statistics, were unable to capture the key FNE group differences between self and other conditions (see supplement), thus we preferred models with good fit statistics as well as generative performance. Finally, our modeling of evidence about the self was rudimentary compared to the sophistication of clinical research on self representations(; ). Future studies modelling self-representations could combine our hierarchical clinically informed model fitting approach with this previous work.

In conclusion, individuals who are high in fear of negative evaluation (yet not care-seeking patients) are more affected by negative social feedback, compared to those unafraid of such feedback. The robustness of typical individuals is consistent with activation of more positive beliefs about themselves independently of feedback, acting as a ‘buffer’ against developing negative expectations. If replicated, this finding can inform therapeutic interventions aiming at activating positive views of self when people are in the crucible of social judgment.

Additional File

The additional file for this article can be found as follows:

Supplementary Information

Beliefs & Associations in Social Learning. DOI: https://doi.org/10.5334/cpsy.57.s1

Computational Psychiatry

Research Articles