INTRODUCTION

Autism spectrum disorder (ASD) is a neurodevelopmental disorder that affects a broad range of cognitive functions, including perception (Simmons et al., ), action (Gowen & Hamilton, ), and social cognition (Baron Cohen, ). In particular, behavioral rigidity manifested as restricted, repetitive behavior and resistance to change is a core ASD symptom (American Psychiatric Association, ; Leekam, Prior, & Uljarevic, ; Poljac & Bekkering, ; Poljac, Hoofs, Princen, & Poljac, ), albeit such behavioral rigidity can be also observed in other psychiatric disorders (Lewis & Kim, ; Zandt, Prior, & Kyrios, ). Behavioral rigidity in ASD consists of various behavioral categories, such as stereotyped motor mannerisms (e.g., hand flapping) and self-injurious or compulsive behavior (Bishop et al., ; Lord & Jones, ). Although the reduced behavioral flexibility severely limits the social adaptation of patients, its cause and the underlying cognitive mechanisms remain unclear.

There have been many studies aiming to construct theories that explain the mechanisms underlying autistic symptoms (Baron Cohen, ; Happé & Frith, ; Hill, ), and recently the focus of these attempts has shifted to the idea of describing fundamental brain function as a set of computational processes (Redish & Gordon, ). In particular, theoretical explanations based on prediction error minimization frameworks, such as predictive coding (Bar, ; Den Ouden, Kok, & de Lange, ) and the free energy principle (Friston, Daunizeau, Kilner, & Kiebel, ), have been well investigated because they may be able to uniformly explain various ranges of autistic symptoms using a simple and neurologically plausible principle (Friston, Lawson, & Frith, ; Lawson, Rees, & Friston, ; Pellicano & Burr, ; Van de Cruys et al., ; Van de Cruys, Van der Hallen, & Wagemans, ; van Boxtel & Lu, ; van Schalkwyk, Volkmar, & Corlett, ). The prediction error minimization mechanism explains how we acquire knowledge and skills (learning) and how we successively infer the causes of sensory inputs and recognize environments as the process of updating a model of the world based on minimizing error between a prediction about incoming sensory inputs and actual sensory inputs. Within a scheme in which prediction error causes the brain to update its model of the world, it is crucial to estimate precision (inverse variance) of sensory information: the expected precision of certain sensory information can provide information about the reliability of the generated prediction error, which influences how much weight is given to the error when updating predictions. For example, although prediction errors for certain sensory inputs that contain information refuting the current expectation (e.g., one looks around the seabed in clear water and what seems like sand suddenly moves) should cause the brain to update its expectation (one recognizes it is not sand but flatfish), errors in sensory inputs that are very noisy (one looks around the seabed in foggy water and something moves) should not cause the update (one would think it is only a wave causing the movement). Although the estimation of such context-dependent sensory precision (prediction about whether information is informative or just noise) helps us to be flexible and adaptable in an uncertain world, deficits of it are expected to cause perceptual peculiarity and great difficulty in social contexts that are filled with situations of particularly high complexity and uncertainty (Lawson et al., ; Palmer, Lawson, & Hohwy, ; Van de Cruys et al., , ; van Schalkwyk et al., ). Van de Cruys et al. () suggested that inflexibly overestimated sensory precision causes autistic symptoms and inflexible behavior may be considered as an attempt to minimize prediction errors; otherwise, patients are exposed to huge error signals. Lawson et al. () explained autistic behaviors as the consequences of “an imbalance of the precision ascribed to sensory evidence relative to prior beliefs.” These aberrant precision accounts of ASD in previous studies are normative and testable, but only suggestive. Specifically, there is a gap between the cognitive mechanisms described in the theories and the actual generation of the symptoms.

This kind of problem is broadly described in psychiatry, and there is a need to demonstrate actual generation of symptoms using formal computational models (Adams, Huys, & Roiser, ; Friston, Stephan, Montague, & Dolan, ; Huys, Maia, & Frank, ; Montague, Dolan, Friston, & Dayan, ; Teufel & Fletcher, ). Indeed, several computational simulations of psychiatric symptoms have been conducted to try to understand the processes underlying these symptoms and clarify the relationships between abnormalities at neurological and behavioral levels (Barakova & Chonnaparamutt, ; Brown, Adams, Parees, Edwards, & Friston, ; Diwadkar et al., ; Krichmar, ; O’Loughlin & Thagard, ; Powers, Mathys, & Corlett, ; Rosenberg, Patterson, & Angelaki, ; Yamashita & Tani, ). In particular, embodiment (Asada et al., ; Smith & Gasser, ) in a robot agent acting in physical environments may be useful, or even essential, for understanding the cognitive mechanisms of psychiatric disorders. That is because psychiatric disorders are characterized by behavioral and perceptual conditions observed through interaction with real environments and physical agents. In a related study, Yamashita and Tani () performed a neurorobotics experiment to investigate schizophrenic cognition by utilizing a hierarchical neural network model. Their robotic experiment showed that behaviors analogous to psychiatric symptoms, such as fictive sensations and cataleptic, stereotyped behaviors, can be generated in the coupled dynamics describing the neural networks, body, and environment due to synaptic disconnections between different levels of the neural network.

In this study, we investigated the effects of increased and decreased sensory precision on adaptive behaviors by conducting experiments using a humanoid robot implemented with a version of the predictive coding model. In the experiment, a task involving adaptive interaction between the robot and a human experimenter was considered. Initially, the neural network model inside the robot learned to generate a set of sequence patterns representing different behaviors of the robot. After the learning phase, the level of estimated sensory precision was manipulated. Then, the change in the robot’s behavior in response to the alteration of the level of sensory precision was observed through experiments in which the robot was required to appropriately recognize situations determined by the experimenter. The results show that both increased and decreased sensory precision can cause seemingly similar inflexible behavioral patterns, such as inappropriate repetitive behavior and freezing; but these behaviors are the result of different processes at the network level in the two cases. Our findings may provide a system-level account for different types of behavioral rigidity observed in ASD and other psychiatric disorders and extends computational perspectives on the cognitive mechanisms underlying psychiatric symptoms.

METHODS

Computational Framework

We used an artificial recurrent neural network (RNN) model to investigate the effects of increased and decreased sensory precision on adaptive behaviors of a robot. An RNN is a connectionist model that can process temporal sequences thanks to recurrent connections between neural units (Elman, ). Owing to their capacity to learn to reproduce complex dynamic behaviors, RNNs have been used in cognitive neurorobotics studies aiming to understand human cognition (Alnajjar, Yamashita, & Tani, ; Marocco, Cangelosi, Fischer, & Belpaeme, ). Murata, Namikawa, Arie, Sugano, and Tani (), within the cognitive robotics scheme, proposed an RNN model with a mechanism for estimating the time-varying uncertainty of sensory information in terms of variance (inverse precision) as inspired by the free energy minimization principle proposed by Friston et al. (). This RNN, called a stochastic–continuous time RNN (S-CTRNN), can learn to predict not only sensory inputs but also their variances based on negative log-likelihood minimization, which is equivalent to precision-weighted prediction error minimization. Tani, Ito, and Sugita () proposed an RNN with parametric bias (RNNPB) that has an online adaptation mechanism based on prediction error minimization. In this framework, parametric bias (PB) is encoded in a small group of neural units that works as a higher level neural representation of the network behavior, and the associations between specific patterns of PB activity and different temporal training patterns are self-organized through a learning process. Owing to this characteristic of PB, a robot driven by RNNPB can not only generate multiple learned behavioral patterns but also switch its behavior by adaptively modulating the PB states in response to a discrepancy between a prediction and actual sensory information. PB states thus can be regarded as the higher level “intention” of a robot Figure 1. Utilizing this model, Ito, Noda, Hoshino, and Tani () demonstrated flexible switching of ball-playing behaviors by a humanoid robot in response to changes in the environment.

Figure 1.  

The S-CTRNN utilized in this study. The S-CTRNN has five groups of neural units: input, context, output, variance, and PB units. Input neural units receive current sensory inputs xt. Based on the inputs, PB state pt, and context state ct, the S-CTRNN generates predictions about the mean yt and variance vt of future inputs in the output and variance units, respectively. Parameters, such as synaptic weights wij and the internal state of PB units, are optimized by minimizing negative log-likelihood as calculated using predictions about sensory states, their variance, and actual target sensory states ŷt.

In the present study, an S-CTRNN with PB was adopted as the computational model for simulating aberrant sensory precision because of its capacity to learn to estimate sensory variance (precision) and adapt to different environments using a prediction error minimization mechanism. The following subsections describe in detail the mathematical procedures used for the forward dynamics and parameter optimization of the S-CTRNN with PB.

Forward Dynamics

The neuronal model is a conventional firing rate model. The internal state of the ith neural unit at time step t, ut,i(s) (t ≥ 1), is described by

((1))
ut,i(s)=ut1,i(s)iIP,1τi(jIIwijxt,j(s)+jICwijct1,j(s)+jIPwijpt,j(s)+bi)+(11τi)ut1,i(s)iIC,jICwijct,j(s)+biiIO,IV.
Here, II, IP, IC, IO, and IV are index sets of the input, PB, context, output, and variance neural units, respectively; wij is the weight of the synaptic connection from the jth neuron to the ith neuron; xt,j(s) is the jth input of the sth sequence at time step t; ct,j(s) is the jth context state; pt,j(s) is the jth PB state, bi is the bias of the ith neuron; and τi is the time constant of the ith neuron. From this equation, we see that PB units can be considered to be a specific type of context unit whose time constant is infinite. The current study sets all initial values of the internal states of the context units to zero, while those of the PB units are optimized for each target sequence in the learning phase. This indicates that differences between target sequences are represented in the activity of the PB units.

The activation values of each neural unit are calculated as follows:

((2))
pt,i(s)=tanhut,i(s)0tiIP,
((3))
ct,i(s)=tanhut,i(s)0tiIC,
((4))
yt,i(s)=tanhut,i(s)1tiIO,
((5))
vt,i(s)=exput,i(s)1tiIV.

Parameter Optimization

The neural network performs parameter optimization based on the gradient decent method aiming to minimize the objective function,

((6))
Lt,i(s)=ln2πvt,i(s)2+ŷt,i(s)yt,i(s)22vt,i(s),
where ŷt,i(s) is the ith target value corresponding to the sth sequence. Minimizing this negative log-likelihood can be regarded as minimizing the precision-weighted (inverse variance–weighted) prediction error and is formally equivalent to minimizing free energy in the active inference scheme proposed by Friston et al. ().

In the learning phase, parameters, including synaptic weights wij, biases bi, and the initial internal states of PB units u0,i(s) (iIP), are updated in an offline manner. Parameter optimization is performed by minimizing the sum of the negative log-likelihood over all dimensions, time steps, and sequences as

((7))
L=sISt=1T(s)iIOLt,i(s),
where IS and T(s), respectively, represent the index set and the length of the sth target sequence. The partial derivative of each parameter, (∂L/∂θ), can be found using the back-propagation-through-time (BPTT) method described in previous studies (Murata et al., ; Rumelhart, Hinton, & Williams, ).

In the adaptation phase, after learning, only the internal states of the PB units are optimized online, and other parameters are fixed. The negative log-likelihood within a short time window W is accumulated as

((8))
L=t=tW+1tiIOLt,i(s).
The time window of length W moves along with the increment of the network time step t. Using the accumulated negative log-likelihood, the internal states of the PB units at time step tW are optimized. The partial derivatives of the internal states of PB units are also calculated by the BPTT algorithm.

In both the learning and adaptation phases, parameters that are allowed to be optimized are collected as a vector θ, and θ at the nth epoch is updated using gradient descent on the accumulated negative log-likelihood L:

((9))
θn=θn1+Δθn
((10))
Δθn=αLθ+ηΔθn1.
Here, α is the learning rate and η is a coefficient representing the momentum term. In this study, α and η are set at 0.0001 and 0.9, respectively.

Task Setting

To provide the robot with a task suitable for testing our hypothesis that aberrant sensory precision induces behavioral rigidity, we require a dynamical interaction setup in which the robot needs to perceive sensory information with intrinsic uncertainty and flexibly recognize situations determined by others. We chose a ball-playing scheme involving interaction between a robot and a human experimenter that was used in a previous study by Chen et al. (). The behavioral patterns of the robot consist of four different ball-playing behaviors (see Figure 2A). In the “right” and “left” behaviors, the robot is required to wait for the ball coming from the human subject and then return it. “Self-play” behavior consists of rolling the ball in front of itself, and the “attract” behavior is an up–down motor action with the arms while the partner engages in the “self-play” behavior of moving the ball left and right. After the S-CTRNN with PB learned to reproduce these visuo-proprioceptive temporal patterns, the behavioral performance of the robot with the trained neural network model was tested in the task of adaptive ball-playing interaction with a human subject.

Figure 2.  

Ball interaction tasks in the training and adaptation phases. A) Four interactive behavioral patterns learned by a robot controled by an S-CTRNN with PB. The upper left and upper right figures show the right and left behaviors, respectively. The lower left and lower right figures show the self-play and attract behaviors. B) System overview during adaptive interaction between a robot and an experimenter. The solid lines for prediction and sensory input represent visual information about the ball position. The dotted lines represent proprioceptive information about the robot’s joint angles. The neural network generates predictions about sensory states yt and their variances vt based on current sensory inputs xt and also recognizes situations by updating PB activity online in the direction of minimizing the negative log-likelihood calculated using the predictions and the target signal (actual sensory feedback) ŷt.

Experimental Environment

We employed a small humanoid robot NAO (Aldebaran) that has a body corresponding to only the upper half of the human body. The robot sat in front of a workbench and engaged in a ball-playing interaction with a human experimenter standing on the opposite side of the bench. The robot’s action involved only movements of the arms with 4 degrees of freedom for each arm (two shoulders and two elbows). In addition, a camera located in the robot’s mouth obtained the center of gravity coordinates for the yellow object, which was used as two-dimensional inputs for ball position. Using the minimum and maximum values of each piece of sensory information, the values of joint angles and the ball position were mapped to values ranging from −0.8 to 0.8. The size of the workbench and the diameter of the ball are approximately 45 × 5 × 30 cm and 9 cm, respectively.

Training

Training of the neural network was conducted in an offline manner by supervised learning using target perceptual sequences recorded in advance. The target perceptual sequences were recorded while the robot repeatedly performed each ball-playing behavior, where the arm movement was generated exactly following preprogrammed trajectories instead of the ones generated by the neural network model. Each of the four behavioral patterns was obtained as a sequence of 10-dimensional vectors (8 dimensions for joint angles and 2 dimensions for ball position). For the training, three sequences were prepared for each behavioral pattern. The time lengths of the sequences were approximately 1,600 time steps for “right,” 1,900 time steps for “left,” 1,600 time steps for “self-play,” and 1,200 time steps for “attract.”

The neural network learned to reproduce these target visuo-proprioceptive sequences. The objective of the learning is to find the optimal values of the parameters (synaptic weights, biases, and internal states of PB units) minimizing negative log-likelihood, or precision-weighted prediction error. At first, each parameter was initialized with a random value, and the network produced random sequences. The parameters were updated in the direction of minimizing negative log-likelihood accumulated through the duration of the target sequences. Repeating the update process many times, the network became able to produce visuo-proprioceptive sequences with the same stochastic properties as the target sequences. In addition, the associations between a particular pattern of target sequence and specific internal states of PB units self-organized.

Online Adaptation

After the learning process, the robot engaged in an adaptive interaction with a human experimenter by updating PB states (intention) online. In this phase, the robot’s intention was first set to a certain state corresponding to a learned behavior, and situation (ball dynamics pattern) was controlled by the experimenter. The goal of the robot was to flexibly recognize situations using visual cues. Real-time adaptation during task execution by the robot was performed based on an interaction between a top-down prediction generation process and a bottom-up parameter adaptation process. In the top-down prediction generation process, the network generated a temporal sequence corresponding to time steps from tW + 1 to t, based on the sensory inputs at time step tW + 1 and the constant PB states (intention). The visuo-proprioceptive sequence was generated by a “closed-loop” procedure, meaning that predictions about mean values of the sensory states at a certain time step were used as inputs at the next step. The initial inputs for proprioceptive states at time step tW + 1 were taken from the generated mean predictions at tW, and those for vision states were taken from the vision data caught by the camera at time step tW + 1. In the bottom-up adaptation process, the negative log-likelihood at each time step within time window W was calculated by using the predictions about vision states, their variance, and the actual visual feedback (see Figure 2B). The PB states (intention) were updated in the direction of minimizing the accumulated negative log-likelihood. Based on the updated PB states, the temporal sequence within the time window was regenerated. After repeating these top-down and bottom-up processes for a certain number of times, the network generated its predictions for time step t + 1, and the predictions about proprioceptive states were sent to the robot as the target for subsequent joint positions. This procedure, where recognition and prediction in the past are reconstructed based on current sensory information, is more properly regarded as a “postdiction” process (Eagleman, ; Shimojo, ), and generated predictions for time steps from tW + 1 to t are more suitably referred to as postdiction of the past rather than prediction in the literal sense.

Parameter Setting for the Experiment

The number of input, output, and variance neural units were NI = NO = NV = 10, corresponding to the dimension of the robot’s sensory states, and the number of PB units was NP = 2. The number and time constant of the context units were NC = 50 and τi = 4, respectively. In the learning phase, the weights of synaptic connections wij (jII, IC) and biases bi were initialized with random values following uniform distributions on the intervals [1NI,1NI]jII and [1NC,1NC]jIC for weights and [−1, 1] for biases, and the internal states of PB units were initialized as 0. These parameters are updated offline 300,000 times in the learning phase. In the adaptation phase, the internal states of PB units were updated online 20 times, and the length of the time window was W = 10.

Simulating Aberrant Sensory Precision

This study simulated increased and decreased sensory precision by altering estimated sensory variance (inverse precision). After the network learned to reproduce the set of behavioral patterns, the activation values of the variance units were modified as

((11))
vt,i(s)=exput,i(s)+K+ϵiIV,
where K is a constant determining the level of the estimated variance and ϵ is its minimum value, set as 0.00001. K is set as 0 in the normal condition, while K is set to negative values in the decreased sensory variance conditions and to positive values in the increased sensory variance conditions (K ∈ {−8, −4, 0, 4, 8}).

Analysis of Robot’s Behavior

To judge whether the robot’s behavior generated during the test phase is appropriate, the generated time series of joint angles was compared with the target (learned) time series. A simple way to compare two time series is to calculate the distance between the value at each corresponding pair of time steps within a certain time window. However, this method is not necessarily appropriate for comparing a general characteristic of time series because a phase shift will increase the distance between the series. Here this would increase the distance even when the robot generates the appropriate action. Thus this study considered histograms of time series values within a specified time window and then compared the histogram of the time series generated through the test experiment with the target time series. Because a histogram of time series values can be considered as a probability distribution, two time series can be compared by calculating the Kullback–Leibler (KL) divergence. Although the probability distribution lacks some information regarding temporal ordering, this comparative approach is suitable for our purpose because a general characteristic of a time series can be extracted. By considering the amount of the state change and calculating the KL divergence from the learned time series, the behaviors observed in the experiments could be classified into one of four types: outwardly normal, freezing (maintaining one posture), unlearned movement (engaging in an unlearned action), and inappropriate learned movement (engaging in a learned action other than the target action). These are explained in more detail in below.

To assess the robot’s behavior in the experiment, an eight-dimensional time series of joint angles was reduced to a two-dimensional time series by applying principal component analysis. To extract the probability distribution of the two-dimensional time series, the two-dimensional space [−N, N] ⋅ [−N, N] (with N the maximum of the absolute value of time series S(t) = {z1(t), z2(t)} across all data, where z1 and z2 represent the first and second principal components, respectively) is divided into Nbin2 subspaces (here Nbin = 20). Then, the occurrence frequencies of states within the time series were counted. Based on the acquired probability distributions of the time series, the KL divergence between the probability distribution of the time series generated in the test experiment and the target (learned) time series was calculated. The robot’s behavior is judged as “outwardly normal” if the KL divergence is less than a threshold ξ, set here as half of the minimum of KL divergence between each pair of learned time series:

((12))
DKL(pq)<ξ=0.5minqi,qjUŝqiqjDKL(qiqj).
Here p is the probability distribution of the generated time series through the test experiment, q is the probability distribution of the target movement, and Uŝ is a set of the probability distributions of each learned movement.

Atypical behaviors can be classified into one of three types of behaviors according to whether the movements were almost stopped and whether they were close to a learned movement other than the target. We call these freezing (if d < 0.02 and ∀qUŝ, DKL (pq) ≧ ξ), unlearned movement (if d ≧ 0.02 and ∀qUŝ, DKL (pq) ≧ ξ), and inappropriate learned movement (if d ≧ 0.02 and ∃qUŝ, DKL (pq) < ξ). In these, d is the amount of the state change, defined as

((13))
d=1Tt=0TiIOjoint|yi,t+1yi,t|.
Here T is the length of the time series, IOjoint is the index set of the joint outputs, and yi,t is output of the ith output neural unit at time step t.

RESULTS

Open-Ended Ball Interaction

First, we observed the effects of increased or decreased sensory variance (inverse precision) on the robot’s behavior through an open-ended ball interaction where situations (ball dynamics patterns) were changed unpredictably by the experimenter. To assess the robot’s behaviors, the joint-angle output of the time series was quantitatively assessed every 100 time steps and classified into one of the four types of movements (see Methods). Figures 3 and 4 show some representative examples of the robot’s behaviors under each condition. Figure 5 focuses on the network-level processes during the trials shown in Figures 3 and 4.

Figure 3.  

Generated time series data from interacting with the experimenter under normal conditions. The robot with a normal network (K = 0) successfully adapted to the changing situations (time steps 400–499, in red box) by flexibly switching its intention (PB state) in the direction of minimizing the increased negative log-likelihood. “Output joint” indicates predictions about selected four-dimensional joint angles. “Input vision,” “variance vision,” and “negative log-likelihood vision,” respectively, indicate the two-dimensional ball position and corresponding estimated variance and precision-weighted prediction error. The negative log-likelihood at time step t is the value after the postdiction process inside the error regression window between time steps tW + 1 and t. PB indicates activation values of the two PB units. The joint-angle output of the time series was quantitatively assessed every 100 time steps, as described in Methods.

Figure 4.  

Generated time series data from interacting with the experimenter under increased or decreased sensory variance conditions. A) Robot’s behavior under increased sensory variance condition (K = 8). With increased sensory variance, the robot’s intention was invariant through the interaction with a situation change (time steps 400–499, in red box) due to highly reduced precision-weighted prediction error, leading to a freezing behavior. B) Robot’s behavior under decreased sensory variance condition (K = −8). With decreased sensory variance, the robot experienced huge precision-weighted prediction error signals, and its intention first quickly changed and then fixed at a certain point, leading to an unlearned repetitive movement. Note that the ranges for negative log-likelihood shown in the graphs for the high-variance condition and the low-variance condition are different. The joint-angle output of the time series was quantitatively assessed every 100 time steps, as described in Methods. Abnormal behavioral patterns, including freezing and inappropriate repetitive behavior, were observed under both increased and decreased sensory variance conditions, and these figures show representative examples.

Figure 5.  

Dynamics of internal PB states (upper figures) and error signals (bottom bar graphs) for each condition during the interactions shown in Figures 3 and 4. Colored dots in the upper figures represent PB dynamics during different periods of time (early: time steps 0–199; middle: time steps 200–499; late: time steps 500–699). Bottom bar graphs show the corresponding mean of the negative log-likelihood per time step during each time span. A) Flexible intention switching under normal condition during the interaction shown in Figure 3. During the situation change in the middle period, generated error signals caused intention switching, and error signals were successfully reduced during interaction in the new situation in the late period. B) Deficits in intention switching for high sensory variance during the interaction shown in Figure 4A. Even when the situation changed in the middle period, PB states were almost unchanged due to the underestimated precision of prediction error. C) Large shift of network behavior for low sensory variance during the interaction shown in Figure 4B. Internal PB states first dynamically fluctuated in the early period, but after the middle period, they became almost fixed at a certain value, although generated error signals were still very large.

Figure 3 shows a successful interaction between the experimenter and the robot with a normal network. The robot and the experimenter first performed a “right” interaction during time steps 0–399, then the experimenter externally changed the situation (ball dynamics pattern) to a “left” interaction during time steps 400–499 (red box in Figure 3). The unpredictable situation switch caused conflict between the robot’s intention (PB states) and the actual situation. However, the robot’s intention was soon updated in the direction of minimizing the increased negative log-likelihood (precision-weighted prediction error) (see also Figure 5A), and the robot generated behavior appropriate to the situation. This indicates that the robot with a normal network could flexibly recognize and adapt to changing environments.

On the other hand, we observed similar patterns of abnormal overt behaviors, such as freezing or inappropriate repetitive behavior by the robot, under conditions of both increased and decreased sensory variance. Figure 4A shows freezing behavior under the increased sensory variance condition. In this case, the robot first successfully performed a “right” interaction (time steps 0–399), but the robot almost stopped and maintained a single posture after the situation was switched to “attract” (time steps 500–699). Figure 4B shows an unlearned repetitive behavior under the decreased sensory variance condition. The robot’s action in this case was initially unstable (time steps 0–199) and then converged to an unlearned periodic movement (time steps 200–399), but the robot generated the appropriate movement after the situation was changed (time steps 500–699). These abnormal behaviors, such as freezing and inappropriate repetitive behavior, were observed in both the increased and decreased sensory variance conditions. The videos of the ball interactions and graphs for abnormal behaviors under increased and decreased sensory variance conditions are attached as supplementary information.

To distinguish between the mechanisms underlying the similar abnormal behaviors observed in the increased and decreased sensory variance conditions, an analysis was performed on the network-level processes and the level of precision-weighted prediction error the robot experienced (see Figures 5B and 5C). In Figure 5B (see also Figure 4A), increased sensory variance caused highly reduced precision-weighted prediction error and consequent invariability of the robot’s intention (PB states), regardless of the situation change during time steps 400–499 (red box in Figure 4A). This caused a mismatch between the robot’s intention and the situation, leading to freezing behavior. In Figure 5C (see also Figure 4B), which shows a decreased sensory variance condition, the internal PB states first quickly but incorrectly changed, possibly because the robot experienced huge precision-weighted prediction errors, which may have included errors associated with inherent noise of the ball dynamics. However, the speed of the changes slowed down when the absolute values of the internal PB states became large. After the repetitive quick state changes and a subsequent slowing down, the internal PB states were fixed at inappropriate values, even though the robot was still exposed to error signals as large as, or even larger than, it experienced before the intentional states became fixed. The fixation of intention caused a mismatch between the robot’s intention and the situation, leading to unlearned repetitive behavior. The fixation of PB states may be considered to be the result of fixing at a suboptimal local solution (suboptimal critical point) of the prediction error minimization.

The abnormal behavioral patterns characterized by resistance to change, such as freezing and inappropriate repetitive behavior, may have appeared as a result of the network dynamics converging to fixed points when there was a discrepancy between the robot’s intention and the actual situation. In addition to the behavioral abnormalities, generating appropriate behavior in a restricted situation (time steps 0–399 in Figure 4A and 500–699 in Figure 4B) was a remarkable characteristic of the observed inflexible behaviors induced by aberrant sensory variance. Thus the difficulties of the robot should not be attributed to deficits in generating organized behaviors per se but to deficits in adaptability. This behavioral rigidity characterized by resistance to change may be considered to be analogous to the characteristics of autistic behavior.

Evaluation of Adaptability and Error Signal Level

To quantitatively evaluate the frequencies of abnormal overt behaviors described in the previous section, an additional simpler experiment was conducted. In this experiment, the situation set by the experimenter was not changed, but there was a discrepancy between the robot’s initial intention (PB states) and the situation. For example, intention of the robot was first set to the value for “left” behavior, but the experimenter rolled the ball to the right. To flexibly interact with the experimenter, the robot thus needed to switch its intention using the visual cue and generate the appropriate behavior. There were six combinations of initial PB states and ball dynamics: initial PB states were “left” or “right,” and the experimenter used one of the three other patterns of ball dynamics. Two trials were performed for each combination.

We evaluated the robot’s behavior in the five conditions (K = −8, −4, 0, 4, 8) for 10 networks trained with differently randomized initial synaptic weights. Figure 6 shows the changes in robot’s behavior and negative log-likelihood (precision-weighted prediction error) per time step associated with the levels of sensory variance. Behavioral traits observed during time steps 150–250 were assessed and divided into four overt behavioral patterns (“outwardly normal,” “freezing,” “unlearned movement,” and “inappropriate learned movement”), as described in Methods. Outwardly normal behavior basically means that the robot successfully switched its intention and generated appropriate behavior. However, it also includes behaviors for which the robot’s intention was fixed in an inappropriate state due to altered sensory variance but the robot nevertheless managed to generate appropriate behavior using only lower level network processes based on sensory inputs.

Figure 6.  

Changes in the robot’s behavior and negative log-likelihood associated with various levels of sensory variance. A) The occurrence rates of each behavioral trait over 120 trials for each variance level determined by a parameter K are shown. Behavioral traits observed at time step from 150 to 250 were assessed (see Methods). B) Negative log-likelihood per time step for each level of sensory variance is shown. Bars in the graph correspond to mean values over 120 trials for each parameter K. One-way repeated-measures ANOVAs indicated significant differences between the five conditions for the frequencies of the sum of the three abnormal behaviors, F (4, 36) = 51.0, p < 0.05, and levels of negative log-likelihood, F (4, 36) = 110.24, p < 0.05. Adjusting for multiple comparisons using the Holm–Bonferroni method, significant differences were found between the normal condition (K = 0) and other unusual variance conditions (K = −8, −4, 4, 8) in frequencies of abnormal behaviors, all p < 0.05. In addition, significant differences in levels of negative log-likelihood between all pairs were reported, all p < 0.05.

One-way repeated-measures ANOVAs and post hoc multiple comparison adjustments using the Holm–Bonferroni method (Holm, ) were conducted for the frequencies of the abnormal overt behaviors and the levels of negative log-likelihood. The level of statistical significance was set at p < 0.05. The repeated-measures ANOVAs indicated significant differences among the five conditions in the frequencies of the sum of the three abnormal movements, F (4, 36) = 51.0, p < 0.05, and the levels of negative log-likelihood, F (4, 36) = 110.24, p < 0.05. In addition, adjusting for multiple comparisons using the Holm–Bonferroni method indicated significant differences between the normal condition (K = 0) and the other unusual variance conditions (K = −8, −4, 4, 8) in the frequencies of abnormal movements (all p < 0.05). Significant differences in the levels of negative log-likelihood were indicated between all pairs (all p < 0.05). These indicate unusual sensory variance that led to unusual levels of precision-weighted prediction error, which may directly affect the perceptual processes of the robot using a prediction error minimization mechanism, thereby leading to reduction of behavioral performance.

DISCUSSION

In this study, we tested the hypothesis that aberrant sensory precision (inverse variance) causes behavioral rigidity, a core autistic behavior. In particular, using a prediction error minimization mechanism, we investigated the effects of increased and decreased sensory variance on adaptive behaviors. We conducted experiments based on a ball interaction between a humanoid robot and a human experimenter, where the robot was required to recognize situations determined by the experimenter. Although the robot with the normal network flexibly recognized situation changes and generated appropriate interactive behaviors, both increased and decreased sensory variance (inverse precision) led to seemingly similar abnormal behaviors resulting from resistance to change, such as freezing and inappropriate repetitive behavior. However, the analysis aiming to discriminate between the mechanisms underlying similar abnormal behaviors induced by the unusual variance conditions shows there were significant differences between the network-level processes underlying the symptoms and the levels of precision-weighted prediction error signals the robot experienced. Specifically, increased sensory variance resulted in disregarding any error signals, leading to invariability of intentional state, while decreased sensory variance caused an excessive response to error signals, leading to incorrect intention change and its subsequent fixation.

Our results demonstrate that increased sensory precision (decreased sensory variance) can lead to the behavioral rigidity characteristic of ASD, supporting the system-level accounts that consider increased sensory precision as the core cognitive trait of individuals with ASD (Lawson et al., ; Palmer et al., ; Van de Cruys et al., , ). Within a theoretical study, abnormal behavioral patterns and resistance to change in individuals with ASD were proposed as strategies to provide a reassuring sense of predictive success in a world otherwise filled with error (Van de Cruys et al., ). This indicates that precision-weighted prediction errors should be reduced to some extent while generating inflexible behavior. However, in our experiment, error signals could be even larger when the robot generated inflexible behavior than they were before the robot’s intention was fixed. The symptoms observed in the experiment might be understood as consequences of a suboptimal solution of prediction error minimization rather than as consequences of successfully reducing the sense of prediction error. However, the difference might be explained by the simplicity of our experimental setting. For example, in the experiment, the visual input to the robot was only from an external cause (a ball), but if visual inputs from internal causes, such as the movements of the arms, were also considered, the robot might generate characteristic behaviors aiming to minimize the total error signal from the two causes by actively changing the internal causes of vision inputs. This consideration of visual inputs from internal causes might also lead to different effects on the robot’s behaviors under increased or decreased sensory precision conditions.

Recently, problems with flexible adjustment of reactions to sensory states in response to volatile environments have been suggested to be associated with psychiatric disorders (Palmer et al., ; Powers et al., ). From previous studies, not only the unusual level of reactions but also the unusual context-sensitive adjustment of reactions, such as adaptation of precision weighting of prediction errors, might explain psychiatric symptoms. In particular, a recent empirical study indicated that autistic perception may be associated with overestimated volatility of the sensory environment, with less distinction between reactions to unexpected and expected situations (Lawson, Mathys, & Rees, ). In this study, sensory precision was persistently increased or decreased, indicating that influences of its context-dependent adjustment on behavioral flexibility were not considered. Future study into the effects of unusual adaptation of the precision weighting of prediction errors may facilitate understanding of finer mechanisms underlying unusual reactions to volatile environments by people with ASD. In addition, investigations will be needed of the effects of aberrant sensory precision on learning and how aberrant sensory precision can be generated through development and learning.

Our study extends attempts to understand cognitive processes underlying autistic behavior by using computational models. As a part of these attempts, Rosenberg et al. () conducted neural network simulations confirming that peculiarities of vision in ASD can be induced by an altered divisive normalization. Another study associated poor motor skills in ASD with the poor goal-directed movements of a physical mobile robot induced by a deficit in temporal visuo-proprioceptive sensory integration (Barakova & Chonnaparamutt, ). We have confirmed that aberrant sensory precision can induce behavioral rigidity, utilizing a humanoid robot controled by a recurrent neural network model. The behavioral abnormality was observed through a real-time human–robot interaction, where the robot was required to flexibly recognize changing environments. In such uncertain and unpredictable situations, reduced cognitive flexibility of individuals with ASD has been generally reported (Leekam et al., ; Poljac & Bekkering, ). Furthermore, this study demonstrated the generation of dysfunction in intentional control (i.e., executive dysfunction) caused by aberrant sensory precision, clarifying the direct relationship between distinct proposed cognitive abnormalities in ASD (Hill, ; Van de Cruys et al., ).

Our results provide the perspective that we could consider autistic behavior as being the result of a phenomenon generally observed in natural systems. Specifically, the process leading to the qualitative shift of network behavior in the decreased sensory variance (increased sensory precision) condition may be similar to critical transitions, which are abrupt behavioral shifts observed in natural dynamical systems, including the climate, ecosystem, and cells’ signaling pathways (Lenton et al., ; May, ; Scheffer et al., ). Critical transitions are suggested to have characteristic early warning signals, such as the slowing down of changes in a system (critical slowing down) and back-and-forth switches between states in response to relatively large impacts (flickering), although they can also occur suddenly due to a large external impact on the system (Scheffer et al., , ). These characteristic phenomena were observed in network behavior in the decreased sensory variance condition. This suggests that some types of behavioral rigidity and resistance to change might result from a critical transition in the hierarchical predictive control system attributed to excessive sensory prediction errors. This perspective might be implicative because pathophysiological experiments have demonstrated that dynamical features of network behavior in epileptic seizures, which relatively high numbers of individuals with ASD experience (Bolton et al., ), are very similar to the process of critical transition (Jiruska et al., ; Kramer et al., ).

Finally, findings from this study also provide an implication for clinical studies aiming to classify the different types of inflexible behavior observed in ASD or to understand differences between the behavioral abnormalities observed in ASD and other psychiatric disorders, such as obsessive–compulsive disorder and schizophrenia. Our results show that seemingly similar inflexible behaviors can result from different network-level processes, and also abnormalities of network-level processes may not necessarily lead to external alterations of behavior. This indicates that measurements and classifications of behavioral abnormalities based on external observation might be confusing and create difficulties in terms of understanding their etiology as broadly described in psychiatry (Redish & Gordon, ). However, our findings also indicate that symptoms induced by increased or decreased sensory precision were substantially different in terms of the levels of prediction error signals while generating abnormal behaviors, suggesting there might be differences in the internal experiences of individuals. Therefore measurements and classifications of both the internal experiences of patients and neural activities coding prediction error signals in the biological brain could be useful to facilitate understandings of heterogeneous behavioral rigidity in psychiatric disorders (Redish & Gordon, ). Future studies may be able to track parameters associated with those properties underlying disrupted adaptive behavior in animal models and humans and should compare the robot model with clinical case studies.

AUTHOR CONTRIBUTIONS

HI, SM, YC, YY, JT, and TO conceived the research topic, designed the experiment, and wrote the paper. HI performed the experiment and analyzed the data.

FUNDING INFORMATION

This work was supported in part by a MEXT Grant-in-Aid for Scientific Research on Innovative Areas, “Constructive Developmental Science” (24119003), JSPS Grant KAKENHI (25330301, 17K12754), and JST CREST (Grant Number: JPMJCR16E2, JPMJCR15E3), Japan.