|
|
||||||||
The Journal of Neurophysiology Vol. 86 No. 3 September 2001, pp. 1113-1130
Copyright ©2001 by the American Physiological Society
Center for Neural Science, New York University, New York, New York 10003
| |
ABSTRACT |
|---|
|
|
|---|
Malone, B. J. and M. N. Semple. Effects of Auditory Stimulus Context on the Representation of Frequency in the Gerbil Inferior Colliculus. J. Neurophysiol. 86: 1113-1130, 2001. Prior studies of dynamic conditioning have focused on modulation of binaural localization cues, revealing that the responses of inferior colliculus (IC) neurons to particular values of interaural phase and level disparities depend critically on the context in which they occur. Here we show that monaural frequency transitions, which do not simulate azimuthal motion, also condition the responses of IC neurons. We characterized single-unit responses to two frequency transition stimuli: a glide stimulus comprising two tones linked by a linear frequency sweep (origin-sweep-target) and a step stimulus consisting of one tone followed immediately by another (origin-target). Using sets of glide and step stimuli converging on a common target, we constructed conditioned response functions (RFs) depicting the variability in the response to an identical stimulus as a function of the preceding origin frequency. For nearly all cells, the response to the target depended on the origin frequency, even for origins outside the excitatory frequency response area of the cell. Results from conditioned RFs based on long (2-4 s) and short (200 ms) duration step stimuli indicate that conditioning effects can be induced in the absence of the dynamic sweep, and by stimuli of relatively short duration. Because IC neurons are tuned to frequency, changes in the origin frequency often change the "effective" stimulus duty cycle. In many cases, the enhancement of the target response appeared related to the decrease in the "effective" stimulus duty cycle rather than to the prior presentation of a particular origin frequency. Although this implies that nonselective adaptive mechanisms are responsible for conditioned responses, slightly more than half of IC neurons in each paradigm responded significantly differently to targets following origins that elicited statistically indistinguishable responses. The prevailing influence of stimulus context when discharge history is controlled demonstrates that not all the mechanisms governing conditioning depend on the discharge history of the recorded neuron. Selective adaptation among the neuron's variously tuned afferents may help engender stimulus-specific conditioning. The demonstration that conditioning effects reflect sensitivity to spectral as well as spatial stimulus contrast has broad implications for the processing of a wide range of dynamic acoustic signals and sound sequences.
| |
INTRODUCTION |
|---|
|
|
|---|
Dynamic binaural stimuli have
dominated the environments in which mammalian auditory systems evolved,
and numerous studies (Ahissar et al. 1992
;
Poirier et al. 1997
; Sanes et al. 1998
; Semple 1999
; Sovijärvi and Hyvärinen
1974
; Spitzer and Semple 1991
, 1993
, 1998
;
Takahashi and Keller 1992
; Wagner and Takahashi 1990
) have demonstrated that auditory motion can dramatically influence the responses of auditory neurons. Although one might expect
that virtual azimuthal-motion stimuli created by the modulation of
interaural disparities of level (ILD) or phase (IPD) would elicit instantaneous discharge rates that reflect the absolute disparity at each point in the sweep, this has proven not to be true
for the typical mammalian inferior colliculus (IC) neuron (Sanes
et al. 1998
; Semple 1999
; Spitzer and
Semple 1991
, 1993
, 1998
). Instead, the response of the neuron
at a particular value of ILD or IPD may be dramatically enhanced or
suppressed depending on the context in which that value occurs. Thus
the responses of auditory neurons appear to contain information about
the trajectory, rather than simply the instantaneous position, of the
stimulus in a space defined by the binaural parameters (e.g., ILD or
IPD) to which they are sensitive.
Previous demonstrations of dynamic conditioning have been confined to stimuli that involve binaural cues for the localization of sound in the azimuthal plane. At present we do not know whether conditioning phenomena are specific to stimuli that appear to move in space or whether they reflect a general sensitivity to stimuli that change in time. We also do not know whether conditioning phenomena are specific to binaural rather than monaural auditory processing. Conditioned enhancement and suppression obtained with "virtual" motion stimuli may reflect a more fundamental sensitivity of IC neurons to signals that "move" along any of the response gradients that collectively define the neuron's response area. Determining whether conditioning phenomena are specific to auditory motion would clarify their functional role and constrain hypotheses concerning the mechanisms that generate them. A major goal of this paper is to investigate the generality of conditioning phenomena by explicitly testing the hypothesis that conditioning reflects a special sensitivity to auditory motion.
We measured the responses of IC neurons to monaural stimuli involving frequency transitions. Frequency modulation (FM) is a prominent feature of human speech, the behaviorally relevant vocalizations of multiple species, and numerous environmental sounds. Because pinnae-derived spectral filtering cues to sound source location are weak for narrowband stimuli, monaural FM stimuli would not be expected to produce salient percepts of azimuthal auditory motion. On the other hand, because the cochlea contains a place map along the basilar membrane, auditory FM may be more directly analogous to motion in other sensory modalities, where moving stimuli elicit activity that traverses the receptor surface (e.g., the retina in vision or the skin in somatosensation).
Prior experiments on dynamic conditioning and interpretations of those results have focused on the importance of the dynamic component of the stimulus. In the case of IPD, responses measured during the dynamic component were compared with statically derived tuning curves. It has been implicitly assumed that the dynamic portion of the stimulus was necessary for the induction of the physiological effects. In the present study, we attempt to refine our understanding of "stimulus context" by testing frequency step stimuli that do not include FM components.
If conditioning is, as our results suggest, a general property of
auditory processing, then it is possible that conditioning itself is a
reflection of a general property of neural processing: response
adaptation. For example, it has been proposed that apparent motion
sensitivity can be explained in terms of the history of a neuron's
response to the stimulus rather than to the history of
the stimulus itself (McAlpine et al. 2000
). In the
present study, we explicitly test the hypothesis that conditioned
responses to the target tone can be explained by the history of action
potentials fired in response to the preceding origin tone.
| |
METHODS |
|---|
|
|
|---|
Surgical and recording procedures
Adult Mongolian gerbils (Meriones unguiculatus;
n = 12) with clean external and middle ears were
anesthetized with pentobarbital sodium (60 mg/kg ip). The pinnae were
removed, and the cerebellum was exposed by a craniotomy of the
interparietal bone. The animal was transferred to a double-walled sound
attenuating chamber (Industrial Acoustics), and sound-delivery speculae
were sealed to the openings of the external auditory meatuses. An
electric heating blanket maintained a constant rectal temperature of
38°, and the trachea was cannulated to prevent respiratory distress.
Supplemental injections of ketamine hydrochloride (30 mg · kg
1 · h
1 im)
maintained the animal in an areflexive state throughout the experiment.
This anesthetic protocol allows for direct comparison with prior work
on conditioning. The separate concern of the potential confounding
effects of anesthesia has been addressed in a separate study in which
conditioning elicited by an IPD stimulus was demonstrated in an
unanesthetized primate (Malone and Semple 2000
).
Single-unit activity was recorded with platinum-plated, glass-insulated tungsten microelectrodes (5- to 10-µm tip exposures; Ainsworth) advanced into the IC with a stepping motor microdrive (CalTech). Electrical signals from the brain were amplified (variable gain), filtered (typically from 0.25 to 10 kHz), and passed to an oscilloscope, audio speaker, and event timer (MALab, Kaiser Instruments). The occurrence of discriminated action potentials and stimulus zero-crossings was logged with a resolution of 1 µs. Event times were then retrieved from a FIFO buffer and stored by the host computer for analysis and display. Entry into the IC central nucleus was demarcated physiologically by an abrupt transition from the diffuse, broadly tuned, and habituating responses characteristic of the dorsal mantle to increased spontaneous background activity, robust and selective responses to pure tones, and the beginnings of a clear ascending tonotopic progression. All procedures associated with this report were reviewed and approved by the New York University Institutional Animal Care and Use Committee.
Stimulus generation and data acquisition
Stimulus waveforms were generated by digital synthesizers controlled by a microprocessor and custom hardware (MALab, Kaiser Instruments). Stimulus characteristics were specified in software running on the host computer, which communicated with the dedicated microprocessor via an IEEE-488 interface. Stimuli were generated and attenuated digitally and transduced by electrostatic earphones (STAX Lambda) in custom housings (Custom Sound Systems) fitted to ear inserts. Before each experiment, the sound pressure level (SPL) expressed in decibels (dB, re: 20 µPa) at each ear was calibrated under computer control for level and phase from 40 Hz to 40 kHz, using a previously calibrated probe tube and condenser microphone (Brüel and Kjær, 4134).
The frequency glide stimuli depicted in Fig.
1 begin with a steady-state period (1 s)
at an initial frequency (origin). The frequency is then linearly
modulated at 8 kHz/s to a new frequency (target) and presented for a
variable interval determined by the depth (and thus the duration) of
the sweep. Because auditory neurons are sensitive to the rate of FM
(e.g., Felsheim and Ostwald 1996
; Mendelson and
Cynader 1985
), a constant rate of modulation rather than a constant duration of modulation was employed for
glides of varying depths. The overall duration of the typical stimulus used in these experiments was 4 s, with an interstimulus interval of 1 s separating repeated presentations (e.g., Fig.
1B). The rate of modulation (8 kHz/s) was chosen to allow
for the testing of a sufficient range of depths at sweep durations
roughly equivalent to those used in prior experiments with
trapezoidally modulated ILD (Sanes et al. 1998
) and IPD
(Miko et al. 1999
) stimuli. The general effectiveness of
the particular choice of modulation rate was confirmed in early
experiments with glides.
|
To determine whether the dynamic (sweep) component of the glide was
necessary for eliciting conditioning in our sample, a frequency step stimulus typically composed of two 2-s steady-state tones was also employed. In a few cases, longer-duration glide and step
stimuli were employed to characterize the kinetics of conditioned
responses more fully. The relatively long duration of our typical
stimuli is comparable to stimulus durations in prior studies of
conditioning (Sanes et al. 1998
; Thornton et al.
1999
). To facilitate comparison of our results with those of
studies involving much shorter tone pips (i.e., durations of milliseconds, rather than seconds), we also employed a "quickstep" stimulus composed of two 200-ms tone pips and an interstimulus interval
of 100 ms. Glide and step stimuli were typically repeated five times,
and quickstep stimuli were repeated 10 times. To reduce spectral
splatter, each of the tones in the step paradigms was shaped with a
cosine-squared ramp (10 ms) at onset and offset. The glide stimulus,
which is continuous, was gated with a similar ramp at the onset of the
origin tone and the offset of the target tone.
Because the glide stimuli in these experiments involved FM, it is important to consider the calibration procedure carefully. A concern arises when the maximum output at the origin and the target frequencies are different because the attenuation for the entire glide stimulus is set by the choice of origin frequency. To address this concern, a five-band parametric equalizer (Symetrix 551e) was used to equalize the magnitude transfer function of the speakers. The frequency range of glide stimuli was confined to the most effectively equalized portion of the calibration curve in each experiment (generally 0.5-10 kHz), and the selection of data points was guided by the calibration curve during the experiment itself. Additionally, all origin-target pairs where a discrepancy in attenuator values greater than 3 dB predicted the observed change in discharge were eliminated from quantitative analysis. Results from earlier experiments where adequately controlled calibration curves were unavailable were also eliminated from detailed quantitative analysis. Results obtained in the step paradigms, where the attenuator values are set independently for each tone, are not impacted by the foregoing concerns and served as an additional check on the authenticity of conditioning effects obtained with glides in the same cells.
In addition to the conditioning stimuli described in the preceding
text, the best frequency (BF), best SPL, and minimum latency for each
cell were determined on the basis of responses to 200-ms tone pips.
This initial characterization aided the search for parameter
combinations (primarily origin and target frequency and SPL) that
elicited robust conditioning effects. To gauge the magnitude of
conditioning in each cell, we sought parameter values that maximized
the effects obtained for that cell. The dimensionality of the search
space is high and includes axes for each steady state duration, target
and origin frequencies, SPL, and modulation rate (glides) or temporal
delay between tones (steps). Previous studies have examined in detail
the effects of varying temporal delay for forward masking and forward
facilitation (Boettcher et al. 1990
; Brosch et
al. 1999
; Finlayson 1999
; Finlayson and Adam 1997
; Harris and Dallos 1979
; Shore
1995
; Smith 1977
, 1979
) and the rate of FM
(Felsheim and Ostwald 1996
; Heil et al.
1992a
,b
; Mendelson and Cynader 1985
;
Mendelson and Grasse 1992
; Mendelson et al.
1993
; Møller 1969
, 1971
; Nelken and
Versnel 2000
; Poon and Chen 1992
; Poon et
al. 1991
; Rees and Møller 1983
; Ricketts et al. 1998
; Sinex and Geisler 1981
) throughout
the auditory pathway. By contrast, we focused on manipulating origin
and target frequencies and SPLs and constructed conditioned response
functions (RFs), which depict the variability in the response to an
identical stimulus (the target) as a function of the preceding origin
frequency (see Fig. 4). Typically, the BF of the cell was chosen as the
target, and the SPL was chosen to be slightly lower (10-15 dB) than
the level that elicited the strongest response to 200-ms tone pips (best SPL). The origin frequency was then varied in 0.5- to 1-kHz steps
that spanned the range of frequencies where the calibration concerns
described above were negligible for glides. Conditioned RFs based on
the step paradigms cover a range of origins similar but not identical
to those used for glides. This reflects the fact that the choices of
origin-target pairs are not constrained by the concerns about
calibration described in the preceding text. The quickstep conditioning
stimuli match those of the step paradigm exactly except that the
stimulus duration was an order of magnitude shorter.
Data analysis and statistical verification of results
Conditioning is operationally defined as a deviation from the
firing rate associated with a particular stimulus due to a change in
stimulus context. Thus the analysis of conditioning effects requires a
reference firing rate. Sanes et al. (1998)
compared the
firing rate immediately subsequent to ILD modulation with that obtained
for the equivalent ILD presented statically, at an equivalent point in
time relative to the stimulus onset. For example, for a 2-kHz depth
glide stimulus, the firing rate at the onset of the target tone at
1.25 s (1 s at the origin, plus the 0.25-s sweep) would be
compared with the firing rate 1.25 s into the static control
stimulus. We refer to this as the control-at-target reference (Fig. 1). It could be argued that stimuli whose origin frequencies lie well outside the response area of the cell have "effective" onset times defined by the point at which the frequency sweep enters the response area, rather than the onset of the origin tone. Accordingly, the response to the target tone was also compared with the response beginning at the onset of the control
stimulus, which we refer to as the control-at-origin
reference (Fig. 1).
Following a thorough search of the glide stimulus space for each cell, the best potential instance of a conditioning effect of each type (enhancement, n = 31; suppression, n = 11) was identified for use in the analysis of the time course and general prevalence of conditioning effects in the IC. The spike counts in each 50-ms bin of the histogram of the response to the target tone of the glide stimulus were compared with the spike counts of the corresponding bins for the unmodulated control stimulus (Wilcoxon signed ranks). The duration of the stimulus was constant (e.g., 4 s), but because the duration of the sweep component varied with the depth of modulation, the duration of the control-at-target reference was also inversely proportional to the depth of modulation, typically ranging from 2.25 to 2.9375 s.
Statistical evaluation of the conditioned RFs, each of which involves multiple comparisons against the reference value, was performed differently to avoid compromising the statistical criterion. Dunnett's test (which is analogous to a t-test corrected for multiple comparisons against a single control value) was used to compare firing rates averaged from 0 to 500 and 500 to 1,000 ms relative to target onset to control responses during the equivalent interval. Responses to the two-tone stimuli were averaged and analyzed similarly with respect to the onset of the second "probe" tone, except that the responses to the quickstep stimuli were averaged in 100- rather than 500-ms blocks. Each control stimulus was generally presented thrice in the sequence of origin-target pairs (beginning, middle, and end), providing an estimate of the variance of the control value maximally sensitive to changes in overall responsiveness during data collection. This procedure makes our analysis conservative with respect to the statistical verification of conditioning effects. The data available for most cells permitted the construction of multiple conditioned RFs for varying SPLs or target frequencies. For each cell, a single conditioned RF containing the most robust and representative conditioning effect of each type (enhancement and/or suppression) was selected for statistical analysis.
| |
RESULTS |
|---|
|
|
|---|
The primary goal of this study was to provide strong evidence for the existence of frequency-based conditioning in the responses of comprehensively examined single units in the IC. The quality of evidence for enhancement or suppression of a neural response relative to some reference value is inherently limited by the trial-to-trial variability of the responses themselves. For this reason, we confined quantitative analysis of conditioned responses to 41 cells whose responses remained very stable throughout the relatively long recording times these experiments required. Cells in the larger sample whose responses could not unambiguously be attributed to conditioning effects, rather than changes in responsiveness or concerns related to calibration (see METHODS), were excluded. Because nearly all cells showed evidence of conditioning (both those included and excluded), the restriction of our sample did not compromise our estimate of the incidence of the conditioning effects.
The BFs of this population ranged from 1.4 to 10 kHz, with a median BF of 4 kHz. The predominance of low-BF cells reflects a deliberate restriction of the sample (see METHODS). The distribution of best levels at the BF was bimodal with peaks at the mode of 50 dB and a secondary peak at 80 dB, which, with few exceptions, was the maximum SPL tested. Nonmonotonicity was clearly common in the population: about half (21/41) of the sample had best SPLs below 50 dB. The SPL at which the largest conditioning effect was obtained (for cells tested with both glides and steps, the glide value was used) was typically lower (mean 11 dB) than the best SPL. The best conditioning SPL was less than the best SPL in 75% of cases. Minimum latencies in the sample spanned a broad range, from 6 to 86 ms, with a median of 12 ms. Minimum latency exceeded 25 ms in 25% of cases. Conditioned enhancement and suppression were each observed for neurons spanning the range in latency. The likelihood of observing a particular category of response (enhancement, suppression, or no significant effect) was not statistically related to any of these simple characterizations of the cells in the population (Kruskal-Wallis, P > 0.05).
Prevalence of conditioned responses
The stimulus configuration that elicited the most robust conditioned enhancement and/or suppression from each cell was identified during recording. Of cells tested comprehensively with glides (n = 31), all exhibited significant conditioning enhancement or suppression for at least one identified stimulus configuration. Examples of conditioned enhancement and suppression are shown in Fig. 1, A and B, respectively. Relative to the control-at-target reference, 28 cells showed enhancement, 8 showed suppression, and 5 exhibited both effects (P < 0.01). Most (29/36) instances of conditioning were significant by a substantially more stringent statistical criterion (P < 0.0001). Relative to the control-at-origin reference, 19 cells showed enhancement, 9 suppression, and 5 both effects.
The glide stimulus was chosen to allow for a fairly direct comparison
with responses to a periodic trapezoidally modulated ILD stimulus
(Sanes et al. 1998
). An example of responses to a periodic trapezoidal FM stimulus is shown in Fig.
2B. The half-trapezoidal stimuli presented in this study were chosen to simplify the analysis of
data and interpretation of results. We confirmed in a few cells that
responses were consistent across these two stimulus paradigms as is
evident by comparing the responses to the half-trapezoidal stimulus,
shown in Fig. 2A, to the responses to the periodic FM stimulus wrapped by the modulation period in Fig. 2C.
|
Time course and magnitude of conditioned responses
One aim of this analysis was to determine whether the time course
of conditioning obtained with monaural frequency sweeps was similar to
that observed when modulating ILD (Sanes et al. 1998
).
To allow for direct comparison with effects obtained with ILD
modulation, splines were fitted to the binned spike counts of the
difference (or the negative of the difference in the case of
suppression) of the conditioned and control responses throughout the
duration of the comparison interval. The time corresponding to the
intersection of the fit with zero was considered to mark the end of the
effect. The typical length of the entire glide stimulus was 4 s,
and most (23/34; 2 cases were excluded because the trial length was
insufficiently long for this analysis) conditioned responses persisted
throughout the duration of the comparison interval, which lasted
slightly more than 2.5 s on average. The difference between the
conditioned and control response did fall below zero in 11 cases with a
mean of 1.64 s. The fact that most effects in our
sample have a duration that exceeds 2.5 s when calculated by the
same procedure utilized by Sanes et al. (1998)
confirms
that conditioning effects obtained in both paradigms persist for a
number of seconds after the modulation has ceased.
To facilitate comparisons to the broader literature, single
exponentials were fit to the binned differences in spike counts for
conditioned responses. We also fit the temporal profile of the control
response at the onset of the origin tone with a single exponential that
included an additional parameter (the "pedestal") for the
steady-state firing rate. In 14 cells, the time courses of the response
at the onset of the control stimulus and the decay of the conditioned
effect magnitude could both be well fit by decaying exponential
functions with median time constants of 257 and 864 ms, respectively.
The time constants obtained from each of these fits in each cell were
not significantly correlated (Spearman-rho, P > 0.05).
The temporal profile of the response to the control tone at onset could
only rarely be fit in those cells whose responses showed significant
suppression for glides. Of the eight cells exhibiting conditioned
suppression (6 of these also exhibited conditioned enhancement for a
different glide stimulus), five had buildup responses (the fit
converged on a rising exponential) to the control. The remaining three
had nearly constant firing rates that were not well described by a
decaying exponential. By contrast, most cases of enhancement (17/28)
were associated with exponentially decaying responses to the onset of
the control stimulus. This relationship between conditioning effect
class (enhancement or suppression) and response profile (decaying,
buildup, poorly fit) was significant
(
2, P = 0.0008). In
the instances where the magnitude of conditioned suppression over time
could be fit (5/8), the time constants were qualitatively quite similar
to those obtained for conditioned enhancement.
Figure 3, B and D, shows the typical temporal evolution of conditioned enhancement and suppression respectively. Firing rates were normalized by taking the ratio of the response in each 50-ms bin to the average response for the duration of the control-at-target reference. These ratios were then averaged over all cells, and splines fit to these ratios were plotted. The control responses, in gray, were calculated and plotted similarly. All the responses that were analyzed quantitatively are included, regardless of significance. In the first second subsequent to the sweep, enhanced responses typically decay from 3.8 to 1.6 times the response to the control average, whereas suppressed responses rise from 0.32 to 0.59 times that average.
|
Figure 3, A and C, show the average response that preceded the conditioned and control responses shown in B and D. The gray curves in A and C, representing the responses during the first second of the control tone, indicate that firing rates decay substantially during the presentation of the unmodulated control stimulus for cells exhibiting enhancement (A), but not for cells exhibiting suppression (C). This is consistent with the fact that the latter cases were poorly fit with decaying exponentials as described in the preceding text. The curves in black (A and C) representing the average response to the origin tone that preceded the common target tone indicate that conditioned enhancement tends to follow a relatively weak response to the origin tone, but conditioned suppression occurs after a stronger response. The medians (dashed lines) of response history for conditioned enhancement (A) and suppression (C), however, are both zero throughout the presentation of the control tone because most origin tones did not elicit action potentials.
Instances of conditioned suppression can be divided neatly into two categories on the basis of discharge history. Cases where the response to the origin tone exceeds the response to the target tone because the glide stimulus modulates away from the cell's BF were less common (4/11) because we generally chose the BF of the cell as the target. The second, somewhat larger (7/11) category includes cases of conditioned suppression in the absence of a significant response to the origin tone.
Effects of varying the origin frequency
For glide stimuli that modulate to a common target, it is possible to construct a conditioned RF that represents the response to an identical stimulus (the target) as a function of the frequency of the preceding origin tone (Fig. 4). Responses averaged over 500-ms intervals following the onset of the origin and target tones were used to construct frequency and conditioned RFs, respectively. The frequency RF is both a measurement of the frequency tuning of the cell and a record of the history of the cell's response prior to the presentation of the target tone. Examples of such functions for two cells exhibiting conditioned enhancement and suppression are shown in Figs. 5 and 6. The dashed lines indicate the response to the target predicted by the control-at-origin (gray) and control-at-target (black) references. The distance between them indexes the adaptation (or buildup) of the response to the unmodulated control. If IC neurons were insensitive to the context in which the target tone appeared, then the conditioned RFs in black should be flat lines (the d's in Fig. 4), subject to the intrinsic variability of the measured response. It is clear from Figs. 5 and 6 that the presentation of the origin tone and origin-to-target sweep can profoundly affect the response to the target even when the origin tone lies well outside the excitatory frequency response area.
|
|
|
It is evident from Figs. 1-3 that conditioned responses converge on the steady-state response associated with a particular choice of target. Figures 7 and 8 illustrate how entire conditioned RFs converge on the steady-state response predicted by the control-at-target reference for enhancement and suppression respectively. Each curve is based on an average of the responses in a 750-ms window, and each successive curve is based on an interval beginning 250 ms later. The histograms show the time course of the response to the target preceded by the origin frequency indicated by the asterisk on each figure. The bars under the stimulus icon in each histogram indicate the intervals where responses were averaged to create the series of conditioned RFs below. Although neither curve flattens completely in the time shown, it is nevertheless clear that the conditioning effects that create the variability in the conditioned RFs diminish with time. Consideration of the curve in Fig. 8 reveals that conditioned suppression for an origin frequency proximal to the peak of the frequency RF (1.5 kHz) appears to decay less rapidly than does suppression of the response induced by the presentation of the 8-kHz origin tone. Variation in the kinetics of conditioned responses elicited by different stimuli, but delivered to the same cell, suggests that the mechanisms responsible for conditioning phenomena are either diverse or operate differently at different synapses of the recorded neuron.
|
|
Is continuous FM required to elicit "dynamic" conditioning?
The term "dynamic conditioning" (Sanes et al.
1998
) implies that a dynamic stimulus component is necessary to
elicit the phenomenon. By presenting sweep-target stimuli that omitted
the origin, we obtained anecdotal evidence that the sweep alone was
sufficient to elicit conditioned responses. Elimination of
the sweep, on the other hand, allows us to test directly the hypothesis
that the dynamic component of the stimulus is necessary for
conditioning to occur. Using sets of tone pairs ("probe" and
"masker" tones with no delay) and varying the frequency of the
origin tone, we created conditioned RFs for step stimuli analogous to
those described previously for glides. Examples of conditioned RFs
exhibiting conditioned enhancement are shown in Figs.
9 and 10.
In Fig. 9, the response to the 5-kHz target is suppressed
by the presentation of the 6-kHz origin tone (Fig. 9C)
which, although proximal to the BF, does not by itself elicit a
response. In Fig. 10, conditioned suppression relative to the
control-at-origin reference (gray dashed line) occurs for origin
frequencies near the 7-kHz target, but not for origin frequencies at or
just below the 4-kHz BF (Fig. 10B). Clearly the elimination
of the modulated component did not eliminate evidence of conditioning.
Moreover the conditioned RFs for these cells are not simply the inverse
of the frequency RFs.
|
|
We examined 28 conditioned RFs based on glides and 28 conditioned
RFs based on steps for evidence of significantly conditioned responses.
Because these stimuli are much longer than those typically used in
studies of forward masking and sequence selectivity (Brosch et
al. 1999
; Calford and Semple 1995
), we collected
an additional 14 conditioned RFs based on 200-ms tone pips and
interstimulus intervals of 100 ms. Conditioned RFs based on these
"quickstep" stimuli were exactly matched in frequency and SPL to
step conditioned RFs obtained in the same cells and test the hypothesis
that conditioning effects require long-duration tones for their
induction. Significance of conditioned responses was assessed for each
choice of reference (control-at-target and control-at-origin) and for
each block of data (firing rates for quicksteps were averaged over 100- rather than 500-ms comparison intervals). Results of this analysis
appear in Table 1. A cell was considered
to exhibit conditioning if at least one response was significantly
enhanced or suppressed relative to the reference. The sum of enhanced
points and suppressed points can exceed the number of conditioned RFs
if there are cells that exhibit both enhancement and suppression in the
same conditioned RF.
|
Averaged across the three stimulus paradigms, 93% of conditioned RFs included at least one significantly conditioned response relative to the control-at-target reference for the first comparison interval. Most (73%) of these effects persisted into the second comparison interval. Using the alternative control-at-origin reference, 70% of conditioned RFs showed evidence of conditioning during the first comparison interval.
It should also be stressed that although we explicitly searched for stimuli that elicited robust conditioning effects, several different stimuli were effective in eliciting conditioned responses in most cells. Although the total number of curves was the same (28), many more points were tested in the step paradigm, and a higher proportion (58 vs. 36%) showed enhancement. Although differences in the duration of the origin tone between the glide (1 s) and step (2 s) paradigms could explain the discrepancy in effect prevalence, quickstep stimuli also readily and robustly conditioned the responses of IC neurons with origin tones lasting only 200 ms. Since the modulated component was evidently not necessary to produce these effects, the frequency sweep probably diminishes the likelihood of detecting conditioned responses during the steady states because the contextual effects exerted by the origin tone wane during the course of the sweep. The similarity of effect prevalence across time scale indicates that the induction of conditioned responses is relatively rapid and occurs on a time scale of a few hundred milliseconds as well as seconds. The observation that conditioning effects appear to decay more rapidly in the quickstep than step and glide paradigms suggests that the duration of the eliciting stimulus affects the duration of the conditioned effect.
Can the conditioned RFs be explained by response history?
No particular origin or target frequencies or modulation depths (expressed in either kilohertz or octaves) were systematically related to either the incidence or magnitude of conditioned effects in our sample. Although the structure of the conditioned RFs was not conserved across cells, conditioned RFs were consistent within cells, across both stimulus type and time scale. For all cells tested with matched origin-target pairs across different stimulus paradigms, we computed the ratio of each point in the conditioned RF to the control-at-target/-origin reference. The correlations of these ratios were significant across both the stimulus type (r = 0.329/0.208; P < 0.0001) and time scale (r = 0.149/0.383; P < 0.0001).
While the foregoing analysis can establish that the structure of the
conditioned RF is anchored to the response area of each cell, it cannot
specify whether the conditioned responses reflect the particular origin
stimulus that preceded the target, or the response to that stimulus. For a neuron very sharply tuned
to 4 kHz, the "effective" onset of a glide stimulus with an origin of 8 kHz might not occur until very near the end of the sweep to the
4-kHz target, when the cell begins to respond. For a 4-s stimulus that
includes a 1-s interstimulus interval, the duty cycle of the control
stimulus over repeated presentations would be 4/5. For the
remote-origin glide stimulus, however, the "effective" duty cycle
of the stimulus
the fraction of time that the stimulus spends at
frequencies that drive the cell to fire action potentials
would be
~2/5. From the standpoint of response adaptation, the cell would be
recovering from adaptation not only during the interstimulus interval
but also during the presentation of the origin tone and much of the
sweep. In addition to recovering from adaptation of excitation during
the presentation of the origin stimulus, cells with inhibitory
sidebands may also be actively inhibited. For the purposes of our
analysis here, however, inhibitory events and their consequences are
treated as aspects of stimulus history because by the history of the
cell's response we mean only what we can measure extracellularly
(i.e., the discharge history).
Our results indicate that enhanced responses typically exceed the
response even at the onset of the control tone (i.e., the control-at-origin reference). Figures
11 and
12 show two examples of
conditioned responses (11, B and C, and
12B) that clearly exceed the responses at the onset of the
control stimuli (Figs. 11A and 12A,
) for
very-long-duration tones. In both instances, the delay between the
tones was varied, and even a temporal separation of 2 s
(11C; data not shown for the cell in Fig. 12) did not
produce any change in the magnitude or character of the response to the target. For tone durations of 5 s and an interstimulus interval of
2 s, the "effective" duty cycle of the control stimulus in Fig. 11 is 10/12, while that of the stimuli shown in B and
C is 5/12. The insensitivity of the conditioned response to
the introduction of long delays (1-3 s) between the origin and target
tones is evidence that the enhanced response to the target results from a change in "effective" duty cycle rather than the presentation of
a particular (e.g., 12 kHz) stimulus in the immediate past.
|
|
If a nonspecific (and in our case, monaural) mechanism of response
adaptation accounts for our results, then it should be possible to
demonstrate that in instances where the responses to different origin
tones are equivalent, the responses to the ensuing target tone are
likewise equivalent. The neuron whose responses are illustrated in Fig.
12 shows robust enhancement of the response to the 7-kHz target when
preceded by a 2-kHz origin tone. This cell cannot be said to have
recovered from adaptation during the presentation of the 2-kHz origin
tone any more that it recovers during the presentation of the target
tone during the control stimulus: the neuron responds slightly but not
significantly more strongly to the 2-kHz origin tone (Fig.
12B,
) than it does to the presentation of the
target tone for the control stimulus (Fig. 12A,
). It could be said, however, that the cell's afferents tuned to 7 kHz recover from adaptation during the presentation of the 2-kHz origin
tone and that the "effective" duty cycle for these afferents
differs substantially across the control and conditioning stimuli. A
similar argument applies to the responses to the 6- and 4-kHz origin
tones in Fig. 2A.
Figure 13B is a cartoon of a conditioned RF based purely on adaptation of excitation. The response to the target presented alone (i.e., the origin tone is replaced with silence) is indicated (*). For every spike fired in response to the origin tone, there is a proportional reduction in the response to the target so that as the origin tone moves out of the cell's excitatory response area, the conditioned RF asymptotes to the response to the target tone alone (e.g., <3 and >10 kHz). Figure 14 depicts a conditioned RF obtained with very-short-duration tone pairs (20 ms) that conforms to this simple model. The second excitatory peak of the frequency RF (8-9 kHz) is matched by a second trough in the conditioned RF. The data derived from the conventional stimulus paradigms, however, did not typically conform to the predictions of the adaptation of excitation hypothesis, which predicts that the peak of the frequency RF should align with the trough of the conditioned RF. Instances like that schematized in Fig. 13C, where ineffective origin tones suppress the response to the subsequent target, were not uncommon. In Fig. 15, for example, the minimum of the conditioned RF occurs at 4 kHz, where the origin response was no different from that of a number of origins that preceded robust target responses. In addition, the control target response is near the maximum of the conditioned RF despite the prior presentation of the BF at 4.5 kHz.
|
|
|
Careful consideration of the conditioned RFs in earlier figures reveals numerous examples of the failure of the discharge history of the cell, embodied in the frequency RF, to account for the response to the subsequent target tone. As has been noted, misalignment of the frequency RF peak and conditioned RF valley is evident in Figs. 9 and 10. In the latter case, the response to the 7-kHz target is not substantially diminished by the robust response to the preceding 4-kHz origin tone, while the response to the target is powerfully suppressed by the relatively weak response to the preceding origin tone of the same frequency. Again these results are consistent with frequency-specific adaptation occurring in the afferents to the recorded IC neuron. As Figs. 9 and 15 demonstrate, decrements in the response to the target tone can occur in the absence of a significant extracellularly observable response to the origin tone. Although response adaptation subsequent to the robust response to the BF of 4 kHz in Fig. 6 clearly accounts for the trough of the conditioned RF there, the suppression of the responses to targets which follow origin tones that do not elicit responses above the spontaneous rate (e.g., 2.5, 8 kHz) cannot be so explained.
Finally, we also encountered instances of stimulus-specific conditioned enhancement like that schematized in Fig. 13D. The response following a particular origin stimulus is enhanced relative to the response to origin frequencies more remote from the BF, where the "effective" duty cycle of the stimulus would be minimized. The tuned peak of the conditioned RF at 7 kHz in Fig. 16 is incompatible with a nonspecific adaptation of excitation mechanism because the discharge history (and thus the "effective" duty cycle) for the origin tones from 7 to 10 kHz is exactly the same: there was no response prior to the onset of the 4-kHz target. Similarly, the cells considered in Figs. 5 and 9 effectively respond only to 7 and 5 kHz, respectively, but the target responses following other equivalently ineffective origin tones differ significantly from one another.
|
Evidence for sensitivity to stimulus history in the conditioned RFs
We assessed the prevalence of sensitivity to stimulus rather than discharge history by restricting each of the measured conditioned RFs to origin frequencies that elicited statistically indistinguishable responses. Because the range of tested origin frequencies greatly exceeded the excitatory response area of each cell, there were generally several origin-target pairs that could be equated in terms of discharge history and "effective" duty cycle: none of those origin tones elicited a response when presented (e.g., 4-6 and 8-10 kHz in Fig. 5). In a few cases, target responses that followed relatively robust but equivalent responses to different origin tones were also retained and analyzed separately. The equivalence of the responses to the remaining origin tones was verified by obtaining a nonsignificant result in an ANOVA of repeated trials by origin frequency (P > 0.10). Firing rates <2 Hz, where 2 Hz represented <10% of the response to the BF, were considered not to be significantly different from zero regardless of the outcome of the test. After eliminating differences in discharge history from the conditioned RFs in this way, an ANOVA by origin frequency was then performed on the responses to the target tones. A significant (P < 0.01) result was taken as evidence for a cell being specifically sensitive to stimulus history. Overall, 57% of the conditioned RFs in each paradigm responded significantly differently to the same target tone when it was preceded by origin tones that elicited statistically indistinguishable responses.
Context sensitivity attributable solely to stimulus history was
estimated by examining the effect of controlling for discharge history
on the ranges of the conditioned RFs. The range of firing rates spanned
by the conditioned RF indicates the variability in the response to an
identical stimulus wrought by changing the context in which that
stimulus appeared. This range has the advantage of being independent of
the choice of control reference used elsewhere to estimate the
incidence and magnitude of conditioned effects. To normalize for
differences in overall firing rate across cells, we divided the range
of each conditioned RF by the range of the associated frequency RF.
These ranges are indicated by the brackets to the right of conditioned
RFs on earlier figures. A value of unity for this "context
sensitivity index" indicates that response variability to the same
target occurring in different contexts is as great as the variability
in the responses to origin tones spanning the excitatory response area
of the cell. Context sensitivity indices did not differ overall across
stimulus type, or time scale, but individual cells tested in more than
one paradigm yielded values that were consistent across both stimulus
type (n = 20;
= 0.552; P = 0.0116) and time scale (n = 14;
= 0.641;
P = 0.0135).
Stimulus-specific context sensitivity was then calculated from the ranges of conditioned RFs restricted to those origin-target pairs where the history of the response to the origin tone was statistically indistinguishable as described in the preceding text. The distribution of these indices also did not differ across paradigm. The graph in Fig. 17A shows context sensitivity with response history controlled plotted against the context sensitivity index for the full conditioned RF. The vertical distance from the diagonal indicates the reduction in the range of the conditioned RF when response history is controlled. In a number of conditioned RFs, most conditioning effects were induced by origin tones outside the excitatory frequency response area of the cell and thus cannot be explained by nonspecific adaptation of excitation. The error index in Fig. 17B estimates the degree of apparent "context sensitivity" attributable to variability over repeated trials. It was computed by normalizing the mean standard error of each conditioned RF by the range of the frequency RF. The median values of the context sensitivity indices when discharge history was (0.43) or was not controlled (0.81) far exceeded the median value of the error index (0.074).
|
It should be noted that the foregoing analysis is conservative with respect to the estimate of stimulus-specific conditioning effects because it effectively assumes that the variability in the responses to target tones preceded by origin tones within the excitatory response area is entirely due to discharge history.
Adaptation and conditioning in the IC
If nonspecific adaptation of excitation were the dominant force in
conditioning responses to the common target tone, the firing rates
elicited by the origin and target tones should be negatively correlated. Although sometimes negative, none of these correlations attained significance for any paradigm. Because the inclusion of origin
frequencies that did not elicit significant conditioning effects could
obscure the underlying relationship between adaptation and conditioning
phenomena, we restricted the analysis to origin-target pairs that
showed significant conditioning relative to the control-at-target reference. Although data from glides and steps failed to show a
significant correlation, the response to the origin tone does predict
the decrement in the response to the target for the quicksteps (
=
0.287; P = 0.0133), suggesting that
nonspecific adaptation of excitation increasingly dominates context
effects for short-duration stimuli (cf. Fig. 14).
It is possible that cells with particular adaptation characteristics have particular conditioning characteristics. To investigate the relation of monaural response adaptation to conditioning magnitudes obtained with monaural frequency glides and steps, we took the ratio of the responses to the control stimuli 500-1,000 and 0-500 ms after onset (when analyzed in 500-ms blocks, the responses in our sample reach a statistically "steady" firing rate by the 2nd block). This ratio was extremely consistent across repeated presentations of the control stimulus for each cell included in the analysis, and the average of these ratios was considered the adaptation ratio for the cell. This ratio was then compared with the context sensitivity index described in the preceding text (Fig. 17A). The lack of a significant correlation between these two measures suggests that the factors that control firing rate adaptation for constant stimuli are not necessarily the factors that regulate conditioning effects induced by frequency transitions.
We measured the decay in firing rate to the target tones by taking the ratio of firing rates measured 500-1,000 and 0-500 ms after target onset (for quicksteps, 100-200 and 0-100 ms). Median values of this ratio for the glide, step, and quickstep stimuli were 0.83, 0.80, and 0.81, respectively. We extracted the contributions of conditioning effects by analyzing the distributions of this ratio for each paradigm by response category (enhancement, suppression, and no effect). The firing rates decrease more from one interval to the next for enhanced points, and increase more for suppressed points, relative to origin-target pairs that did not exhibit conditioning, for all paradigms (Kruskal-Wallis, P < 0.0001). Thus it appears that conditioning effects are superimposed on, rather than determined by, the adaptation profile of each cell.
Effects of varying level on conditioning
Our exploration of the effects of level was generally limited to tests during the initial search for stimuli that elicited robust conditioning. Once an effective pair of origin and target frequencies had been identified, we typically varied SPL over a 30-dB range in an attempt to maximize the magnitude of the conditioned effect. The level that proved best was then set, and the additional points necessary to construct a conditioned RF were selected. In several instances, particularly in later experiments, we obtained entire conditioned RFs at multiple SPLs. The primary value of these data was to verify the reliability of origin-specific conditioning effects because the shapes of conditioned RFs generally changed gradually as the SPL was varied. Qualitatively, the nature of these changes appeared to be as specific to individual cells as the conditioned RFs themselves. For example, although it was often clear that the magnitude of conditioned enhancement could be limited by response saturation to the target tone, regardless of context, the high incidence of nonmonotonicity in our sample complicates generalizations about the SPL range that saturated the control response. It was possible for the sign of the conditioned effect associated with a particular origin-target pair to change as the SPL was varied over a sufficiently large range, as shown in Fig. 18. Nevertheless, as we have noted, SPLs 10-15 dB below the best SPL for the cell were most likely to be associated with large conditioning effects.
|