|
|
||||||||
INNOVATIVE METHODOLOGY
1Department of Neurobiology and the Interdisciplinary Center for Neural Computation, The Hebrew University, Jerusalem 91904, Israel; and 2University Laboratory of Physiology, University of Oxford, Parks Road, Oxford OX1 3PT, United Kingdom
Submitted 19 March 2004; accepted in final form 13 May 2004
| ABSTRACT |
|---|
|
|
|---|
| INTRODUCTION |
|---|
|
|
|---|
Illumination with long wavelengths (>700 nm) is preferred for studying visual cortex because these maximize the contribution of changes in hemoglobin oxygenation relative to the contributions of blood volume changes to the modulation of the reflectance (Frostig et al. 1990
; Malonek et al. 1997
; Shtoyerman et al. 2000
). Because stimulus-related changes in reflectance at these wavelengths are apparently not observed, most imaging studies of auditory cortex have been conducted with green light (
546 nm), which is thought to reflect mostly changes in blood volume (Frostig et al. 1990
). However, at these shorter wavelengths, slow, local, stimulus-independent oscillations are present that can be many times larger in amplitude than the stimulus-evoked changes (Versnel et al. 2002
). Consequently, maps generated using green light from auditory cortex by normalizing the averaged frames for a particular stimulus with respect to those obtained for a "null" or "blank" stimulus require long data collection times to average out the nonstimulus related signals. For example, Versnel et al. (2002)
developed a successful paradigm for mapping the frequency sensitivity of auditory cortex, but this paradigm requires about 2.5 h of data collection to measure the responses to just 4 different tone frequencies.
Most of the data reported in this study were acquired using a new paradigm, similar to that used in visual cortex by Kalatsky and Stryker (2003)
, where images are acquired continuously and stimuli are delivered in a series of constantly repeated sequences, with each sequence running through a particular stimulus parameter of interest (e.g., up the frequency scale). This approach enabled us to use much shorter data-acquisition times and to generate functional maps from both primary and nonprimary cortical areas at much higher resolution in stimulus parameter space. The technique relies, however, on the assumption that there is a constant lag between stimulus and response, and that the major determinants of the responses are independent of the stimulation sequence. These assumptions have been recently tested in a number of studies (Martindale et al. 2003
; Nemoto et al. 2004
; Sheth et al. 2004
) and are critically assessed here.
| METHODS |
|---|
|
|
|---|
All animal procedures were performed under license from the UK Home Office in accordance with the Animal (Scientific Procedures) Act 1986. Seven adult pigmented female ferrets (Mustela putorius) were used in this study. Two of those were used in preliminary experiments; the results discussed here are from the remaining 5. Two otoscopic examinations, on the day of the experiment and 2 days before, were carried out to ensure that both ears were clean and disease free.
Anesthesia was induced by 2 ml/kg intramuscular injection of alphaxalone/alphadolone acetate (Saffan; Schering-Plough Animal Health, Welwyn Garden City, UK) and maintained, during the surgery, by intravenous injection of supplementary doses when required. Once surgery was complete, anesthesia was switched to halothane (0.51.5%; Merial Animal Health, Harlow, UK) with a carrier gas mixture of oxygen (50%) and nitrous oxide (50%).
Usually, the left radial vein was cannulated and a continuous infusion (5 ml/h) of saline supplemented by 5% glucose, dexamethasone (0.5 mg kg1 h1; Dexadreson; Intervet UK, Milton Keynes, UK), doxapram hydrochloride (4 mg kg1 h1; Dopram-V; Fort Dodge Animal Health, Southampton, UK), and atropine sulfate (0.06 mg kg1 h1; C-Vet Veterinary Products, Leyland, UK) was maintained throughout the experiment. A tracheal cannula was implanted for artificial ventilation and gas anesthesia administration.
The animal was placed in a stereotaxic frame and the temporal muscles of both sides were retracted to expose the dorsal and lateral parts of the skull. On the right side of the skull a metal bar was cemented and screwed in place, to hold the head without further need of a stereotaxic frame. This freed the ear canals for the insertion of 2 specula into which earphones (RPHV297, Panasonic, Bracknell, UK) were fixed for acoustic stimulation. On the left side, the temporal muscle was retracted to gain access to the auditory cortex that lies ventrally to the suprasylvian sulcus (Kelly et al. 1986
). The most dorsal part of the suprasylvian and pseudosylvian sulci were exposed by a craniotomy and a stainless steel chamber (16 mm diameter) was cemented and sealed around it (Fig. 1A). The overlying dura was removed and the chamber filled with silicon oil and covered with a glass plate according to procedures described by Bonhoeffer and Grinvald (1996)
.
|
Recordings were performed in a purpose-built, double-walled, sound-attenuated chamber. After imaging data collection was completed the glass cover and silicon oil were removed from the chamber and agar (2% in saline) was placed over the surface of the cortex for electrophysiological recordings with glass-coated tungsten electrodes.
Stimulation protocols
Stimuli constituted sequences of short (30- to 50-ms) tone bursts. Each burst consisted of a fixed-frequency tone with 5 ms ON- and OFF-ramps, and the tone frequency changed slowly from burst to burst over the sequence duration. Typically, the frequency rose continuously over a period of several seconds (428, most commonly 1214 s). Identical sequences were repeated in a continuous loop to produce continuous stimulation. Because of technical limitations, data were collected in sets of 600 frames, which, at the frame rate used most commonly (240 ms/frame), lasted 144 s. Usually, these continuous stimulation periods were repeated 10 times. Thus a total of approximately 25 min of data were collected for each sequence.
The number of frequencies used varied somewhat between experiments. Frequencies were uniformly distributed on a logarithmic scale, with 12 frequencies per octave, between 500 Hz and about 30 kHz (the precise upper limit depended on the quality of the acoustic calibration in each specific animal, and varied between 28 and 32 kHz). The rate of presentation of individual tone pips then depended on the total duration of the sequence. Tone sequences of rising frequency were used in all experiments and, in some experiments, downward sequences were used as well.
Tone level was 75 dB SPL, independent of frequency. To achieve this tone level at all frequencies, a calibration was performed in each ear against a precalibrated microphone. Tone levels were adjusted using the calibration curve to achieve the nominal level.
Data collection
Intrinsic optical signals were acquired using Imager 2001VSD+ (Optical Imaging, Mountainside, NJ). In all animals, most of the data were collected while the cortex was illuminated by narrow-band green light (
= 546 nm; 50%-bandwidth, 10 nm; Coherent-Ealing filter; Ealing Electro-Optics, Holliston, MA) directed through 2 fiber-optic light guides. In all experiments, at least one block of data was also collected with red illumination (
= 700 nm), and in one experiment optical signals were also collected using orange light (
= 610 nm). The data collected with red illumination did not show any significant responses, and the data collected in orange illumination showed small areas of significant responses with much weaker stimulus sensitivity than that under green illumination. Therefore all the data presented herein were collected under green illumination.
Images were acquired using a video camera (CS8310C, Tokyo Electronic Industries, Tokyo, Japan), mounted above the cortex and perpendicular to its surface. The area over which data were collected measured approximately 8 x 6 mm, at 1/4 or 1/9 of the maximal resolution (758 x 568 pixels). Blood vessel artifacts at the cortical surface were reduced using a macro double-lens configuration [2 Nikon, 50-mm SLR camera lenses (Nikon, Tokyo, Japan) mounted front to front] with a shallow depth of field and focused 500 µm below the cortical surface. We used the VDAQ/NT data-acquisition software (v1.5, Optical Imaging).
To synchronize the auditory stimulation and the optical image acquisition, we used a separate computer that collected all the necessary timing information using AlphaMap (Alpha Omega, Nazareth, Israel). The same computer also recorded the times of every stroke of the respirator and the waveform of the ECG. The nominal period between images was 80 ms in one experiment and 240 ms in the other 4 experiments. These acquisition rates were sufficiently fast for the slow dynamics of the intrinsic optical signals (Martindale et al. 2003
). The precise times of image acquisitions fluctuated somewhat around these values. Usually, this jitter was small. Thus in the 4 experiments with a period of 240 ms, the median interval was 241.8 ms (against a nominal value of 240 ms), and in all blocks, 99% of the intervals were within <17 ms of each other (about 7% of the nominal interval). The larger intervals usually occurred at the beginning of a data collection run, and were about twice as long as the nominal interval. Thus although the jitter had to be corrected, simple interpolation at equally spaced time points was sufficient for this purpose.
In 3 experiments (F0230, F0234, and F0242), data were also collected using fixed stimuli, according to the paradigm developed by Versnel et al. (2002)
. Four frequencies (1, 2.8, 8, and 16 kHz) were used in these experiments, interleaved with a fifth, silent, condition. Each trial lasted 17.5 s, during which 14 data frames were acquired. A total of 64 repeats were averaged for each stimulus. The stimuli consisted of four 250-ms tone pips presented once every 500 ms and were triggered 1.5 s after the beginning of the acquisition of the optical signal.
Electrophysiology
Extracellular recordings were made using fixed arrays of 4 tungsten-in-glass electrodes. The signals were band-pass filtered (500 Hz5 kHz), amplified (
10,000x), and digitized at 25 kHz. Data were collected and analyzed using BrainWare (TDT, Alachua, FL). Single units and small multiunit clusters were isolated from the digitized signal. Pure tones ranging from about 0.530 kHz and about 1090 dB SPL were presented pseudorandomly at a repetition rate of once per second with 510 repetitions per recording.
All data presented are from units whose spike counts during the stimulus were significantly different from the spike counts in windows of the same duration just before stimulus onset (P < 0.05, paired t-test). Units were classified as tuned if a one-way ANOVA on the evoked spike counts showed a significant (P < 0.05) effect of frequency. Best frequency (BF) was defined as the frequency with the lowest threshold, as measured by spike counts.
Data analysis
The images acquired in response to the continuous stimulation sequences were analyzed on a pixel-by-pixel basis. We refer to the time series of reflectance values observed at one particular pixel as the pixel's "optical waveform." Because the optical reflectance changed only slowly over the cortical surface, the images were spatially downsampled by a factor of 3. The first step in the analysis of the optical waveform was to correct for respiration artifacts. For the experiment with the fast sampling rate, heart rate artifacts were corrected as well. For the experiments with the slower frame-acquisition rate, heart rate (250300/min, equivalent to about 45/s) was about as fast as the frame rate and therefore no correction was needed. Because the algorithm used in both cases was the same, it will be described for the respiration artifacts only.
For each pixel, an average postinspiratory optical signal was determined by averaging the segments between successive ventilator strokes (about 30/min, equivalent to 0.5/s or about 89 frames between successive ventilator strokes). To correct for the jitter in the intervals between frames, the optical signal was resampled (using linear interpolation between nearby sample points) at the nominal frame rate, with the zero time of the resampled signal always precisely at an inspiratory event.
The respiratory averaged waveform was then convolved with the sequence of inspiratory event times (considered as delta functions) and subtracted from the raw optical waveform. Because of the jitter in the intervals between frames, the convolution was performed by resampling the average postinspiratory signal at the actual delays at which the frames occurred after an inspiratory event.
After correction for respiratory (and, where necessary, heart-beat) artifacts, the optical waveforms were normalized, by expressing them as a proportion of the average reflectance value for each pixel across the entire recording time. All further analysis was performed on the corrected, normalized frame sequences.
From the normalized signal we then calculated the stimulus-dependent reflectance modulation function (RMF) by averaging all the signal segments after the start of a stimulus sequence. Because of the jitter in the intervals between frames, the RMFs were calculated by resampling the normalized optical signal at fixed times beginning at the start of each stimulus sequence, with a resampling period approximately equal to the nominal sampling period of the optical signal (either every 80 or every 240 ms, depending on the experiment). Because the times between stimulus sequence starting points were generally not an exact multiple of the sampling period of the optical signal, the resampling rate of the optical waveform was adjusted slightly, either up or down, to accommodate an exact number of sampling points between 2 successive starts of the stimulus sequence. To judge the statistical significance of the stimulus-evoked modulation, a one-way ANOVA was performed, with the time after the start of the sequence as the factor. In all figures, only pixels with ANOVA F values >2 are displayed. The F-value represents the ratio of the variance of the RMF as a function of time over the average variance of the responses to individual cycles around this mean. It is therefore a "signal to noise ratio", and the value of 2 is a reasonable cutoff point. The significance of this F-value with respect to the null hypothesis of no modulation is <0.01 for all tests performed here, and <0.0001 for the typical tests (sequences of 12 s). Because the parameter maps most often contain about 5,000 pixels, the use of such significance levels would, on average, result in at most 50 pixels (but most commonly <1 insignificant pixel) being displayed.
Figure 1 illustrates how the RMF is extracted from the optical waveform. A short segment of the optical waveform is shown in Fig. 1B. The individual waveform segments between successive starts of the stimulus sequence are shown in Fig. 1C, with the mean waveform and SDs superimposed. The SDs are essentially independent of the lag during the stimulus sequence, justifying the use of one-way ANOVA for detecting significant modulation of the means. In this case, the modulation was highly significant [F(24,3600) = 56.1, P < 0.001].
For the experiments in which fixed stimuli were used, the data were analyzed as in Versnel et al. (2002)
. Significant response in a pixel was defined as a decrease in reflectance of more than 1 SD. The SD was first computed for each stimulus and each pixel separately, and the median of all stimuli and pixels was used for the significance test. Each pixel was assigned to the frequency that elicited the highest peak reflectance change, provided that at least one frequency gave rise to a significant response. However, in many cases more than one frequency gave rise to a significant response, and sometimes the responses to 2 or more frequencies were of comparable magnitudes. An additional way of quantifying the sensitivity of a pixel was therefore used: all the frequencies that gave rise to a significant response in a pixel were averaged, with weights proportional to the peak activation they elicited.
Statistical tests are considered significant when P < 0.05. For tests resulting in extreme values of the statistics, smaller bounds on the P-values are reported.
| RESULTS |
|---|
|
|
|---|
Optical data were collected in 5 ferrets. Because the use of continuous stimulation is new, we will start by reporting on a number of methodological issues, including the statistical stability of the stimulus-locked signal and the effect of varying stimulation parameters, before describing the spatial distribution of frequency sensitivity observed.
Nature of the stimulus-locked signal
Under green (
= 546 nm) illumination, the acoustic stimuli could evoke a modulation of the optical reflectance signal of
15% of the mean value. However, modulations of 15% were more typical, even for the pixels that exhibited the strongest modulation. The stimulus-locked signal had a complex dependency on the stimulation parameters, as will be shown below. Therefore as a first step in the analysis of this signal, we studied its stability over time within and across blocks.
To study the stability of the stimulus-locked signal during a data collection block, partial RMFs were computed on short sections of about 1 min and compared with the full RMF computed over the whole recording time, which is the average of the partial RMFs. Figure 2 shows 4 examples of the evolution of the partial RMFs in single pixels from 3 animals. These data span the range of behaviors that we observed.
|
Figure 2B shows a case in which a number of partial RMFs at the beginning of the block had a shape different from that of the rest. These partial RMFs are drawn in green lines, whereas the rest are in black. The correlation coefficient between the full RMF and the partial RMFs is small at the beginning of the block, but after 2 min it is already >0.75 and it remains large to the end of the block. Clearly, the optical signal was not locked to the stimulus at the beginning of the block. After a few cycles, however, it was entrained by the acoustic stimulation and remained highly entrained until the end of the block.
Figure 2C shows a case where the partial RMFs at the end of the block (green) had a shape different from that of the rest.
Finally, Fig. 2D shows a case in which there was no entrainment of the optical signal by the acoustic signal. The lack of entrainment is mirrored by the small F-value in this case (F = 0.5). The correlation coefficients between the partial RMFs and the full RMF are all <0.75.
To quantify these phenomena, we examined the correlation coefficients between the partial and full RMFs (blue lines in the insets in Fig. 2) using 2 measures, the mean size of the correlation coefficients and their SD, computed on a pixel-by-pixel basis. Pixels with consistent RMFs are expected to have large mean correlation coefficient and small SD. Increase in SD is expected to occur in cases such as those illustrated in Fig. 2, B and C.
To illustrate this, Fig. 3 shows population distributions for mean correlation coefficients and SDs. The data are taken from all frequency sequence blocks from all 5 experiments. Each panel shows a histogram computed for pixels with F > 2 (blue), corresponding to significant modulation of the RMFs, and a histogram computed for pixels with F < 2 (green). The mean correlation coefficients between the partial RMFs and the full RMF (Fig. 3A) are clearly shifted to larger values in those pixels that had significant RMF modulation. There was also a clear, although smaller, shift of the SD toward smaller values (Fig. 3B).
|
25 h between repetitions. Figure 4, AE show examples of RMFs from pairs of such blocks. Each panel displays RMFs recorded along a line on the cortical surface, chosen to cover a large extent of the significantly modulated area. The top panel in each pair corresponds to the data collected in the earlier block and the bottom panel to the data collected in the later block. Although there are differences in detail there is a good general agreement in the shape and positions in the RMFs.
|
Phase shifts in the RMFs between blocks were generally small. Figure 4G shows a histogram of all the shift values, from all pixels with significant modulation in both blocks (purple) and in the complementary pixels (green). Phase shifts close to 0, indicating the same temporal position of the RMFs in both blocks, were by far the most common, but negative shifts of up to about 3 s were also not uncommon. In contrast, positive time shifts were rarer. Negative shifts indicate a delay of the RMF in the later block relative to the earlier one. Thus it seems that, over time, the RMFs either kept their temporal position or tended to be somewhat delayed in the later block. Examples of such delays are also apparent in Fig. 4, B and C. An average time shift of 12 s, over a total stimulus sequence duration of 12 s and about a frequency range of 6 octaves corresponds to a shift in the presumed "trigger frequency" to which the RMF was locked of about 0.51 octaves. Because of these limitations in the stability of the RMF over long time periods it is unrealistic to expect the optical data to line up precisely with electrophysiologically determined tonotopic maps that are recorded much later in the same experiment. Misalignments of as much as one octave on average may be expected between the optical and the electrophysiological measurements. However, the shifts in optical trigger frequency themselves were highly correlated across the cortical surface (e.g., Fig. 4, B and C). Thus the large-scale tonotopic organization was nevertheless reflected faithfully in the optical maps.
Properties of the RMFs
The RMF, by its construction, has the same period as the stimulation sequence. The RMF can therefore be decomposed into sinusoidal components at the period of the stimulation sequence and its harmonics. Kalatsky and Stryker (2003)
argued that at a sufficiently fast stimulation rate, the RMF should be dominated by the fundamental frequency and therefore be roughly sinusoidal. They used the phase of the fundamental as their temporal reference point for the RMF. In our hands, for sequence durations longer than 6 s, the shape of the RMFs was frequently not sinusoidal and depended on the period of the stimulus sequence (Fig. 5A). The asymmetry in the RMFs can be quantified by the relative contributions of the frequency components at the sequence duration (the fundamental H0, with a period of D seconds, where D is the sequence duration) and its 1st harmonic (H1, with a period of D/2) (Fig. 5B). For a sequence duration of 4 s, the first harmonic was on average about 30 dB below the fundamental (amplitude ratio of about 3%), whereas for sequence durations of 1014 s, at which most of the data were collected, the energy ratio was between 10 and 6 dB, corresponding to amplitude ratios of 3050%. Thus although the fundamental component was dominant on average, the 1st harmonic (and also higher harmonics) made substantial contributions to the shape of the RMF at the longer sequence durations.
|
To quantify this observation, the gradient of the maps was computed at each pixel that showed significant modulation. The gradient was estimated as the vector of differences between each pixel and its neighbors along the x- and y-axes, and the Euclidian length of this vector was used as the magnitude of the gradient. The maxima of the RMFs tended to stay roughly constant and then move in rather large jumps. Therefore the gradient magnitudes of these maps are expected to have an excess of both very small and very large values relative to the gradient magnitudes of the zero-crossing maps. Because zero-crossing maps are smoother, medium gradient values are expected to dominate.
In the data, the gradients had a highly skewed distribution. To compare the gradients of the maxima and of the zero-crossing maps, the average gradient magnitudes and their SDs were computed for each map. The gradients computed for the maxima maps tended to be larger on average (t = 2.6, df = 89, P < 0.05, paired t-test), and more dispersed (t = 3.4, df = 89, P < 0.05, paired t-test), than the gradients computed for the zero-crossing maps. Figure 5G shows a coarse histogram of the 2 distributions. The bin widths have been selected to achieve near-uniform counts for the histogram of the gradients computed for the maxima maps (black). In the same bins, the counts for the gradients of the zero-crossing maps (red) were much more concentrated in the second bin, with lower probabilities for both smaller and larger values. These findings fully justify the use of zero crossings of the RMFs, rather than their maxima, as the temporal reference points.
The fixed-trigger model for the RMFs
To be able to derive estimates of preferred stimulus parameters from the RMF it is necessary to estimate the temporal relationship between the RMF and the start of the stimulus sequence. In the example shown in Fig. 1 the start of the stimulus sequence happened to coincide with a minimum in the RMF, but that was not always the case: for other pixels or stimulus conditions the temporal relationship between the RMF and the sequence onset was quite different, as seen in Fig. 2. Our data analysis is based on the assumption that the RMF is the signature of a stereotyped response of a cortical pixel, which is "triggered" when the stimulus sequence crosses the tuning curves of the neurons in the pixel. Consequently, if we understand the (presumably fixed) temporal relationship between an identifiable reference point on the RMF (e.g., the downward zero crossing) and the "trigger point," then we can deduce which stimulus parameter triggered the response. The slow time course of the optical signal implies that the analysis method presented here cannot be used to deduce the "selectivity" of the optical signal recorded from a given pixel to the stimulus parameter, given that it is very difficult to relate the time course of the signal following the trigger point to the properties of the stimulus sequence. Instead, we identify here the "sensitivity" of the optical signal to the stimulus parameter, as indicated by the parameter value that caused the triggering of the optical response. Hereafter, this parameter value will be called the trigger parameter.
In the simplest version of this model, the "temporal position" of the RMF relative to the start of the stimulus sequence is determined by 2 factors. One is a fixed delay T, which depends on the dynamics of the reflectance signal, and is expected to be on the order of a second or so (Devor et al. 2003
; Martindale et al. 2003
; Nemoto et al. 2004
). This delay could, in principle, also depend on the total latency between sound presentation and the response of the cortical neurons. However, neuronal response latencies, on the order of a few tens of milliseconds, are substantially shorter than the latency because of the dynamics of the reflectance signal itself and can therefore be ignored. The second factor determining the temporal position of the RMF is the time from the beginning of the sequence until the trigger parameter value. If we take downward zero crossings as points of reference on the RMF, then these temporal relationships are given by the following equation (see Fig. 6 for schematic representation)
![]() | (1) |
|
The simple "trigger parameter" model described here is, of course, likely to be an oversimplification. In the data described below, it is clear that either the delay or the trigger frequency (and probably both) depend on the exact stimulation paradigm used (ascending vs. descending frequencies, sequence duration, and so on). Consequently, the tests of the trigger model described below are unlikely to be borne out with precision, but insofar as they yield at least approximately correct results they do lend support to the validity of the derived estimates of sensitive ("trigger") parameter values.
The fixed-trigger model: sequences of varying durations
One way of verifying the suitability of the model is by comparing the results of measurements with different stimulus sequence durations. Assuming that freq(x, y) is independent of duration, it is easily shown that
![]() | (2) |
Figure 7 shows the results of these tests carried out with data collected at different sequence durations. Figure 7, A and B show RMFs collected in pixels along a line on the cortical surface, for sequence durations of 6 and 10 s (animal F0256). The zero crossings consist of 2 segments interrupted by some pixels with nonsignificant modulation. Figure 7C is a plot of the zero crossings, divided by the duration of each sequence as in Eq. 2; the zero crossings for the other sequence duration tested in this animal, 14 s, are also displayed. The zero crossings have been referred backward by 3.4 s (14 samples), to position the phase jump at approximately the same location for the data at all 3 sequence durations. In this plot, the 3 lines should be parallel to each other (their distance being determined by the delay, see Eq. 2). The data in Fig. 7C are roughly, though not precisely, consistent with this prediction. Figure 7D displays the scatter plot of the zero-crossing times at the 2 shorter durations (6 and 10 s), normalized by the duration (as in Eq. 2) and corrected for phase ambiguity, at all pixels in which the RMFs had significant modulation for both sequences. The scatter plot showed a roughly monotonic relationship between zc(x, y) and D, with a correlation coefficient of 0.84. Correlation coefficients were computed for 40 pairs of maps measured with different durations. The average correlation coefficient was 0.75 ± 0.13 (mean ± SD), and thus the example in Fig. 7D, although somewhat above average, is typical.
|
We examined this quantitatively using the gradients of the zero-crossing maps. For maps that contain regions of constant values interrupted by discontinuities, as in Fig. 7F, it is expected that the magnitude of the gradients will show an excess of both small and large values, whereas maps with smoother variation would show mainly intermediate gradient magnitudes. Figure 7G shows a coarse histogram of the gradient magnitudes. The gradient magnitudes, computed for all bins with significant modulation of the RMF, have been separately collected for maps generated with frequency sequences of durations shorter and longer than 14 s, respectively. The bins in Fig. 7G have been selected to give equal counts in the histogram of the gradient magnitudes for the maps computed with long durations (black). Using these bins, there is clearly an overrepresentation of intermediate gradient magnitudes in the maps computed for shorter durations (red), confirming the presence of smoother changes in the zero-crossing values across the cortical surface at those durations.
The fixed-trigger model: upward versus downward sequences
Another way of testing the fixed-trigger model is by comparing responses to upward- and downward-frequency sequences (Kalatsky and Stryker 2003
). In these cases, the fixed-trigger model predicts that
![]() | (3) |
To quantify the general relationships between the zero-crossing times for the upward- and downward-frequency sequences, the correlation coefficients between them were calculated for all such pairs of maps. Because phase wrapping could affect the quality of the fit, the maps were shifted cyclically by all possible values, and the best (most negative) correlation coefficient was determined. Figure 8 shows 3 examples of such scatter plots. Figure 8A is the best case in the whole data set. The inverse relationship is clearly apparent, in that regions of early zero crossings in one map correspond to regions of late zero crossings in the other map, and the slope of the scatter plot is approximately 1. However, only 2 map pairs out of 10 tested, both from the same animal (F0242), showed this type of behavior.
|
Ten pairs of maps were compared with this approach. Best-correlation coefficients varied between 0.8 (the data shown in Fig. 8A) to 0.12 (the data shown in Fig. 8C). The mean correlation coefficient was 0.42 (±0.23 SD). The data in Fig. 8B had a correlation coefficient of 0.51, and are therefore typical.
Tonotopic maps
To compute maps of trigger parameters from the RMFs at each pixel, the temporal reference point on the RMF was determined as the lag corresponding to the downward zero crossing of the RMF. This choice is justified by the data presented in Fig. 5. Furthermore, in all animals except one (F0242), only upward-frequency sequences were used because of the general incompatibility between the upward and downward maps, as shown in Fig. 8. Animal F0242 was exceptional in that it was the only one with reasonably compatible upward and downward zero-crossing maps (Fig. 8A). Both of these choices are different from those made by Kalatsky and Stryker (2003)
, but are justified by the character of the data as described above.
Using the data collected with a variety of pure-tone sequences, trigger-frequency maps were created for each possible value of the delay T. For each value of T, the similarity between the maps from all pure-tone sequences was estimated by computing their variance around the mean, averaged over all pixels that had significantly modulated RMFs in at least 2 conditions. The delay T that corresponded to the least-variable map was selected. This procedure is similar to linear regression of the zero-crossing times against stimulus duration, but with the intercept fixed across all pixels and with weighting that emphasizes the shorter sequences. The emphasis of the shorter sequences is justified by the data presented in Fig. 7.
This procedure is illustrated in Fig. 9. The data for sequence durations of 8 and 16 s, collected in the same animal (F0234), are shown in Fig. 9, A and B, at a delay of 1.7 s (the best delay for this data set). The consensus map, based on all the available data for this animal (durations of 4, 8, 10, 12, 14, 16, and 20 s), is shown in Fig. 9C, and Fig. 9D shows the SDs in the frequency estimates at each pixel. The data are shown only for pixels where at least 2 valid estimates of the zero crossings (F > 2, as derived from the ANOVA) were available. The maps were not spatially smoothed in any way. The maps estimated from single-sequence durations are rather similar to each other, and this is expressed in the relatively low SD of the frequency estimates over most of the cortical surface (<0.5 octaves in 56% of the pixels, and larger than one octave in only 16% of the pixels).
|
|
|
Generally, the frequencies assigned to each pixel using the standard paradigm of presenting one tone frequency at a time (Fig. 11) are higher than those assigned using the frequency sequences (Fig. 10). This difference may be attributable to the use of upward sequences for estimating the trigger frequencies. Using such sequences, it is expected that the neurons will be activated when the sequence enters their sensitive areas from below, at frequencies that are below the BF. This would be true even in the presence of lower inhibitory sidebands, given that metabolic demand, which determines blood flow, is thought to depend on the total synaptic activity rather than on the spiking activity (Logothetis et al. 2001
).
Large-scale organization of the tonotopic map
The tonotopic arrangement in ferret auditory cortex based on the optical maps (Fig. 10) consists of a central low-frequency area with flanking high-frequency regions. In all maps, a high-frequency focus was found in the dorsal part of the MEG (see Fig. 1A, marked with thick arrows in Fig. 10). In 4 animals, additional high-frequency areas (marked with thin arrows in Fig. 10, AC and E) were also found ventral to the low-frequency central area. The high-frequency focus located at the dorsal part of the MEG (thick arrows) most likely corresponds to the high-frequency end of areas A1 and AAF. A frequency gradient extending ventrally from the tip of the MEG is consistent with the A1 frequency gradient as usually defined in the ferret, and was seen in all animals. In no case, did we observe a clear frequency reversal within this high-frequency area that could be interpreted as a border between high-frequency A1 and AAF. In addition, except for one animal, we did not observe any clear discontinuity within the central low-frequency region that could be interpreted as the border between low-frequency A1 and AAF (as in Fig. 10, A, B, and E). In one case (Fig. 10C, animal F0242), a middle-frequency ridge was present that could indicate the border between the low-frequency A1 and AAF regions. Such a region has also been observed in tonotopic maps generated using electrophysiological recordings (e.g., Kelly et al. 1986
) and was also seen in the map generated from the standard simulation paradigm in this animal (Fig. 11C). Thus in terms of tonotopic organization, A1 and AAF appear to share a continuous gradient that runs approximately dorsoventrally from the MEG to the PEG and AEG, respectively.
Beyond the A1/AAF area, the optical maps consistently showed a frequency reversal and high-frequency sensitivity on the AEG, PEG, or both (Fig. 10, A, B, C, and E). The only exception is one animal (F0253, Fig. 10D) in which the RMFs in response to tone sequences outside A1/AAF were not significant. The observed frequency reversals are indicative of borders with additional, presumably higher-order, auditory fields. Thus our studies reveal at least 2 new fields: one on the AEG, ventral to A1 and AAF and anterior of the pseudosylvian sulcus, and another on the PEG ventral to the primary fields and posterior to the pseudosylvian sulcus. There may be additional fields between these two, lying inside the pseudosylvian sulcus itself, but such fields cannot be visualized with optical signals.
Single-unit recording data (Phillips et al. 1994
) suggest that the representation of sound frequency in A1 may break down at high sound levels. However, in keeping with other imaging studies (Harrison et al. 1998
; Spitzer et al. 2001
; Versnel et al. 2002
), we found that the frequency gradients in A1 and AAF were preserved at high levels, and that the RMFs remained significantly modulated by acoustic stimulus parameters. Because the optical signals are thought to reflect synaptic activity rather than the spiking output of the cortex (Nemoto et al. 2004
), this finding may imply that the patchiness observed in the electrophysiological tone-frequency maps is a result of cortical processing of inputs that have a much stricter tonotopic order, even at high sound levels.
Relationship between optical and electrophysiological data
To compare the frequency maps derived from the RMFs with electrophysiological measures of neural tuning, a small number of microelectrode recordings were performed in each animal after the optical recordings had been completed. The responses of 63 multiunit clusters were recorded, of which 53 were recorded at locations to which a frequency could be assigned based on the optical recordings; for 48 of these clusters, a best frequency could be assigned to the electrophysiological data. The responses of the other 5 clusters were not tuned for frequency.
Figure 12 shows examples of frequency response areas derived from the electrophysiological recordings. Three types of behavior can be distinguished. In 15/48 cases, the best frequency of the cluster and the optical frequency were less than half an octave apart. Examples for these cases are shown in Fig. 12, A and B. These cases were about twice as likely to occur inside A1/AAF (11/15) as outside A1/AAF (4/15).
|
Finally, in the third type of clusters, the optical frequency was located more than half an octave above the cluster BF (12/48). These included all 8 clusters with BFs < 2 kHz (Fig. 12E). At these low BFs, the optical frequency estimates may be distorted because of the large frequency jump that occurred in the stimulus sequence between the very high frequencies at the end of one repeat of the sequence and the very low frequencies at the beginning of the next repeat. Of the remaining 4 clusters with this behavior (4/48, Fig. 12F), 2 occurred within A1/AAF and the other 2 outside A1/AAF.
All the examples in Fig. 12, except that in Fig. 12C, are from A1/AAF. The recording location of Fig. 12C was at the high-frequency anterior area of F0230, just dorsal to the tip of the pseudosylvian sulcus (Fig. 10A).
The distributions of the different classes inside and outside A1/AAF were not statistically different (
2 = 5.6, df = 2, ns).
To verify that the relationship between cluster BFs and optical frequencies was not random, we assumed a model in which the BFs estimated from the electrophysiological recordings are given, and the optical frequencies are randomly distributed across the cortical surface. In this model, it is possible to estimate the probability of finding an optical frequency within a given interval from a cluster BF simply as the ratio of the length of this interval and the frequency range spanned by the stimulus sequence. Using this approach, the expected number of penetrations for which the optical frequencies would be within half an octave from the BF was 8.3, whereas the actual number of cases, 15, was significantly higher (
2 = 10.5, df = 1, P < 0.01). So although BFs and optical trigger frequencies often differed by more than half an octave, their relationship was much closer than would be expected by chance.
| DISCUSSION |
|---|
|
|
|---|
The use of continuous sequences is new, and results using this method have been published until now only in the rat visual cortex (Kalatsky and Stryker 2003
). Thus in this study, we carried out a detailed examination of the properties of the optical signal evoked during continuous stimulation and its relationships to the parameters of the stimulation sequence. We will discuss here 1) the validity of the fixed-trigger model; 2) optimal sets of stimulation parameters for generating the optical maps; and 3) the validity of the resulting frequency maps.
The fixed-trigger model
The interpretation of the data collected with continuous-sequence stimulation assumes that the optical signal is generated when a stimulus-frequency sequence crosses some trigger value, which is representative of the frequency selectivity of the neural elements in the tissue. The optical signal and the crossing of the trigger frequency are thought to be separated by an unknown but constant delay. Finding this constant delay is important for fixing the scale, but not the shape, of the frequency map.
Our results show that the fixed-trigger model is at best only an approximation, and that this approximation may hold sufficiently well over only a narrow range of parameters. Two main departures from the fixed-trigger model are documented here. The first is the change in the structure of zero-crossing maps at long-sequence durations, when the maps lose their smoothness and become a mosaic of regions of rather fixed values (Fig. 7). The second, and much larger, departure from the fixed-trigger model is the general lack of concordance between the zero-crossing maps derived from upward- and downward-frequency sequences (Fig. 8).
The partial failures of the fixed-trigger model could result from both sensory and nonsensory factors that contribute to the generation of the optical signal. It has been shown a number of times (e.g., Heil et al. 1992
; Nelken and Versnel 2000
) that frequency-modulated sweeps produce responses when they cross the border of the tuning curve. In many cases, the response can be shown to be triggered at a frequency that is independent of the velocity of the sweep, with the trigger frequency below the best frequency for upward sweeps and above the best frequency for downward sweeps. A similar phenomenon could occur here, producing different zero-crossing maps for frequency sequences of opposite directions.
More generally, a large number of studies have shown that the responses of neurons in A1 depend on their stimulation history (e.g., Brosch and Schreiner 1997
; Calford and Semple 1995
; Malone et al. 2002
; Ulanovsky et al. 2003
). It could be that other adaptive mechanisms, specific to the type of frequency sequences that we used, are responsible for the large differences between the results from the upward and downward sequences. Such adaptation could also affect the results of optical imaging of more complex parameters, and the resulting maps could reflect the interplay of pure sensory responses with specific adaptation mechanisms.
However, under this scenario, it would still be expected that the maps derived from upward- and downward-frequency sequences would show negative correlation. Furthermore, even in the cortex, the high-frequency borders of auditory tuning curves tend to be steeper than the low-frequency borders (e.g., Fig. 12). Consequently, it would be expected that downward-frequency sequences would produce more consistent and representative zero-crossing maps than upward sequences. Instead, the more consistent maps were produced using upward sequences.
Fur