|
|
||||||||
The Journal of Neurophysiology Vol. 86 No. 6 December 2001, pp. 2789-2806
Copyright ©2001 by the American Physiological Society
1Keck Center for Integrative Neuroscience and Department of Physiology, 2Sloan-Swartz Center for Theoretical Neurobiology, and 3Department of Otolaryngology, University of California, San Francisco, California 94143-0444
| |
ABSTRACT |
|---|
|
|
|---|
Liu, Robert C., Svilen Tzonev, Sergei Rebrik, and Kenneth D. Miller. Variability and Information in a Neural Code of the Cat Lateral Geniculate Nucleus. J. Neurophysiol. 86: 2789-2806, 2001. A central theme in neural coding concerns the role of response variability and noise in determining the information transmission of neurons. This issue was investigated in single cells of the lateral geniculate nucleus of barbiturate-anesthetized cats by quantifying the degree of precision in and the information transmission properties of individual spike train responses to full field, binary (bright or dark), flashing stimuli. We found that neuronal responses could be highly reproducible in their spike timing (~1-2 ms standard deviation) and spike count (~0.3 ratio of variance/mean, compared with 1.0 expected for a Poisson process). This degree of precision only became apparent when an adequate length of the stimulus sequence was specified to determine the neural response, emphasizing that the variables relevant to a cell's response must be controlled to observe the cell's intrinsic response precision. Responses could carry as much as 3.5 bits/spike of information about the stimulus, a rate that was within a factor of two of the limit the spike train could transmit. Moreover, there appeared to be little sign of redundancy in coding: on average, longer response sequences carried at least as much information about the stimulus as would be obtained by adding together the information carried by shorter response sequences considered independently. There also was no direct evidence found for synergy between response sequences. These results could largely, but not entirely, be explained by a simple model of the response in which one filters the stimulus by the cell's impulse response kernel, thresholds the result at a fairly high level, and incorporates a postspike refractory period.
| |
INTRODUCTION |
|---|
|
|
|---|
To understand the coding of information by neurons, it is important to quantify the variability in their responses. When this variability is driven by changes in the stimulus, the neurons can use this to distinguish between stimuli. On the other hand, when this variability occurs in repeated responses to the same stimulus, it acts as noise that reduces the neurons' potential capacity to code information.
The study of neuronal variability has recently seen a rebirth of
interest in association with the renewed use of information-theoretic techniques for analyzing neural coding (Bair 1999
;
Borst and Theunissen 1999
; Buracas and Albright
1999
; de Ruyter van Steveninck et al. 1997
;
Meister and Berry 1999
; Rieke et al.
1997
; Victor 1999
). In the visual system, the
precision of spike times and counts has been investigated in several
neural areas, although only a few have looked at the lateral geniculate
nucleus (LGN) (Guido and Sherman 1998
; Hartveit
and Heggelund 1994
; Kara et al. 2000
; Keat et al. 2001
; Reich et al. 1997
;
Reinagel and Reid 2000
; Sestokas and Lehmkuhle
1988
). In this paper, we further explore the degree of
precision found in LGN neurons of barbiturate-anesthetized cat by
examining both spike count and timing measures. We go on to quantify
the amount of information transmitted by neurons about the stimulus and
to determine the degree to which models of response based on linear
integration of inputs can account for the observed precision.
A unique feature of the present approach is that we closely examined
the dependence of neuronal variability on the degree of specification
of the stimulus. To do this, we employed a pseudorandom binary stimulus
known as an M-sequence (Sutter 1992
). We focused only on
characterizing the neurons' response to temporally varying stimuli by
showing full-field bright and dark frames, ignoring the center-surround
spatial structure of LGN neurons. M-sequences provide a statistically
efficient and convenient method for analyzing responses because they
have the nice property that every sequence of bright and dark frames of
a given length (up to some limit) is repeated the same number of times
somewhere throughout the sequence (see METHODS). This
allowed us to simultaneously examine the responses
both the mean
response and the variability in the response
to every
sequence of a given length, giving us a detailed characterization of
the neural code for such sequences. By varying this length, we examined
how much of the stimulus had to be specified to maximize the precision
of a neuron's response: e.g., if the neuron's response was influenced
by the last 10 frames and only 5 frames were specified, then the
response would be averaged over the unspecified frames, causing the
neuron's responses to appear more variable than they would be if the
stimulus were fully specified. The variability remaining when the
stimulus was fully specified reflected the neuron's intrinsic response variability.
It is common to characterize a cell's response by its linear temporal
kernel, which
as computed from an M-sequence stimulus and neglecting
normalization (see METHODS)
is the difference between its
mean response to a single bright frame and its mean response to a
single dark frame. We found that average responses to a single bright
or dark frame within a sequence showed Poisson-like spike count
variability and temporal dispersion over tens of milliseconds, and the
kernel was correspondingly temporally broad. But by specifying more of
the stimulus
e.g., specifying eight consecutive frames
the response
could become far more precise, with sub-Poisson spike count variability
and temporal precision of 1-2 ms. The information conveyed by the
neuron correspondingly increased, containing as much as 3.5 bits/spike
about longer stimulus sequences. We found that this information
depended on the specification of spike times down to 1-ms resolution
and that the information in consecutive spikes showed little redundancy
or synergy. Finally, we determined that the precision obtained when
multiple frames were specified could be largely, but not entirely,
explained if the spike rate arose from a filtering of the stimulus by
the cell's temporal kernel followed by thresholding, along with
imposition of a postspike refractory period.
Some of this work was previously presented in abstract form (Liu
et al. 2000
; Tzonev et al. 1997
).
| |
METHODS |
|---|
|
|
|---|
Experiments
We performed experiments on adult cats under a protocol approved
by the University of California, San Francisco Committee on Animal
Research. Cats were initially anesthetized with isoflurane (1-5%),
and placed on a feedback-controlled heating pad to maintain body
temperature at 37.5-38°C. We established an intravenous line and
thereafter maintained anesthesia via thiopental sodium or pentobarbital
sodium (the latter was given once anesthesia was stable). The heart
rate, respiratory rate, core temperature, O2 saturation, expiratory CO2, and lung pressure
were all continually monitored. After performing a tracheotomy, the
animal was respirated with nitrous oxide in a 1:1 ratio with oxygen. We
performed a craniotomy, and then paralyzed the animal by infusing
gallamine (10 mg · kg
1 · h
1 in lactated dextrose Ringers). The
electroencephalogram (EEG) was subsequently monitored continuously. We
reflected the optic disk onto a white background using a fiber optic
light source, and inserted contact lenses to focus the eyes at a
distance of 35-40 cm.
We recorded extracellularly using tetrodes (Gray et al.
1995
) advanced through a guide tube inserted to within a few
millimeters of the LGN. The LGN was recognized by the small (relative
to surrounding structures) and monocular visual receptive fields, and
by the match of topography across repeated penetrations to published accounts (Sanderson 1971
). The electrodes were
constructed from 13-µm-diam nickel chromium insulated wire (~20
µm including the insulation). The tips were beveled and gold-plated,
and the typical impedance was in the range of 0.8-1.5 M
. Tetrode
signals were amplified and then digitized at 20 or 30 kHz with 12-bit
resolution. The digitized data were continuously streamed to the disk.
To separate signals from different neurons, we sorted based on the spike amplitudes measured at the four tetrode wires. Clustering was
done manually using different two-dimensional projections of the
four-dimensional space.
Stimulus
For visual stimulation, sequences of full-field bright and dark
frames were presented on a computer monitor at the rate of 120 Hz,
yielding a frame duration of tf
8.3 ms. Each frame varied randomly between bright or dark, with a
photopic mean luminance; contrast [measured as (L
D)/(L + D) where L and D
were the luminances of bright and dark frames, respectively] for each
full sequence was chosen from 6, 14, 20, 40, or 80%.
We generated random frames using a binary M-sequence, which is essentially a stream of pseudorandom bits having some special properties (see following text). A bit value of 1 corresponded to a bright frame, and 0 corresponded to a dark frame.
An M-sequence of order n consists of
2n
1 bits. The full sequence can be viewed as
a collage of overlapping k-bit sequences, k
n, drawn from the list of all possible binary
combinations of k bits. For example, for k = 2, the possible binary combinations are: (0) 00, (1) 01, (2) 10, and
(3) 11. Thus a portion of the full sequence consisting of the bits
0110100 can be decomposed as the overlapping combination of the
sequences (1), (3), (2), (1), (2), (0). The same decomposition
procedure can be applied for any k. The M-sequence has the
convenient property that all subsequences of length k
n randomly appear within the full sequence the same
number of times, namely 2n
k occurrences
(except that the all-zero sequence of length k appears 2n
k
1 times). Because of this statistical
regularity of the M- sequence, it is an excellent tool for the
investigation of a cell's neural code.
Analysis
Cells were selected for analysis based on the following
criteria. To ensure single cell isolation, we chose only cells with clearly isolated clusters in the various two-dimensional projections of
the four-electrode amplitude space; clusters with clipped responses due
to amplifier saturation were avoided. To achieve reasonable estimates
of the information rates,
1,000 spikes were required during the whole
stimulus. Finally, only cells with ON or OFF linear temporal kernels (see following text) were studied, since this
formed the basis for the definition of response events. In total, 12 cells (4 ON, 8 OFF) in one cat were studied at
five contrast levels
80% (9 cells), 40% (6 cells), 20% (3 cells),
14% (1 cell), and 6% (2 cells)
yielding a total of 21 trials.
Response events and precision analysis
To study the precision of spikes, we attempted to classify each
individual spike as part of a spike event evoked in response to a
specific sequence of k frames. This was done by applying the
following algorithm, described here for an OFF cell. We
determined the average stimulus before a spike, and defined the cell's
mean conditional latency (conditioned on a spike) as the time to the zero-crossing between peak and trough in the spike-triggered-average stimulus (illustrated in Fig. 1). Then,
as shown in Fig. 2, for each spike in
the train, we looked back in time from the spike by the mean
conditional latency and found the closest OFF transition (bright frame followed by dark frame) within a window of ±1.5 frames;
the spike was assigned to that transition. If there was no such
transition, the spike was unclassified. We characterized sequences by
their length k and the location t of the
transition within the sequence (e.g., k = 8, t = 3 labeled an 8-frame sequence with a transition at
the onset of the 3rd frame
that is, between the 4th and 3rd frames,
where the 1st frame was the latest in time). For a given choice of
k and t, a given transition was uniquely associated with a surrounding sequence, and the spike was assigned to
that sequence. All spikes associated with the same sequence were
labeled as part of the same event. The percentage of total spikes that
were unclassified served as a measure of the level of "spontaneous"
activity that was not driven by transitions.
|
|
Once the events were identified for a given choice of k and
t, the probability that a specific sequence produced an
event was computed by dividing the number of times some spike response (
1 spike) was obtained for that sequence, by the total number of
presentations of that sequence (i.e., 2 × 214
k times). This quantity was called the
event probability.
We assessed the timing precision of the first spike in an event for
each sequence consisting of a specified number of frames, k, with transition location t. A
distribution for the times to the first spike in an event (of 1 or more
spikes) was obtained from the numerous presentations of a particular
k-frame sequence. A jackknife estimate of the standard
deviation of this first-spike time was used as the index of the timing
precision (Thomson and Chave 1991
), and the error was
taken as the square root of its variance. We approximated the overall
first-spike timing jitter for a given k and t by
the median standard deviation across all k-frame sequences
with transition location t. The timing jitter was then
studied as a function of k and t.
To determine whether the timing jitter was correlated with the event
probability, we computed the Spearman rank-order correlation (Press et al. 1992
, p. 639-642) for eight-frame
sequences that had t = 3, the transition position that
generally resulted in the smallest timing jitter. In several cases,
there were sequences with very small event probabilities and hence very
few event responses from which to estimate the timing jitter. This
could result in particularly large or particularly small jitters. To
test whether this may have biased our estimate of the correlation, we
calculated the Spearman rank-order correlation under two conditions:
using all sequences and using only those sequences with event
probabilities above a minimum probability. This minimum probability was
arbitrarily taken to be 1/
We also assessed the spike count precision of the events for each sequence of a specified k and t. In this case, we generated a histogram of the number of spikes in the event responses for each sequence, allowing for the possibility of no spikes. A jacknife estimate of the variance of that distribution was used as the index of that sequence's count precision. The error was again taken as the square root of the variance of this estimate. To summarize the results across all sequences of length k with a given t, the Fano factor (variance divided by the mean) for each sequence was also estimated by jacknife. The median spike count Fano factor was then used to show the dependence of spike count precision on k for a given t.
Information analysis
The information in the spike train about the stimulus was
quantified using the "direct" method (de Ruyter van
Steveninck et al. 1997
; Strong et al. 1998a
,b
).
This method estimates the mutual information between stimulus and
response "directly" from the spike trains without regard to the
details of the stimulus/response relationship and with very few
assumptions about the coding strategy. This method relies on the fact
that the mutual information between the stimulus and response can be
written as the difference of two spike train entropies. First, the
maximum amount of information that a spike train response
can
provide about the stimulus is just given by the entropy of the spike
train itself, H(
). This is estimated from the probability
distribution of spike responses over the course of the whole experiment
without specific knowledge of the stimulus. Second, the information the
spike train carries about the stimulus is reduced from this maximum by
the degree to which there is variability or noise
in the repeated
responses to an identical stimulus, as measured by the spike train
noise entropy, H(
). This is estimated from the
probability distribution of spike responses to multiple, identical
presentations of the same stimulus, averaged over stimuli.
With the M-sequence, responses to the repeated presentations of each
k-frame stimulus sequence were easily obtained. For each occurrence of a specific k-frame sequence, the response
beginning at a delay
(ranging from 0 to 130 ms) relative to the
onset of the initial frame of the sequence was divided into bins of size 
(usually 1 ms) containing the number of spikes in each bin.
These bins were combined to form spike "words" of length T = M
, where M was an integer number
of bins. For example, for M = 3, the joining of three
bins containing 2, 0, and 1 spikes, respectively, would yield the word
201 (note that the absence of spikes in a bin can be informative, and
its contribution was included).
We then computed the entropies for each choice of k, T, and

by building the probability distribution of these words
across the whole experiment for
Hk,
,T(
) and across the
multiple repeats of the ith k-frame stimulus
sequence (i = 1, ... , 2k)
at time-shift
for
Hi,
,k,
,T(
). Note that the location of a transition, t, within the
k-frame sequence was now irrelevant and not specified;
instead all k-frame sequences contributed equally to this
analysis. Both T and 
were varied to obtain estimates
of the entropy on different time scales. For a given T and

, the average information about the k-frame sequence
that began at time
before a response word was then given by
Hk,
,T(
)
Hi,
,k,
,T(
)
i, where
H(
)
i was the average
noise entropy across all k-frame stimulus sequences (i.e.,
average over i). We assigned the information about
k-frame sequences, for the given T and 
, as
the maximum information across
(see following text).
First though, for each combination of T, 
,
k, and
, we corrected for finite-data errors. This was
done by computing the mutual information for different partitions of
the data: the whole data set, and the average over each half of the
set, over each third, and each fourth. This average information was
then plotted as a function of the number of partitions N,
and fit to the functional form, I = I0 + I1/N + I2/N2
(Strong et al. 1998b
).
I0 therefore represented the true
information rate extracted from the limit of infinite data for a given
T, 
, k, and
. Note, however, that when
the amount of the data were too small, even this correction failed.
Empirically, this occurred when the ratio of
I2 to
I0 became large. We used a ratio of
2 × 10
3 as the border between sufficient
and insufficient data and show results only for cases in which data
were sufficient by this criterion. In practice, the corrections for
finite data were typically tiny, and the point of this procedure was
primarily to screen out cases (e.g., too-large k or
too-large T) for which data were insufficient.
Given the corrected information, we assigned the information about
k-frame sequences as follows. For the given k, T,
and 
, we determined the
that maximized the information. The
information, I, was then assigned to be the average
information over the bins within ±4 ms around this maximum. (We chose
this to correspond to about a frame width, so that averaging smoothed
out any frame-related artifacts.) The information rate of the spike
train, in units of bits/time, was I/(M
). We
converted this to units of bits/spike Isp by dividing by the neuron's
average spike rate, r, assessed over the entire
two-M-sequence stimulus: Isp = I/(rM
).
This method worked well only for relatively short response words. Long
response words required long stimulus sequences to minimize the
randomizing effect of different stimulus contexts on early or late
portions of the response word. However, since each sequence repeated
2 × 214
k times, as k
increased, our estimate of the entropies degraded due to sampling
problems. Thus to consider very long response words, we employed a
different strategy: we estimated a lower bound on the
information carried by the spike train about the stimulus by applying
the direct method to the two repeats of the full M-sequence. Assuming
that the only thing in common between the two presentations of the
M-sequence was the stimulus itself and that therefore the noise in the
two cases were uncorrelated, the information that one response
1 carried about the second response
2,
I
,T(
1,
2)
should be a lower bound to the information between either response
and the stimulus
,
I
,T(
,
)
(Strong et al. 1998b
). We took each response to be the
spike train generated by each full M-sequence, minus the first and last
200 ms. We then computed each spike train's entropy,
H
,T(
i),
i = 1, 2, for words of length T, and the
joint entropy,
H
,T(
1,
2),
for the co-occurrence of words in the two spike trains. These were
computed from the probability distributions for words by using
overlapping intervals (incremented by 
, to increase the effective
number of samples). To correct for finite-data errors, data size
scaling was applied in this case directly to the entropy estimations
(rather than to the mutual information as in the data size scaling
described above); an example is shown in Fig.
3A. The mutual information between the two responses was then
|
(1) |

was small. Hence, to summarize
the dependence for a particular bin size, the infinite-word-length limit was taken by obtaining a linear fit to the plots of the (infinite
data limit) entropies versus 1/T, and using the y
intercept as the (infinite word limit) entropy rates in the calculation of the information rate. The fit was performed only over the range of
1/T where sufficient data were available to accurately
estimate the entropy rates, as illustrated in Fig. 3B. In
practice, T's ranged from 8 to 48 ms. Finally, the
information per second from words of spikes was converted into the
information per spike by dividing by the mean spike rate across the
whole experiment.
|
Models
We constructed quasi-linear threshold models of driven LGN
spiking activity to investigate whether the observed precision could be
explained by simple mechanisms. All models convolved the full
M-sequence stimulus, binned at one-sixth the frame period, with the
cell's temporal kernel to generate a firing function, f(t) (linear part). These responses were
thresholded and perhaps squared (nonlinear part) to generate firing
rates r(t), as follows. We defined
r(t) = 
([f(t)
]+)p, where
[x]+ = x, x > 0; = 0, otherwise; p = 1 for a linear function and
p = 2 for a quadratic function; and

was chosen to make the mean of
r(t) equal to the observed mean firing rate. The
value of the threshold
was fit as described in the following text. Finally, spikes were generated as a Poisson process from these rates,
perhaps along with a refractory period, as will be described in the
following text.
The temporal kernel was determined as the spike-triggered-average
stimulus, divided by the autocorrelation (or in Fourier space, the
power spectrum) of the M-sequence stimulus (the power in the M-sequence
at frequency f is proportional to [sin
(f/rf)/f]2,
where rf = 120 Hz is the frame rate). This
division yields the linear filter that, applied to the stimulus, gives
the best estimate of the response in the sense of least mean-square
error (Rieke et al. 1997
). The spike-triggered average
and temporal kernel for one cell can be seen in Fig. 1. The division is
done in Fourier space, where it simplifies to a frequency-by-frequency
division; otherwise it would involve multiplying one matrix by the
inverse of another matrix. However, one does not want to continue
dividing up to arbitrarily high frequencies where the power in the
stimulus approaches zero, as this will just amplify high-frequency
noise. We chose to do the division up to some cutoff frequency, and to set all power above that cutoff frequency to zero. To choose a cutoff
frequency, we tried cutoffs from 75 to 100 Hz in 5-Hz steps. For each
cutoff, we applied the corresponding filter to the M sequence to obtain
the output f(t), converted this to a rate
function r(t) as described in the preceding text
using p = 1, and chose the threshold
as that which
minimized the mean-square error difference between the predicted
Poisson rate function and the eight-frame PSTH for the actual data. We
then chose the cutoff frequency that gave the least mean-square error;
this best cutoff was 90 Hz. This kernel was used subsequently in all
models to draw actual spikes for PSTH comparison (see following text).
The conversion from r(t) to spikes was as
follows. We interpolated r(t) to achieve a
temporal resolution of 1/60 of a frame (the spike-triggered average and
temporal kernel had been computed in bins of 1/6 of a frame or ~1.39
ms). For the simple Poisson case, spikes were then generated in each
time bin
t with probability r(t)
t, using
t = 139 µs. For the case of a Poisson process with a refractory period, a
free firing rate, q (Berry and Meister 1998
),
was generated assuming a specific refractory period, µ, by taking
q(t) = r(t)/[1
r(t)µ]. Spikes were then drawn as in the
Poisson case but using q(t) rather than
r(t). In the case of only an absolute refractory
period, the probability of a spike was set to zero for µ ms after
each spike. We also tried adding an exponential recovery after the
absolute refractory period, setting µ = µabs + µrel, where
µabs was the absolute refractory period and
µrel was the exponential recovery of the
probability from zero up to q(t). This
implementation for a relative refractory period is reasonable when
µrel is smaller than the characteristic time
over which the firing rate remains relatively constant.
For each of the models, an optimal threshold and refractory period(s) (if applicable) were selected simultaneously to minimize the mean-square error between the real data and the model of the segment of the eight-frame PSTHs defined by the 18 ~1.39 ms bins before and the 7 bins after the end of the eight-frame sequence. This was done by trying every threshold from 1 to 5 in steps of 0.2, (if applicable) absolute refractory periods from 1 to 4 ms and relative refractory periods from 0.5 to 4 ms in steps of 0.5 ms for which q(t) remained positive, and then selecting the combination of threshold and refractory periods that gave the least mean-square error. These ranges seem reasonable because in no case was the optimum parameter at an extreme of the range explored for that parameter. The mean firing rate over the whole stimulus in the model was typically matched to within a few percent of the data's mean.
| |
RESULTS |
|---|
|
|
|---|
Full-frame, binary, 14-bit M-sequence stimuli were presented at different contrast levels. In general, this stimulus drove cells in the LGN well. Average spike rates across all cells and stimulus conditions ranged from 4.6 to 25.3 Hz. Neural responses were usually triggered by transitions from either bright to dark frames (OFF cell), or vice versa (ON cell); we referred to two-frame sequences of bright/dark or dark/bright as an OFF or ON transition, respectively. Each cell's polarity was determined by reverse correlating the spike train with the M-sequence stimulus. Figure 1 presents the spike-triggered-average stimulus for one of our good OFF cells (cell 4, 80% contrast) that had a strongly driven response producing nearly 7,000 spikes. We use this cell to illustrate the main results of our analysis. A spike at time 0 for this cell was generally preceded by a transition from bright to dark ~32 ms earlier. This time delay was referred to as the cell's mean conditional latency. Figure 1 also illustrates the cell's temporal kernel (see METHODS), which represents the cell's temporal receptive field and has the same 32 ms mean conditional latency; we will return to this later.
An initial 1,200 frames (10 s) from the beginning of the M- sequence
were presented to adapt the cells to the stimulus ensemble before
showing the M-sequences used in data analysis. After the conditioning,
two repeats of the full M-sequence were displayed without delay. A
total of 2 × 214
k repetitions of
each k-frame sequence (k
14) occurred,
e.g., 128 repeats of each eight-frame sequence. Because of this
convenient property, it was natural to focus on responses to the set of
k-frame sequences for different k.
Mean response: the PSTH matrix
The M-sequence stimulus presented frames of random stimuli in series rather than in isolation. To obtain an average response to a specific stimulus sequence, we extracted the individual spike responses to the multiple presentations of that sequence in the full M sequence. Consider first the case of one-frame stimuli. The average response to single bright or dark frames of stimuli was generated in the form of a matrix of PSTHs (Fig. 4). The shading in each 1-ms bin corresponds to the total number of spikes from all presentations of this sequence at that time relative to the frame onset. Note that there was a nonzero spike rate even at the time origin that was nearly the same for both bright and dark frames. This reflects the fact that at early times, the spikes were responses to earlier frames over which we had averaged. The response to the particular bright or dark frame was most clear at ~32 ms as expected from the cell's mean conditional latency.
|
One advantage of visualizing a PSTH matrix is in the ability to display the neuron's average responses to stimuli more complex than just a single frame, as shown in Fig. 5 for two-frame sequences. This clearly shows that spikes tended to be generated near the mean conditional latency in response to an OFF transition (stimulus 2), whereas spiking was clearly suppressed near the mean conditional latency by an ON transition (stimulus 1). Note that the response to a dark frame (stimulus 0 in Fig. 4) was now broken down according to whether the preceding frame was dark or bright (stimuli 0 and 2, respectively, in Fig. 5).
|
Figure 6 displays the PSTH matrix (with 1-ms time bins) for the response to seven-frame sequences, sorted according to the rightmost two frames, f1 and f2 (we usually numbered frames in a k-frame sequence consecutively as fn, n = 1, ... , k, with f1 the latest in time and fk the earliest). This grouped together all responses to sequences with an OFF transition in the most recent two frames. As expected, a large vertical band of spikes centered at ~32 ms appeared in response to the OFF transition. One striking feature was the slight slant in time of the OFF response band near 32 ms. Qualitatively, for this cell, the time to the first spike was correlated with the amount of time the stimulus had been bright prior to the final transition to dark: the longer this time, the earlier the occurrence of the first spike in the response.
|
Moreover, the spikes in this band were noticeably isolated in time on both sides by regions of virtually no spikes, suggesting that there was a high degree of temporal precision in the response when seven frames of the stimulus were specified. To examine this, each spike should ideally be classified as part of a response to a particular sequence. In the PSTH matrix though, each spike occurred multiple times, each time associated with a different time frame and sequence. Hence, echoes of the main OFF response appeared in the other quadrants of the PSTH matrix where an OFF transition occurred earlier in the sequence.
Event classification
To classify a spike to a unique sequence, a search was performed to find the OFF transition that was most likely to be responsible for a given spike. All spikes classified to the same transition were then grouped together as the spike "event" in response to the sequence containing that transition (see METHODS). In practice, this algorithm reproduced the event structure quite well, as can be seen from the comparison of Figs. 7 and 8. These show the PSTH matrix and the extracted unique spike events, respectively, for the 1/4 of eight-frame sequences having an OFF transition in their final two frames. The band of spikes near 32 ms was clearly reproduced in the spike events. Virtually all spikes in the train were accounted for by this technique; only 1.8% of the spikes were unclassified. (Note that spikes placed at random would show 5/16, or 31%, unclassified.)
|
|
In general, for the group data across all cells, 10 of 21 trials had unclassified percentages <5%, while for the remaining 11 trials this was larger than 5%. Qualitatively, the unclassified percentage was correlated with the degree to which spikes were locked to the stimulus as evidenced by visual isolation of spikes around the mean conditional latency in the PSTH matrix. When the spikes around the mean conditional latency could be visibly isolated (10 of 21 trials), the algorithm appeared to yield fairly low unclassified percentages (9 of those 10 trials). The one exception was a 40% contrast trial for an ON cell in which the events in response to an ON transition were fairly well isolated yet the unclassified percentage was nevertheless high (26%), probably because spikes were also produced without a transition when the stimulus had been bright for several frames. In cases when locking was evident but poor (5 of 21 trials had bands of increased spiking, but these were not well isolated) or when spiking was more indiscriminate (6 of 21 trials had poorly distinguishable bands), the unclassified percentage tended to be larger (10 of these 11 trials had unclassified percentages above 5%). The one exception was a 6% contrast trial for an OFF cell with a weak linear kernel-its events were not well isolated, but its unclassified percentage was nevertheless low (3.5%).
For each sequence, we defined its event probability to be the percentage of its occurrences that evoked an event of one or more spikes.
Response variability
SPIKE TIMING PRECISION. Using the binary k-frame sequences to characterize the stimulus, and the spike events to characterize the response, we turn to the next issue of this paper: a study of the reliability and precision of responses and their dependence on the stimulus. The timing precision of these events was examined by determining the jitter in the time of the first spike in the events associated with a particular sequence. This is shown in Fig. 9A for the only possible two-frame sequence with an OFF transition. This sequence generated a spike response 49% of the time, and the time of the first spike had a standard deviation of 3.25 ± 0.04 ms. Because the responses to all possible combinations of stimulus frames before and after the two frames of the transition were averaged together, this standard deviation represented the precision achieved by the two frames of the OFF transition alone, when the other frames were unspecified. Its value was already less than the standard deviation expected (7.2 ms) if the first spike times were distributed uniformly over the three-frame search window that defined events.
|
|
mean of 1.56 ± 0.39 (SD) ms (n = 9, excluding the outlier). Two trials (1 cell at 40% contrast, another at
80% contrast) had high unclassified percentages (26 and 22%,
respectively) but nevertheless had small median standard deviations
(1.97 ± 0.06 and 2.28 ± 0.13 ms, respectively). The remaining nine trials that had >5% unclassified spikes clustered at
~4.01 ± 0.58 ms. Many of these trials were less well driven, as
evidenced by their generally lower firing rate, as shown in Fig.
11B. Because this group of trials often responded more
diffusely in time, making classification of spikes difficult, their
poorer precision was not surprising. However, given that their
precision was well below the 7.2 ms expected from random placement of
spikes, it seems likely that this reflected a true property of the
cells rather than an artifact of the classification method.
|
SPIKE COUNT PRECISION.
The timing precision analysis focused on how the stimulus affected the
jitter of a single spike (namely the 1st spike in an event). To study
the precision of the remaining spikes in an event, we analyzed the
precision of the number of spikes in the events evoked by a stimulus.
This spike count precision was characterized by examining the variance
in the number of spikes per event versus the mean number of spikes in
an event. In the case of a Poisson process, the variance is equal to
the mean. At the other extreme, the minimum possible variance for a
discrete counting process with a given mean m is obtained if
the number of spikes in every event is either ceil(m) (the
smallest integer
m) or floor(m) (the
largest integer
m). This minimum variance varies
periodically with the mean, dropping to zero at each integer and
forming a scalloped curve between integers.
|
|
|
Information transmission
The spike timing and count variability measures discussed above gave some indication of the precision of LGN neurons. How much information did this level of precision allow the cells to transmit?
To address this, we changed our analysis method. The preceding analyses of variability depended on defining events that associated each spike with a unique sequence that evoked it. This required specifying both sequence length and the location within the sequence of the transition (because spikes were associated with transitions and these 2 facts uniquely linked transitions to sequences). For the information analysis, we instead considered all sequences of a given length, without regard for the presence of a transition, and simply examined the response at some fixed time interval after the initiation of the sequence.
We computed information using the direct method (see
METHODS). We binned time into discrete units of size

, typically 1 ms, and defined the "letters" of the response
"alphabet" as the number of spikes in a bin (0 or 1 for 1-ms bins).
A string of M such letters formed a response "word"
for
M = 1, the word was simply the number of spikes in a
single bin. Ideally, the choice of bin size should reflect the degree
of temporal resolution in the code, while the word size should reflect
the longest time scale of temporal correlations in the code. The timing
precision analysis suggested that a reasonable bin size was ~1 ms.
Initially ignoring correlations between bins, we calculated the
information about k-frame stimuli by considering only
single-bin words at this resolution (Fig.
15). The information grew with time
from the onset of the stimulus sequence, provided that further stimulus frames continued to be specified, up to at least nine frames. At this
point, the maximum information was ~3.5 bits/spike and appeared to be
nearing a plateau. The existence of a plateau was reasonable since a
given response time bin should give little or no information about
stimulus frames that occurred far in the past. For longer sequences
(k
4), the information began to drop from its peak
at ~24-26 ms after the onset of the last frame in the sequence, or
~16-18 ms after the onset of the first unspecified frame. This
suggests that 16-18 ms was the minimum delay for a frame to
significantly influence the response. This was in rough agreement with
our previous results that one and perhaps two frames after the
transition frames can influence the spike count by vetoing or allowing
spikes induced by the transition; if the response occurs 32 ms after
the transition, then these frames would have onsets ~15 and 24 ms
before the response that they influence.
|
We also compared the maximal observed information rate of 3.5 bits/spike to the cell's maximum possible information rate, as measured by the entropy of its spike train. Achieving this maximum would imply that all of the cell's response variability (as measured in single 1-ms bins) was used to encode the stimulus. In fact, the coding efficiency, the ratio of the actual information coded to that which could possibly be encoded, was ~51% (for k = 9), so that the cell transmitted information in individual 1-ms bins at a level that was within a factor of two of its limit.
We next examined the role of time resolution in information encoding by
varying the binwidth. We considered 8-ms words of the spike train, and
binned these words using either 1-, 2-, 4-, or 8-ms resolution. If the
precise timing of the spikes at these resolutions within the word were
important for transmitting information, then we expected more
information at smaller bins than larger bins. Finer resolution
increases the possible information the spike train can code; if the
actual information coded also grows, then the coding efficiency would
not significantly change with increasing resolution. On the other hand,
a fall-off of the coding efficiency would indicate that the increased
resolution is not being used to code information. We computed maximum
information rates for eight-frame sequences to ensure that there were
sufficient repeats of each sequence to allow us to estimate the
information for multiple-bin response words. The information rate
increased from 2.4 bits per spike at 8-ms bins to 3.1 bits per spike at 2-ms bins, a 29% increase (Fig.
16A), while the spike train
entropy increased by 36% over the same range. That is, 0.29/0.36 = 81% of the increase in entropy associated with this increase in
resolution was used to encode information. As a result, the coding
efficiency stayed relatively flat, decreasing only ~5% from a bin
size of 8 to 2 ms. Thus the position of spikes at
2-ms resolution was significant for coding information. Improving the resolution by a
factor of 2 from 2 to 1 ms yielded an additional 3% increase in
information to 3.2 bits per spike, compared with an increase in entropy
of 15%, suggesting that only 20% of the entropy change encoded
information. Thus while more information was encoded at this finer
resolution, there was a diminishing return as the noise became a
proportionately larger contributor to the cell's increased variability.
|