Content deleted Content added
No edit summary
U kthye versioni 1829824 i bërë nga 217.73.129.54 (diskutimet)
Etiketa: Undo
 
Rreshti 1:
{{H:h Ndihmë|Lexuesi toc}}
SIMULATION 2: LEXICAL DECISION
 
As mentioned in the Introduction, the distinction between words and nonwords is of fundamental importance to the lexical system, and many researchers incorporate this distinction into their models by representing words as explicit structural entities such as logogens or localist word units. Such units provide a natural account of how skilled readers can accurately distinguish written words from nonwords in lexical decision (LD) tasks. Given that the current distributed connectionist approach to lexical processing does not contain word-speciŽc representations, it becomes important to establish that distributed models can, in fact, perform lexical decision accurately and that, in doing so, they are inuenced by properties of the words and nonwords in the same way as human readers.
Kjo faqe jep shpjegime për funksionin "tani" d.m.th në kuptimin dallimi i ndryshimeve "tani" dhe në "fund".
Seidenberg and McClelland (1989) attempted to demonstrate that distributed representations can provide a sufŽcient basis for lexical decision. They assumed that subjects make word/nonword decisions based on some measure of the “familiarity” of the stimulus (Atkinson & Juola, 1973; Balota & Chumbley, 1984). This familiarity measure could potentially be computed from a variety of types of information derived by the lexical system— orthographic, phonological and semantic. Subjects are assumed to adopt a speciŽc strategy and decision criterion that allows fast responding with acceptable error rates depending on the composition of the word and nonword stimuli. Given that orthographic information is provided directly in the input, subjects rely on orthographic familiarity when this provides a sufŽcient basis for distinguishing words from nonwords—for example, if the nonwords are orthographically illegal consonant strings (e.g. PSLR). If the nonword stimuli include very word-like, orthographically legal nonwords
 
(e.g. NUST), subjects may consider phonological familiarity—whether the stimulus sounds like a word. If the nonwords include so-called “pseudohomophones” (e.g. BRANE) that have the same pronunciation as a word, subjects may have to base their decision on whether the orthography generates a familiar meaning.
Funksioni '''tani''' (versionet anglisht '''diff''') është dallimi në mes të versioneve të faqeve. Kjo mund të shikohet në faqet e bashkangjitura me funksionin [[Ndihmë:Historiku i faqes|Historiku i faqes]] d.m.th në historikun e faqes dhe paraqet versionet e asaj faqe. Kolona e rrathëve "radio" në anën e majtë është për të zgjedhur versionin më të vjetër ndërsa ajo e djathta për versionet e reja. Si rezultat i zgjedhjes së dy versioneve dhe shtypjes së pullës "Krahasoni versionet e zgjedhura" do të ngarkohet një faqe në të cilën paraqiten dallimet në mes të tyre dy versioneve.
Although Seidenberg and McClelland’s general framework includes semantics, their speciŽc implemented network did not. Accordingly, they focused on demonstrating that, under some conditions, lexical decision can be performed by a distributed network using only orthographic and/or phonological information. In their simulation, the orthographic familiarity of a stimulus was measured by the discrepancy between the orthographic pattern presented to the network and the one regenerated by the network from its hidden representation. Seidenberg and McClelland showed that this orthographic error score tended to be smaller for trained stimuli (words) than for novel stimuli (nonwords), although there was some overlap in the distributions. Moreover, the degree of overlap depended on properties of the words and nonwords in a way that corresponded to the effects of these properties on LD accuracy and latency in empirical studies. For example, the overlap is increased and, hence, lexical decision is slower and less accurate when the words are of lower frequency (e.g. Gordon, 1983) or include so-called “strange” words with unusual spelling patterns (e.g. AISLE, GAUGE; Waters & Seidenberg, 1985), or when the nonwords include pseudohomophones (e.g. Coltheart, Davelaar, Jonasson, & Besner, 1977; McCann, Besner, & Davelaar, 1988). Seidenberg and McClelland assumed that, as the overlap in orthographic error scores for words and nonwords increases, the network (and subjects) would have to turn to phonological and/or semantic information to perform lexical decision accurately. In particular, the presence of strange words causes subjects to rely on phonological information, giving rise to effects of spelling–sound consistency in lexical decision (Waters & Seidenberg, 1985).
<!--A '''diff''' is the difference between two versions of a page. It can be viewed from the [[Help:Page history|page history]]: for every version there are potentially two [[w:radio button|radio button]]s: the left column is for selecting the older version, the right column for selecting the newer one. Pressing "Compare selected versions" gives the difference between the two versions.-->
However, Besner and colleagues (Besner et al., 1990; Fera & Besner, 1992) challenged the claim that Seidenberg and McClelland’s model provides an adequate account of LD performance, even under conditions in which orthographic information is deemed sufŽcient (i.e. in the absence of strange words). Besner et al. (1990) reported that, when a decision criterion over orthographic error scores is set to yield an error rate on words of 5.2% (as observed by Waters & Seidenberg, 1985, experiment 2), the falsepositive rate on nonwords exceeded 28%. Although Waters and Seidenberg did not report nonword error rates, this value is certainly higher than would be expected of subjects, especially if given unlimited time to respond. Moreover, Fera and Besner (1992) demonstrated that, for human subjects, the degree of overlap in orthographic error scores for words and nonwords had no effect on LD accuracy and latency, nor on the magnitude of a pseudohomophone effect among nonwords. These negative Žndings concerning Seidenberg and McClelland’s implemented model and its predictions call into question the more general claim that words can be distinguished from nonwords by a distributed system lacking word-speciŽc representations.
 
The goal of the current simulation was to demonstrate that lexical decision can be performed accurately when based on a familiarity measure applied to semantics and that, moreover, such a measure exhibits the appropriate sensitivity to properties of the words and nonwords in the task. It is important to be clear at the outset that this demonstration should not be interpreted as implying that lexical decision always relies on semantics. The perspective taken is exactly that of Seidenberg and McClelland (1989)— subjects performing lexical decision may adopt a variety of strategies, including those that depend only on orthographic and/or phonological information, as a function of the composition of the stimuli. In fact, subjects are assumed to rely on these more surface representations as much as possible. It is acknowledged, however, that in some contexts, orthographic and phonological representations will not support sufŽciently accurate LD performance on their own. The current work aimed to establish that semantic familiarity provides a sufŽcient basis for lexical decision within a distributed lexical system, and supports the hypothesis that subjects can perform lexical decision accurately by relying on semantics when necessary.
Në rastet e faqeve speciale si [[Ndihmë:Ndryshime së fundmi|Ndryshime së fundmi]] mund të bëhen krahasimet në mes të dy versioneve të paraqitura aty.
Method
<!--For special cases (the diff for a single edit or between an old and the current version) other possibilities are clicking ''cur'' or ''last'' in the page history or on the [[Help:Recent changes|Recent Changes page]]. The diff is also shown during an [[Help:Edit conflict|edit conflict]] so you can see exactly what you need to reintegrate.-->
A feedforward network was trained to map from the orthographic representations of the 2998 monosyllabic words in Plaut et al.’s (1996) corpus to their phonological representations and to newly created semantic representations. The architecture of the network, shown in Fig. 6, corresponded to an instantiation of the full framework for lexical processing in Fig. 1, except that the depicted hidden units between orthography and
 
Po ashtu një funksion i tillë krahasues është i integruar edhe gjatë redaktimit. Në këtë rast krahasohet versioni i fundit i faqes me redaktimin e bërë nëse shtypet pulla "Trego ndryshimet"
FIG. 6. The network architecture used to model naming and lexical decision. Large arrows represent full connectivity between layers except where indicated. Small arrows indicate input units (incoming arrow) or output units (outgoing arrow).
 
semantics and between phonology and semantics were combined. SpeciŽcally, 108 grapheme units were connected to 100 hidden units, which, in turn, were connected to 61 phoneme units. The grapheme and phoneme units were also connected to a second group of 1000 hidden units, which were then connected to 200 semantic units such that each hidden unit had a probability of 0.5 of being connected to each semantic unit. A much larger number of hidden units was needed to map to semantics than to map from orthography to phonology because there is no systematicity between the surface forms of words and their meanings, and connectionist networks Žnd unsystematic mappings particularly difŽcult to learn (see Hinton, McClelland, & Rumelhart, 1986, for discussion). Including bias weights, the network contained a total of 283,998 connections.
<!--From [[m:MediaWiki 1.5|MediaWiki 1.5]] diff works also in preview, showing the difference between the currently stored version and the current version in the edit box-->
The fact that the network has a feedforward architecture should not be interpreted as a theoretical claim about the structure of the human reading system. The underlying theory incorporates interactivity, and its use in forming attractors, as an important computational principle. Thus, the current feedforward network should be thought of as an approximation to a fully recurrent one. Training a recurrent version of the network was, however, infeasible due to limitations in the available computational resources. Nonetheless, there are certain aspects of the feedforward approximation that are important to point out. In particular, processing is unidirectional from phonology to semantics, so that phonological representations (derived from orthography) inuence semantics in performing lexical decision, but semantic representations cannot inuence phonology in naming. As a result, unlike the Žrst simulation, the naming performance of the current network does not reect a contribution of semantics, and thus constitutes only a coarse approximation of human performance. Even so, Plaut et al. (1996) demonstrated that, with regard to normal performance, networks trained without semantics produce patterns of accuracy and latency results that are very similar to those of networks trained with semantics.
<br clear=all>
The semantic representations for words were designed to capture only the most abstract characteristics of word meanings—namely, that they cluster into categories and are arbitrarily related to orthographic and phonological representations. They were created in the following way. First, 120 random prototype patterns were generated over 200 semantic features such that each feature had a probability Pa 5 0.1 of being active. Then each prototype was used to generate 25 exemplars that cluster around it by regenerating each semantic feature (using Pa 5 0.1) with a probability Pr 5 0.05, under the constraint that each exemplar had to differ from every other exemplar by at least three features. This procedure created 3000 semantic exemplars, each with an average number of active features of 20.12 (SD 1.712, range 14–26; the mean is slightly greater than 20 due to the constraint on the minimum difference between semantic patterns). Fifteen of these exemplars were chosen at random and discarded, and the remaining 2985 patterns were assigned randomly to the words in the training corpus (to avoid the difŽculties involved in learning one-to-many mappings from orthographyto-semantics, the 13 pairs of homographs in the corpus, such as WIND and READ, were assigned only one meaning). The orthographic and phonological representations for the words were the same as used in Simulation 1 and by Plaut et al. (1996).
==Si duket ==
Although the semantic patterns used in the simulation do not reect the relative similarities among the actual meanings of these words, their random assignment to words ensures that there is no systematic relationship between the written and spoken form of each word and its meaning. On the current approach, it is only this property that is critical for demonstrating that semantics can support lexical decision. In fact, abstract semantic representations like those used in the current simulation have been used successfully to model empirical phenomena in a number of psycholinguistic domains, including word learning (Chauvin, 1988), inectional morphology (Cottrell & Plunkett, 1991; Hoeffner, 1992), lexical ambiguity resolution
<!--The two versions are shown side-by-side. In the old version paragraphs which differ are yellow and in the new version they are green. In left-to-right languages, the old version is on the left. This is reversed in [[m:RTL|right-to-left]] scripts. Text removed within a paragraph is shown in red on the old version. New text within a paragraph is shown in red on the new version. If a whole paragraph was removed or added, the text is not red but just black, while the other side is blank (white). Unchanged text is black on grey, only parts before and after changed text is shown.
(Joordens & Besner, 1994; Kawamoto, 1988, 1993; Kawamoto, Farrar, & Kello, 1994), semantic and associative priming (Masson, 1995; Plaut, 1995b) and rehabilitation of impaired reading via meaning (Plaut, 1996).
 
Given that the number of incoming connections to units varied widely in parts of the network, some sets of connections were initialised differently than others. SpeciŽcally, weights in the network were initialised to random values drawn uniformly between 6 1, except that hidden-to-phoneme weights were initialised between 6 0.2, hidden-to-semantics weights were initialised between 6 0.1, and the bias weights for phoneme and semantic units were pre-set to 2 2.94444 (the magnitude of input producing an initial state of 0.05). These conditions ensured effective learning at the very outset of training, but are otherwise irrelevant to the results reported below.
The diff shows differences per line. Some editors find that adding manual line breaks improves the diff function.
The network was trained with back-propagation using a learning rate of 0.0001, momentum of 0.9 (set to 0.0 for the Žrst 10 epochs), adaptive connection-speciŽc learning rates (Jacobs, 1988, increment of 0.1, decrement of 0.9), and the cross-entropy error function without weight decay (see equation 4). During training, each presentation of a word generated both a phonological pattern and a semantic pattern as output. The error computed over the phoneme units was back-propagated to the grapheme units and used to calculate derivatives for weights in the orthography-to-phonology pathway, exactly as in Simulation 1. The error computed over the semantic units was back-propagated both to the grapheme units and to the phoneme units and used to calculate derivatives for the orthography-to-semantics and phonology-to-semantics weights. Semantic error was not back-propagated through the phoneme units to inuence the derivatives for the orthography-to-phonology weights, however, to prevent the network from trading off phonological accuracy for semantic accuracy. These error derivatives were scaled by a factor proportional to the logarithm of the word’s Kucera and Francis (1967) frequency and accumulated for each word before being used to change the weights at the end of each epoch.
 
After 1300 epochs of training, the network was tested for its performance both at naming and at lexical decision, using the words in the training corpus and two lists of nonwords. The Žrst list consists of 591 pronounceable nonwords created by Seidenberg et al. (1994) from each unique body in Seidenberg and McClelland’s (1989) corpus, and is referred to below as the “body-matched” nonword list. It was used by Seidenberg and colleagues to compare the nonword reading performance of 24 human subjects with the performance of Plaut and McClelland’s (1993) network (a precursor to the simulations of Plaut et al., 1996) and the performance of the pronunciation rules in Coltheart and co-workers’ (1993) Dual Route Cascaded model. This list was chosen because its size and diversity allows a thorough evaluation of overall LD accuracy. The second list of nonwords (Seidenberg, Petersen, MacDonald, & Plaut, 1996) contains 64 pseudohomophones and 64 nonpseudohomophone control nonwords, and is referred to below as the “PH/nonPH” nonword list. The two sets of items are very closely matched orthographically because they were constructed in groups of four by exchanging onsets and bodies (e.g. pseudohomophones JOAK and HOAP, nonpseudohomophones HOAK and JOAP). Thus, these nonwords provide a stringent test for the existence of pseudohomophone effects in the absence of orthographic confounds (see Seidenberg et al., 1996, for discussion).
As well as showing the difference between versions, the diff page has links to the user page and talk page of the users who edited both the last and current versions. Links to the users' contribution lists are also shown. For sysops, a [[Help:Reverting a page to an earlier version|rollback]] button is shown allowing them to revert from the new version to the old one. Note however that this is even shown when viewing the diff between the recent version of a page and a version older than the last version by an author other than the one of the most current version, in which case the rollback would not undo the change that is displayed. Thus if user A vandalized a page, and user B partially reverted that vandalism, the diff of the two together shows the remaining vandalism, but rollback reverts the partial repair by user B!
Lexical decision was based on a measure of the familiarity of the semantic pattern generated by a word or nonword. A commonly used measure of familiarity in distributed networks is the negative of the “energy”, S i , jsisjwij (HopŽeld, 1982), and a number of researchers (e.g. Besner & Joordens,
 
1995; Borowsky & Masson, 1996; Joordens & Becker, 1997; Masson & Borowsky, 1995; Rueckl, 1995) have proposed recently that it may be possible to perform lexical decision in distributed models on the basis of differences in the energy for words versus nonwords. A serious drawback of this measure, however, is that it requires decision processes to have explicit access to the weights between units (analogous to the size and number of synapses between neurons), which is far less neurobiologically plausible than a procedure that need only access unit states.
[[Help:Edit summary|Edit summaries]] are also shown on the diff page. These appear in the row beneath the user names. If the user has used links in their edit summary, these act as links on the diff page as well.
An appropriate alternative measure, termed stress, is based only on the states of units. SpeciŽcally, the stressSj of unit j is a measure of the information content (entropy) of its state sj, corresponding to the degree to which it differs from rest:
 
(5)
In 1.4 there are also links to both versions, and the previous and next diff.
The stress of a unit is 0 when its state is 0.5 and approaches 1 as its state approaches either 0 or 1. The target semantic patterns for words are binary, and thus have maximal stress. Because, over the course of training, the semantic patterns generated by words increasingly approximate their target patterns, the average stress of semantic units approaches 1 for words. By contrast, nonwords are novel stimuli that share graphemes with words that have conicting semantic features. As a result, nonwords will typically fail to drive semantic units as strongly as words do, producing semantic patterns with much lower average stress. Accordingly, the average stress of semantic units, here termed simply semantic stress, should provide an adequate basis for performing lexical decision. We assume that LD responses are actually generated by a stochastic decision process (e.g. Ratcliff, 1978; Usher & McClelland, 1995) in which a decision criterion is adopted such that stimuli with stress values farther from this criterion are responded to more quickly.
 
Results and Discussion
This example shows the top of the diff page, with the links described above.-->
Naming. After the 1300 epochs of training, the network pronounced all of the 2998 words in the training corpus correctly (where homographs were considered correct if they elicited either correct pronunciation). Moreover, the cross-entropy error the network produced when tested on Patterson and Hodges’ (1992) high- and low-frequency regular and exception words replicated the standard empirical Žnding of a frequency 3 consistency interaction in naming latency [means: high-frequency regular 5 0.0022, low-frequency regular 5 0.0051, high-frequency exception 5 0.0051, low-frequency exception 5 0.0203; F(1,164) 5 20.56, P , 0.001].
Ky shembull tregon një nga pamjet e tilla (për MediaWiki):
With regard to nonword reading, 91.4% (540/591) of body-matched nonwords were given a pronunciation that either matched the pronunciation given by at least one of Seidenberg and co-workers’ (1994) 24 human subjects, or was consistent with the pronunciation of a word in the training corpus with the same body. For the PH/nonPH nonwords, the network’s pronunciations of 96.9% (62/64) of the pseudohomophones and 95.3% (61/64) of the nonpseudohomophones matched that of some word in the training corpus with the same body.
{| width=100%
Thus, overall, the network exhibited the appropriate pattern of skilled performance in naming words and nonwords.
|- align="center" bgcolor="#cccccc"
Lexical Decision. Semantic stress values were calculated using equation (5) for the body-matched nonwords and for sets of high-, medium- and low-frequency words. The words sets consisted of the 600 words in the training corpus with the highest Kucera and Francis (1967) frequency (mean 653.4, median 158), the 600 with median frequency (mean 8.807, median 9) and the 600 with the lowest frequency (mean 0.6298, median 1). Figure 7 displays the distributions of semantic stress for these words and nonwords.
| width=50%|<strong>Revision as of 22:32, Aug 03, 2003</strong><br/>
As Fig. 7 shows, there is very little overlap between the semantic stress values for words and those for nonwords. In fact, if a decision criterion is adopted such that a stimulus is accepted as a word if it generates semantic stress in excess of 0.955, then the network produces error rates of 1.5% on both words (27/1800) and nonwords (9/591) and a d9 value of 4.33. Considering only low-frequency words, a criterion of 0.95 yields 2.5% misses (15/600) and 4.4% false-alarms (26/591) and a d9 of 3.67. For high-frequency words, a criterion of 0.965 yields 0.167% misses (1/600) and false-alarms (1/591) and a d9 of 5.87. These very high levels of discriminability between words and nonwords should be interpreted as reecting asymptotic
[[User:Angela|Angela]] ([[User_talk:Angela|Talk]] | [[Special:Contributions/Angela|contribs]])<br/>
[[Help:Edit summary|Edit summaries]] in diffs are great
FIG. 7. Distributions of semantic stress values for the body-matched nonwords (n 5 591; Seidenberg et al., 1994) and for high-, medium- and low-frequency words from the training corpus (n 5 600 for each).
|
performance in the absence of time pressure. We assume that subjects can trade accuracy for reduced latency under conditions in which they are encouraged to respond as quickly as possible while keeping error rates acceptably low (e.g. by adjusting the response threshold in Ratcliff’s, 1978, diffusion model).
<strong>Revision as of 00:10, Aug 18, 2003</strong><br/>
Moreover, Fig. 7 illustrates that the distributions of stress values for words vary systematically as a function of their frequency. SpeciŽcally, the overlap with nonwords increases as word frequency decreases. This property provides an account of the frequency blocking effect in lexical decision. Gordon (1983; see also Glanzer & Ehrenreich, 1979) compared LD latency to high-, medium- and low-frequency words when presented in mixed blocks versus when blocked by frequency. The blocking manipulation did not inuence the error rates for words in each frequency band, nor the latencies to low-frequency words. By contrast, LD latencies to medium- and high-frequency words were faster (by 19 and 40 msec, respectively) when presented in pure blocks than when mixed with low-frequency words. These Žndings make sense in the context of the distributions of semantic stress values shown in Fig. 7 if subjects adjust their decision criterion to optimise their performance within each block given the composition of the stimuli. A conservative criterion is required to achieve acceptable levels of accuracy when low-frequency words are present in the block (either mixed or pure). However, when only higher-frequency words are present, a more aggressive criterion can be adopted that, for the same error rate, produces faster responding.
[[User:Tim Starling|Tim Starling]] ([[User_talk:Tim Starling|Talk]] | [[Special:Contributions/Tim_Starling|contribs]])&nbsp;&nbsp;&nbsp;&nbsp;'''['''''[[Main Page|rollback]]''''']'''<br/>
The network also produced reliably higher semantic stress values for the pseudohomophones in the PH/nonPH list compared with their nonpseudohomophone control nonwords [means: PH 5 0.9246, nonPH 5 0.9184; paired t(63) 5 2.408, P 5 0.019]. The network tends to produce greater semantic stress for a pseudohomophone (e.g. HOAP) compared with a control nonword (e.g. JOAP) because the pronunciation it derives for the pseudohomophone (HOPE) was trained to help drive semantic units to extreme value—speciŽcally, the (binary) semantic representation of the base word. Given that higher stress values for nonwords decrease their discriminability from words, this result corresponds to the empirical Žnding of increased latency and/or error rates to pseudohomophones compared with control nonwords in lexical decision under time pressure (e.g. Coltheart et al., 1977; McCann et al, 1987). Nonetheless, in the absence of time pressure, both types of nonwords are easily discriminated from even the lowest-frequency words. For example, a decision criterion of 0.95 produces 2.5% errors to words (15/600), 6.25% errors to pseudohomophones (4/64) and no errors to nonpseudohomophones (0/64), corresponding to a d9 of 3.82.
Reverted edits by Angela to last version by Anthere
In summary, a network that maps orthography to semantics both directly and via phonology performed lexical decision accurately if word/nonword decisions were based on a measure of semantic familiarity, termed stress, that reects the degree to which generated semantic patterns are binary. Moreover, the distributions of stress values for different types of words and nonwords accounted for empirical Žndings concerning the effects of word frequency and nonword pseudohomophony on LD performance. In this way, the Žndings establish clearly that words can be distinguished from nonwords based on the functional properties of distributed semantic representations without recourse to word-speciŽc structural representations.
|}
GENERAL DISCUSSION
 
The traditional view of the lexical system stipulates rather complicated and domain-speciŽc structures and processes, including those that apply to individual words but not to nonwords, or to some words (regulars) but not to others (exceptions). The current article adopts an alternative view in which lexical knowledge and processing develop through the operation of general learning principles as applied to distributed representations of written and spoken words and their meanings. Distinctions between words and nonwords, and among different types of words, are not reiŽed in the structure of the system, but rather reect the functional implications of the statistical structure relating orthographic, phonological and semantic information. The structural divisions within the system are assumed to arise from the neuroanatomic localisation of input and output modalities, not from differences in representational content (for similar arguments, see Farah, 1994; Farah & McClelland, 1991).
<!--When moving or copying a piece of text within a page or from another page, and also making other edits, it is useful to separate these edits. This way the diff function can be usefully applied for checking these other edits.-->
This distributed view of lexical processing has been championed most explicitly by Van Orden et al. (1990) and by Seidenberg and McClelland (1989). Seidenberg and McClelland supported the view with an explicit computational model, but this support was limited by inadequacies in the model’s ability to account for skilled performance in nonword reading and in lexical decision (Besner et al., 1990) and impaired performance in uent surface dyslexia (Patterson et al., 1989). Plaut et al. (1996) provided a more adequate account of nonword reading and of surface dyslexia by improving the orthographic and phonological representations and by incorporating an inuence of semantics on word reading. However, the claimed role of semantics in reading, as reected in the account of surface dyslexia, has been challenged by the performance of patients whose word reading is unaffected by semantic impairment (e.g. Cipolotti & Warrington, 1995; Lambon Ralph et al., 1995). Moreover, the limitations of Seidenberg and McClelland’s model in performing lexical decision remained unaddressed.
 
The current simulations provide additional support for a distributed theory of lexical processing by addressing these two challenges. The approach taken to both involves a reconsideration and elaboration of the role of semantics in naming and lexical decision.
<!--==Width==
Semantics in Word Naming: Individual Differences
 
The Žrst simulation demonstrated that parametric variations within reading models give rise to individual differences in the overall competence and division of labour between the semantic and phonological pathways, such that individuals who are able to pronounce low-frequency exception words without semantic support fall at one end of a continuum. In particular, the semantic and phonological pathways together learn to support skilled word reading, but the speciŽc division of labour between the two—particularly for low-frequency exception words—depends on factors that either improve the competence of the semantic pathway or impede the competence of the phonological pathway. Two factors were investigated. The Žrst, the asymptotic strength of semantic support for phonology, summarises a variety of factors that would be expected to inuence learning in the semantic pathway, particularly the mapping from orthography to semantics. The second factor, weight decay in the phonological pathway, can be thought of as reecting the degree to which the underlying physiology can support large numbers of synapses and, hence, strong interactions between neurons. Simulations demonstrated that the phonological pathway can pronounce low-frequency exception words without semantics if either semantic strength or weight decay is particularly low during training, but that, in general, semantic damage leads to some degree of reading impairment (surface dyslexia).
After the table of differences, the latest of the two compared versions is shown fully.
On the one hand, this explanation would seem to be at odds with claims advanced by Patterson and colleagues (Graham et al., 1994, 1995; Patterson et al., 1994a, b; Patterson and Hodges, 1992) that semantic support is critical to the integrity of phonological representations. The current Žndings suggest that this property may still hold for most individuals but not for those with highly developed phonological pathways. On the other hand, the explanation for individual differences in division of labour relies critically on the supposition that the semantic and phonological pathways combine to support skilled reading, and that the semantic contribution to phonology has important implications for the nature of learning in the phonological pathway. In this way, even though the current account allows for the possibility that semantics plays a minimal role in the skilled reading performance of some individuals (and, hence, the preservation of reading performance despite semantic damage), it nonetheless incorporates the fundamental insight of Patterson and colleagues—that a consideration of semantic–phonological interactions is critical for a general understanding of the organisation and operation of the reading system. The account also retains the ability to explain why, in individuals who do exhibit surface dyslexia following semantic damage, there is a close relationship between the comprehension and correct pronunciation of individual exception words (Funnell, 1996; Graham et al., 1994; Hillis & Caramazza, 1991).
 
The introduction of individual differences into the current explanation raises the question of whether Plaut and co-workers’ (1996) account of surface dyslexia is underconstrained. After all, a variety of types and degrees of impairment to the models give rise to the qualitative pattern of surface dyslexia. It should be pointed out, though, that the approach is no different than the traditional dual-route model in this respect (see Coltheart & Funnell, 1987). To be clear, what is in question about the current account is the speciŽc claim regarding the relationship between premorbid differences in division of labour and the quantitative severity of surface dyslexia following semantic or semantic-to-phonological damage. More extensive testing of the non-reading capabilities of surface dyslexic patients is needed to address this concern to establish independent patterns of performance that predict the severity of their reading impairment.
In the case of the Classic skin with quickbar, the diff page does not have the quickbar, to provide more space. Therefore the diff page is also useful for viewing the page on full screen width, without changing the preferences.
What is not left underspeciŽed by the approach is the computational basis for the surface dyslexia pattern itself. This pattern arises directly from the intrinsic sensitivity of learning in distributed networks to word frequency and spelling–sound consistency, as expressed by the frequency–consistency equation (equation 1) for a two-layer Hebbian network. In the normal system, this sensitivity manifests as a frequency 3 consistency interaction in naming latency. Low-frequency exception words (e.g. PINT) are named disproportionately slowly because their vowel pronunciations (I Þ /aI/) have the least support from friends and suffer the greatest competition from enemies (e.g. I Þ /I/ in PRINT, HINT, MINT, etc.). However, when the performance of the system is limited (e.g. by weight decay and/or strong semantic support during learning, or by direct damage), these weakly supported vowels may now lose the competition to their enemies, resulting in regularisation errors (PINT Þ “pihnt”). High-frequency exceptions (e.g. HAVE) are more immune to these inuences because their own frequency (F[t] in equation 1) counterbalances the effect of enemies (e.g. GAVE, SAVE, etc.). However, with greater impairment, the vowels in these items also begin to lose the competition and are regularised (HAVE Þ “haive”). With still greater impairment, low-frequency regular words (e.g. SOUR) begin to elicit errors. If they have inconsistent bodies, they are also subject to competition from enemies (e.g. POUR, FOUR) and their errors reect this competition (SOUR Þ “sore”)—so-called LARC errors (Patterson et al., 1996). Thus, on the current account, the errors made by surface dyslexic patients to all types of words and the pattern of naming latencies exhibited by skilled readers have a common underlying cause: the inherent sensitivity of distributed networks to frequency and consistency. This is why the simulations of surface dyslexia, in particular, account for the full pattern of performance across a range of levels of severity of impairment, even when the speciŽc factors that lead to a particular level of impairment in a speciŽc patient may be in need of further speciŽcation.
 
It is important to emphasise that the postulated individual differences in division of labour do not give rise to all possible patterns of reading impairment following semantic damage, allowing any observed pattern to be explained. For instance, no parametric variation in the network will cause it to be severely impaired on low-frequency exception words without also showing some impairment on high-frequency exception words (see Fig. 5), or to be completely unable to read exception words while remaining unimpaired on regular words. (Note that the standard dual-route theory has no difŽculty exhibiting either of these patterns.) Rather, the patterns of performance that result from semantic impairment are highly constrained. That is, they all correspond to the speciŽc pattern of uent surface dyslexia—a frequency 3 consistency interaction in accuracy with poor reading of low-frequency exception words, high rates of regularisations, and normal nonword reading. Individual differences serve only to locate patients along a single continuum—severity of reading impairment—with the possibility that some, like D.R.N. and D.C., fall at one end of the continuum and are essentially unimpaired. This extreme pattern, however, belies the commonality of the effects that are observed when the full distribution of patients is considered. This commonality derives, on the current account, from inherent properties of connectionist learning applied to distributed representations. Essentially, the effects of word frequency and spelling– sound consistency are inextricably related because they have the same underlying cause: frequency-weighted sensitivity to input–output similarity, with frequency reecting the most similar item (the stimulus itself) and consistency reecting the less similar items (its friends and enemies).
With the Monobook skin the panels on the left are also on the diff page.
Semantics in Lexical Decision: Stress as a
 
Measure of Familiarity
[[w:en:Page widening|Page widening]] is more likely on a diff page, because there are two columns, but also because URLs (especially long ones) are not hidden.
The second simulation addressed a further challenge to a distributed theory of lexical processing, by demonstrating that, in a network that mapped orthography to semantics directly and via phonology, a measure of familiarity derived from semantic activation—termed semantic stress— provide a sufŽcient basis for distinguishing words from nonwords. Moreover, this measure was shown to account for some aspects of how lexical decision performance is inuenced by the nature of the words and nonwords in the task; namely, the frequency blocking effect (Gordon, 1983) and the pseudohomophone effect (Coltheart et al., 1977; McCann et al., 1988).
 
The stress measure reects the strength with which semantic units are driven from their “resting” activation level (0.5) towards an extreme value (0 or 1). The reason why semantic stress distinguishes words from nonwords stems from the lack of systematicity in the mapping from orthography to semantics. This lack of systematicity means that orthographic similarity is unpredictive of semantic similarity. Thus, during training, the network must learn to map visually similar words (e.g. HAVE, GAVE, SAVE) to completely unrelated sets of semantic features. The presentation of a nonword (e.g. MAVE) partially engages the mappings for visually similar words, but because these mappings are inconsistent with each other, they generate conicting input to the semantic units, resulting in only weak (non-binary) semantic activation. By contrast, the considerable systematicity between orthography and phonology enables the mappings for visually similar words to cooperate and collectively produce strongly active, correct pronunciations for nonwords.
==URL==
Other researchers (e.g. Besner & Joordens, 1995; Borowsky & Masson, 1996; Joordens & Becker, 1997; Masson & Borowsky, 1995; Rueckl, 1995) have proposed performing lexical decision in distributed networks based on the negative of the “energy” in the network (S i , jsisjwij; HopŽeld, 1982). The current stress measure has the advantage of being based entirely on unit activations rather than requiring decision processes to have access to weights on connections between other units. However, it should be pointed out that the two measures are closely related: To the extent that unit states are on the appropriate side of “rest” given the sign of the weight between them, then increasing stress (by moving states towards their extreme values) will also decrease energy. Thus, the present results would also be expected to hold, at least qualitatively, if lexical decision were based on the negative of energy.
 
Although the simulation establishes that lexical decision can be performed accurately based on semantics, it should not be interpreted as implying that subjects always do so. In fact, following Seidenberg and McClelland (1989), we assume that subjects can base their decisions on any available information in the lexical system, and that they adopt a strategy that optimises their performance given the composition of the stimuli. We also assume that subjects will rely on orthographic information when this sufŽces, given that such information is more reliable and more rapidly available than either phonological or semantic information. Thus, the simulation is not intended as a fully adequate account of lexical decision under all conditions. In particular, it is not intended to account for the inuence of orthographic factors, such as neighbourhood density, on lexical decision (see, e.g. Andrews, 1992; Sears, Hino, & Lupker, 1995). However, the simulation does serve to demonstrate that subjects can fall back on semantic information when necessary—for example, when orthographically strange words like AISLE and GAUGE or highly word-like pseudohomophones like HOAP and JOAK are included among the stimuli.
To do a comparison with the older page rendered below the table of differences, provide the URL as follows.
An important aspect of LD performance that is not treated in detail in the current work is a speciŽcation of the actual processing mechanism that computes semantic stress and uses it to make word/nonword decisions in real time. One of the advantages of stress over an alternative measure of familiarity like energy (HopŽeld, 1982) is that the computation of stress does not require access to the values of connection weights between units (which are presumably inaccessible to other units). It is fairly straightforward for a decision process to compute semantic stress if it has access to the semantic units as input. Moreover, Usher and McClelland (1995) have demonstrated recently that competition between linear, stochastic, time-averaging units representing alternative responses gives rise to a number of basic properties of empirical Žndings in standard choice reaction time tasks, including the shapes of time-accuracy functions, latency-probability functions, hazard functions and reaction-time distributions. Their approach could be applied to generate LD latencies in the current context by creating a “yes” unit whose input is the current level of semantic stress and a “no” unit whose input constitutes a decision criterion that is adjusted to optimise performance, and having the network respond when one of the units exceeds a response criterion.
 
CONCLUSION
Open the revision of one page that you wish to compare to another, for example <code><nowiki>http://meta.wikimedia.org/w/index.php?title=Help:Diff&oldid=78722</nowiki></code>, and the revision of the other page that you wish to compare, for example <code><nowiki>http://meta.wikimedia.org/w/index.php?title=Main_Page&oldid=98420</nowiki></code>. Copy the oldid number of one page (<code>&oldid=78722</code> in the first example) and replace the text <code>oldid</code> with <code>diff</code>: <code>&diff=78722</code>. Paste this string into the URL of the other page between that page's title and its oldid (<code>&oldid=98420</code> in the second example), so you have something like this:
The simulations described in this article, in conjunction with those developed previously (Plaut et al., 1996; Seidenberg & McClelland, 1989), illustrate how connectionist computational principles—distributed representation, structure-sensitive learning, and interactivity—can provide insight into central empirical phenomena in normal and impaired lexical processing. Moreover, they make it clear that distinctions in the function of the lexical system—as manifest in the behaviour of experimental subjects— need not reect corresponding distinctions in the structure of the system. Thus, networks exhibit word-frequency effects and word/nonword discrimination without word representations, and spelling–sound consistency effects without separate mechanisms for regular and exception items. In this way, gaining insight into the structure and function of the cognitive system by observing its normal and impaired behaviour—the central goal of cognitive psychology and neuropsychology—may depend critically on developing theories and explicit simulations in the context of a speciŽc computational framework that relates structure to function.
 
The current work demonstrates how the distributed connectionist approach can provide an effective theoretical framework for understanding word naming and lexical decision. This is not to say that the existing distributed models are fully adequate and account for all of the relevant data in sufŽcient detail—this is certainly not the case. In fact, given that they are models, they are abstractions from the actual processing system and are certainly wrong in their details. Nonetheless, their relative success at reproducing key patterns of data in the domain of word reading, and the fact that the very same computational principles are being applied successfully across a wide range of linguistic and cognitive domains, suggests that these models capture important aspects of representation and processing in the human language and cognitive system.
:<code><nowiki>http://meta.wikimedia.org/w/index.php?title=Main_Page&diff=78722&oldid=98420</nowiki></code>
REFERENCES
 
Andrews, S. (1982). Phonological recoding: Is the regularity effect consistent? Memory and Cognition, 10, 565–575.
You may remove the page title (<code>title=Main_Page</code> in the example above) from the URL if you wish, but this is not necessary. The resulting diff will compare the given versions of the two pages [http://meta.wikimedia.org/w/index.php?title=Main_Page&diff=78722&oldid=98420].
Andrews, S. (1992). Frequency and neighborhood effects on lexical access: Lexical similarity or orthographic redundancy? Journal of Experimental Psychology: Learning, Memory andCognition, 18, 234–254.
 
Atkinson, R.C., & Juola, J.F. (1973). Factors inuencing speed and accuracy of word recognition. In S. Kornblum (Ed.), Attention and performance IV, pp. 583–612. New York: Academic Press.
To compare the current version of the page and a given oldid, you can put "current" after "diff=" instead of an oldid. For example,
Balota, D.A., & Chumbley, J.I. (1984). Are lexical decisions a good measure of lexical access? The role of word frequency in the neglected decision stage. Journal of Experimental Psychology: Human Perception and Performance, 10, 340–357.
<code><nowiki>http://meta.wikimedia.org/w/index.php?title=Help:Diff&diff=current&oldid=124558</nowiki></code>
Behrmann, M., & Bub, D. (1992). Surface dyslexia and dysgraphia: Dual routes, a single lexicon. Cognitive Neuropsychology, 9, 209–258.
 
Besner, D., & Joordens, S. (1995). Wrestling with ambiguity—Further reections: Reply to Masson and Borowsky (1995) and Rueckl (1995). Journal of Experimental Psychology: Learning, Memory and Cognition, 21, 515–301.
would compare the current version of this page with the version that has oldid 124558.-->
Besner, D., & Smith, M.C. (1992). Models of visual word recognition: When obscuring the stimulus yields a clearer view. Journal of Experimental Psychology: Learning, Memory and Cognition, 18, 468–482.
 
Besner, D., Twilley, L., McCann, R.S., & Seergobin, K. (1990). On the connection between connectionism and data: Are a few words necessary? Psychological Review, 97, 432–446.
{{h:f Ndihmë}}
Borowsky, R., & Masson, M.E.J. (1996). Semantic ambiguity effects in word identiŽcation. Journal of Experimental Psychology: Learning, Memory and Cognition, 22, 63–85.
Bub, D., Cancelliere, A., & Kertesz, A. (1985). Whole-word and analytic translation of spelling-to-sound in a non-semantic reader. In K. Patterson, M. Coltheart, & J.C. Marshall (Eds), Surface dyslexia, pp. 15–34. Hove: Lawrence Erlbaum Associates Ltd.
Chauvin, Y. (1988). Symbol acquisition in humans and neural (PDP) networks. PhD thesis, University of California, San Diego, CA.
Cipolotti, L., & Warrington, E.K. (1995). Semantic memory and reading abilities: A case report. Journal of the International Neuropsychological Society, 1, 104–110.
Coltheart, M. (1978). Lexical access in simple reading tasks. In G. Underwood (Ed.), Strategies of information processing, pp. 151–216. New York: Academic Press.
Coltheart, M. (1985). Cognitive neuropsychology and the study of reading. In M.I. Posner & O.S.M. Marin (Eds), Attention and performance XI, pp. 3–37. Hillsdale, NJ: Lawrence Erlbaum Associates Inc.
Coltheart, M., & Funnell, E. (1987). Reading writing: One lexicon or two? In D.A. Allport, D.G. MacKay, W. Printz, & E. Scheerer (Eds), Language perception and production: Shared mechanisms in listening, speaking, reading and writing, pp. 313–339. New York: Academic Press.
Coltheart, M., Davelaar, E., Jonasson, J., & Besner, D. (1977). Access to the internal lexicon. In S. Dornic (Ed.), Attention and performance VI, pp. 535–555. Hillsdale, NJ: Lawrence Erlbaum Associates Inc.
Coltheart, M., Curtis, B., Atkins, P., & Haller M. (1993). Models of reading aloud: Dualroute and parallel-distributed-processing approaches.Psychological Review, 100, 589–608.
Coslett, H.B. (1991). Read but not write “idea”: Evidence for a third reading mechanism. Brain and Language, 40, 425–443.
Cottrell, G.W., & Plunkett, K. (1991). Learning the past tense in a recurrent network: Acquiring the mapping from meaning to sounds. In Proceedings of the 13th Annual Conference of the Cognitive Science Society, pp. 328–333. Hillsdale, NJ: Lawrence Erlbaum Associates Inc.
Cummings, J.L., Houlihan, J.P., & Hill, M.A. (1986). The pattern of reading deterioration in dementia of Alzheimer type: Observations and implications. Brain and Language, 29, 315–323.
Dunn, L.M. (1965). The Peabody picture vocabulary test. Circle Pines: American Guidance Service.
Farah, M.J. (1994). Neuropsychological inference with an interactive brain: A critique of the locality assumption. Behavioral and Brain Sciences, 17, 43–104.
Farah, M.J., & McClelland, J.L. (1991). A computational model of semantic memory impairment: Modality-speciŽcity and emergent category-speciŽcity. Journal of Experimental Psychology: General, 120, 339–357.
Fera, P., & Besner, D. (1992). The process of lexical decision: More words about a parallel distributed processing model. Journal of Experimental Psychology: Learning, Memory and Cognition, 18, 749–764.
Franklin, S., Howard, D., & Patterson, K. (1995). Abstract word anomia. Cognitive Neuropsychology, 12, 549–566.
Friedman, R.B., Ferguson, S., Robinson, S., & Sunderland, T. (1992). Dissociation of mechanisms of reading in Alzheimer’s disease. Brain and Language, 43, 400–413.
Funnell, E. (1996). Response biases in oral reading: An account of the co-occurrence of surface dyslexia and semantic dementia. Quarterly Journal of Experimental Psychology, 49A, 417–314.
Glanzer, M., & Ehrenreich, S.L. (1979). Structure and search of the internal lexicon. Journal of Verbal Learning and Verbal Behavior, 18, 381–398.
Glushko, R.J. (1979). The organization and activation of orthographic knowledge in reading aloud. Journal of Experimental Psychology: Human Perception and Performance, 5, 674–691.
Gordon, B. (1983). Lexical access and lexical decision: Mechanisms of frequency sensitivity. Journal of Verbal Learning and Verbal Behavior, 22, 24–44.
Graham, K.S., Hodges, J.R., & Patterson, K. (1994). The relationship between comprehension and oral reading in progressive uent aphasia. Neuropsychologia, 32, 299–316.
Graham, K.S., Patterson, K., & Hodges, J.R. (1995). Progressive pure anomia: InsufŽcient activation of phonology by meaning. Neurocase, 1, 25–38.
Hillis, A.E., &Caramazza, A. (1991). Category-speciŽc naming and comprehension impairment: A double dissociation. Brain, 114, 2081–2094.
Hinton, G.E. (1989). Connectionist learning procedures. ArtiŽcial Intelligence, 40, 185– 234.
Hinton, G.E., McClelland, J.L., & Rumelhart, D.E. (1986). Distributed representations. In D.E. Rumelhart, J.L. McClelland, & the PDP Research Group (Eds), Parallel distributed processing: Explorations in the microstructure of cognition. Vol. 1: Foundations, pp. 77–109. Cambridge, MA: MIT Press.
Hodges, J.R., Patterson, K., Oxbury, S., & Funnell, E. (1992). Semantic dementia: Progressive uent aphasia with temporal lobe atrophy. Brain, 115, 1783–1806.
Hoeffner, J. (1992). Are rules a thing of the past? The acquisition of verbal morphology by an attractor network. In Proceedings of the 14th Annual Conference of the Cognitive Science Society, pp. 861–866. Hillsdale, NJ: Lawrence Erlbaum Associates Inc.
HopŽeld, J.J. (1982). Neural networks and physical systems with emergent collective computational abilities. Proceedings of the National Academy of Science, USA, 79, 2554–2558.
Jacobs, R.A. (1988). Increased rates of convergence through learning rate adaptation. Neural Networks, 1, 295–307.
Joordens, S., & Becker, S. (1997). The long and short of semantic priming effects in lexical decision. Journal of Experimental Psychology: Learning, Memory and Cognition, 23, 1083–1105.
Joordens, S., & Besner, D. (1994). When banking on meaning is not (yet) money in the bank: Explorations in connectionist modeling. Journal of Experimental Psychology: Learning, Memory and Cognition, 20, 1051–1062.
Kawamoto, A. (1988). Distributed representations of ambiguous words and their resolution in a connectionist network. In S.L. Small, G.W. Cottrell, & M.K. Tanenhaus (Eds), Lexical ambiguity resolution: Perspectives from psycholinguistics, neuropsychology, and artiŽcial intelligence . San Mateo, CA: Morgan Kaufmann.
Kawamoto, A.H. (1993). Nonlinear dynamics in the resolution of lexical ambiguity: A parallel distributed processing approach. Journal of Memory and Language, 32, 474–516.
Kawamoto, A.H., Farrar, W.T., & Kello, C.T. (1994). When two meanings are better than one: Modeling the ambiguity advantage using a recurrent distributed network. Journal of Experimental Psychology: Human Perception and Performance, 20, 1233–1247.
Kay, J., Lesser, R., & Coltheart, M. (1992). Palpa: Psycholinguistic assessments of language processing in aphasia. Hove: Lawrence Erlbaum Associates Ltd.
Kucera, H., & Francis, W.N. (1967). Computational analysis of present-day American English. Providence, RI: Brown University Press.
Lambon Ralph, M., Ellis, A.W., & Franklin, S. (1995). Semantic loss without surface dyslexia. Neurocase, 1, 363–369.
Marshall, J.C., & Newcombe, F. (1973). Patterns of paralexia: A psycholinguistic approach. Journal of Psycholinguistic Research, 2, 175–199.
Masson, M.E.J. (1995). A distributed memory model of semantic priming. Journal of Experimental Psychology: Learning, Memory and Cognition, 21, 3–23.
Masson, M.E.J., & Borowsky, R. (1995). Unsettling questions about semantic ambiguity in connectionist models: Comment on Joordens and Besner (1994). Journal of Experimental Psychology: Learning, Memory and Cognition, 21, 509–514.
McCann, R.S., Besner, D., & Davelaar, E. (1988). Word recognition and identiŽcation: Do word frequency effects reect lexical access? Journal of Experimental Psychology: Human Perception and Performance, 14, 693–706.
McCarthy, R., & Warrington, E.K. (1986). Phonological reading: Phenomena and paradoxes. Cortex, 22, 359–380.
McClelland, J.L., & Rumelhart, D.E. (1981). An interactive activation model of context effects in letter perception: Part 1. An account of basic Žndings. Psychological Review, 88, 375–407.
Meyer, D.E., Schvaneveldt, R.W., & Ruddy, M.G. (1974). Functions of graphemic and phonemic codes in visual word recognition. Memory and Cognition, 2, 309–321.
Morton, J. (1969). The interaction of information in word recognition. Psychological Review, 76, 165–178.
Morton, J., & Patterson, K. (1980). A new attempt at an interpretation; or, an attempt at a new interpretation. In M. Coltheart, K. Patterson, & J.C. Marshall (Eds), Deep dyslexia, pp. 91–118. London: Routledge and Kegan Paul.
Paap, K.R., & Noel, R.W. (1991). Dual route models of print to sound: Still a good horse race. Psychological Research, 53, 13–24.
Patterson, K., & Hodges, J.R. (1992). Deterioration of word meaning: Implications for reading.Neuropsychologia, 30, 1025–1040.
Patterson, K., Coltheart, M., & Marshall, J.C. (Eds) (1985). Surface dyslexia. Hove: Lawrence Erlbaum Associates Ltd.
Patterson, K., Seidenberg, M.S., & McClelland, J.L. (1989). Connections and disconnections: Acquired dyslexia in a computational model of reading processes. In R.G.M. Morris (Ed.), Parallel distributed processing: Implications for psychology and neuroscience, pp. 131–181. Oxford: Oxford University Press.
Patterson, K., Graham, N., & Hodges, J.R. (1994a). The impact of semantic memory loss on phonological representations. Journal of Cognitive Neuroscience, 6, 57–69.
Patterson, K., Graham, N., & Hodges, J.R. (1994b). Reading in Alzheimer’s type dementia: A preserved ability? Neuropsychology, 8, 395–412.
Patterson, K., Plaut, D.C., McClelland, J.L., Seidenberg, M.S., Behrmann, M., & Hodges, J.R.
(1996). Connections and disconnections: A connectionist account of surface dyslexia. In J. Reggia, R. Berndt, & E. Ruppin (Eds), Neural modeling of cognitive and brain disorders, pp. 177–199. New York: World ScientiŽc.
Plaut, D.C. (1995a). Double dissociation without modularity: Evidence from connectionist neuropsychology. Journal of Clinical and Experimental Neuropsychology, 17, 291–321.
Plaut, D.C. (1995b). Semantic and associative priming in a distributed attractor network. In Proceedings of the 17th Annual Conference of the Cognitive Science Society, pp. 37–42. Hillsdale, NJ: Lawrence Erlbaum Associates Inc.
Plaut, D.C. (1996). Relearning after damage in connectionist networks: Toward a theory of rehabilitation. Brain and Language, 52, 25–82.
Plaut, D.C., & McClelland, J.L. (1993). Generalization with componential attractors: Word and nonword reading in an attractor network. In Proceedings of the 15th Annual Conference of the Cognitive Science Society, pp. 824–829. Hillsdale, NJ: Lawrence Erlbaum Associates Inc.
Plaut, D.C., McClelland, J.L., Seidenberg, M.S., & Patterson, K. (1996). Understanding normal and impaired word reading: Computational principles in quasi-regular domains. Psychological Review, 103, 56–115.
Ratcliff, R. (1978). A theory of memory retrieval. Psychological Review, 85, 59–108.
Raymer, A.M., & Berndt, R.S. (1994). Models of word reading: Evidence from Alzheimer’s disease. Brain and Language, 47, 479–482.
Rueckl, J.G. (1995). Ambiguity and connectionist networks: Still settling into a solution. Comment on Joordens and Besner (1994). Journal of Experimental Psychology: Learning, Memory and Cognition, 21, 501–508.
Rumelhart, D.E., Hinton, G.E., & Williams, R.J. (1986a). Learning internal representations by error propagation. In D.E. Rumelhart, J.L. McClelland, & the PDP Research Group (Eds), Parallel distributed processing: Explorations in the microstructure of cognition. Vol. 1: Foundations , pp. 318–362. Cambridge, MA: MIT Press.
Rumelhart, D.E., Hinton, G.E., & Williams, R.J. (1986b). Learning representations by back-propagating errors. Nature, 323, 533–536.
Schwartz, M.F. (Ed.) (1990). Modular deŽcits in Alzheimer-type dementia. Cambridge, MA: MIT Press.
Schwartz, M.F., Marin, O.S.M., & Saffran, E.M. (1979). Dissociations of language function in dementia: A case study. Brain and Language, 7, 277–306.
Schwartz, M.F., Saffran, E.M., & Marin, O.S.M. (1980). Fractioning the reading process in dementia: Evidence for word-speciŽc print-to-sound associations. In M. Coltheart, K. Patterson, & J.C. Marshall (Eds),Deep dyslexia, pp. 259–269. London: Routledge and Kegan Paul.
Sears, C.R., Hino, Y., & Lupker, S.J. (1995). Neighborhood size and neighborhood frequency effects in word recognition. Journal of Experimental Psychology: Human Perception and Performance, 21, 876–900.
Seidenberg, M.S. (1992). Beyond orthographic depth: Equitable division of labour. In R. Frost & K. Katz (Eds), Orthography, phonology, morphology, and meaning, pp. 85–118. Amsterdam: Elsevier.
Seidenberg, M.S., & McClelland, J.L. (1989). A distributed, developmental model of word recognition and naming. Psychological Review, 96, 523–568.
Seidenberg, M.S., Waters, G.S., Barnes, M.A., & Tanenhaus, M.K. (1984). When does irregular spelling or pronunciation inuence word recognition? Journal of Verbal Learning and Verbal Behaviour, 23, 383–404.
Seidenberg, M.S., Plaut, D.C., Petersen, A.S., McClelland, J.L., & McRae, K. (1994). Nonword pronunciation and models of word recognition. Journal of Experimental Psychology: Human Perception and Performance, 20, 1177–1196.
Seidenberg, M.S., Petersen, A., MacDonald, M.C., & Plaut, D.C. (1996). Pseudohomophone effects and models of word recognition. Journal of Experimental Psychology: Learning, Memory and Cognition, 22, 48–62.
Shallice, T., Warrington, E.K., & McCarthy, R. (1983). Reading without semantics.
Quarterly Journal of Experimental Psychology, 35A, 111–138.
Snodgrass, J.G., & Vanderwart, M. (1980). A standardized set of 260 pictures: Norms for name agreement, image agreement, familiarity, and visual complexity. Journal of Experimental Psychology: Learning, Memory and Cognition, 6, 174–215.
Snowden, J.S., Goulding, P.J., & Neary, D. (1989). Semantic dementia: A form of circumscribed cerebral atrophy. Behavioral Neurology, 2, 167–182.
Strain, E., Patterson, K., & Seidenberg, M.S. (1995). Semantic effects in single-word naming. Journal of Experimental Psychology: Learning, Memory and Cognition, 21, 1140–1154.
Taraban, R., & McClelland, J.L. (1987). Conspiracy effects in word recognition. Journal of Memory and Language, 26, 608–631.
Usher, M., & McClelland, J.L. (1995). On the time course of perceptual choice: A model based on principles of neural computation. Technical Report PDP.CNS.95.5. Pittsburgh, PA: Carnegie Mellon University, Department of Psychology.
Van Orden, G.C., & Goldinger, S.D. (1994). Interdependence of form and function in cognitive systems explains perception of printed words. Journal of Experimental Psychology: Human Perception and Performance, 20, 1269.
Van Orden, G.C., Pennington, B.F., & Stone, G.O. (1990). Word identiŽcation in reading and the promise of subsymbolic psycholinguistics. Psychological Review, 97, 488–522.
Waters, G.S., & Seidenberg, M.S. (1985) Spelling–sound effects in reading: Time course and decision criteria. Memory and Cognition, 13, 557–572.
Watt, S., Jokel, R., & Behrmann, M. (1997). Surface dyslexia in progressive aphasia. Brain and Language, 56, 211–233.
Wickelgren, W.A. (1969). Context-sensitive coding, associative memory, and serial order in (speech) behavior. Psychological Review, 76, 1–15.