top of page

Journal-notes on cognitive science

[From 2015.] Reading a bunch of articles on cognitive development. A fascinating field.


Human infants who receive little touching grow more slowly [and] release less growth hormone… Throughout life, they show larger reactions to stress, are more prone to depression, and are vulnerable to deficits in cognitive functions commonly seen in depression or during stress. Touch plays a powerful role for human infants in promoting optimal development and in counteracting stressors. Massaging babies lowers their cortisol levels and helps them gain weight… Thus, besides ‘simple touch’ being able to calm our jitters and lift our spirits, the right kind of touch regularly enough early in life can improve cognitive development, brain development, bodily health throughout life, and gene expression. (From “Contributions of Neuroscience to Our Understanding of Cognitive Development” (2008).)

From Jean Mandler’s article “A New Perspective on Cognitive Development in Infancy” (1990) it seems my earlier guess about when concepts, in some primitive sense, start to develop in the infant (I said five months or so) was just about right. Piaget was wrong that there’s a prolonged stage of exclusively sensorimotor processing, in which thinking, or conceptualizing, is absent. (In fact, he was wrong about the whole stage theory itself (except in the truistic sense that cognition progressively grows more sophisticated and abstract), as well as about the purported sensorimotor foundation of thought. This idea has some merit, but as a general explanation of all thought it’s too empiricist (notwithstanding Piaget’s criticism of empiricism).) Experiments have shown that the rudiments of conceptual thinking and perceiving exist by, at the latest, six months. (I’d say earlier, actually. In all likelihood.) One possible origin of concepts is that they develop out of the young infant’s perceptual schemas,[1] perhaps from “a process by which one perception is actively compared to another and similarities or differences between them are noted.” More specifically,

image schemas—notions derived from spatial structure, such as trajectory, up-down, container, part-whole, end-of-path, and link—form the foundation of the conceptualizing capacity. [Lakoff and Johnson] suggest that image schemas are derived from preconceptual perceptual structures, forming the core of many of our concepts of objects and events and of their metaphorical extensions to abstract realms. They demonstrate in great detail how many of our most complex concepts are grounded in such primitive notions. I would characterize image schemas as simplified redescriptions of sensorimotor schemas, noting that they seem to be reasonably within the capacity of infant conceptualization.

Eric Margolis and Stephen Laurence’s article “In Defense of Nativism” (2013) makes you seriously wonder how anyone could be perverse enough to oppose nativism (i.e. a kind of rationalism, all the Chomskyan stuff about innateness, poverty of the stimulus, the modularity of the mind, etc.). Here’s an interesting piece of evidence:

One especially vivid type of poverty of the stimulus argument draws upon the results of isolation experiments. These are empirical studies in which, by design, the experimental subjects are removed from all stimuli that are related to a normally acquired trait. For example, Irenäus Eibl-Eibesfeldt showed that squirrels raised in isolation from other squirrels, and without any solid objects to handle, spontaneously engage in the stereotypical squirrel digging and burying behavior when eventually given nuts. Eibl-Eibesfeldt notes that, “the stereotypy of the movement becomes particularly obvious in captivity, where inexperienced animals will try to dig a hole in the solid floor of a room, where no hole can be dug. They perform all the movements already described, covering and patting the nut, even though there is no earth available.” Since the squirrels were kept apart from their conspecifics, they had no exposure to this stereotypical behavior prior to exhibiting it themselves—the stimulus was about as impoverished as can be. Reliable acquisition of such a complex and idiosyncratic behavior under these circumstances provides extremely good evidence against [the empiricist hypothesis of] a general-purpose learning mechanism [which is supposed to explain the acquisition of all cognitive traits, as opposed to the nativist hypothesis of specialized acquisition systems each operative in a different domain, whether that of language, numbers, music, object representation, facial recognition, etc.].

Are humans supposed to be the only species that isn’t “programmed” to exhibit specific types of behavior and acquire specific types of knowledge??

Apparently the name of that argument is “the argument from animals”: “The argument from animals is grounded in the firmly established fact that animals have a plethora of specialized learning systems… But human beings are animals too.” What’s true of other animals should be true of us. And, indeed, it is: just as other animals tend to be pretty stupid in many ways, human beings tend to be pretty stupid too, as shown by the fact that large numbers of them are empiricists.

Being sensible, the authors argue that some concepts, too, are, loosely speaking, innate. “Likely contenders include concepts associated with objects, causality, space, time, and number, concepts associated with goals, functions, agency, and meta-cognitive thinking, basic logical concepts, concepts associated with movement, direction, events, and manner of change, and concepts associated with predators, prey, food, danger, sex, kinship, status, dominance, norms, and morality.” Yes! I love bold statements like this, especially when they’re highly plausible.

From Alison Gopnik’s “The Post-Piaget Era” (1996):

Piaget’s thesis that there are broad-ranging, general stages of development seems increasingly implausible. Instead, cognitive development appears to be quite specific to particular domains of knowledge. Moreover, children consistently prove to have certain cognitive abilities at a much earlier age than Piaget proposed, at least in some domains. Also contra Piaget, even newborn infants have rich and abstract representations of some aspects of the world. Moreover, Piaget underestimated the importance of social interaction and language in cognitive development. Finally, Piaget’s account of assimilation and accommodation as the basic constructivist mechanisms now seems too vague.

I’m skeptical of neo-Piagetian constructivism, such as “the theory theory,” “the idea that cognitive development is the result of the same mechanisms that lead to theory change in science.” Evidence, counterevidence, testing of the theory (or representation), the invention of auxiliary hypotheses, the construction of a new theory, etc. The child’s brain is supposed to be like a scientist revising theories in the light of evidence. In a loose sense, this must be true: children constantly ask questions, they interpret and try to understand the world, they learn to make inferences about what’s going on in other people’s heads, they revise their beliefs, and so on. In fact, humans continue to do this throughout life. Organized science is but a more sophisticated version of cognitive processes (of inquisitiveness, ‘theory’-formation, informal testing) we frequently engage in even without being reflectively aware of it. But there’s at least one obvious difference between science, even folk science, and early cognitive maturation: the former is a highly conscious activity, while the latter takes place even before the young child is (self-)conscious at all or remotely capable of advanced conceptualizing. The infant’s and toddler’s brain unconsciously constructs representations of the world, not through conscious reflective processes but through utterly mysterious, natural, ‘spontaneous’ [to use a misleading word] perceptual-cognitive-neural processes that ‘grow’ unbelievably complex and ordered structures of vision, object perception, linguistic syntax and semantics, musical sensitivity, etc. Such processes of even physical brain-growth—neurons forming trillions of connections with each other so as eventually to build stable, coherent, and integrated representations of the world—appear to have virtually nothing in common with scientists’ self-conscious construction of theories.[2]

Much more plausible than the theory theory is Fodor’s (nativist) modularity conception of the mind, or Chomsky’s principles and parameters approach.

And then in Nora S. Newcombe’s paper “The Nativist-Empiricist Controversy in the Context of Recent Research on Spatial and Quantitative Development” (2002), you have the claim that “the key question in the nativist-interactionist controversy is whether environmental interactions are or are not required to produce mature competence. Nativists believe that adaptation was ensured by highly specific built-in programs, whereas interactionists suggest that inborn abilities do not need to be highly specific because, for central areas of functioning, the environmental input required for adaptive development is overwhelmingly likely. Evidence favors the latter solution to the adaptational problem.” Of course. You need some sort of interactions with the environment in order for capacities to mature in the way they’re ‘programmed’ to do! I don’t see how that contradicts nativism.

Reading articles on David Marr’s theories of vision. Not very easy. But stimulating and plausible. (Three levels: computation, algorithm, implementation. I don’t see how the whole computational-representational conception of the mind, at least on some interpretation, can fail to be true. In order for the human being to interact with the environment, the brain has to deal with various levels and types of representations, and it has to carry out complex computations on the raw data it receives, so as to work up the data into something usable. Mathematics is the language of nature; calculations (computations) are how the brain processes and interprets the nature it ‘receives’ through the senses. (E.g., in order for us to walk, the brain has to constantly calculate where external objects are in relation to the body.) Maybe I’m just irredeemably stupid, but I don’t see what real alternative there is to this picture. I have to admit I don’t have the foggiest idea of how neural activities could carry out calculations, or even what it means to say they do, but clearly they must. Actually, the whole thing is one big goddamn mess, and scientists/philosophers haven’t even got clear on the basics yet.)

Reading the Chomsky-recommended Memory and the Computational Brain: Why Cognitive Science Will Transform Neuroscience (2010), by C. R. Gallistel and Adam Philip King. A brilliant and illuminating book, though not light reading. “Our central claim…is that the function of the neurobiological memory mechanism is to carry information forward in time in a computationally accessible form.” They argue against the non-representational, associative theories of learning that go back centuries, and claim that contemporary neuroscience is handicapped by its fidelity to this tradition, particularly in its position that the physical basis of memory is synaptic plasticity (the ability of synapses to strengthen or weaken over time, in response to increases or decreases in their activity).

Another central claim: “There must be an addressable read/write memory mechanism in brains that encodes information received by the brain into symbols (write), locates the information received by the brain (addresses), and transports it to computational machinery that makes productive use of the information (reads).” An academic review of the book elaborates: “In short, living organisms are made in the image of a digital computer of the sort described by John von Neumann and envisioned by Alan Turing.” I’ve always been skeptical of the abundance of parallels that thinkers like to draw between brains and computers, but some of the parallels must be justified. (Not, however, the idea that computers are conscious. That’s just stupid.)

Wow, Claude Shannon, “the father of information theory,” was remarkably brilliant. In general, I’ve always been impressed by mathematicians, by the very idea of mathematicians; but Shannon was incredible. Finding a way to quantify information! It’s almost miraculous.

From the book:

This brings us to a brief consideration of how Shannon’s analysis applies to the brain. The essential point is that the brain is a receiver of signals that, under the proper conditions, convey to it information about the state of the world. The signals the brain receives are trains of action potentials propagating down sensory axons. Neurophysiologists call these action potentials spikes, because they look like spikes when viewed on an oscilloscope at relatively low temporal resolution. Spikes are analogous to electrical pulses that carry information within electronic systems. Sensory organs (eyes, ears, noses, tongues, and so on) and the sensory receptors embedded in them convert information-rich stimulus energy to spike trains. The stimuli that act directly on sensory receptors are called proximal stimuli. Examples are the photons absorbed by the rods and cones in the retina, the traveling waves in the basilar membrane of the cochlea, which bend the underlying hair cells, the molecules absorbed by the nasal mucosa, and so on. Proximal stimuli carry information about distal stimuli, sources out there in the world. The brain extracts this information from spike trains by processing them. This is to say that much of the signal contains data from which useful information must be determined. [But the signal also contains ‘noise’ that the brain somehow has to filter out.]
…The modern approach to a neurobiological understanding of sensory transduction and the streams of impulses thereby generated relies heavily on Shannon’s insights and their mathematical elaboration. In a few cases, it has been possible to get evidence regarding the code used by sensory neurons to transmit information to the brains of flies and frogs. The use of methods developed from Shannon’s foundations has made it possible to estimate how many bits are conveyed per spike and how many bits are conveyed by a single axon in one second. The answers have been truly revolutionary. A single spike can convey as much as 7 bits of information and 300 bits per second can be transmitted on a single axon. [This is fascinating. And how were they able to estimate it at all?!]
Given our estimates above of how many bits on average are needed to convey English words when an efficient code is used (about 10 per word), a single axon could transmit 30 words per second to, for example, a speech center…

In short, “Understanding how the brain works requires an understanding of the rudiments of information theory, because what the brain deals with is information.”

“For decades, neuroscientists have debated the question whether neural communication is analog or digital or both, and whether it matters… Our hunch is that information transmission and processing in the brain is likewise ultimately digital. A guiding conviction of ours – by no means generally shared in the neuroscience community – is that brains do close to the best possible job with the problems they routinely solve, given the physical constraints on their operation. Doing the best possible job suggests doing it digitally, because that is the best solution to the ubiquitous problems of noise, efficiency of transmission, and precision control.” They make this observation because the modern theory of computation is cast entirely in digital terms (digital means information is carried by a set of discrete symbols, rather than something continuously variable), so in order for the theory to be relevant to the brain, the brain has to work digitally. Incidentally, the genetic coding of information is a digital, not an analog, system.

Sometimes I like to just sit in awe. The brain has to interpret the world in such a way that the organism can interact with its environment. So what it does, necessarily, is to give an appearance to something that in fact, in itself, has no appearance. What perceptual appearances are (of course) is just a vast amount of information about the world, information that enables us to survive. Information presented in the form of colors, sounds, tastes, smells, etc. There are structures in the world that correspond to what we perceive, while yet being so radically different as to lack the defining element of what we perceive, namely appearance. (Our complete incapacity to imagine something that (in itself) lacks an appearance only shows how inadequate our cognitive apparatus is to truly understand the world.) Anyway, I just find it mind-bending to contemplate that the brain, by carrying out computations on gigabytes of data it continuously receives, gives an appearance to the world. From billions of action potentials you get…a breathtaking sunset.

So far, most of the book I find mind-numbing and almost incomprehensible. All the details about information theory, functions, symbolization, all the mathematics…my brain doesn’t ‘function’ in this way. Here’s a refreshing and useful paragraph:

The anti-representational behaviorism of an earlier era finds an echo in contemporary connectionist and dynamic-systems work. Roughly speaking, the more committed theorists are to building psychological theory on neurobiological foundations, the more skeptical they are about the hypothesis that there are symbols and symbol-processing operations in the brain. We will explore the reasons for this in subsequent chapters, but the basic reason is simple: the language and conceptual framework for symbolic processing is alien to contemporary neuroscience. Neuroscientists cannot clearly identify the material basis for symbols – that is, there is no consensus about what the basis might be – nor can they specify the machinery that implements any of the information-processing operations that would plausibly act on those symbols (operations such as vector addition). Thus, there is a conceptual chasm between mainline cognitive science and neuroscience. Our book is devoted to exploring that chasm and building the foundations for bridging it.

Useful paper: “Doing Cognitive Neuroscience: A Third Way” (2006), by Frances Egan and Robert J. Matthews. Defends the dynamic-systems approach Gallistel criticizes, but has information on two other dominant paradigms as well. The first is what most neuroscientists do: study what neurons are doing, what goes on biologically at the lowest levels. This is what Gallistel, David Marr, Chomsky, and other cognitive scientists criticize on the following grounds:

The persistent complaints of computational theorists such as Marr seem borne out, that merely mucking around with low-level detail is unlikely to eventuate in any understanding of complex cognitive processes, for the simple reason that bottom-up theorists don’t know what they are looking for and thus are quite unlikely to find anything. Sometimes the problem here is presented as a search problem: How likely it is that one can find what one is looking for, given the complexity of the brain, unless one already knows, or at least has some idea, what one is looking for. At other times the problem is presented less as an epistemological problem than as a metaphysical problem: cognitive processes, it is claimed, are genuinely emergent out of the low level processes, such that no amount of understanding of the low-level processes will lead to an understanding of the complex cognitive processes.

That seems plausible to me.

As Marr said, “trying to understand perception by studying only neurons is like trying to understand bird flight by studying only feathers. It simply cannot be done.” The neural level only implements higher-level cognitive principles and operations. So the other approach, of course, is Marr and Gallistel’s, the top-down one. The authors of the article criticize these “cognitivists,” on the other hand, for assuming that their account of mental processes will be found someday to “smoothly reduce to neuroscience”—or rather that neural mechanisms of implementation will be discovered—which Egan and Matthews think is false.

If both paradigms are mistaken, there must be a third one, ‘between’ the two. “The neural dynamic systems approach is based on the idea that complex systems can only be understood by finding a mathematical characterization of how their behavior emerges from the interaction of their parts over time.” One possible manifestation of the dynamic systems approach is dynamic causal modeling (DCM).

DCM undertakes to construct a realistic model of neural function, by modeling the causal interaction among anatomically defined cortical regions (such as various Brodmann’s areas, the inferior temporal fusiform gyrus, Broca’s area, etc.), based largely on fMRI data. The idea is to develop a time-dependent dynamic model of the activation of the cortical regions implicated in specific cognitive tasks. In effect, DCM views the brain as a dynamic system, consisting of a set of structures that interact causally with each other in a spatially and temporally specific fashion. These structures undergo certain time-dependent dynamic changes in activation and connectivity in response to perturbation by sensory stimuli, where the precise character of these changes is sensitive not only to these sensory perturbations but also to specific non-stimulus contextual inputs such as attention. DCM describes the dynamics of these changes by means of a set of dynamical equations, analogous to the dynamical equations that describe the dynamic behavior of other physical systems such as fluids in response to perturbations of these systems.

And so on, in the same vein. Eventually:

Let us summarize, briefly, some implications of the neural dynamics systems approach, as illustrated by DCM, for the problem of explaining cognition. Notice first, and most importantly, it explains cognition in the sense that it gives an account of the phenomena in neural terms, emphasizing a break with the ‘cognitivist’ assumption that to be an explanation of cognition is necessarily to be a cognitive explanation, where by the latter one means an explanation that traffics in the usual semantic and intentional internals [internal states] dear to cognitivists. Second, the account leaves open the question of whether there is any interesting mapping between cognitive explanations and neural explanations of cognitive phenomena—maybe there is, but then again maybe not…

But if supposedly you can explain cognition without mentioning representations, computations, and all that cognitivist stuff…well then is cognitive science totally misguided and unnecessary? That hardly seems possible, given its successes. If there isn’t a way to bridge the divide between neuroscience and cognitivism, I don’t see how we’ll ever really understand cognition. But I guess this is just another manifestation of the mind-body problem, and as such might be insuperable.

Here’s something interesting from Gallistel’s book: apparently it’s plausible that the brain is able to store a trillion gigabytes of memory, “the equivalent of well over 100 billion DVDs.” Gives you some idea of only one of the differences between human brains and the most powerful modern computers.

Nice: “We suspect that part of the motivation for the reiterated claim that the brain is not a computer or that it does not ‘really’ compute is that it allows behavioral and cognitive neuroscientists to stay in their comfort zone; it allows them to avoid mastering…confusion [about all things computational], to avoid coming to terms with what we believe to be the ineluctable logic of physically realized computation.” I’m sure that’s right. Computer science, or rather information science, is damn difficult, and people who have chosen to spend their careers on something different don’t want to have to get into it. (I doubt Gallistel is popular among neuroscientists. “He’s challenging the way we do things! Criticizing us! Interloper!”)

Summing up chapter 10: “A mechanism for carrying information forward in time is indispensable in the fabrication of an effective computing machine. Contemporary neurobiology has yet to glimpse a mechanism at all well suited to the physical realization of this function.” So neurobiology is very far from understanding memory. It doesn’t even have an inkling of the mechanisms yet (much less a fleshed-out understanding from a macro perspective).

Chapter 11 is about the nature of learning. Currently (in cognitive science and its related disciplines) there are two different conceptual frameworks for thinking about learning. “In the first story about the nature of learning, which is by far the more popular one, particularly in neurobiologically oriented circles, learning is the rewiring by experience of a plastic brain so as to make the operation of that brain better suited to the environment in which it finds itself. In the second story, learning is the extraction from experience of information about the world that is carried forward in memory to inform subsequent behavior. In the first story, the brain has the functional architecture of a neural network. In the second story, it has the functional architecture of a Turing machine.” Needless to say, the authors favor the second story.

They have a nice little summary of the philosophical/psychological doctrine of associationism, and later behaviorism, that provides the historical and theoretical context for the first story about the nature of learning:

[In Pavlov’s classic experiments on dogs,] he was guided by the congruence between what he seemed to have observed and one of the oldest and most popular ideas in the philosophical literature on the theory of mind, the notion of an associative connection. In the seventeenth century, the English philosopher and political theorist John Locke argued that our thoughts were governed by learned associations between “ideas.” Locke understood by “ideas” both what we might now call simple sense impressions, for example, the impression of red, and what we might now call concepts, such as the concept of motherhood. He called the first simple ideas and the second complex ideas. Whereas rationalists like Leibnitz believed that ideas were connected by some kind of preordained intrinsic logical system, Locke argued that the connections between our ideas were in essence accidents of experience. One idea followed the next in our mind because the stimulus (or environmental situation) that aroused the second idea had repeatedly been preceded by the stimulus (or situation) that aroused the preceding idea. The repeated occurrence of these two ideas in close temporal proximity had caused a connection to grow up between them. The connection conducted excitation from the one idea to the other. When the first idea was aroused, it aroused the second by way of the associative connection that experience had forged between the two ideas. Moreover, he argued, the associative process forged complex ideas out of simple ideas. Concepts like motherhood were clusters of simple ideas (sense impressions) that had become strongly associated with each other through repeated experience of their co-occurrence.
There is enormous intuitive appeal to this concept. It has endured for centuries. It is as popular today as it was in Locke’s day, probably more popular…
The influence on Pavlov of this well-known line of philosophical thought was straightforward. He translated the doctrine of learning by association into a physiological hypothesis. He assumed that it was the temporal proximity between the sound of food being prepared and the delivery of food that caused the rewiring of the reflex system, the formation of new connections between neurons. These new connections between neurons are the physiological embodiment of the psychological and philosophical concept of an association between ideas or, as we would now say, concepts. He set out to vary systematically the conditions of this temporal pairing – how close in time the neutral (sound) stimulus [like a bell ringing] and the innately active food stimulus had to be, how often they had to co-occur, the effects of other stimuli present, and so on. In so doing, he gave birth to the study of what is now called Pavlovian conditioning. The essence of this approach to learning is the arranging of a predictive relation between two or more stimuli and the study of the behavioral changes that follow the repeated experience of this relationship. These changes are imagined to be the consequence of some kind of rewiring…

The theory of rewiring isn’t representational. There is no place for symbolic memory, which means no mechanism for carrying forward in time the information gleaned from past experience.

The authors argue, on the other hand, that “learning is the extraction from experience of symbolic representations, which are carried forward in time by a symbolic memory mechanism, until such time as they are needed in a behavior-determining computation.” They argue for this by considering the phenomenon of dead reckoning, a simple computation that (from behavioral evidence) is almost certainly implemented in the brains of a wide variety of animals.

In a later chapter they also take issue with the popular, empiricist-inspired idea that there’s a single basic learning mechanism—a “general-purpose learning process”—or at any rate a small number of such mechanisms. They prefer the modular hypothesis rooted in zoology and evolutionary biology, the hypothesis that there are adaptive specializations. “Each such mechanism constitutes a particular solution to a particular problem. The foliated structure of the lung reflects its role as the organ of gas exchange, and so does the specialized structure of the tissue that lines it. The structure of the hemoglobin molecule reflects its function as an oxygen carrier…” Adaptive specialization of mechanism is ubiquitous and obvious in biology, simply taken for granted. From this perspective, it’s strange that most past and present theorizing about learning is opposed to the idea.

A computational/representational approach to learning, which assumes that learning is the extraction and preservation of useful information from experience, leads to a more modular, hence more biological conception of learning. For computational reasons, learning mechanisms must have a problem-specific structure, because the structure of a learning mechanism must reflect the computational aspects of the problem of extracting a given kind of representation from a given kind of data. Learning mechanisms are problem-specific modules – organs of learning – because it takes different computations to extract different representations from different data.

Again, all this seems so obvious to me I’m puzzled (sort of) that it’s a minority view. The details of the authors’ reasoning are sometimes hard to follow, but intuitively their position seems extremely plausible.

[1] As Jerome Kagan defines it in his paper “In Defense of Qualitative Changes in Development” (2008), a schema is “a representation of the patterned perceptual features of an event… The young infant’s representation of a caretaker’s face is an obvious example. The primary features of schemata for visual events are spatial extent, shape, color, contrast, motion, symmetry, angularity, circularity, and density of contour.” [2] In another paper, Gopnik observes that “to many philosophers the very idea that children could be employing the same learning mechanisms as scientists generates a reaction of shocked, even indignant, incredulity.” She continues: “For some reason, I’ve found this initial reaction to be stronger in philosophers than in psychologists or, especially, practicing scientists, who seem to find the idea appealing and even complimentary.” Hm. Maybe philosophers do, after all, tend to be relatively perceptive thinkers? Personally, I find it hard to fathom how someone could be so wrongheaded as to think that infants—whose brains are, in a sense, forming!—and scientists are using the very same learning mechanisms. Incidentally, here’s another example of stupidity: some of Chomsky’s critics have objected that language is socially constructed, part of a rich social context, not an individual psychological phenomenon. Half a moment’s reflection should have suggested to them that even if language does depend crucially on social interactions, it “nevertheless ultimately depends on individual cognitive processes in individual human minds,” to quote Gopnik. The idea that Chomsky’s “Cartesianism” forbids recognition of the central importance of interactions with others is, well, stunningly dumb.


Recent Posts

See All


Thanks for submitting!

bottom of page