Discussion · bonfire.cafe

@UlrikeHahn@fediscience.org · 3 months ago

#cogsci25 a great talk by Ellie Pavlick on ‘emergent compositionality in neural networks’:

Compositionality in language and thought has been one of the long running debates in cognitive science. It refers to the way complex meanings are established from component parts. Specifically, it’s the idea that the meaning of a complex unit can be derived solely from the meanings of its parts: e.g., the meaning of “black cat” can be built up directly from the meanings of “black” and “cat”.
🧵

Ulrike Hahn

@UlrikeHahn@fediscience.org replied · 3 months ago

2/ Compositionality is a hallmark of symbolic computation. The difficulty that connectionist networks had with compositionality was marshalled as a key reason for rejecting them as candidate cognitive models (see e.g., Fodor & Pylyshyn, 1988 https://uh.edu/~garson/F&P1.PDF )

So the question of whether 3rd generation neural networks (e,g, large language models) fare better has the potential to be hugely informative to questions about how to think about human cognition.

Ulrike Hahn

@UlrikeHahn@fediscience.org replied · 3 months ago

3/ Ellie distinguished two aspects/kinds of compositionality: structural and functional compositionality and provided evidence of both in 3rd generation models.
‘Structural compositionality’ refers to the way -on symbolic representations- the constituent parts are part of the overall representation (our representation of ‘pink’ elephant’ is part of our representation of ‘pink elephant’). For neural networks, this would translate into the fact that activations and weights of the network are organised into identifiable and re-combinable parts.

Evidence for exactly this can be found in studies like Lepori, Serre & Pavlik (2023) which shows evidence for structured representations in the weights as networks break tasks down into subroutines

https://proceedings.neurips.cc/paper_files/paper/2023/hash/85069585133c4c168c865e65d72e9775-Abstract-Conference.html

Ulrike Hahn

@UlrikeHahn@fediscience.org replied · 3 months ago

4/ The second type of compositionality she distinguished is ‘functional compositionality’. Imagine a function like y = g(f(x)). As a universal function approximator a network could learn to approximate this functions output directly, or it could derive it via intermediate computation of f(x). In support of this she described new work that shows evidence of such intermediate computation.

Benjamin Han

@BenjaminHan@sigmoid.social replied · 3 months ago

@UlrikeHahn Any citation on this? Thanks!

Ulrike Hahn

@UlrikeHahn@fediscience.org replied · 3 months ago

@BenjaminHan sadly no, this was work in prep!

Benjamin Han

@BenjaminHan@sigmoid.social replied · 3 months ago

@UlrikeHahn Showing LLMs can learn the concept of recursion (and possibly the concept of infinity?) is super interesting. Looking forward to the paper!

Ulrike Hahn

@UlrikeHahn@fediscience.org replied · 3 months ago

5/ In short, LLMs can learn representations that are structural and modular in both activation and weight space. But at the same time, they remain context sensitive - so they capture ways in which human cognition deviates from purely symbolic architectures. In this way, they can move forward this long standing debate by providing an example computational system that combines these properties.

Ulrike Hahn

@UlrikeHahn@fediscience.org replied · 3 months ago

I posted about Ellie Pavlick’s excellent talk on compositionality in #LLMs at #cogsci25 last week. I just saw that she is also giving this keynote #ccn2025 and anyone can watch it here:

I recommend it!

https://hva-uva.cloud.panopto.eu/Panopto/Pages/Embed.aspx?id=b26bd214-6afd-413e-898d-b2dc00787139

Gemma ⭐️🔰🇺🇸 🇵🇭 🎐

@gcvsa@mstdn.plus replied · 3 months ago

@UlrikeHahn LLMs cannot "learn" anything. All they can do is assign a value to a thing that it can store in a database for later lookup. An LLMs cannot decide that a lower value item is actually a more appropriate response, the way a learned human can distinguish this.

Ulrike Hahn

@UlrikeHahn@fediscience.org replied · 3 months ago

@gcvsa Gemma, you are free to define the term “learning” in any way you wish. I’m using it in the way it has been used in cognitive science for decades: under that terminology, LLMs -like other artificial neural networks- “learn”

1 more replies (not shown)

bonfire.cafe

A space for Bonfire maintainers and contributors to communicate

bonfire.cafe: About · Code of conduct · Privacy · Users · Instances

Bonfire social · 1.0.0-rc.3.21 no JS en

Automatic federation enabled