Next.js App Router + React Server Components Demo

new
past
show
ask
show
jobs
submit

▲Meaning Machine – Visualize how LLMs break down and simulate meaning (meaning-machine.streamlit.app)

33 points by jdspiral 6 hours ago | 12 comments

wrs 1 hours ago [-]

Is there evidence that modern LLMs identify parts of speech in an observable way? This explanation sounds more like how we did it in the 90s before deep learning took over.

synapsomorphy 47 minutes ago [-]

This is somewhat disingenuous IMO. Language models do NOT explicitly tag parts of speech, or construct grammatical trees of relationships between words [1].

It also feels like motivated reasoning to make them seem dumb because in reality we mostly have no clue what algorithms are running inside LLMs.

> When you or I say "dog", we might recall the feeling of fur, the sound of barking [..] But when a model sees "dog", it sees a vector of numbers

when o3 or Gemini sees "dog", it might recall the feeling of fur, the sound of barking [..] But when a human says "dog", it sees electrical impulses in neurons

The stochastic parrot argument has been had a million times over and this doesn't feel like a substantial contribution. If you think vectors of numbers can never be true meaning then that means either (a) no amount of silicon can ever make a perfect simulation of a human brain, or (b) a perfectly simulated brain would not actually think or feel. Both seem very unlikely to me.

There are much better resources out there if you want to learn our best idea of what algorithms go on inside LLMs [2][3], it's a whole field called mechanistic interpretability, and it's way, way, way more complicated than tagging parts of speech.

[1] Maybe attention learns something like this, but it's doing a whole lot more than just that.

[2] https://transformer-circuits.pub/2025/attribution-graphs/bio...

[3] https://transformer-circuits.pub/2022/toy_model/index.html

P.S. The explainer has em dashes aplenty. I strongly prefer to see disclaimers (even if it's a losing battle) when LLMs are used heavily for writing especially for more technical topics like this.

jdspiral 6 hours ago [-]

I built a tool called Meaning Machine to let you see how language models "read" your words.

It walks through the core stages — tokenization, POS tagging, dependency parsing, embeddings — and visualizes how meaning gets fragmented and simulated along the way.

Built with Streamlit, spaCy, BERT, and Plotly. It’s fast, interactive, and aimed at anyone curious about how LLMs turn your sentence into structured data.

Would love thoughts and feedback from the HN crowd — especially devs, linguists, or anyone working with or thinking about NLP systems.

GitHub: https://github.com/jdspiral/meaning-machine Live Demo: https://meaning-machine.streamlit.app

dz0707 1 hours ago [-]

I'm wondering if this could turn into some kind of prompt tunning tool - like to detect weak or undesired relationships, "blur" in embeddings, etc.

georgewsinger 5 hours ago [-]

Is this really how SOTA LLMs parse our queries? To what extent is this a simplified representation of what they really "see"?

jdspiral 3 hours ago [-]

Yes, tokenization and embeddings are exactly how LLMs process input—they break text into tokens and map them to vectors. POS tags and SVOs aren't part of the model pipeline but help visualize structures the models learn implicitly.

Der_Einzige 1 hours ago [-]

UMAP is far superior to PCA for these kinds of visualizations and they have a fast GPU version available within CuML for awhile.

andai 4 hours ago [-]

See also: explainer post: https://theperformanceage.com/p/how-language-models-see-you

sherdil2022 4 hours ago [-]

Great job! Do you have any plans to visualize/explain how machine translation - between human languages - works?

jdspiral 3 hours ago [-]

Thanks! Yes — that’s on the roadmap, along with some other cool visualizations I’m working on. Machine translation is definitely something I want to work on: showing how models align meaning across languages using shared embeddings and attention patterns. I’d love to make that interactive too.

sherdil2022 2 hours ago [-]

I would love to get involved with that (I speak a handful of himan languages). Let me know if you are looking for collaborators.

Dwedit 3 hours ago [-]

Send tokens to model, model goes brrrr, get output tokens back.

6 hours ago [-]

XTXinverseXTY 5 hours ago [-]

[flagged]

Rendered at 04:57:51 GMT+0000 (Coordinated Universal Time) with Vercel.