dandelionv1bes

an hour ago
If you’re interested in an easy explainer for backprop, then highly recommend math for deep learning by Kneusel. Finished it recently and you can apply the chain rule backwards by hand with a tiny NN, and code one too.

helloplanets

2 days ago
For the visual learners, here's a classic intro to how LLMs work: https://bbycroft.net/llm

vivzkestrel

a day ago
- while impressive, it still doesnt tell me why a neural network is architected the way it is and that my bois is where this guy comes in https://threads.championswimmer.in/p/why-are-neural-networks...

- make a visualization of the article above and it would be the biggest aha moment in tech

stuxnet79

a day ago
Regarding architecture, I don't believe a satisfying "why" is in the cards.

Conceptually neural networks are quite simple. You can think of each neural net as a daisy chain of functions that can be efficiently tuned to fulfill some objective via backpropagation.

Their effectiveness (in the dimensions we care about) are more a consequence of the explosion of compute and data that occured in the 2010s.

In my view, every hyped architecture was what yielded the best accuracy given the compute resources available at the time. It's not a given that these architectures are the most optimal and we certainly don't always fully understand why they work. Most of the innovations in this space over the past 15 years have come from private companies that have lacked a strong research focus but are resource rich (endless compute and data capacity).

Lovely visualization. I like the very concrete depiction of middle layers "recognizing features", that make the whole machine feel more plausible. I'm also a fan of visualizing things, but I think its important to appreciate that some things (like 10,000 dimension vector as the input, or even a 100 dimension vector as an output) can't be concretely visualized, and you have to develop intuitions in more roundabout ways.

I hope make more of these, I'd love to see a transformer presented more clearly.

Super cool visualization Found this vid by 3Blue1Brown super helpful for visualizing transformers as well. https://www.youtube.com/watch?v=wjZofJX0v4M&t=1198s
This is just scratching the surface -- where neural networks were thirty years ago: https://en.wikipedia.org/wiki/MNIST_database

If you want to understand neural networks, keep going.