Olmo 3: Charting a path through the model flow to lead open-source AI

Olmo 3: Charting a path through the model flow to lead open-source AI (allenai.org)

mseri

373 points

mentalgear

This is how the future of "AI" has to look like: Fully-traceable inferences steps, that can be inspected & adjusted if needed.

Without this, I don't see how we (the general population) can maintain any control - or even understanding - of these larger and more opaque becoming LLM-based long-inference "AI" systems.

Without transparency, Big Tech, autocrats and eventually the "AI" itself (whether "self-aware" or not) will do whatever they like with us.

moffkalast

19 hours ago

You've answered your own question as to why many people will want this approach gone entirely.

Imustaskforhelp

11 hours ago

I always really like answers like yours as they are clever and in my opinion maybe a bit true as well

I think that tho there are a lot of things public can do and maybe raising awareness about these stuff could be great as well.

turnsout

20 hours ago

I agree transparency is great. But making the response inspectable and adjustable is a huge UI/UX challenge. It's good to see people take a stab at it. I hope there's a lot more iteration in this area, because there's still a long way to go.

lionkor

19 hours ago

If I give you tens of billions of dollars, like, wired to your personal bank account, do you think you could figure it out given a decade or two?

SilverElfin

16 hours ago

In the least, we need to know what training data goes into each AI model. Maybe there needs to be a third party company that does audits and provides transparency reports, so even with proprietary models, there are some checks and balances.

Blamklmo

a day ago

[dead]

Y_Y

a day ago

I asked it if giraffes were kosher to eat and it told me:

> Giraffes are not kosher because they do not chew their cud, even though they have split hooves. Both requirements must be satisfied for an animal to be permissible.

HN will have removed the extraneous emojis.

This is at odds with my interpretation of giraffe anatomy and behaviour and of Talmudic law.

Luckily old sycophant GPT5.1 agrees with me:

> Yes. They have split hooves and chew cud, so they meet the anatomical criteria. Ritual slaughter is technically feasible though impractical.

silviot

a day ago

I tried the playground at https://playground.allenai.org/ and clicked the "Show OlmoTrace" button.

Above the response it says

> Documents from the training data that have exact text matches with the model response. Powered by infini-gram

so, if I understand correctly, it searches the training data for matches in the LLM output. This is not traceability in my opinion. This is an attempt at guessing.

Checking individual sources I got texts completely unrelated with the question/answer, but that happen to share an N-gram [1] (I saw sequences up to 6 words) with the LLM answer.

I think they're being dishonest in their presentation of what Olmo can and can't do.

[1] https://en.wikipedia.org/wiki/N-gram

nickreese

18 hours ago

I'm just now moving my main workflows off openai over to local models and I'm starting to find that these smaller models main failure mode is that they will accept edgecases with the goal of being helpful.

Especially in extraction tasks. This appears as inventing data or rationalizing around clear roadblocks.

My biggest hack so far is giving them an out named "edge_case" and telling them it is REALLY helpful if they identify edgecases. Simply renaming "fail_closed" or "dead_end" options to "edge_case" with helpful wording causes qwen models to adhere to their prompting more.

It feels like there are 100s of these small hacks that people have to have discovered... why isn't there a centralized place where people are recording these learnings?

tcsenpai

a day ago

I think they should start aiming for 20B models along with 32B and 7B. Usually 7B is enough for a 8GB GPU, 32B requires a 24GB GPU for decent quants (I can fit a 32B with IQ3_XXS but is not ideal) while 20-ish B models (such as magistral or gpt-oss) are a perfect fit for 16GB GPUs

robrenaud

13 hours ago

Open source AI is just a lost term. It has been co-opted. If the weights are released, it's open source. Not because that makes sense, not because it's right, but because that's the unfortunate marketting term that has stuck.

The real open source models need a rebrand. Transparent models, perhaps?

tcsenpai

a day ago

Also: I simply asked to the 7B model "hi, who are u":

Thought for 1 seconds Okay, the user is asking "Who are u?" with a casual tone ("u" instead of "you", exclamation mark). Let me break this down.

First, I need to address the question itself. They want to know who I am as an AI assistant. My standard response is that I'm DeepSeek R1, designed to be helpful and handle reasoning step by step. But maybe they're not familiar with AI terms in general. The "u" part might just be their way of speaking, so I should reply politely using "you" form.

Next, considering the context. They started with an HTML tag " "

----

And it hung there without a reply. It might be an OpenWebUI bug but not so promising