I can’t narrow the error band enough given our inability to tune the scale of the response, the sequence of reasoning, the order of alignment.
So if someone demos such a thing they will seek to falsify what exists not interpret what exists as limited by the constraints upon our control. (That’s my fear.)
Secondly I have more insight recently into their decision processes and I can see that while we WANT to limit ourselves to the production of training material, it’s possible and evident they want to see the full implementation – which is exactly what I was trying to avoid getting into the business of producing.
I don’t want to own servers, code, releases, or customers for that matter. It’s ‘plumbing’. The existing teams at the LLM producers are already masters of plumbing it’s closure and computability they’re not. So I see duplication of effort and spend as unnecessary. I’d prefer to devote all our time to the production of the tuning and not of the code and hosting.
So, I thought I could get away with a shallow implementation with the documents (RAG) and protocols (Prompts). And for basic questions yes. But can I get to demonstrating auditability in liability sectors such that someone who doesn’t understand our work can see it’s a matter of precision from training (tuning)?
I’m not sure and I don’t like pitching a deal where I can’t prove my words.
And yes I”m just thinking and stressing out loud because I’m disappointed that I have to do more work when I’d like to move back to producing the training materials and the books, and getting the legal nonsense all done etc.
Source date (UTC): 2025-09-15 18:00:18 UTC
Original post: https://twitter.com/i/web/status/1967649694307471623
Leave a Reply