It's really hard to distinguish this from satire, because it's so much detached from reality. I deduce it's not satire from the other posts here.
Labeling 300+ million people as "the good guys" grouping then by nation (I assume with "Americans" you mean US American citizen and not, for example, Mexicans?) but then trying to detach a nation from its politics is wild and the notion of "they are the good guys even when they do terrible things" is some weird circular or contradicting argument (depending on how I've wants to play that).
From what I've heard, personas give a greater chance that the LLM will answer confidently.. and also a greater chance it'll hallucinate something when the data is sparse. Supposedly "grounding" the personas on real documents/web searches is the best approach. Anecdotal though.
Instead of overwhelming development teams with bug reports, people should be offering fixes.
Months ago I started designing a system that, like other companies, finds critical security bugs in software developed at Apple, but instead of sending bug reports and stopping there the system also offers pull-requests to fix said issues; 94% acceptance rate as I write this, with the remaining merge-requests only failing at identifying kill switches that some projects use and weren’t documented beforehand.
Development teams are happier with this approach than with plain bug reports and I think every security company in the world should adapt to do something similar.
I think we should see this as simply silly behavior by a government. ... We saw some of this behavior in the last administration too.
So it's silly behavior, as typified by the last decade of American governance? Is there "serious" American leadership we should be expecting to see soon, e.g. 2029 AOC elected on a platform of unlimited 10GW datacenters and universal basic Mythos 8 models?
It may seem subjectively silly to you, but e.g. getting executed for refusing to point at a deer and call it a horse is pretty silly stuff as well, at least for those not living in the Qin Dynasty.
>But when people think of decentralized training, they don’t first think of gigantic datacenters, owned by the same company, training models across large distances. Instead, they imagine thousands of small datacenters, or individual consumers, pooling their spare compute over the internet to orchestrate a training run larger than any single actor could manage alone.
Many companies are pursuing this vision: Pluralis Research, Prime Intellect and Nous Research have already successfully decentrally trained models at scale. But in practice, training decentrally over the internet has lagged far behind more centralized training. Even their largest models (Pluralis’ 8B Protocol Model, Prime Intellect’s INTELLECT-1, and Nous’ Consilience 40B) have been trained with 1,000x less compute than today’s frontier models (such as xAI’s Grok 4).
https://epoch.ai/gradient-updates/how-far-can-decentralized-...
Most startups get a lot of warning before being sued, and the worst that happens to most that do get sued is that they have to shut down. Sure, if you piss off Disney they may come after you, but they'll send a cease and desist before they sue.
If you're successful enough to get noticed and sued, you're also likely to be successful enough to get a lawyer.
I'd recommend you launch. Learn everything you can about your product idea, your approach to thinking. Experiment and figure out what makes people fall off, what they really want and don't want. The vast majority of projects aren't financially successful but you DO learn a lot by shipping and getting real behavior.
My first thought is that this government-Anthropic feud is good publicity for both of them.
- Anthropic is seen as a victim/hero
- They get Government-endorsed model hype
- The government gets to appear like they're ahead of the curve
- The government gets to appear forcible and weapons-conscious (and maybe earn some right-wing points)
I’ve been doing pentesting with LLMs for a while and only hit a few “nope I won’t do that” and one “this conversation is flagged for being against the TOS”. No idea what the guardrails are but they are trivially abused
You’re completely overrating these benchmarks and it’s landing you at a nonsense opinion. Just actually use the models and you will see that the gap is significant.
Anthropic seem f’ed, not f’ed all at the same time. All this will come to light, but the hunch from the model reviews is that it’s all PR. Their model isn’t even close to all that for the govt to shit their pants over.
It's literally just writing a spec.md and reviewing it in a loop, fanning out to many agents using "reviewer -> [findings] -> validator -> judge (on conflict)" passes. Before I had it collect a kernel facts document from sources and a bunch of other stuff using the same kind of loop. It's got all it needs.
Also I'm doing this because I find it amusing and somewhat educational on a meta level. If I'd written this myself without a spec it would've been done last month and been likely more correct than what Claude is likely to do once it gets to implementing it (again, the first spec-free attempt failed miserably). This is way too complex an integration for the poor thing. I had some hopes Fable would get it unstuck, but now we'll never know.
But to answer your question, there is something less nuclear. You can cycle multiple modes with SHIFT+TAB.
Could it be done by making a sparse MoE of thousands, or tens of thousands, of smaller experts in very niche domains? Maybe a tree-like structure of experts which can delegate from relatively general but inaccurate to extremely niche but accurate? Also these experts might be plug-and-play, easily swap out an inferior expert with a stronger one in the future without having to redo the whole pile?
As a non-US person, I will use whatever is the best and reasonably priced. I could not give one iota about who makes or hosts these models. The origin or political leanings of these models mean nothing in my usage calculus.
As long as these models require a lot of computing power, the best models open source or not will be served by corporations who can afford the infra.