Hacker Newsnew | past | comments | ask | show | jobs | submit | sinuhe69's commentslogin

IMO, using AI to assign keywords to a broader group of strict synonymous keywords would make the comparison much more helpful.

Because in general we want to know the trend of categories more than of a word, asking for “auto pilot” for ex. should include “self driving”, FSD etc.


I would not like this. This is the kind of change that made google search so annoying. (Eg what if I want to track the history of 'self-driving' vs 'auto pilot' in sales pitches? Or more basically, what if the system wrongly interprets me wrongly?) Better to support | or similar old-fashioned search engine syntax and dwis and not dwim.

Synonym functionality is good as long as there's an easy way to disable it, either globally or by wrapping the term in quotes.

Is moving many drones in a formation so truly difficult? I’ve seen drones in formation all the time for drones show and fireworks. Hundreds and even thousands of them, and most likely they are not remote controlled but programmed to do so.

Why was the account painted as something extraordinary?


It’s not extraordinary from a technological standpoint, it’s extraordinary as a military tactic. This has historically not been how drones are used in wartime.

I think the takeaway is that swarming behavior is not difficult. It's relatively simple, it's cheap, and it's going to grow in effectiveness and smaller actors are going to gain access to it. That's going to be a problem for war-makers who don't take it into account. I think the pilot expressed awe at it because it was part of what beat him.

Beyond the shape they produce, I assume that drones for entertainment are not the same as military ones.

You forgot that the SAT requirement is not exclusive but an additional data point. While I agree that it could narrow the path for truly good employees, I’d argue that an additional data point like SAT (+ GPA) could tell the employer a lot about consistency of the applicants. Or at least an interesting talking point (“I see you got a very high SAT score but your GPA was lower, what happened?”), if they care.

I think it could serve the purposes of hiring fresh/young graduates. However, it’s still weird if they requested it for people already 5-10 years or more in the industry.


Searching and seizure of your laptops, including your personal phones without a probable cause or warrant.

Compel you to reveal your secrets, including your passwords by threatening to arrest and detain you without legal proceedings for an unspecified period.

Deny your basic human rights, particularly at the borders, especially if you aren’t a citizen.

And more.


5% is very low probability to get a hit. I tried with all ChatGPT, Gemini, Claude and Grok but they all answered correctly :’(

Too sad, I want to have the fun.


doing 50 API queries to Sonnet of this length is not that expensive..

How it’s price dumping if they give you the model to run free on your own hardware? Follow this logic, are not all OSS price dumping and should be blocked as well? I remember Steve Ballmer once called for that!

I’m pretty sure the digital lords like that proposal a lot. Not so much about the serfs themselves, though.


First of all you are assuming I'm saying they should block it. I'm not.

Seconddly, releasing the model weights for free and selling hosted inference are completely different markets. Open source itself is not price dumping obviously. But a hosted API can still be dumped if it is sold below cost to buy market share or squeeze competitors. Which it is.


Not just Sander, Trump expressed his wish to exchange concessions and privileges for share in AI labs, too. Which is IMO even more problematic.

These games are so far outside the normal training corpus and purposes of the AI, I think different promtings could bring vastly different results.

Too bad the author didn’t let the playground open for anyone to try their hand on it.

Yes, it’s fun and it could justify the conclusion “each model for its task”. But are coding benchmarks not designed for the same purpose? The current benchmarks are certainly not perfect and hyper-tuned for the tests can always happen. However, I don’t think a battle royal result can tell much about the coding performance or how helpful the AI could be for me in my daily work.


I get what you mean. But for many people, AI coding is not about solving complex problems. No, they do it mostly themselves. AI coding for many is a productivity tool, where it helps you with mundane, but laborious tasks.

In my setup, I use a daily workhorse for such things. They should be fast, cheap and reasonably working well. I don’t expect it to be smart, but need it to follow instructions perfectly and handle tool calling well.

For architectural work or debugging help, I use the top models instead.

That works reasonably well for me with a low cost.


Recent incident with the Rio 3.5 model clearly shows that many coding models are specifically trained/fine tuned for the benchmarks.

That's what I thought

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: