ai benchmark - Search News

5don MSN

These researchers used NPR Sunday Puzzle questions to benchmark AI ‘reasoning’ models

Researchers used questions from the NPR Sunday Puzzle challenge to build a benchmark to test AI 'reasoning' models.

14h

HackerRank Introduces New Benchmark to Assess Advanced AI Models

The ASTRA Benchmark consists of multi-file, project-based problems designed to mimic real-world coding tasks. The intent of the HackerRank ASTRA Benchmark is to determine the correctness and ...

2don MSN

Nvidia and AMD trade blows over who is faster on DeepSeek AI benchmarks, so which team is telling the truth?

AMD claims its RX 7900 XTX runs DeepSeek R1 faster than Nvidia’s RTX 4090, 4080 Super, but Nvidia says the opposite is true.

The News International3d

AI domination

According to a report by the International AI Benchmark Consortium, DeepSeek outperformed its American counterparts in language processing and data analysis by 15 percent. Its multilingual ...

Yahoo Finance5d

These researchers used NPR Sunday Puzzle questions to benchmark AI 'reasoning' models

and startup Cursor created an AI benchmark using riddles from Sunday Puzzle episodes. The team says their test uncovered surprising insights, like that reasoning models — OpenAI's o1 ...

Yahoo News Singapore2d

Researchers Replicate OpenAI's Hot New AI Tool in 24 Hours

It's not perfect quite yet, it's worth pointing out. Hugging Face's Open Deep Research scored a 55.15 percent accuracy on a benchmark called General AI Assistants, while OpenAI's version scored 67.36, ...

Yahoo News Australia2d

Researchers Replicate OpenAI's Hot New AI Tool in 24 Hours

Hugging Face's Open Deep Research scored a 55.15 percent accuracy on a benchmark called General AI Assistants, while OpenAI's version scored 67.36, leaving some room for improvement. (OpenAI's ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results