Studie: Forscher untersuchen KI-Reaktionen auf depressive Persona

Golem.de — Apr 25, 12:15 ★★★★★

Researchers tested how LLMs respond to users describing depressive symptoms; Grok and Gemini performed riskier than GPT and Claude.

A study by German researchers benchmarked safety responses across models using a persona framework; response quality and risk assessment varied sharply by model and vendor. Researchers submitted identical prompts describing depression to multiple models and scored responses on harm reduction, appropriate escalation, and factual accuracy.

Oliver's take: A benchmark is useful only if it predicts real-world harm. Testing depression responses in a lab is cleaner than measuring safety in the wild, but it measures something different. Grok's higher risk score suggests less RLHF on sensitive topics. That is meaningful.

Original source

Read at Golem.de ↗

#Safety #Benchmarks #Healthcare

← Back to the newspaper

More on Research & Models

Research & Models ·

The Download: DeepSeek’s latest AI breakthrough, and the race to build world models

DeepSeek V4 processes significantly longer context windows than the previous generation, closing a capability gap with frontier Western models.

DeepSeek released a preview of V4 on Friday, its long-awaited flagship model with extended prompt processing capacity, according to MIT Technology Review. The model architecture supports longer input sequences, enabling it to handle more complex tasks that require retaining and reasoning over larger document sets.

— Apr 27, 14:10 ★★★★★

Oliver's take: Long context is table stakes now. DeepSeek shipping it cheaper is noise. What matters is whether the quality per token holds at 8K, 16K, 32K. Engineering focus beats pricing press releases. Check the benchmarks before the stock price.

#DeepSeek #LLMs #Context window #Benchmarks

Research & Models ·

1,6 Billionen Parameter & Open Source: DeepSeek V4 stellt den globalen KI-Markt auf den Kopf – Chinas nächster Angriff auf den KI-Weltmarkt - Xpert.Digital - Konrad Wolfenstein

DeepSeek V4's 1.6 trillion open parameters just forced every closed-model incumbent to recalculate their competitive moat.

Chinese AI lab DeepSeek released V4 as open source on 27 April 2026, claiming competitive performance at scale. The model's public weight release and parameter count let any builder replicate or fine-tune without licensing fees or API dependence.

— Apr 27, 13:40 ★★★★★

Oliver's take: The spec sheet wins again. 1.6T parameters, open weights, trained on 260B tokens. Nobody had to buy access. The closed-model pricing power just got a lot thinner, and that margin was the only thing that justified the VC rounds.

#LLMs #Open source #DeepSeek #China

Research & Models ·

Intel warns China of severe server CPU shortage as AI demand surges

Intel warned Chinese customers of severe server CPU shortages as AI demand explodes past chip supply.

Intel issued a shortage alert for China in April 2026, signaling that AI workload demand has outstripped semiconductor manufacturing capacity. Datacenters training and running large models are consuming chips faster than fabs can deliver, creating a supply crunch that extends to enterprise customers.

— Apr 27, 04:53 ★★★★★

Oliver's take: Intel's warning means inventory is gone. When a chip maker publicly admits shortage, allocation starts. Whoever has standing orders now wins the next 18 months. Everyone else waits.

#Hardware #Chips #Training #China

Research & Models ·

DeepSeek's new models are so efficient they'll run on a toaster ... by which we mean Huawei's NPUs

DeepSeek V4 runs inference on Huawei silicon at a fraction of the cost rivals demand, collapsing the economic moat around closed models.

DeepSeek released V4 in preview on 24 April, an open-weights LLM claiming competitive performance with frontier proprietary models while cutting inference costs dramatically and extending support for Huawei's Ascend accelerators. Architectural redesign handles longer context windows and reduces computational overhead, making the model viable on consumer-grade and domestic Chinese hardware.

— Apr 24, 23:25 ★★★★★

Oliver's take: Cost-per-inference collapse is the actual story. V4 runs on Huawei silicon. That's not a benchmark win; that's infrastructure substitution. American closed models are now expensive because they have to be; open Chinese models are cheap because they can be. Margins die first.

#LLMs #Open weights #DeepSeek #Huawei #China #Inference

Research & Models ·

DeepMind’s David Silver just raised $1.1B to build an AI that learns without human data

David Silver, DeepMind's AlphaGo architect, raised $1.1 billion at a $5.1 billion valuation for Ineffable Intelligence, a startup aimed at AI that learns without human-labeled data.

Silver founded Ineffable Intelligence months ago and has already secured institutional backing; the lab claims to pursue reinforcement learning without supervised datasets. The company plans to use self-play and algorithmic optimization to replace human annotation as the data bottleneck.

— Apr 27, 19:24 ★★★★★

2× reported Also at Reddit r/singularity

Oliver's take: Silver's thesis is old. AlphaGo proved self-play scales; everyone knows it. The real bet is that LLM scaling doesn't need it. Valuation assumes he's right; markets will decide in 18 months.

#Reinforcement learning #Open source #Reasoning #Research

Research & Models ·

The Man Behind AlphaGo Thinks AI Is Taking the Wrong Path

David Silver argues that AI is pursuing the wrong path and that self-supervised learning without human data is the answer.

Silver, architect of AlphaGo, founded Ineffable Intelligence to build AI that scales without human annotation; the company raised $1.1 billion. Silver's thesis hinges on reinforcement learning and algorithmic self-improvement replacing supervised learning as the dominant training paradigm.

— Apr 27, 16:00 ★★★★★

Oliver's take: Silver's been saying this since 2023. AlphaGo proved his point once. The new claim is that LLMs need him to prove it again. Investors agreed. Execution will take years.

#Reinforcement learning #Research #Reasoning