Skip to main content


DeepSeek R1's recipe to replicate o1 and the future of reasoning LMs


R1 utilizes a training method called direct reinforcement learning which is a form of unsupervised learning that forgoes the need for labelled data or explicit solutions. Instead, the model explores various approaches and generates multiple potential answers that are grouped and evaluated using a reward score. This score acts as a fitness function, allowing for learning and adjusting strategies over time. R1 progressively improves its problem-solving abilities by reinforcing successful approaches. This is a similar process to how humans learn to solve problems through trial and error.
This entry was edited (2 months ago)



“Versagen der israelischen Armee”: Generalstabs-Chef Halevi tritt zurück exxpress.at/politik/versagen-d… Mehr als 15 Monate nach Beginn des Gaza-Kriegs hat der israelische Generalstabschef Herzi Halevi seinen Rücktritt erklärt. Der Schritt solle am 6. März in Kraft treten. #news #press



I just unsubscribed from #TheContrarian (contrarian.substack.com/), #JenniferRubin and #NormEisen's new anti-Trump endeavor, after subscribing the day it launched.
I gave it long enough to be certain that it's neoliberal garbage through and through. Just a bunch of status-quoniks who aspire for things to be how they were before Trump. As if that will solve anything. Clueless idiots, the lot of them. Or maybe just clickbait grifters. (Why not both?)
#journalism #neoliberalism

Dan Gillmor reshared this.

in reply to Jonathan Kamens

That seems pretty harsh. I find it better than a WaPo subscription.
contrarian.substack.com/p/trum…
in reply to John Breen

@jab01701mid
Condemning Trump and Trumpism is a low bar. There are plenty of people I can read who are doing that who are far more well-informed and informative than anyone at The Contrarian.
To be any use whatsoever, people need to go farther than that. They need to actually be advocating for progressivism, not just a return to the time before Trump.
The time before Trump is what got us Trump.
Rubin will never do that. She still thinks the GOP was fine before Trump came along and ruined it.


Stell Dir vor, es ist Wahl und keiner geht hin tichyseinblick.de/daili-es-sen… Derzeit planen 28 Prozent der Stimmberechtigten, am 23. Februar nicht an der Bundestagswahl teilzunehmen. Das hat eine Umfrage des Instituts Forsa ergeben. Anfang Dezember lag der Wert noch bei 22 Prozent. Ein Anstieg von über einem Viertel über die Feiertage. „Das ist untypisch und zeigt, wie verunsichert die Menschen sind, dass sie nicht mehr wissen,
Der






Vietnam, Belarus to waive visa requirements for citizens starting Jan 30 byteseu.com/673652/ #Belarus #BelarusToWaiveVisaRequirementsForCitizensStartingJan30VnExpressInternational #Vietnam


Feuerinferno: Mindestens 66 Tote bei Hotelbrand in türkischem Skigebiet de.rt.com/gesellschaft/233836-… Der Brand im Skigebiet Kartalkaya hat die Türkei in einen Schockzustand versetzt. Die Zahl der Toten stieg im Laufe des Tages von zehn auf mehr als 60 an. Mehr als 50 Menschen wurden verletzt. Die Ursache des Feuerinfernos in dem elfgeschossigen Hotel ist weiterhin unklar. #news #press


Donald Trump just pardoned and set free a man serving life, with no chance of parole!

Ross Ulbricht, aka Dread Pirate Roberts, sentenced to life in federal prison for creating, operating ‘Silk Road’ website

More than $200 million in illegal drugs and other illicit goods were bought and sold on the website

Ulbricht was a drug dealer and criminal profiteer who contributed to the deaths of at least six young people.

#AureFreePress #News #press #breaking #breakingnews

ice.gov/news/releases/ross-ulb…

This entry was edited (2 months ago)
in reply to Aure Free Press

:coolapk_015: i got a serious question: What if this man starts another drug trafficking website? Can he receive such a verdict again?


Ukraine’s General Staff launches investigation into 156th Brigade byteseu.com/673650/ #Ukraine


Science doesn't lie, unless it DOES

Research to Ruin: The Worsening Spectre of Academic Fraud

"Universities’ willingness to tolerate sloppy or phony research in order to protect their own reputations is largely to blame for the “replication crisis” currently undermining public confidence in science generally. Very briefly, replication is a cornerstone of the scientific method because it allows the validity of experiments to be independently confirmed, in turn strengthening the underlying hypothesis. But more and more reported results from experiments in the natural, medical and social sciences cannot be replicated by other scientists. The ongoing replication crisis, now well into its second decade, furnishes further indirect evidence that a shocking amount of scientific research is shoddy or outright bogus."

c2cjournal.ca/2025/01/research…



I'm looking for a venue to post my astrophotography instead of Instagram because I don't want to support misinformation and sites that streamline radicalization. Not sure if mastodon is the place yet. But here's a smart telescope image of the orion nebula (Messier 42).
in reply to GQ Martinez

And another one with the same smart telescope (an inherited Vaonis Stellina) with a mosaic to increase the field of view. Previous image was 15 minutes of exposure time, this one is about an hour and 10 minutes. Mosaic means less integration time at each section, and you can see slightly less detail in the orion nebula even though I imaged it longer. Nebula to the left is barely noticeable.
in reply to GQ Martinez

After a month since those last images, I had a single clear night to try one of my wide field imaging setups (AT60ED scope with ASI2600MC Duo camera) on my Celestron CGX mount. I did a two panel mosaic to get the orion and running man nebulas (one panel) in the same image as the horsehead nebula (second panel). I've been wanting this image for years, but weather and location have prevented it from happening. Not my best image, but excited to finally get these together. #astrophotography


Meinungsfreiheit gibt es nicht gefiltert ansage.org/meinungsfreiheit-gi… Redeverbote und Sprachkontrolle: Werkzeuge der geistigen Amputation eines Volkes (Symbolbild:Imago) Hessens Innenminister Roman Poseck (CDU) beklagt, daß sich in den sozialen Medien “ungefiltert Meinungen, darunter auch gezielte Falschnachrichten” sammeln und durch BigTech aus dem Ausland sowie über Künstliche Intelligenz (KI) Desinformation betrieben werde. Diese für ihn “unliebsamen”




Jellyfin Server/Web release: 10.10.4


https://forum.jellyfin.org/t-new-jellyfin-server-web-release-10-10-4

This entry was edited (2 months ago)


Trump pardons dark web marketplace creator Ross Ulbricht


Ulbricht operated the anonymous digital marketplace known as Silk Road when law enforcement arrested him. The pardon fulfills a campaign pledge Trump made to Ulbricht's Libertarian supporters.

#news #npr #publicradio #usa
posted by pod_feeder_v2