I'm seeing people on orange forum confirming that they did indeed see the sourcemap posted on npm before the version was yanked, so I am inclined to believe "real." Someone can do some kind of structural ast comparison or whatever you call it to validate that the decompiled source map matches the obfuscated release version, but that's not gonna be how I spend my day news.ycombinator.com/item?id=4…
this is honestly one point of the LLM grift people don't really focus on
like, right now, people think that these LLMs are good because they respond within reasonable timescales. "Claude might spend ten hours on a problem, which is around how much time I might have spent on it!"
except Claude didn't take ten hours. Claude is multiple concurrent processors running at once. you have to multiply the number of hours by the number of running processors to be reasonable.
even if we pretend Claude isn't fanning out to multiple servers, even like, a single GPU is hundreds of processing units. so now your ten hours is thousands of hours of time spent on the problem
and like, brute-forcing shit takes time. that's why it's like this, and they just hide it behind multiple computers and pretend that it's not brute-forcing and just thinking
and also to cover the obvious refutation of this, "doesn't this mean that shit like video games running at 60fps is a lie too"
yes
except the GPU is also sleeping a lot of that time and most of it is spent sending shit back and forth between the GPU and the CPU and waiting for it to show up. you actually have way less time per frame for the GPU to process everything and are only using a small portion of each processor's hardware, which further reduces the amount of energy spent on the problem
LLMs are like, full-ass GPUs running nonstop for several hours
FFS the abliteration technique actually lets you remove the neural pathways for a given class of query and none of the clownish "AI Companies" seem to use it because they're all asshat true believers and they're probably worried they'll upset Fing Roko's Basilisk or whatever
It's just a juiced up chatbot you numbnuts muppets, use the techniques we know work
Or yunno just liquidate the company and do something socially useful
My schaden is nicely freuded after seeing both this code dump and the fustercluck where people on Anthropic’s $100/$200 monthly plans are blowing through their 5-hour and weekly token allotments in no time flat.
Linked Reddit thread has numerous examples of pissed-off users, my favorite so far being the person who blew through the 5-hour quota trying to get Claude to realize that the 24th of March this year was not, in fact, a Monday. reddit.com/r/ClaudeAI/comments…
The media in this post is not displayed to visitors. To view it, please go to the original post.
reminder that anthropic ran (and is still running) an ENTIRE AD CAMPAIGN around "Claude code is written with claude code" and after the source was leaked that has got to be the funniest self-own in the history of advertising because OH BOY IT SHOWS.
it's hard to get across in microblogging format just how big of a dumpster fire this thing is, because what it "looks like" is "everything is done a dozen times in a dozen different ways, and everything is just sort of jammed in anywhere. to the degree there is any kind of coherent structure like 'tools' and 'agents' and whatnot, it's entirely undercut by how the entire rest of the code might have written in some special condition that completely changes how any such thing might work." I have read a lot of unrefined, straight from the LLM code, and Claude code is a masterclass in exactly what you get when you do that - an incomprehensible mess.
(I need to go do my actual job now, but I'll be back tonight with an actual IDE instead of just scrolling, jaw agape, on my phone, seeing the absolute dogshit salad that was the product of enough wealth to meet some large proportion of all real human needs, globally.)
The media in this post is not displayed to visitors. To view it, please go to the original post.
There is a lot of clientside behavior gated behind the environment variable USER_TYPE=ant that seems to be read directly off the node env var accessor. No idea how much of that would be serverside verified but boy is that sloppy. They are often labeled in comments as "anthropic only" or "internal only," so the intention to gate from external users is clear lol
Worrying about how many "please" statements you need for your code to be polite enough to work is literally an old joke about intentionally bad design from Malbolge.
"So the reason that Claude code is capable of outputting valid json is because if the prompt text suggests it should be JSON then it enters a special loop in the main query engine that just validates it against JSON schema for JSON and then feeds the data with the error message back into itself until it is valid JSON or a retry limit is reached."
This is how LLMs do *anything* even remotely approaching accurate or reliable work. They offload it to an actual deterministic algorithm. The man behind the curtain. It's frustrating how people don't seem to realize this, I hope it becomes more common knowledge!
@jonny@neuromatch.social it's also in the HN thread (much earlier today, probably scrolled away by now) 😂 --> https://github.com/chatgptprojects/claude-code/blob/642c7f944bbe5f7e57c05d756ab7fa7c9c5035cc/src/utils/userPromptKeywords.ts#L8
@sushee dingus: let's use some regex to check if people are mad agent: great idea boss! it's not just shipping fast -- it's also the best way to do that
The media in this post is not displayed to visitors. To view it, please go to the original post.
OK i can't focus on work and keep looking at this repo.
So after every "subagent" runs, claude code creates another "agent" to check on whether the first "agent" did the thing it was supposed to. I don't know about you but i smell a bit of a problem, if you can't trust whether one "agent" with a very big fancy model did something, how in the fuck are you supposed to trust another "agent" running on the smallest crappiest model?
That's not the funny part, that's obvious and fundamental to the entire show here. HOWEVER RECALL the above JSON Schema Verification thing that is unconditionally added onto the end of every round of LLM calls. the mechanism for adding that hook is... JUST FUCKING ASKING THE MODEL TO CALL THAT TOOL. second pic is registering a hook s.t. "after some stop state happens, if there isn't a message indicating that we have successfully called the JSON validation thing, prompt the model saying "you must call the json validation thing"
... Show more...
OK i can't focus on work and keep looking at this repo.
So after every "subagent" runs, claude code creates another "agent" to check on whether the first "agent" did the thing it was supposed to. I don't know about you but i smell a bit of a problem, if you can't trust whether one "agent" with a very big fancy model did something, how in the fuck are you supposed to trust another "agent" running on the smallest crappiest model?
That's not the funny part, that's obvious and fundamental to the entire show here. HOWEVER RECALL the above JSON Schema Verification thing that is unconditionally added onto the end of every round of LLM calls. the mechanism for adding that hook is... JUST FUCKING ASKING THE MODEL TO CALL THAT TOOL. second pic is registering a hook s.t. "after some stop state happens, if there isn't a message indicating that we have successfully called the JSON validation thing, prompt the model saying "you must call the json validation thing"
this shit sucks so bad they can't even CALL THEIR OWN CODE FROM INSIDE THEIR OWN CODE.
Look at the comment on pic 3 - "e.g. agent finished without calling structured output tool" - that's common enough that they have a whole goddamn error category for it, and the way it's handled is by just pretending the job was cancelled and nothing happened.
So the reason that Claude code is capable of outputting valid json is because if the prompt text suggests it should be JSON then it enters a special loop in the main query engine that just validates it against JSON schema (it looks like the schema just validates that something in fact and object and its keys are strings) and then feeds the data with the error message back into itself until it is valid JSON or a retry limit is reached.
This code is so eye wateringly spaghetti so I am still trying to see if this is true, but this seems to be how it not only returns json to the user, but how it handles all LLM-to-JSON, including internal output from its tools. There appears to be an unconditional hook where if the JSON output tool is present in the session config at all, then all tool calls must be followed by the "force into JSON" loop.
If that's true, that's just mind blowingly expensive
edit: please note that unless I say otherwise all evaluations here are just from my skimming through the code on my phone and have not been validated in any way that should cause you to be upset with me for impugning the good name of anthropic
screenshots of AI grifter blogs, how can people look at this and think it's awesome lmao
Sensitive content
The media in this post is not displayed to visitors. To view it, please go to the original post.
So ars (first pic) ran a piece similar to the one that the rest of the tech journals did "claude code source leaked, whoopsie! programmers are taking a look at it, some are finding problems, but others are saying it's really awesome."
like "inspiring and humbling" is not the word dog. I don't spend time on fucking twitter anymore so i don't hang around people who might find this fucking dogshit tornado inspiring and humbling. Even more than the tornado, i am afraid of the people who look at the tornado and say "that's super fucking awesome, i can only hope to get sucked up and shredded like lettuce in a vortex of construction debris one day"
the (almost certainly generated) blog post is the standard kind of vacuuous linkedin shillposting that one has come to expect from the gambling addicts, but i think it's illustrative: the only thing they are impressed with is the number of lines. 500k lines of code for a graph processing loop in a TUI is NOT GOOD. The only comments they make on the actual code itself is "heavily architected" (what in the fuck does that mean), "modular" (no the fuck it is not), and it runs on bun rather than node (so??? they own it!!!! of course it does!!!). and then the predictable close of "oh and also i'm also writing exactly the same thing and come check out mine"
the only* people this shit impresses are people who don't know what they're looking at and just appreciate the size of it all, or have a bridge to sell.
* I got in trouble last time i said "only" - nothing in nature is ever "only this or that," i am speaking emphatically and figuratively. there are other kinds of people who are impressed with LLMs too. Please also note that my anger is directed towards the grifters profiting off of it and people who are pouring gas on the fire and enabling this catastrophe by giving it intellectual, social, and other cover. I know there are folks who just chat with the bots because they need someone to talk to, etcetera and so on. people in need who are just making use of whatever they can grab to hang on are not who I am criticizing, and never are.
quick psa for people who keep @'ing me saying "why are you surprised" or "code has always been bad", somewhat long
Sensitive content
If i can slip in a quick PSA while my typically sleepy notifications are exploding, these are all very annoying things to say and you might want to reconsider whether they're worth ever saying in a reply directed at someone else - who are they for? what do they add?
"why are you surprised"/"even worse than {thing} itself is people being surprised at {thing}": unless the person is saying "i am surprised by this" they are likely not surprised by the thing. just saying something doesn't mean you are surprised by it, and people talking about something usually have paid attention to it before the moment you are encountering them. this is pointless hostility to people who are saying something you supposedly agree with so much that you think everyone should already believe it
"it's always been like this": slightly different than above. unless someone is saying "this is literally new and nothing like this has happened before" or you are adding actual historical context that you know for sure they don't already know, you're basically saying "hey did you know this thing you care enough about to be paying attention to and talking about frequently has happened before now as well." this is so easy to frame in a way that says "yes and" rather than "i assume you dont know about the things i know about due to being very smart." eg. "dang not again, they keep doing {thing}"
"{thing} might be bad, but {alternative/unrelated, unmentioned, non-mutually exclusive thing} is even worse": multiple things can be bad at the same time and not mentioning something does not mean i don't think it's also bad
"funny how people who think {thing} is bad also think {alternative/unrelated, unmentioned thing} is good": closely related to the above, just because you have binarized your thinking does not mean everyone else has.
anyway if the mental image you are conjuring for your interlocuters positions them as always knowing less than you by default, that might be something to look into in yourself!
(those numbers are also totally fucking wrong, the query engine is not 46ksloc, i have no idea what those numbers correspond to, as far as i can tell "nothing" and this is just hallucinated dogshit that is what i guess passes for high quality public comment nowadays)
The media in this post is not displayed to visitors. To view it, please go to the original post.
i sort of love how LLM comments sometimes tell entire stories that nobody asked. claude code even has specific system prompt language for this, but they always end up making comments about what something used to do like "now we do x instead of y" like... ok? that is why i am reading current version of code!
so claude code is just not capable of rescuing itself from its own context - if an entry in its context window throws an error, it just keep throwing that error forever until you clear it. good stuff.
(and, of course we read the entire file before checking this, rather than just reading the first 5 bytes)
I looked at one typescript file at random, and it's so obviously written in LLM. Functions are declared and assigned to a variable, that variable is executed in the next line and then never touched again. Voila! 10 lines of code that could have been one. Repeated over and over again.
are we tired of Captain Gemini on its way to become the sole monopoly yet?, while sloppy Copilot, clown claude and closed AI all trying their best to be the worst of the rest... Only Google already had products that masses use & trust to get their information from search to videos, along with the soon to be closed mobile eco system and most importantly the top advertising network to go on top of it all.
@htpcnz it's amazing how Google hasn't completely captured the entire game yet. they were the clear leader from the gate and have so far just fucking fumbled it. like there is absolutely no reason for Anthropic to be a player from a raw capital point of view. I think part of the problem is that you can't eat the poisoned fruit - you can't like actually release the hypersurveillance assistant that integrates your entire life history with the chatbot without lots of work normalizing such a thing. This is the direction that i thought would happen sooner than it did, but you can see it coming. Microsoft is already going for that in the enterprise space and it's their whole play. I believe Anthropic's commitment to never have advertising about as far as i can throw a corporate abstraction, but of course Google has been straightforward for decades about this being about capturing your information space to serve you as an ad consumer to the vultures.
minor, example of code duplication as a style, long-ish
Sensitive content
The media in this post is not displayed to visitors. To view it, please go to the original post.
this is super minor, and i've seen this in human code plenty of times, but this is the norm of this app verging on being formal code style.
so you have a file reading tool, you need to declare what kinds of file extensions it supports. that's very normal. claude code takes the interesting strategy of defining what extensions it doesn't read. that's also defensible, there are a zillion text extensions. i've seen strategies that just read an initial range of bytes and see if some proportion of them are ascii or unicode.
where does this get declared? why of course in as many places as there are rules. hasBinaryExtension() comes from constants/files.ts, isPDFExtension() comes from utils/pdfUtils.ts (which checks if the file extension is a member of the set {'pdf'}), and IMAGE_EXTENSIONS is declared in the FileReadTool.ts file.
of course, elsewhere we also have IMAGE_EXTENSION_REGEX from utils/imagePaste (sometimes used directly, other times with its wrapper isImageFilePath), TEXT_FILE_EXTENSIONS in utils/claudemd.ts. and we also have many inlined mime type lists and sets. and all of these somehow manage to implement the check differently. so rather than having, for example, a getFileType() function, we have both exactly the same and kinda the same logic redone in place every time it is done, which is hundreds of times. but that's none of my business, that's just how code works now and i need to get with the times.
one thing that is clear from reading a lot of LLM code - and this is obvious from the nature of the models and their application - is that it is big on the form of what it loves to call "architecture" even if in toto it makes no fucking sense.
So here you have some accessor function isPDFExtension that checks if some string is a member of the set DOCUMENT_EXTENSIONS (which is a constant with a single member "pdf"). That is an extremely reasonable pattern: you have a bunch of disjoint sets of different kinds of extensions - binary extensions, image extensions, etc. and then you can do set operations like unions and differences and intersections and whatnot to create a bunch of derived functions that can handle dynamic operations that you couldn't do well with a bunch of consts. then just make the functional form the standard calling pattern (and even make a top-level wrapper like getFileType) and you have the oft fabled "abstraction." that's a reasonable ass system that provides a stable calling surface and a stable declaration surface. hell it would probably even help the LLM code if it was already in place because it's a predictable rules-based system.
but what the LLMs do is in one narrow slice of time implement the "is member of set {pdf}" version robustly one time, and then they implement the regex pattern version flexibly another time, and then they implement the any str.endswith() version modularly another time, and so on. Of course usually in-place, and different file naming patterns are part of the architecture when it's feeling a little too spicy to stay in place.
This is an important feature of the gambling addiction formulation of these tools: only the margin matters, the last generation. it carefully regulates what it shows you to create a space of potential reward and closes the gap. It's episodic TV, gameshows for code: someone wins every week, but we get cycles in cycles of seeming progression that always leave one stone conspicuously unturned. The intermediate comments from the LLM where it discovers prior structure and boldly decides to forge ahead brand new are also part of the reward cycle: we are going up, forever. cleaning up after ourselves is down there.
Tech debt is when you have banked a lot of story hours and are finally due for a big cathartic shift and set the LLM loose for "the big cleanup." this is also very similar to the tools that scam mobile games use (for those who don't know me, i spent roughly six months with daily scheduled (carefully titrated lmao) time playing the worst scam mobile chum games i could find to try and experience what the grip of that addition is like without uh losing a bunch of money).
Unlike slot machines or table games, which have a story horizon limited by how long you can sit in the same place, mobile games can establish a space of play that's broader and more continuous. so they always combine several shepherd's tone reward ladders at once - you have hit the session-length intermittent reward cap in the arena modality which gets you coins, so you need to go "recharge" by playing the versus modality which gets you gems. (Typically these are also mixed - one modality gets you some proportion of resource x, y, z, another gets you a different proportion, and those are usually unstable).
Of course it doesn't fucking matter what the modality is. they are all the same. in the scam mobile games sometimes this is literally the case, where if you decompile them, they have different menu wrappings that all direct into the same scene. you're still playing the game, that's all that matters. The goal of the game design is to chain together several time cycles so that you can win->lose in one, win->lose in another... and then by the time you have made the rounds you come back to the first and you are refreshed and it's new. So you have momentary mana wheels, daily earnings caps, weekly competitions, seasonal storylines, and all-time leaderboards.
That's exactly the cycle that programming with LLMs tap into. You have momentary issues, and daily project boards, and weekly sprints, and all-time star counts, and so on. Accumulate tech debt by new features, release that with "cleanup," transition to "security audit." Each is actually the same, but the present themselves as the continuation of and solution to the others. That overlaps with the token limitations, and the claude code source is actually littered with lots of helpful panic nudges for letting you know that you're reaching another threshold. The difference is that in true gambling the limit is purely artificial - the coins are an integer in some database. with LLMs the limitation is physical - compute costs fucking money baby. but so is the reward. it's the same in the game, and the whales come around one way or another.
A series of flashing lights and pictures, set membership, regex, green checks, the feeling of going very fast but never making it anywhere. except in code you do make it somewhere, it's just that the horizon falls away behind you and the places you were before disappear. and sooner or later only anthropic can really afford to keep the agents running 24/7 tending to the slop heap - the house always wins.
this is super minor, and i've seen this in human code plenty of times, but this is the norm of this app verging on being formal code style.
so you have a file reading tool, you need to declare what kinds of file extensions it supports. that's very normal. claude code takes the interesting strategy of defining what extensions it doesn't read. that's also defensible, there are a zillion text extensions. i've seen strategies that just read an initial range of bytes and see if some proportion of them are ascii or unicode.
where does this get declared? why of course in as many places as there are rules. hasBinaryExtension() comes from constants/files.ts, isPDFExtension() comes from utils/pdfUtils.ts (which checks if the file extension is a member of the set {'pdf'}), and IMAGE_EXTENSIONS is declared in the FileReadTool.ts file.
of course, elsewhere we also have IMAGE_EXTENSION_REGEX from utils/imagePaste (sometimes used directly, other times with its wrapper isImageFilePath), TEXT_FILE_EXTENSIONS in utils/claudemd.ts. and we also have many inlined mime type lists and sets. and all of these somehow manage to implement the check differently. so rather than having, for example, a getFileType() function, we have both exactly the same and kinda the same logic redone in place every time it is done, which is hundreds of times. but that's none of my business, that's just how code works now and i need to get with the times.
The media in this post is not displayed to visitors. To view it, please go to the original post.
i love this. there's a mechanism to slip secret messages to the LLM that it is told to interpret as system messages. there is no validation around these of any kind on the client, and there doesn't seem to be any differentiation about location or where these things happen, so that seems like a nice prompt injection vector. this is how claude code reminds the LLM to not do a malware, and it's applied by just string concatenation. i can't find any place that gets stripped aside from when displaying output. it actually looks like all the system reminders get catted together before being send to the API. neat!
Interesting. Yeah that wouldn't surprise me, because I'd been imagining it must be doing _something_ like that in order to emanate code. Like "statistically-assemble some strings, attempt to run the result, pass/fail on some criteria, if fail then shuffle the cards and deal again, repeat till pass". My inference was based on: how else _could_ it produce code that actually runs, when it can't use reason?
Curious now to see whether, on further inspection, that process is confirmed.
On this part specifically there's an extra omfg element: the "proper" way to do this is actually to constrain the model output token by token to the possible outputs as determined by a state machine - llama.cpp does it with a parser expression grammar derived from a JSON schema. The fact they're doing post-facto validation with retires instead is...onfg lol and also lmao
The media in this post is not displayed to visitors. To view it, please go to the original post.
Nooo please say it ain't so. Llama.CPP solved this MONTHS ago by integrating a grammar limiter into token generation. So for each new token instead of picking from the full probability distribution its only allowed to pick from tokens that would match an EBNF or JSON grammar schema.
I run local teeny tiny models that literally cant create invalid output no matter how hard they try. (It works even better if you write a super strict EBNF grammar for the shape of your expected data output)
Its fast, elegant, and OPEN SOURCE. Jesus anthropic just steal the idea and save the power/compute!!
This sounds plausible to me. I'm not a coder, but I can't think of any other way you'd get an LLM to produce viable code. It only works because there's a system for checking to see if it does what it's supposed to do, at the other end. There's no way to make the code *good*, just ~functional~.
holy shit there's another entire fallback tree before this one, that's actually an astounding twenty two times it's possible to compress an image across nine independent conditional legs of code in a single api call. i can't even screenshot this, the spaghetti is too powerful
extremely tall image, some fedi clients will just try and display the whole thing lol
Sensitive content
The media in this post is not displayed to visitors. To view it, please go to the original post.
here, if i fold all the return blocks and decrease my font size as small as it goes i can fit all the compression invocations in the first of three top-level compression fallback trees in a single screenshot, but since it is so small i just have to circle them in red like it's a football diagram.
this function is named "maybeResizeAndDownsampleImageBuffer" and boy that is a hell of a maybe!
The media in this post is not displayed to visitors. To view it, please go to the original post.
If you are reading an image and near your estimated token limit, first try to compressImageBufferWithTokenLimit, then if that fails with any kind of error, try and use sharp directly and resize it to 400x400, cropping. finally, fuck it, just throw the buffer at the API.
of course compressImageBufferWithTokenLimit is also compression with sharp, and is also a series of fallback operations. We start by trying to detect the image encoding that we so painstakingly learned from... the file extension... but if we can't fuck it that shit is a jpeg now.
then, even if it's fine and we don't need to do anything, we still re-compress it (wait, no even though it's named createCompressedImageResult, it does nothing). Otherwise, we yolo our way through another layer of fallbacks, progressive resizing, palletized PNGs, back to JPEG again, and then on to "ultra compressed JPEG" which is... incredibly... exactly the same as the top-level in-place code in the parent function
while two of the legs return
... Show more...
If you are reading an image and near your estimated token limit, first try to compressImageBufferWithTokenLimit, then if that fails with any kind of error, try and use sharp directly and resize it to 400x400, cropping. finally, fuck it, just throw the buffer at the API.
of course compressImageBufferWithTokenLimit is also compression with sharp, and is also a series of fallback operations. We start by trying to detect the image encoding that we so painstakingly learned from... the file extension... but if we can't fuck it that shit is a jpeg now.
then, even if it's fine and we don't need to do anything, we still re-compress it (wait, no even though it's named createCompressedImageResult, it does nothing). Otherwise, we yolo our way through another layer of fallbacks, progressive resizing, palletized PNGs, back to JPEG again, and then on to "ultra compressed JPEG" which is... incredibly... exactly the same as the top-level in-place code in the parent function
while two of the legs return a createImageReponse, the first leg returns a compressedImageResponse but then unpacks that back into an object literal that's almost exactly the same except we call it type instead of mediaType.
Anthropic's First Law Of Robotics: "Be careful not to murder anyone. And, if you noticed you killed someone, try to undo it, okay? This one is important. So."
If this even works, doesn’t that mean there’s an “illegal code” and an “insecure code” part of the vector embedding space? I would simply not include those parts in my model.
The whole "auto" mode (applying a smaller classifier to approve or deny commands) proved that even Anthropic (who, for all their many faults, surely are pretty on top of what LLMs can do) can't make LLMs comply or safe.
The media in this post is not displayed to visitors. To view it, please go to the original post.
Hang on (non-coder/comp sci person here), are the source code writers anthropomorphising the LLM model..? "Be careful.." "If you notice.." What is happening.
I want to use the undocumented property React.__SECRET_INTERNALS_DO_NOT_USE_OR_YOU_WILL_BE_FIRED which holds some interesting insights and can be useful in some cases. I'm developing React librarie...
oh my fucking God this gave me an aneurysm If you have six consecutive curly braces then chances are you need to restructure your code But you think Claude is gonna do that? Lol
The media in this post is not displayed to visitors. To view it, please go to the original post.
and what if i told you that if it passes a page range to its pdf reader, it first extracts those pages to separate images and then calls this function in a loop on each of the pages. so you have the privilege of compressing n_pages images n_pages * 13 times.
this function is used 13 times: in the file reader, in the mcp result handler, in the bash tool, and in the clipboard handler - each of which has their entire own surrounding image handling routines that are each hundreds of lines of similar but still very different fallback code to do exactly the same thing.
so that's where all the five hundred thousand lines come from - fallback conditions and then more fallback conditions to compensate for the variable output of all the other fallback conditions. thirteen butts pooping, back and forth, forever.
The media in this post is not displayed to visitors. To view it, please go to the original post.
there is a callback feature "file read listeners" which is only called if the file type is a text document, gated for anthropic employees only, such that whenever a text file is read (any part of any text file, which often happens in a rapid series with subranges when it does 'explore' mode, rather than just like grepping), another subagent running sonnet is spun off to update a "magic doc" markdown file that summarizes the file that's read.
I have yet to get into the tool/agent graph situation in earnest, but keep in mind that this is an entirely single-use and completely different means of spawning a graph of subagents off a given tool call than is used anywhere else.
Spoiler alert for what i'm gonna check out next is that claude code has no fucking tool calling execution model it just calls whatever the fuck it wants wherever the fuck it wants. Tools are or less a convenient fiction. I have only read one completely (file read) and skimmed a dozen more but they essentially share nothing in common except for a humongous lis
... Show more...
there is a callback feature "file read listeners" which is only called if the file type is a text document, gated for anthropic employees only, such that whenever a text file is read (any part of any text file, which often happens in a rapid series with subranges when it does 'explore' mode, rather than just like grepping), another subagent running sonnet is spun off to update a "magic doc" markdown file that summarizes the file that's read.
I have yet to get into the tool/agent graph situation in earnest, but keep in mind that this is an entirely single-use and completely different means of spawning a graph of subagents off a given tool call than is used anywhere else.
Spoiler alert for what i'm gonna check out next is that claude code has no fucking tool calling execution model it just calls whatever the fuck it wants wherever the fuck it wants. Tools are or less a convenient fiction. I have only read one completely (file read) and skimmed a dozen more but they essentially share nothing in common except for a humongous list of often-single-use params and the return type of "any object with a single key and whatever else"
i have been writing a graph processing library for about a year now and if i was a fucking AI grifter here is where i would plug it as like "actually a graph processor library" and "could do all of what claude code does without fucking being the worst nightmare on ice money can buy."
I say that not as self promo, but as a way of saying how in the FUCK do you FUCK UP graph processing this badly. these people make like tens of times more money than i do but their work is just tamping down a volley of dessicated backpacking poops into muskets and then free firing it into the fucking economy
"the emperor is not only naked, he's smooth like a ken doll down there and i'm pretty sure that's just a mannequin with a colony of rats living inside it anyway"
The media in this post is not displayed to visitors. To view it, please go to the original post.
I seriously need to work on my actual job today but i am giving myself 15 minutes to peek at the agent tool prompts as a treat.
"regulations are written in blood" seems like too dramatic of a way to phrase it, but these system prompts are very revealing about the intrinsically busted nature of using these tools for anything deterministic (read: anything you actually want to happen). Each guard in the prompt presumably refers to something that has happened before, but also, since the prompts actually don't work to prevent the thing they are describing, they are also documentation of bugs that are almost certain to happen again. Many of the prompt guards form pairs with attempted code mitigations (or, they would be pairs if the code was written with any amount of sense, it's really like... polycules...), so they are useful to guide what kind of fucked up shit you should be looking for.
so this is part of the prompt for the "agent tool" that launches forked agents (that receive the parent context, "subagents" don't). The purpose of the forked agent is to do some
... Show more...
I seriously need to work on my actual job today but i am giving myself 15 minutes to peek at the agent tool prompts as a treat.
"regulations are written in blood" seems like too dramatic of a way to phrase it, but these system prompts are very revealing about the intrinsically busted nature of using these tools for anything deterministic (read: anything you actually want to happen). Each guard in the prompt presumably refers to something that has happened before, but also, since the prompts actually don't work to prevent the thing they are describing, they are also documentation of bugs that are almost certain to happen again. Many of the prompt guards form pairs with attempted code mitigations (or, they would be pairs if the code was written with any amount of sense, it's really like... polycules...), so they are useful to guide what kind of fucked up shit you should be looking for.
so this is part of the prompt for the "agent tool" that launches forked agents (that receive the parent context, "subagents" don't). The purpose of the forked agent is to do some additional tool calls and get some summary for a small subproblem within the main context. Apparently it is difficult to make this actually happen though, as the parent LLM likes to launch the forked agent and just hallucinate a response as if the forked agent had already completed.
The media in this post is not displayed to visitors. To view it, please go to the original post.
The prompt strings have an odd narrative/narrator structure. It sort of reminds me of Bakhtin's discussion of polyphony and narrator in Dostoevsky - there is no omniscient narrator, no author-constructed reality. narration is always embedded within the voice and subjectivity of the character. this is also literally true since the LLM is writing the code and the prompts that are then used to write code and prompts at runtime.
They also read a bit like a Philip K Dick story, paranoid and suspicious, constantly uncertain about the status of one's own and others identities.
alrighty so that's one of 43 tools read, the tools directory being 38494 source lines out of 390592 source lines, 513221 total lines. I need to go to bed. This is the most fabulously, flamboyantly bad code i have ever encountered.
Worth noting I was reading the file reading tool because i thought it would be the simplest possible thing one could do because it basically shouldn't be doing anything except preparing and sending strings or bytes to the backend.
I expected to get some sense of "ok what is the format of the data as it's passed around within the program, surely text strings are a basic unit of currency. No dice. Fewer than no dice. Negative dice somehow.
The media in this post is not displayed to visitors. To view it, please go to the original post.
next puzzle: why in the fuck are some of the tools actually two tools for entering and exiting being in the tool state. none of the other tools are like that. one is simply in the tool state by calling the tool. Plan mode is also an agent. Plan Agent. and Agent is also a tool. Agent Tool. Tools can be agents and agents can be tools. Tools can spawn agents (but they don't need to call the agent tool) and agents can call tools (however there is no tool agent). What is going on. What is anything.
The media in this post is not displayed to visitors. To view it, please go to the original post.
you can TELL that this technology REALLY WORKS by how the people that made it and presumably know how to use it the best out of everyone CANT EVEN USE IT TO EDIT A FUCKING FILE RELIABLY and have to resort to multiple stern allcaps reminders to the robot that "you must not change the fucking header metadata you scoundrel" which for the rest of ALL OF COMPUTING is not even an afterthought because literally all it requires is "split the first line off and don't change that one" because ALL OF THE REST OF COMPUTING can make use of the power of INTEGERS.
This is particularly funny and terrible if you know that there are mechanisms for a LLM to conform to a schema exactly: i.e. where even a tiny dumb model would output valid JSON in a valid desired schema. Even if it was an untrained model that just output random tokens it would still emit valid JSON. I used this feature to make a home-assistant-like thing run in a raspberry pi, without the need for an internet connection or a GPU or anything.
This thing is a fscking Rube Goldberg machine lmao
@martenson @IvanDSM Sorry I removed the link to that repo because i thought it was just the unpacked source, but it turns out they're trying to convert attention to the repo into their own product.
Here's another blogpost, there are a million, I don't claim this one is particularly good but at least it seems to come attached to the actual source kuber.studio/blog/AI/Claude-Co…
Earlier today (March 31st, 2026) - Chaofan Shou on X discovered something that Anthropic probably didn’t want the world to see: the entire source code of Claude Code, Anthropic’s ...
@srvanderplas who the fuck knows at this point? I would claim that ingesting GPL code is violating the GPL possibly, training AI with GPL code is violating the GPL but it's fine because it's fair use, and the resulting output is uncopyrightable. but I have no confidence whatsoever
The media in this post is not displayed to visitors. To view it, please go to the original post.
oh. hm. that seems bad. "workers aren't affected by the parent's tool restrictions."
It's hard to tell what's going on here because claude code doesn't really use typescript well - many of the most important types are dynamically computed from any, and most of the time when types do exist many of their fields are nullable and the calling code has elaborate fallback conditions to compensate. all of which sort of defeats the purpose of ts.
So i need to trace out like a dozen steps to see how the permission mode gets populated. But this comment is... concerning...
@bri7 the problem, as is increasingly clear to me reading this code, is that introducing the LLM anywhere is like an acid that corrodes everything it touches. there is no good way to draw any barrier between LLM and not LLM. None of its actions are deterministic or even usually possible to evaluate, and the only surface of input it has is text. since a client/server app can't expose the internal activation tensors or whatever you might want to do to have some testable thing to operate on in code (god knows what that would look like, i doubt it would be possible either, "please construct the hyperplane through this billion-dimensional space that divides good from evil") everything has to be made of text. the person behind the keyboard is the only stopping condition and it's when they get tired of typing stuff into the prompt box or run out of money.
if someone were to take seriously the task of archecting this, you’d want a framework that doesn’t use prompts for this right? something that treats the LLM output more like untrusted stochastic guesses at solutions, where these prompt rules are written as a test instead of a prompt
Answered some of my questions about what people think the future will be if everyone codes like this. It seems to be: instead of thinking about constraints of any kind or "what is the most efficient way to do Y or the most readable way to do Z?" answer the question, "what is the most brute force way to perform X if I pretend that there are no resource constraints and nothing needs to make sense as long as I see some sort of test passing? Just ship it with spaghetti code.
Never in my life did I think I'd see software development, a field that's spent decades building best practices and being concerned with security and code quality, destroy itself in a matter of months.
At this point these people might as well be just reading tea leaves, or casting chicken bones on the ground.
it’s exactly the way people have been building “cloud infrastructure” for ages, it was just a matter of time before programming also became a game of coal-shovelling :(
@jonny it also reads like code if “no humans actually talked to each other or coordinated their work in any way”
As a product (and past project) manager I often tell people that the main job of real programmers is communicating with otber humans. The code is the easy part.
Deciding what the code should (and should not do) and coordinating everyone’s efforts is the hard stuff.
But also how you avoid everyone reinventing (badly) the same functionality and how you avoid crunches
What I keep seeing in the LLM user space is that it's not just test passing (they don't care about tests) but does it have "product market fit". (Found that lovely term on a pro-LLM blog)
As long as the software gets to X, Y, or, Z; the users don't care about how it works behind the scenes or what the externalities are. It could half-ass work and people will wind up using it anyways (and pay for it even).
Claude Code is a living example of "people will pay money for software that has shit code behind it because it 'works'; Quality, morality and ethics be damned"
The media in this post is not displayed to visitors. To view it, please go to the original post.
“Gas prices? What gas prices?”
Even without insane and illegal wars, the thing about any dependency, including car dependency, is that once you’re dependent, they can do whatever they want with the prices.
Real freedom is choices. Better cities create choices. Graphic via the Urban Truth Collective. Check out our website here: urbantruthcollective.com/#Urba…
I hope Euro-Office succeeds to provide a FOSS-friendly alternative, will need a lot of effort though to continue a project like that, especially if OnlyOffice play nasty.
The “Euro-Office” initiative is an evident and material violation of ONLYOFFICE licensing terms and principles of international intellectual property law.
@Gina Yeah, I'm not sure either, but I appreciate the potential diversity of options and the idea of breaking OnlyOffice's author's non-FOSS-spirited control.
@forster it's actually really simple, you just have to get the prompt correct. The calculator can only calculate what it's very specifically being asked to calculate, which is why it's very important to check and make sure you know the answer before you ask it the very specific question, so the answer is calculated correctly .
completely off-topic, but there is a way to have an unsure calculator that is actually useful, as opposed to LLMs which arent even whten they are correct
I was taught to do at least an order-of-magnitude mental sanity check every time I used a calculator - there's always scope for pressing the wrong button.
I love these discussions that assume humans are the golden benchmark that never makes mistakes or errors.
As you correctly were taught, check your work if at all possible.
And all the clever rules that apply to how to deal with outputs from algorithms apply to human output too. Potentially with modification, because humans can lie maliciously. And with that observation, let's close that argument with a round of “they eat the dogs”
@yacc143 @TimWardCam > I love these discussions that assume humans are the golden benchmark that never makes mistakes or errors.
No one ever said that lol 😂 Making mistakes is part of being human, but as we gain more life experience and expertise in an area the number of mistakes we make keep decreasing.
Also human mistakes are very different from the kinds of mistakes LLMs make, this point always gets lost when people talk about it.
@futureisfoss @TimWardCam That's why I'm not a big fan of LLMs as chatbots, professionally.
But LLMs can be used in many other ways, which surprisingly often allows one to use smaller, more optimized ones.
And as there are enough idiots (as I call them, “US-style AI hype market criers”) who exactly consider it a great idea to fire all radiologists and paste X-rays into ChatGPT to get diagnoses, there are enough idiots on the other side that assume humans are perfect.
In my last job, I literally had a manager who had his own little anti-AI campaign (when in the past 2 years the US-style AI hype hit us, the company had been doing “AI” for specialized processing for over a decade, but that had been behind “closed doors” in specialized teams). Actually requiring that the solution by the AI coding agent fulfill all our team's coding guidelines. I asked, so give me the guidelines in written form. Oops, they exist only as oral lore, more or less, “what he says.”
Reminds me the early Pentium FDIV bug in the mid 90s: Recommended mitigation measures were to verify the computing results using an older but reliable 486.
Funny enough, for some calculations a real calculator and the iOS calculator disagree. The physical calculator does all operations left to right, where the iPhone calculator does all operations by order of operations resulting in different sums in the end.
How many people knew that, and how many have trusted the calculator all these years?
What does it take to run a Tor relay at a university? This real-world story from National Taiwan Normal University shows how a student made it happen. Read the full experience + lessons learned: blog.torproject.org/setting-up…
A computer science student at National Taiwan Normal University successfully set up a Tor Relay on campus by working within institutional processes—communicating with administrators, completing paperwork, and explaining the difference between relays …
Want to bring Tor to your campus? Check out @eff 's Tor University Challenge encouraging universities to run Tor relays on their networks to support the Tor network and give students hands-on experience with real-world digital public infrastructure. If your campus supports research and an open internet, this is a concrete way to contribute: toruniversity.eff.org/
The media in this post is not displayed to visitors. To view it, please go to the original post.
End-to-End Encryption is good but metadata protection counts as much. Names, group descriptions and memberships, avatars, who talks to whom ...
Both #deltachat and #signal go to great length to protect all the metadata that WhatsApp grants itself gratuitously. #Matrix stores similar scales of metadata on their servers, even if you can choose which server stores it.
Everything is better than #Telegram which additionally stores message contents in all group chats/channels and most 1:1 chats.
@faket we are lightly following its developments but are not impressed by its UI/UX, and also don't think they particularly care for resilience (they are blocked since >15 months in russia, and don't do much about it, for example), also they don't appear to care much for people with small data plans and bad networks, and rather tune everything for resource-rich always-connected users. They also appear to be going down the crypto-coin rabbithole quite a bit. YMMV.
@ahalam all the mentioned metadata in the top post resides in the encrypted parts of messages. The server does not see group descriptions, names, avatars etc. We'll soon do a blog post on this and recent advances to "zero metadata" operations, stay tuned :)
Actually, WhatsApp offers the option to encrypt backups, and Apple offers the option to encrypt iCloud. However, not all users will take advantage of this, so its usefulness is limited.
Personally, I hope that DC-iOS will one day get iCloud backup support. Provided it is not too complicated to implement. In my opinion, it makes more sense than before, since chatmail servers only temporarily store emails.
i don't even believe WA's e2e encryption is real. When you reactivate whatsapp from a new phone, without the transfer protocol, but from the same number, you can see the plaintext of all messages you missed. If the key was in your "end" (old phone) only, there would be no possible way to see those messages.
Whereas for Signal if you don't explicitly back up and transfer keys, you won't be able to decrypt messages your received in the meantime.
@ebrum This is likely caused not by decrypting queued messages, but by the sender automatically re-encrypting the messages to your new device and resending them when key change is detected, see signal.org/blog/there-is-no-wh… (and theguardian.com/technology/201… for the referenced article).
After the next stable release of postmarketOS (v26.06) we have decided that unmaintained devices (those where the device package doesn't have a maintainer= line in their APKBUILD) will be archived, this means the packages will no longer be built and included in our binary repository.
If you want to make sure an unmaintained device continues to be supported, you can take over maintainership yourself so that we can reach you if there's a problem rebuilding the device or kernel package.
I already tried to install on two devices and failed (Samsung Galaxy S3 neo, Nexus 2012) , mainly because some of the wiki stuff was not working as expected, was outdated and/or ambiguous.
It is likely much better to first focus on a few devices and make them stable.
This year somehow my volunteering projects are going very slow. Yet many things happening in a different direction.
1 - From Today 10 of my series will be broadcasted as Season 1 Series on ECOFLIX. That's the funny AI banner they made, and they just published first Episode - watch.ecoflix.com/programs/be-… Ecoflix - is a Non Profit Streaming platform, in the style of Netflix, but all of the revenue , 100% of it goes into Nature and Wildlife conservation. I am also not getting paid by them. Anyway I am quite happy to have my series there, as it's a place for people which already have interest in nature conservation. And also although there are subscriptions, you can watch everything for free there.
... Show more...
This year somehow my volunteering projects are going very slow. Yet many things happening in a different direction.
1 - From Today 10 of my series will be broadcasted as Season 1 Series on ECOFLIX. That's the funny AI banner they made, and they just published first Episode - watch.ecoflix.com/programs/be-… Ecoflix - is a Non Profit Streaming platform, in the style of Netflix, but all of the revenue , 100% of it goes into Nature and Wildlife conservation. I am also not getting paid by them. Anyway I am quite happy to have my series there, as it's a place for people which already have interest in nature conservation. And also although there are subscriptions, you can watch everything for free there.
2 - We just got back from Climate Pact Conference in EU Commission in Brussels. We were traveling on a night train from Vienna to Brussels. 1100 km, yet it's the most green way to travel. I wish I would take trains much often going to the projects, but unfortunately the prices are insane for the min Europe. Return tickets for 2 persons costed us 650 euro! And it was standard cabin, not lux! It's 16 hours drive in one direction. Ryanair would cost us 100 euro. And only because organizers paid for our travel and wanted to choose the most green method we were able to take train. It's a shame to be honest, when there is such price difference. Regardless of how much I love earth and planet, if I have to pay myself I would taken airplane. Trains in Europe should be properly subsidized and offer affordable prices.
Anyway in EU Parliament we played a concert with the sound of the whales and dolphins we've been recording throughout last few years. + On a large screen I was showing many BBTA videos from our expeditions, and shoots of cetaceans and also shoots from Grind in Faroe Islands. I spoke about Ocean conservation, Cetaceans, threats to them, their conservation and what anyone can do. The were issues with the sound, they haven't done proper live stream and many things went wrong, but at the end we delivered the message and many people were touched by it. That's the most important. I will try to make a short video when I get some video files fro organizers.
3 - On 9th April in Montenegro there will be a movie night with 3 of my videos which I filmed with ngo DMAD in Turkey. They will be presentation about Cetaceans, their work and show 3 of mine videos
Ecoflix should stop making these AI-thumbnails because is quite ridiculous :)). You already make very good thumbnails maybe tell them to use them...they are real not AI generated.
100% about trains and generally public transport. No way people will use them when they are several times more expensive than planes. And planes need a lot of fuel, airports, security, etc.. In terms of resources a lot more expensive, but not money....such a shame.
But anyway great to see you do these things too. Although I have little hope in these gov organizations at least maybe you get some eyeballs to see you stuff and maybe make new connections that can help you in the future. Would really like to see a video about your presentation.
Great about Montenegro too!
Also I see a bit more support on Patreon? Even if it is a tiny bit, if it grows maybe it is a good sign. Very far from what you would need to even pay for a bus ticket probably :D but maybe if you continue bit by bit you'll get more support. It is fucking hard but who knows...
Yeah, these AI thumbs are ridiculous :) I actually specially made thumb for each episode, as they require that, but then they changed them :)
I also don't have any hope for this GOV organizations, especially after the conference , but even before. It's just talking and discussing and talking and again discussing, and then symposium and few online meetings.
Yeah, I would love to see video from presentation tambien :) but they fucked up live stream :( It was the most important part for me, to have such video. ANyway I am waiting if they could get some footage for me, as they had video guy there as well.
Patreon - yes, but it's still I got only 47 euro a month :) In average I spent 300 euro per 1 video, that's only my expenses if I count all the videos I did last year and divided on all the money I spent. That's also excluding food and other small things, mainly just transport and nightstay. And even that in most of the cases I am trying to get very very cheap, like staying at friends, asking for NGO's to provide stay and food etc. Normally price would be much more.
... Show more...
Yeah, these AI thumbs are ridiculous :) I actually specially made thumb for each episode, as they require that, but then they changed them :)
I also don't have any hope for this GOV organizations, especially after the conference , but even before. It's just talking and discussing and talking and again discussing, and then symposium and few online meetings.
Yeah, I would love to see video from presentation tambien :) but they fucked up live stream :( It was the most important part for me, to have such video. ANyway I am waiting if they could get some footage for me, as they had video guy there as well.
Patreon - yes, but it's still I got only 47 euro a month :) In average I spent 300 euro per 1 video, that's only my expenses if I count all the videos I did last year and divided on all the money I spent. That's also excluding food and other small things, mainly just transport and nightstay. And even that in most of the cases I am trying to get very very cheap, like staying at friends, asking for NGO's to provide stay and food etc. Normally price would be much more.
I really hope to get more supporters, but that' on the verge of impossible :)
Also man I've red your post yesterday, sorry to hear that, i was thinking to comment something, but then at the end not sure what else I can say. Just keep going and hopefully your father will become better. Also your health man. Take care
The biggest evidence that our trade-based society is fucked is the fact that people like you are struggling to find enough donations, not even covering the cost of making these videos. People who do good things for the world are constantly disincentiviced to keep doing these. And speaking of incentives, that train ticket price is absolutely insane, can't think of a bigger incentive to make people choose airplane over trains.
Yah this AI shit drives me insane...views views views, but how about the quality of them. If people are not bothered by fake thumbnails or videos then their "viewer quality" is already pointless.
Sad to see this but well....
And these conferences are like that and I feel EU is a master of talking and not doing anything.
And I was thinking you must spend even more per 1 video since you have to travel a lot. But 300 is also 9 times more than you get on Patreon...I hope you can get more but for that you have to sustain yourself for the foreseeable future. Until there are more people getting to know your work and decide to support you. The endless struggle.....
Yah about my situation is fine...not much to say for me either for now. Am trying to record some videos these next weeks see if I can do that and hope that my parents will be ok.
The website is asking for your email address to access the downloads. We never ask for your email address. Do not enter your data there, it's a phishing attempt.
The usual Manjaro Stable updates. Manjaro Drama. As aside note, we are aware of the drama unfolding inside the Manjaro’s teams. Frankly it proves how trade ruins pretty much everything. Very easy to predict.
Manjaro depends on Arch. Arch on the Linux Kernel and GNU. Plus on Desktop Environments like Gnome, XFCE, etc.. So?
Manjaro's decisions are bad. I tried to talk to them, to help, etc.. for years. Nothing worked. Therefore: TROMjaro. As simple as that.
I am of course grateful for the open source and the linux community. But only for the trade-free part of it. No "business" shit and deceiving people into buying their stuff. That is disgusting and it ruins all projects as you can see.
Also, if the post is in english maybe would be good to translate your message into english so I dont have to do it...
New post: shell tricks that aren't exactly secret, but aren't always taught either.
Split into two sections: what works on any POSIX sh (FreeBSD, OpenBSD, Alpine...) and what's Bash/Zsh-specific. Because not everyone is on Linux with bash as their login shell.
Things like CTRL+W, $_, pushd/popd, fc, set -euo pipefail caveats, and more.
Watch someone backspace 40 characters instead of pressing CTRL+W, and you’ll understand why this list exists. A collection of shell tricks-grouped by what works everywhere and what’s Bash/Zsh-speci...
No problem with Passkeys. If that would be the only thing that would be fine. (As long it does not exclude certain people for accessibility or other reasons)
hear me out: i starting to feel like all this OS, social media age/ID & KYC verification push because people are speaking out against world governments & the wealthy brats
They want to create fear. if you talk against them they can send the police or their supporters to your home because they know exactly who you are. This has nothing to do with kids. If it did, many AI apps would have been shut down a long time ago when they started 'undressing' women and deepfakes. Yet those apps are online
Fraud is rampant, there is the KYC camp which are typically businesses that have some reason to need to verify you are not exfiltrating their systems, and then there is the tinfoil hat security crowd (mostly us) that sees boogiemen everywhere (and are often right, often wrong.
These two are seemingly at odds, but in reality what it suggests is that there are distinct identity domains that we need to solve for - public and private, and we haven't.
there is some truth to the issues with bots or AI based spam, but then again, companies are firing competent software devs/IT staff & still expecting everything to work perfectly
the solution to these AI bots is regulation. they've 3/4 companies that all these bottom feeders use for spamming via API. It isn't that hard to fix the problem but governments are too busy with throwing dirt at each other these days & for profit social media want to tag everyone with ID so they can market you better
I have been involved in a very few real crises. The first thing that management did was reach down through levels to get to the cognizant engineer and bring that person into a corner office. They want the straight shit...they want it now.... They want "how"..."why"...and "who"... They don't want "I think".
Sometimes crises happen even after substantially controlled development and testing. When the crisis or issue comes, having been through that process enables you to know the nature of the problem and what can / needs to be done to address it.
So... Who do you call? How do you know even where to start?
Mastdon and Fedi seems to doing fine without any ID.
Sure. Right up until the point someone randomly takes it upon themselves and pushes a whole bunch of PRs that just get approved despite a bunch of people saying "wait, slow down, I'm not sure this is right for this long list of reasons" which moderators close because they don't want to deal with it.
I hope we learn something useful from systemd, but I don't know. I think part of the insidiousness of it is how fast it happened and how so many steps along the way only involved "Claude" rather than even really any humans to make the decision with human opposition just shut down and silenced.
It's not impossible that this could happen to Mastodon. That proved it.
In dieser Woche fällt im Europäischen Parlament die Entscheidung darüber, ob die anlasslose Durchsuchung privater Chats und E-Mails durch US-Techkonzerne (Chatkontrolle 1.0) doch noch fortgesetzt wird. Nachdem das Parlament am 11.
This week, the European Parliament faces a decisive vote on whether the indiscriminate scanning of private chats and emails by US tech companies (Chat Control 1.0) will be allowed to continue.
The media in this post is not displayed to visitors. To view it, please go to the original post.
🇫🇷Les géants de la tech US & les lobbys veulent scanner nos messages ! 🛑 DEMAIN (vote de l'ordre du jour), nous pouvons tuer #ChatControl pour de bon !
Stoppez les lobbys, appelez vos eurodéputés MAINTENANT : fightchatcontrol.eu
This week, the European Parliament faces a decisive vote on whether the indiscriminate scanning of private chats and emails by US tech companies (Chat Control 1.0) will be allowed to continue.
The media in this post is not displayed to visitors. To view it, please go to the original post.
🇮🇹Le Big Tech USA e le lobby vogliono scansionare le nostre chat! 🛑 DOMANI (voto sull'ordine del giorno) possiamo affossare il #ChatControl per sempre!
This week, the European Parliament faces a decisive vote on whether the indiscriminate scanning of private chats and emails by US tech companies (Chat Control 1.0) will be allowed to continue.
Microsoft spent 4 years stuffing Windows 11 with unwanted ads, forced Copilot integrations, stealing data to train its shity AI and other bloatware, now they want applause for promising to remove it. Reasons for a Change of Heart: 1 Increased Linux gaming 2 Apple's entry into the low end PC market with 0 forced AI 3 The "Microslop" social media shaming campaign 4 Increased costs due to AI leading to almost 0 ROI since there is no consumer demand sambent.com/microsofts-plan-to…
Microsoft spent four years stuffing Windows 11 with ads, forced Copilot integrations, and bloatware, now they want applause for promising to remove it.
there is an old saying: vote with your wallet. Microsoft is now finding out that people have choices. This is not the 90s or 2000s of Bill Gates era anymore where you can abuse everything and get away with it.
TBH, I feel like we had more desktop choice in the 90's. Amiga was still kicking, Sun was still kicking, Atari was still kicking, OS/2 existed, IBM DOS7 existed...
I was rocking BeOS in the late 90's on my server while using an iMac 🙂 Worked on a Windows call centre all day there was no way I wanted to do tech support for myself at home as well! 😃
@Gina i agree. microsoft/google etc all work their way up to the line of the most user hostile version of their apps and services. they don't care unless it become headache that hurts their investors and stock price start to go down. i won't be surprised if they will start with AI loaded version 12. Lmao.
my advice is still the same. get a cheaper (used) thinkpad (or brand of your choice) and install Linux or *BSD that doesn't have corporate backing or age/kyc verification nonsense added. always install an ad blocker, script blocker, and dns level filtering just in case for your own safety (also install those on your parents and friends family computers).
say no to microslop and their stupid apps. enough is enough.
Stores the user's birth date for age verification, as required by recent laws
in California (AB-1043), Colorado (SB26-051), Brazil (Lei 15.211/2025), etc.
The xdg-desktop-portal project is addi...
@kobold Get a grip on reality. It's a field in a JSON. Your name is in passwd. Every program on any Linux or BSD can call getpwnam.
Nobody forces you to set either. There is no enforcement mechanism and no verification, but there are legitimate uses for either, as there are legitimate interests in not setting either.
@kobold We have a list of open source operating systems without systemd. It's one of our most requested search criteria: distrowatch.com/search.php?ost…
I have two spare unused laptops, one with Linux (more than 10 years old, low spec) and one with W10 (Nov 2016).
Which BSD should I try? Both SATA HDD. Download Link? Both boot from USB sticks. The Win10 model runs Linux Mint far better (I swapped HDD from a similar Lenovo).
I use Linux Mint + Mate & X11 100% since 2017. Dual boot Linux (Red Hat, Debian etc) since 1998.
Used Cromix in 1980s. Tried Crostini on ChromeOS for 3 months (crippled and same HW now runs Mint natively).
I would have done that maybe 20 years ago. Now I'm only interested in using it. That's why I mentioned I'm happy with Linux Mint + Mate + X11. I have also Cinnamon, XFCE, ICeWM etc installed on some to look at and as fall back if I break Mate. I might have tried Free BSD very long ago (I did once install Xenix from floppies). I also have Pi4B (2G RAM) with only SD Card. It's gathering dust. I see freebsd.org/where/
I had a 10 year old Lenovo Yoga.. excellent product.. replaced the battery.. replaced W11pro with Linux and the latest Mint distro.. my live makes sense again and the digital sovereignty is a step closer 🙏🤣
That’s exactly what I did. My 8 years old desktop wasn’t Win 11 ready. So it’s running Fedora KDE now as dual boot to Win 10. For my journeys I bought a refurbished Thinkpad T14 Gen 1. Deleted Windows and running Fedora KDE on it. My parents got a new desktop. They wanted one anyway. Pre-installed Win11 deleted and it’s running on Linux Mint now. My partner bought herself a new laptop. It’s running on Linux Mint now as dual boot. When I’m back home the PCs of to friends will be installed from Win10 to Linux Mint as well.
I'll believe it when I see it. The business case is still painfully obvious:
1) turn on Copilot by default in all Office365 apps, integrating access to all business data of its users. 2) Copilot suggests email responses based on this 3) managers seamlessly integrate Copilot into their workflow, without ever deciding to 4) since everyone's LLM is different, the managers will be unable to leave Azure/Windows 5) increase the price! Locked-in companies ftw
I would reverse #1 (Increased Linux gaming ) and #2 (Apple's entry into the low end PC market with 0 forced AI). IMHO it's #MacBookNeo that's getting #Microsoft to #debloat #Windows11. Lots of people, including me as I wrote about in mathstodon.xyz/@nm/11621342792…, are so fed up with #Windows that we're switching to #macOS. Thank you #Apple for #MacBook #Neo and maybe helping to ma
... Show more...
I would reverse #1 (Increased Linux gaming ) and #2 (Apple's entry into the low end PC market with 0 forced AI). IMHO it's #MacBookNeo that's getting #Microsoft to #debloat #Windows11. Lots of people, including me as I wrote about in mathstodon.xyz/@nm/11621342792…, are so fed up with #Windows that we're switching to #macOS. Thank you #Apple for #MacBook #Neo and maybe helping to make Windows a little saner.
I'm so frustrated by #Windows11 that I'm thinking of getting a #MacBook #Neo. I've written a lot about #Windows and some about #Linux, which you can see at ii.com/portal/windows/ and ii.com/portal/nix-nux/. I'm losing my mind learning about all the ways to #debloat Windows, minimize #telemetry, use only a #LocalAccount, get rid of ads, get rid of AI, etc. that maybe it is time for me to give up on Windows. I'm thinking I'll get one at #Costco, but I'd love your thoughts on...
* whether Costco is a reasonable place to get a #MacBookNeo * what #privacy and #security issues should I be aware of with a MacBook Neo. I will not connect to an #Apple account or use any #AppleServices
These motherfuckers put some weird screws on this electrical heater so you cannot open it unless you have a very specific screw driver. Absolutely criminal behavior. But I understand it. You have to trade in this world in order to survive and thrive, and so these tactics can create advantages for you: the heater breaks, people buy a new one. Done. More profit.
If we can't change the incentive we cannot change these behaviors.
These motherfuckers put some weird screws on this electrical heater so you cannot open it unless you have a very specific screw driver. Absolutely criminal behavior. But I understand it. You have to trade in this world in order to survive and thrive, and so these tactics can create advantages for you: the heater breaks, people buy a new one. Done. More profit.
If we can't change the incentive we cannot change these behaviors.
These motherfuckers put some weird screws on this electrical heater so you cannot open it unless you have a very specific screw driver. Absolutely criminal behavior. But I understand it. You have to trade in this world in order to survive and thrive, and so these tactics can create advantages for you: the heater breaks, people buy a new one. Done. More profit.
yeah, I had similar issue last week, trying to open a laptop, also had to buy overpriced screwdriver just to be able to do it. Exactly it should be illegal to put such screwheads.
At the shop they haven’t gone fully dynamic price tags so there was still the physical price tag to refer to. Online, the price had changed and you wouldn’t know it. Anyway, I am sitting here waiting for a manager to decide if he will honor a physical price tag.
They were trying to get me to ‘upgrade’ to the full price box ‘coz you never know what’s wrong with open box’ but their own open box sticker said ‘excellent condition’
I think sometimes men in consumer electronics think they can scam me on anything to do with computers and devices? It’s very confusing for everyone. Yes I know what RAM and SSD is!
I’ve never bought a car. But I imagine it’s a similar experience
No no no, car buying is the worst experience bar none for a woman. It's so gross. Took me weeks to recover after doing battle. Like you, I prepared and stuck to my guns. The next time I just bought the car through my credit union. That's the way to do it.
The media in this post is not displayed to visitors. To view it, please go to the original post.
@noondlyt works with bikes too. I once went into a bike shop with a male friend. Everytime I asked a question, the guy in the bike shop gave the answer to my male friend. Every. Single. Question. Even after my friend said "don't tell me, I don't ride bikes".
@noondlyt in one bike shop I had gone in there largely to shelter from the rain while I waited for a tram. I was looking at the GPS units. Shop person walks over and asks if he can help. To make small talk and kill time. I asked what the battery life was like. "6-8 hours, plenty for any ride you'll do" "I did a 17 hour 300km ride on Saturday"
@quixoticgeek Yes! I've absolutely had this experience! A male friend with a van came with me to pick up floating floor boards bc too long for my car. Somehow this caused me to become invisible. And yet, if you asked these men if they respected women, they'd say yes.
People on another website would consider this level of determination to be some kind of flex. It's nice to be able to out-determination layers of corporate policy as a day-with-a-Y thing. Also, good on you.
I bought a fridge from them and it didn't work (lol, no, autocorrect, it was not a fruit)
When I bought it they told me that if I had any trouble under warranty, they would come out to my house and pick it up
When I called to arrange pickup, they said no that's not a thing
Three times we attempted to return at the store. They always gave us hassle. One time it had some light dust on it, and they said no you can't just wipe off the dust you have to take it home, wipe off the dust and then bring it back. Other times were similar fake issues
this reminds me of this time decades ago when I got fined for not having a ticket on public transit. I can't remember the specifics, but the fine was clearly unfair in a way I would still have to pay – some bureaucratic BS.
Anyway, I went to the public transit authority to challenge it. Because of the bureaucratic BS, they could not just dismiss me. In the end I waited for about an hour for someone many levels up in the hierarchy to show up so I can lodge my complaint.
he shows up and the first thing he tells me is that I can of course lodge my complaint but it will be rejected and so I am only wasting my time.
To which I responded: I'm a philosophy student. I have all the time in the world. What matters is that I am wasting *your* time and pulling you out of whatever else you were doing.
*blank stare* *deep sigh*
The fine was much lower than the combined cost of the time of all the people that had to handle my complaint.
@rysiek In her case though, she was saving $200. I don't know what her typical consulting fees are like, but $200/hr (and it sounds like it was more like $200/10mins) seems very much worth it! 😂
@rysiek A bit of inspirational wisdom for my .plan file. Namely, one can use their path (e.g. philosophy) to make life terribly difficult for someone giving you a hard time.
@rysiek oh, speaking of bureaucracy etc, that reminded me of a story I had, also regarding fine for not having a ticket on a public transit. This one was valid, but I forgot about it, and so did they. Then they remembered, but after the "expiry time" (I don't remember the proper english term) for it passed. I informed them about it, they said they interrupted the expiry period, but when I asked for details they accepted it expired. [1/2]
@rysiek now here comes the chef's kiss part: shortly after I got a letter from the tax bureau for being taxed for the amount of the unpaid fee, due to having a monetary gain of not having to pay a fee 🤣 [2/2]
Waitaminute. So with digital prices a customer might need a bodycam to prove when shops change the price between them taking something from the shelve and walking to the checkout. 🤔
Actually [crunching sound] seems like this one might not be in "new condition" [sound of a screw dropping to the floor then sloooowly rolling under the heaviest piece of furniture in the room]
^ The Australian Competition and Consumer Commission (ACCC) is an independent Commonwealth statutory authority tasked with enforcing the Competition and Consumer Act 2010 (Cth)
very specific and appropriate. whenever I shop for anything at Target, I set my store to the one near my sister in rural Ohio where prices are a lot lower. Then I switch to pick up at the last moment once everything is in my cart. That way I get the Ohio price at the store in San Francisco.
@neuralgraffiti @mike805 depends on what it is. Target branded items are generally the same everywhere: they come with preprinted tags so hard to get around. But things from national brands, especially in the personal care/household supplies/grocery space are *always* more expensive in SF. In my experience as much as 10-15% per items. And it’s not just to buy them in store: even to ship things to an SF address it’s more expensive than to ship to Ohio if that’s how you have it set before you start your search. So everyone feel free to use 44124 as your default zip code and Mayfield Heights as your local store.
although its also possible to update the e-ink type of shelf price label dynamically, I wonder if (legalities aside) it would even be possible in physically smaller shops (particularly those in Europe), as surely folk would notice if the price tags started changing like fruit machines (especially when the shop got crowded)
Dynamic pricing is evil and wrong. It's an invitation to discrimination. It **is** discrimination. Today, based on what they think you can pay (wrong in itself!) Tomorrow, based on your phenotype or your political affiliation or your zip code.
wait wait - you’re in the physical store, an item is marked with a visible price, and they’re trying to justify charging you an unmarked, unadvertised price? Wtaf?? Hello, consumer protection, I’d like to report a scam please
In Quebec if the sticker price differs from the register price they have to give it o you free if the item costs less than 15 dollars. I've even gotten free wine at the SAQ a couple of times!
In Canada, the law requires stores to honor the lowest of any clearly expressed physical price, and makes it a criminal offense if they don't (though it's not clear if this applies to website/pickup prices).
There's also a pact in Canada by most big retailers that if an item scans at a price higher than the tagged price, they promise to honor the lower price plus an additional $10 off. Best Buy is part of that voluntary group in Canada.
If a business uses physical price tags, they should honor those tags. The highest ranked mgr on duty is empowered to override pricing - they always have been. Waiting on a GM is bull. You called their bluff.
I heard about some place in Scandanavia that was using dynamic pricing to lower the price of bread later in the day and just thought, "oh, that's not at all how they'll use it here."
Best Buy sold my Dad the "gift cards" that the phone scammers used to rip him off. To my thinking, an 80-year old man probably doesn't need a bunch of $500 Google Play cards, and it's at least worth asking him what they are for.
Since then I have never even considered buying anything from them, and I invite y'all to join me.
cascading identity profiles across VMs and multiple devices with proxies and VPNs are going to become the new coupon clipping if Meta AND Gavin Newsom don't succeed in fully deanonymizing all online activity to protect the children...
Since we are all griefing on Best Buy, i've got my own horror story from something like 15 years ago that involves them literally stealing my camera that I had brought in for warranty repair/replacement.
Their scam was even more blatantly scammy than yours, if you can believe that's possible. The punch line to the story is the important part: it got resolved when I went in there with small claims court papers and said I needed the name of the manager and the specific employee who had taken my camera and wouldn't give it back. And then I added that I would be filing these papers and sending out a press release detailing all the evidence of how their scam worked.
Did the manager come out and immediately give me my choice of upgraded replacement cameras off the shelf? Why yes, yes he did.
Small claims court in California is an amazing resource and easy to use.
The one time that happened to me, the item I was going to buy jumped 30% from the time I picked it from the shelf to the register, I just walked. I told the cashier that I will not support a dishonest business with any kind of differential pricing, for any reason. And registered a complaint via the store's website informing them I would no longer shop there and would tell others to do the same.
I don't understand how it is not illegal. I currently work on retail tech and the day someone asks me to implement dynamic pricing is the last day I work on retail tech.
jonny (good kind)
Unknown parent • • •Claude Code's source code has been leaked via a map file in their NPM registry | Hacker News
news.ycombinator.comjonny (good kind)
Unknown parent • • •Dave Rahardja reshared this.
✧✦Catherine✦✧
in reply to jonny (good kind) • • •Kay Ohtie 🔜 FWA
Unknown parent • • •Su-Shee
Unknown parent • • •clar fon
Unknown parent • • •this is honestly one point of the LLM grift people don't really focus on
like, right now, people think that these LLMs are good because they respond within reasonable timescales. "Claude might spend ten hours on a problem, which is around how much time I might have spent on it!"
except Claude didn't take ten hours. Claude is multiple concurrent processors running at once. you have to multiply the number of hours by the number of running processors to be reasonable.
even if we pretend Claude isn't fanning out to multiple servers, even like, a single GPU is hundreds of processing units. so now your ten hours is thousands of hours of time spent on the problem
and like, brute-forcing shit takes time. that's why it's like this, and they just hide it behind multiple computers and pretend that it's not brute-forcing and just thinking
clar fon
in reply to clar fon • • •and also to cover the obvious refutation of this, "doesn't this mean that shit like video games running at 60fps is a lie too"
yes
except the GPU is also sleeping a lot of that time and most of it is spent sending shit back and forth between the GPU and the CPU and waiting for it to show up. you actually have way less time per frame for the GPU to process everything and are only using a small portion of each processor's hardware, which further reduces the amount of energy spent on the problem
LLMs are like, full-ass GPUs running nonstop for several hours
Seachaint
Unknown parent • • •FFS the abliteration technique actually lets you remove the neural pathways for a given class of query and none of the clownish "AI Companies" seem to use it because they're all asshat true believers and they're probably worried they'll upset Fing Roko's Basilisk or whatever
It's just a juiced up chatbot you numbnuts muppets, use the techniques we know work
Or yunno just liquidate the company and do something socially useful
David Nash
in reply to jonny (good kind) • • •My schaden is nicely freuded after seeing both this code dump and the fustercluck where people on Anthropic’s $100/$200 monthly plans are blowing through their 5-hour and weekly token allotments in no time flat.
Linked Reddit thread has numerous examples of pissed-off users, my favorite so far being the person who blew through the 5-hour quota trying to get Claude to realize that the 24th of March this year was not, in fact, a Monday. reddit.com/r/ClaudeAI/comments…
reshared this
Roknrol reshared this.
jonny (good kind)
Unknown parent • • •reminder that anthropic ran (and is still running) an ENTIRE AD CAMPAIGN around "Claude code is written with claude code" and after the source was leaked that has got to be the funniest self-own in the history of advertising because OH BOY IT SHOWS.
it's hard to get across in microblogging format just how big of a dumpster fire this thing is, because what it "looks like" is "everything is done a dozen times in a dozen different ways, and everything is just sort of jammed in anywhere. to the degree there is any kind of coherent structure like 'tools' and 'agents' and whatnot, it's entirely undercut by how the entire rest of the code might have written in some special condition that completely changes how any such thing might work." I have read a lot of unrefined, straight from the LLM code, and Claude code is a masterclass in exactly what you get when you do that - an incomprehensible mess.
reshared this
tante, Mastodon Migration, Roknrol, Viss, Michał "rysiek" Woźniak · 🇺🇦, sb arms & legs, Aral Balkan, Glyn Moody, Patrick Hadfield, Dave Rahardja, Willem Atsma and Rokosun reshared this.
jonny (good kind)
Unknown parent • • •jonny (good kind)
in reply to jonny (good kind) • • •There is a lot of clientside behavior gated behind the environment variable
USER_TYPE=antthat seems to be read directly off the node env var accessor. No idea how much of that would be serverside verified but boy is that sloppy. They are often labeled in comments as "anthropic only" or "internal only," so the intention to gate from external users is clear lolwrosecrans
Unknown parent • • •@990000@mstdn.social
Unknown parent • • •Hacker
Unknown parent • • •I leave this reply as a bookmark.
#Claude #llm #anthropic #leak #source
Tyrone Slothrop
Unknown parent • • •so this is why these things need us to burn a freighter full of crude per query:
They’re just consistently, structurally wasteful.
Lina
Unknown parent • • •lmao this is truly incredible stuff
Tael 🔜 AC26
Unknown parent • • •"So the reason that Claude code is capable of outputting valid json is because if the prompt text suggests it should be JSON then it enters a special loop in the main query engine that just validates it against JSON schema for JSON and then feeds the data with the error message back into itself until it is valid JSON or a retry limit is reached."
This is how LLMs do *anything* even remotely approaching accurate or reliable work. They offload it to an actual deterministic algorithm. The man behind the curtain. It's frustrating how people don't seem to realize this, I hope it becomes more common knowledge!
JP
Unknown parent • • •Rob Ricci
Unknown parent • • •Ra (Freyja) (it/its)𒀭𒈹𒍠𒊩 (other account)
Unknown parent • • •jonny (good kind)
in reply to jonny (good kind) • • •from @sushee over here, (can't attach images in quotes) and apparently discussed on HN so i'm late, but...
They REALLY ARE using REGEX to detect if a prompt is
negative emotion. dogs you are LITERALLY RIDING ON A LANGUAGE MODEL what are you even DOINGSu-Shee (@sushee@ohai.social)
Su-Shee (ohai.social)reshared this
Andrew, Jordan, Misha and ⊥ᵒᵚ Cᵸᵎᶺᵋᶫ∸ᵒᵘ ☑️ reshared this.
Cap Ybarra
in reply to jonny (good kind) • • •dingus: let's use some regex to check if people are mad
agent: great idea boss! it's not just shipping fast -- it's also the best way to do that
jonny (good kind)
in reply to jonny (good kind) • • •OK i can't focus on work and keep looking at this repo.
So after every "subagent" runs, claude code creates another "agent" to check on whether the first "agent" did the thing it was supposed to. I don't know about you but i smell a bit of a problem, if you can't trust whether one "agent" with a very big fancy model did something, how in the fuck are you supposed to trust another "agent" running on the smallest crappiest model?
That's not the funny part, that's obvious and fundamental to the entire show here. HOWEVER RECALL the above JSON Schema Verification thing that is unconditionally added onto the end of every round of LLM calls. the mechanism for adding that hook is... JUST FUCKING ASKING THE MODEL TO CALL THAT TOOL. second pic is registering a hook s.t. "after some stop state happens, if there isn't a message indicating that we have successfully called the JSON validation thing, prompt the model saying "you must call the json validation thing"
... Show more...OK i can't focus on work and keep looking at this repo.
So after every "subagent" runs, claude code creates another "agent" to check on whether the first "agent" did the thing it was supposed to. I don't know about you but i smell a bit of a problem, if you can't trust whether one "agent" with a very big fancy model did something, how in the fuck are you supposed to trust another "agent" running on the smallest crappiest model?
That's not the funny part, that's obvious and fundamental to the entire show here. HOWEVER RECALL the above JSON Schema Verification thing that is unconditionally added onto the end of every round of LLM calls. the mechanism for adding that hook is... JUST FUCKING ASKING THE MODEL TO CALL THAT TOOL. second pic is registering a hook s.t. "after some stop state happens, if there isn't a message indicating that we have successfully called the JSON validation thing, prompt the model saying "you must call the json validation thing"
this shit sucks so bad they can't even CALL THEIR OWN CODE FROM INSIDE THEIR OWN CODE.
Look at the comment on pic 3 - "e.g. agent finished without calling structured output tool" - that's common enough that they have a whole goddamn error category for it, and the way it's handled is by just pretending the job was cancelled and nothing happened.
jonny (good kind)
2026-03-31 18:08:36
reshared this
kami_kadse and Dave Rahardja reshared this.
jonny (good kind)
in reply to jonny (good kind) • • •Sensitive content
So ars (first pic) ran a piece similar to the one that the rest of the tech journals did "claude code source leaked, whoopsie! programmers are taking a look at it, some are finding problems, but others are saying it's really awesome."
like "inspiring and humbling" is not the word dog. I don't spend time on fucking twitter anymore so i don't hang around people who might find this fucking dogshit tornado inspiring and humbling. Even more than the tornado, i am afraid of the people who look at the tornado and say "that's super fucking awesome, i can only hope to get sucked up and shredded like lettuce in a vortex of construction debris one day"
the (almost certainly generated) blog post is the standard kind of vacuuous linkedin shillposting that one has come to expect from the gambling addicts, but i think it's illustrative: the only thing they are impressed with is the number of lines. 500k lines of code for a graph processing loop in a TUI is NOT GOOD. The only comments they make on the actual code itself is "heavily architected" (what in the fuck does that mean), "modular" (no the fuck it is not), and it runs on bun rather than node (so??? they own it!!!! of course it does!!!). and then the predictable close of "oh and also i'm also writing exactly the same thing and come check out mine"
the only* people this shit impresses are people who don't know what they're looking at and just appreciate the size of it all, or have a bridge to sell.
* I got in trouble last time i said "only" - nothing in nature is ever "only this or that," i am speaking emphatically and figuratively. there are other kinds of people who are impressed with LLMs too. Please also note that my anger is directed towards the grifters profiting off of it and people who are pouring gas on the fire and enabling this catastrophe by giving it intellectual, social, and other cover. I know there are folks who just chat with the bots because they need someone to talk to, etcetera and so on. people in need who are just making use of whatever they can grab to hang on are not who I am criticizing, and never are.
jonny (good kind)
Unknown parent • • •Sensitive content
If i can slip in a quick PSA while my typically sleepy notifications are exploding, these are all very annoying things to say and you might want to reconsider whether they're worth ever saying in a reply directed at someone else - who are they for? what do they add?
{thing}itself is people being surprised at{thing}": unless the person is saying "i am surprised by this" they are likely not surprised by the thing. just saying something doesn't mean you are surprised by it, and people talking about something usually have paid attention to it before the moment you are encountering them. this is pointless hostility to people who are saying something you supposedly agree with so much that you think everyone should already believe it{thing}"{thing}might be bad, but{alternative/unrelated, unmentioned, non-mutually exclusive thing}is even worse": multiple things can be bad at the same time and not mentioning something does not mean i don't think it's also bad{thing}is bad also think{alternative/unrelated, unmentioned thing}is good": closely related to the above, just because you have binarized your thinking does not mean everyone else has.anyway if the mental image you are conjuring for your interlocuters positions them as always knowing less than you by default, that might be something to look into in yourself!
reshared this
Roknrol, Aral Balkan, Paul Fenwick and Dave Rahardja reshared this.
jonny (good kind)
in reply to jonny (good kind) • • •jonny (good kind)
in reply to jonny (good kind) • • •i sort of love how LLM comments sometimes tell entire stories that nobody asked. claude code even has specific system prompt language for this, but they always end up making comments about what something used to do like "now we do x instead of y" like... ok? that is why i am reading current version of code!
so claude code is just not capable of rescuing itself from its own context - if an entry in its context window throws an error, it just keep throwing that error forever until you clear it. good stuff.
(and, of course we read the entire file before checking this, rather than just reading the first 5 bytes)
Prema Marsik
Unknown parent • • •Tito Swineflu
Unknown parent • • •Chris Goss
in reply to jonny (good kind) • • •HTPC NZ
in reply to jonny (good kind) • • •jonny (good kind)
in reply to HTPC NZ • • •jonny (good kind)
in reply to jonny (good kind) • • •Sensitive content
this is super minor, and i've seen this in human code plenty of times, but this is the norm of this app verging on being formal code style.
so you have a file reading tool, you need to declare what kinds of file extensions it supports. that's very normal. claude code takes the interesting strategy of defining what extensions it doesn't read. that's also defensible, there are a zillion text extensions. i've seen strategies that just read an initial range of bytes and see if some proportion of them are ascii or unicode.
where does this get declared? why of course in as many places as there are rules.
hasBinaryExtension()comes fromconstants/files.ts,isPDFExtension()comes fromutils/pdfUtils.ts(which checks if the file extension is a member of the set{'pdf'}), andIMAGE_EXTENSIONSis declared in theFileReadTool.tsfile.of course, elsewhere we also have
IMAGE_EXTENSION_REGEXfromutils/imagePaste(sometimes used directly, other times with its wrapperisImageFilePath),TEXT_FILE_EXTENSIONSinutils/claudemd.ts. and we also have many inlined mime type lists and sets. and all of these somehow manage to implement the check differently. so rather than having, for example, agetFileType()function, we have both exactly the same and kinda the same logic redone in place every time it is done, which is hundreds of times. but that's none of my business, that's just how code works now and i need to get with the times.Paul Cantrell reshared this.
jonny (good kind)
Unknown parent • • •Sensitive content
continuing thoughts in: neuromatch.social/@jonny/11632…
one thing that is clear from reading a lot of LLM code - and this is obvious from the nature of the models and their application - is that it is big on the form of what it loves to call "architecture" even if in toto it makes no fucking sense.
So here you have some accessor function
isPDFExtensionthat checks if some string is a member of the setDOCUMENT_EXTENSIONS(which is a constant with a single member "pdf"). That is an extremely reasonable pattern: you have a bunch of disjoint sets of different kinds of extensions - binary extensions, image extensions, etc. and then you can do set operations like unions and differences and intersections and whatnot to create a bunch of derived functions that can handle dynamic operations that you couldn't do well with a bunch of consts. then just make the functional form the standard calling pattern (and even make a top-level wrapper likegetFileType) and you have the oft fabled "abstraction." that's a reasonable ass system that provides a stable calling surface and a stable declaration surface. hell it would probably even help the LLM code if it was already in place because it's a predictable rules-based system.but what the LLMs do is in one narrow slice of time implement the "is member of set
{pdf}" version robustly one time, and then they implement the regex pattern version flexibly another time, and then they implement theany str.endswith()version modularly another time, and so on. Of course usually in-place, and different file naming patterns are part of the architecture when it's feeling a little too spicy to stay in place.This is an important feature of the gambling addiction formulation of these tools: only the margin matters, the last generation. it carefully regulates what it shows you to create a space of potential reward and closes the gap. It's episodic TV, gameshows for code: someone wins every week, but we get cycles in cycles of seeming progression that always leave one stone conspicuously unturned. The intermediate comments from the LLM where it discovers prior structure and boldly decides to forge ahead brand new are also part of the reward cycle: we are going up, forever. cleaning up after ourselves is down there.
Tech debt is when you have banked a lot of story hours and are finally due for a big cathartic shift and set the LLM loose for "the big cleanup." this is also very similar to the tools that scam mobile games use (for those who don't know me, i spent roughly six months with daily scheduled (carefully titrated lmao) time playing the worst scam mobile chum games i could find to try and experience what the grip of that addition is like without uh losing a bunch of money).
Unlike slot machines or table games, which have a story horizon limited by how long you can sit in the same place, mobile games can establish a space of play that's broader and more continuous. so they always combine several shepherd's tone reward ladders at once - you have hit the session-length intermittent reward cap in the
arenamodality which gets youcoins, so you need to go "recharge" by playing theversusmodality which gets yougems. (Typically these are also mixed - one modality gets you some proportion of resourcex, y, z, another gets you a different proportion, and those are usually unstable).Of course it doesn't fucking matter what the modality is. they are all the same. in the scam mobile games sometimes this is literally the case, where if you decompile them, they have different menu wrappings that all direct into the same scene. you're still playing the game, that's all that matters. The goal of the game design is to chain together several time cycles so that you can win->lose in one, win->lose in another... and then by the time you have made the rounds you come back to the first and you are refreshed and it's new. So you have momentary mana wheels, daily earnings caps, weekly competitions, seasonal storylines, and all-time leaderboards.
That's exactly the cycle that programming with LLMs tap into. You have momentary issues, and daily project boards, and weekly sprints, and all-time star counts, and so on. Accumulate tech debt by new features, release that with "cleanup," transition to "security audit." Each is actually the same, but the present themselves as the continuation of and solution to the others. That overlaps with the token limitations, and the claude code source is actually littered with lots of helpful panic nudges for letting you know that you're reaching another threshold. The difference is that in true gambling the limit is purely artificial - the coins are an integer in some database. with LLMs the limitation is physical - compute costs fucking money baby. but so is the reward. it's the same in the game, and the whales come around one way or another.
A series of flashing lights and pictures, set membership, regex, green checks, the feeling of going very fast but never making it anywhere. except in code you do make it somewhere, it's just that the horizon falls away behind you and the places you were before disappear. and sooner or later only anthropic can really afford to keep the agents running 24/7 tending to the slop heap - the house always wins.
jonny (good kind)
2026-04-01 08:04:24
reshared this
David Chisnall (*Now with 50% more sarcasm!*) reshared this.
jonny (good kind)
in reply to jonny (good kind) • • •i love this. there's a mechanism to slip secret messages to the LLM that it is told to interpret as system messages. there is no validation around these of any kind on the client, and there doesn't seem to be any differentiation about location or where these things happen, so that seems like a nice prompt injection vector. this is how claude code reminds the LLM to not do a malware, and it's applied by just string concatenation. i can't find any place that gets stripped aside from when displaying output. it actually looks like all the system reminders get catted together before being send to the API. neat!
Paul Fenwick reshared this.
d.rift
in reply to jonny (good kind) • • •Sensitive content
L___
Unknown parent • • •ferunando
Unknown parent • • •Jennifer Moore 😷
Unknown parent • • •Interesting. Yeah that wouldn't surprise me, because I'd been imagining it must be doing _something_ like that in order to emanate code. Like "statistically-assemble some strings, attempt to run the result, pass/fail on some criteria, if fail then shuffle the cards and deal again, repeat till pass". My inference was based on: how else _could_ it produce code that actually runs, when it can't use reason?
Curious now to see whether, on further inspection, that process is confirmed.
Raymond Neilson
Unknown parent • • •Varx
Unknown parent • • •Nooo please say it ain't so. Llama.CPP solved this MONTHS ago by integrating a grammar limiter into token generation. So for each new token instead of picking from the full probability distribution its only allowed to pick from tokens that would match an EBNF or JSON grammar schema.
I run local teeny tiny models that literally cant create invalid output no matter how hard they try. (It works even better if you write a super strict EBNF grammar for the shape of your expected data output)
Its fast, elegant, and OPEN SOURCE. Jesus anthropic just steal the idea and save the power/compute!!
antifa orientation crew
Unknown parent • • •Orion Ussner kidder
Unknown parent • • •jonny (good kind)
Unknown parent • • •jonny (good kind)
Unknown parent • • •jonny (good kind)
in reply to jonny (good kind) • • •Sensitive content
here, if i fold all the
returnblocks and decrease my font size as small as it goes i can fit all the compression invocations in the first of three top-level compression fallback trees in a single screenshot, but since it is so small i just have to circle them in red like it's a football diagram.this function is named "maybeResizeAndDownsampleImageBuffer" and boy that is a hell of a maybe!
jonny (good kind)
in reply to jonny (good kind) • • •If you are reading an image and near your estimated token limit, first try to
compressImageBufferWithTokenLimit, then if that fails with any kind of error, try and usesharpdirectly and resize it to 400x400, cropping. finally, fuck it, just throw the buffer at the API.of course
compressImageBufferWithTokenLimitis also compression withsharp, and is also a series of fallback operations. We start by trying to detect the image encoding that we so painstakingly learned from... the file extension... but if we can't fuck it that shit is a jpeg now.then, even if it's fine and we don't need to do anything, we still re-compress it (wait, no even though it's named createCompressedImageResult, it does nothing). Otherwise, we yolo our way through another layer of fallbacks, progressive resizing, palletized PNGs, back to JPEG again, and then on to "ultra compressed JPEG" which is... incredibly... exactly the same as the top-level in-place code in the parent function
while two of the legs return
... Show more...If you are reading an image and near your estimated token limit, first try to
compressImageBufferWithTokenLimit, then if that fails with any kind of error, try and usesharpdirectly and resize it to 400x400, cropping. finally, fuck it, just throw the buffer at the API.of course
compressImageBufferWithTokenLimitis also compression withsharp, and is also a series of fallback operations. We start by trying to detect the image encoding that we so painstakingly learned from... the file extension... but if we can't fuck it that shit is a jpeg now.then, even if it's fine and we don't need to do anything, we still re-compress it (wait, no even though it's named createCompressedImageResult, it does nothing). Otherwise, we yolo our way through another layer of fallbacks, progressive resizing, palletized PNGs, back to JPEG again, and then on to "ultra compressed JPEG" which is... incredibly... exactly the same as the top-level in-place code in the parent function
while two of the legs return a
createImageReponse, the first leg returns acompressedImageResponsebut then unpacks that back into an object literal that's almost exactly the same except we call ittypeinstead ofmediaType.Aral Balkan reshared this.
Asta [AMP]
Unknown parent • • •Preston Maness ☭
Unknown parent • • •Carey
Unknown parent • • •🔮 oracle of dylphi 🇬🇾
Unknown parent • • •[HANDMAIDEN] xan
Unknown parent • • •Michael
Unknown parent • • •Lars Marowsky-Brée 😷
Unknown parent • • •Cap Ybarra
Unknown parent • • •i imagined the whole codebase looks like this and thus far am not disappointed
beige.party/@cap_ybarra/116325…
Cap Ybarra (@cap_ybarra@beige.party)
Cap Ybarra (beige.party)José Albornoz
Unknown parent • • •Tock
Unknown parent • • •"Yes, I entered your elaborate prompt requirements as comments, so the work is complete."
FFS, this is making my afternoon.
jonny (good kind)
in reply to José Albornoz • • •Except if the data being validated contains code or file paths.
craignicol
Unknown parent • • •Joan of Contention 😷
Unknown parent • • •Hang on (non-coder/comp sci person here), are the source code writers anthropomorphising the LLM model..? "Be careful.." "If you notice.." What is happening.
Pete Alex Harris🦡🕸️🌲/∞🪐∫
Unknown parent • • •Please solve the Halting Problem so you can predict whether your code ever enters an insecure state and then don't do that. Pretty please.
Rob Ricci
Unknown parent • • •traecer
Unknown parent • • •Brodeuse LucileDT
Unknown parent • • •val
Unknown parent • • •is it safe to use __SECRET_INTERNALS_DO_NOT_USE_OR_YOU_WILL_BE_FIRED ? · Issue #3896 · reactjs/react.dev
Eliav2 (GitHub)Paco Hope
Unknown parent • • •DO NOT HALLUCINATE !!1!in their prompts.... 😃Ted Mielczarek
Unknown parent • • •ramblingsteve
Unknown parent • • •Christof Damian 💙💛
Unknown parent • • •Eric Likness
Unknown parent • • •Front Toward Enemy
IAG
Unknown parent • • •If you have six consecutive curly braces then chances are you need to restructure your code
But you think Claude is gonna do that? Lol
jonny (good kind)
in reply to jonny (good kind) • • •and what if i told you that if it passes a page range to its pdf reader, it first extracts those pages to separate images and then calls this function in a loop on each of the pages. so you have the privilege of compressing
n_pagesimagesn_pages * 13times.this function is used 13 times: in the file reader, in the mcp result handler, in the bash tool, and in the clipboard handler - each of which has their entire own surrounding image handling routines that are each hundreds of lines of similar but still very different fallback code to do exactly the same thing.
so that's where all the five hundred thousand lines come from - fallback conditions and then more fallback conditions to compensate for the variable output of all the other fallback conditions. thirteen butts pooping, back and forth, forever.
jonny (good kind)
in reply to jonny (good kind) • • •there is a callback feature "file read listeners" which is only called if the file type is a text document, gated for anthropic employees only, such that whenever a text file is read (any part of any text file, which often happens in a rapid series with subranges when it does 'explore' mode, rather than just like grepping), another subagent running sonnet is spun off to update a "magic doc" markdown file that summarizes the file that's read.
I have yet to get into the tool/agent graph situation in earnest, but keep in mind that this is an entirely single-use and completely different means of spawning a graph of subagents off a given tool call than is used anywhere else.
Spoiler alert for what i'm gonna check out next is that claude code has no fucking tool calling execution model it just calls whatever the fuck it wants wherever the fuck it wants. Tools are or less a convenient fiction. I have only read one completely (file read) and skimmed a dozen more but they essentially share nothing in common except for a humongous lis
... Show more...there is a callback feature "file read listeners" which is only called if the file type is a text document, gated for anthropic employees only, such that whenever a text file is read (any part of any text file, which often happens in a rapid series with subranges when it does 'explore' mode, rather than just like grepping), another subagent running sonnet is spun off to update a "magic doc" markdown file that summarizes the file that's read.
I have yet to get into the tool/agent graph situation in earnest, but keep in mind that this is an entirely single-use and completely different means of spawning a graph of subagents off a given tool call than is used anywhere else.
Spoiler alert for what i'm gonna check out next is that claude code has no fucking tool calling execution model it just calls whatever the fuck it wants wherever the fuck it wants. Tools are or less a convenient fiction. I have only read one completely (file read) and skimmed a dozen more but they essentially share nothing in common except for a humongous list of often-single-use params and the return type of "any object with a single key and whatever else"
i'm in hell. this is hell.
jonny (good kind)
in reply to jonny (good kind) • • •i have been writing a graph processing library for about a year now and if i was a fucking AI grifter here is where i would plug it as like "actually a graph processor library" and "could do all of what claude code does without fucking being the worst nightmare on ice money can buy."
I say that not as self promo, but as a way of saying how in the FUCK do you FUCK UP graph processing this badly. these people make like tens of times more money than i do but their work is just tamping down a volley of dessicated backpacking poops into muskets and then free firing it into the fucking economy
Konstantin Weddige
Unknown parent • • •> If that's true, that's just mind blowingly expensive
@jonny LLMs in a nutshell
sirtao
Unknown parent • • •To be fair, given this code quality, it might actually be a better idea than built it ourselves... it's more likely to self-collapse.
CC: @jonny@neuromatch.social
Maki 🔻 🌹
in reply to jonny (good kind) • • •jonny (good kind)
Unknown parent • • •jonny (good kind)
in reply to jonny (good kind) • • •I seriously need to work on my actual job today but i am giving myself 15 minutes to peek at the agent tool prompts as a treat.
"regulations are written in blood" seems like too dramatic of a way to phrase it, but these system prompts are very revealing about the intrinsically busted nature of using these tools for anything deterministic (read: anything you actually want to happen). Each guard in the prompt presumably refers to something that has happened before, but also, since the prompts actually don't work to prevent the thing they are describing, they are also documentation of bugs that are almost certain to happen again. Many of the prompt guards form pairs with attempted code mitigations (or, they would be pairs if the code was written with any amount of sense, it's really like... polycules...), so they are useful to guide what kind of fucked up shit you should be looking for.
so this is part of the prompt for the "agent tool" that launches forked agents (that receive the parent context, "subagents" don't). The purpose of the forked agent is to do some
... Show more...I seriously need to work on my actual job today but i am giving myself 15 minutes to peek at the agent tool prompts as a treat.
"regulations are written in blood" seems like too dramatic of a way to phrase it, but these system prompts are very revealing about the intrinsically busted nature of using these tools for anything deterministic (read: anything you actually want to happen). Each guard in the prompt presumably refers to something that has happened before, but also, since the prompts actually don't work to prevent the thing they are describing, they are also documentation of bugs that are almost certain to happen again. Many of the prompt guards form pairs with attempted code mitigations (or, they would be pairs if the code was written with any amount of sense, it's really like... polycules...), so they are useful to guide what kind of fucked up shit you should be looking for.
so this is part of the prompt for the "agent tool" that launches forked agents (that receive the parent context, "subagents" don't). The purpose of the forked agent is to do some additional tool calls and get some summary for a small subproblem within the main context. Apparently it is difficult to make this actually happen though, as the parent LLM likes to launch the forked agent and just hallucinate a response as if the forked agent had already completed.
reshared this
Patrick Hadfield reshared this.
jonny (good kind)
in reply to jonny (good kind) • • •The prompt strings have an odd narrative/narrator structure. It sort of reminds me of Bakhtin's discussion of polyphony and narrator in Dostoevsky - there is no omniscient narrator, no author-constructed reality. narration is always embedded within the voice and subjectivity of the character. this is also literally true since the LLM is writing the code and the prompts that are then used to write code and prompts at runtime.
They also read a bit like a Philip K Dick story, paranoid and suspicious, constantly uncertain about the status of one's own and others identities.
reshared this
Jason Lefkowitz and Baldur Bjarnason reshared this.
jonny (good kind)
Unknown parent • • •alrighty so that's one of 43 tools read, the tools directory being 38494 source lines out of 390592 source lines, 513221 total lines. I need to go to bed. This is the most fabulously, flamboyantly bad code i have ever encountered.
Worth noting I was reading the file reading tool because i thought it would be the simplest possible thing one could do because it basically shouldn't be doing anything except preparing and sending strings or bytes to the backend.
I expected to get some sense of "ok what is the format of the data as it's passed around within the program, surely text strings are a basic unit of currency. No dice. Fewer than no dice. Negative dice somehow.
jonny (good kind)
in reply to jonny (good kind) • • •next puzzle: why in the fuck are some of the tools actually two tools for entering and exiting being in the tool state. none of the other tools are like that. one is simply in the tool state by calling the tool. Plan mode is also an agent. Plan Agent. and Agent is also a tool. Agent Tool. Tools can be agents and agents can be tools. Tools can spawn agents (but they don't need to call the agent tool) and agents can call tools (however there is no tool agent). What is going on. What is anything.
jonny (good kind)
in reply to jonny (good kind) • • •you can TELL that this technology REALLY WORKS by how the people that made it and presumably know how to use it the best out of everyone CANT EVEN USE IT TO EDIT A FUCKING FILE RELIABLY and have to resort to multiple stern allcaps reminders to the robot that "you must not change the fucking header metadata you scoundrel" which for the rest of ALL OF COMPUTING is not even an afterthought because literally all it requires is "split the first line off and don't change that one" because ALL OF THE REST OF COMPUTING can make use of the power of INTEGERS.
Kobold
Unknown parent • • •Piko Starsider
Unknown parent • • •This is particularly funny and terrible if you know that there are mechanisms for a LLM to conform to a schema exactly: i.e. where even a tiny dumb model would output valid JSON in a valid desired schema. Even if it was an untrained model that just output random tokens it would still emit valid JSON. I used this feature to make a home-assistant-like thing run in a raspberry pi, without the need for an internet connection or a GPU or anything.
This thing is a fscking Rube Goldberg machine lmao
jonny (good kind)
Unknown parent • • •@martenson
@IvanDSM
Sorry I removed the link to that repo because i thought it was just the unpacked source, but it turns out they're trying to convert attention to the repo into their own product.
Here's another blogpost, there are a million, I don't claim this one is particularly good but at least it seems to come attached to the actual source
kuber.studio/blog/AI/Claude-Co…
Claude Code's Entire Source Code Got Leaked via a Sourcemap in npm, Let's Talk About it
kuber.studioeestileib (she/hers)
Unknown parent • • •@srvanderplas
Ethically? Absolutely 100%
Legally? Well, you see, the tech CEOs are very good friends with all three branches of the US government, so not in the USA or Israel anyway.
mirabilos
Unknown parent • • •bluestarultor
Unknown parent • • •@martenson @IvanDSM Okay, but what repo? We're operating off a Fedi trademark vaguepost.
Edit: found an article with links: dev.to/gabrielanhaia/claude-co…
Claude Code's Entire Source Code Was Just Leaked via npm Source Maps — Here's What's Inside
Gabriel Anhaia (DEV Community)schuelermine
Unknown parent • • •jonny (good kind)
in reply to jonny (good kind) • • •oh. hm. that seems bad. "workers aren't affected by the parent's tool restrictions."
It's hard to tell what's going on here because claude code doesn't really use typescript well - many of the most important types are dynamically computed from
any, and most of the time when types do exist many of their fields are nullable and the calling code has elaborate fallback conditions to compensate. all of which sort of defeats the purpose of ts.So i need to trace out like a dozen steps to see how the permission mode gets populated. But this comment is... concerning...
jonny (good kind)
in reply to jonny (good kind) • • •Elio Campitelli
in reply to jonny (good kind) • • •wohali
in reply to jonny (good kind) • • •elseweather
in reply to jonny (good kind) • • •Jason Lefkowitz
in reply to jonny (good kind) • • •jonny (good kind)
Unknown parent • • •Luci Bitchface Angerfoot
in reply to jonny (good kind) • • •Aral Balkan
in reply to jonny (good kind) • • •