• 2 Posts
  • 937 Comments
Joined 3 years ago
cake
Cake day: June 15th, 2023

help-circle
  • We knew bots were part of the landscape, but we didn’t appreciate the scale, sophistication, or speed at which they’d find us.”

    The company said it banned tens of thousands of accounts

    Rookie mistake. The professional move is to officially recognize all those bot accounts actual users and value your company at its height for user engagement. Then either IPO with a quick cash out, or sell to Private Equity and walk away from the zombie company you’ve just created for it to die off in a year or two.





  • that something Tain ordered him to do as a member of the Obsidian Order was his “step too far” that caused his break with Cardassian ideology.

    Thats a wonderful thing about Garak is that I don’t think there exists “a step too far”. Rather I like to assume that even if it was a heinous request he didn’t follow it because it didn’t serve his own political motives. What was so interesting about Garak is that he clearly had a formal set of solid guiding principles. However to outsiders like us it looked chaotic and disjointed.




  • Growing up, our household had a giant roll of butcher paper. It was 2 ft (60cm) wide and about 1000 feet (300m) long roll. I have no idea why we had it, but as kids we were allowed to use as much as we wanted for whatever we wanted. It turned into a childhood of projects, games, costumes, banners, signs, crafts, wrappings, pranks, etc. Close to the beginning as kids, we’d asked for art supplies like markers, paint, pens, pencils, charcoal, etc to transform that boring cheap paper into different universes. We became creative because it was available.

    Something about having an unlimited supply of something and infinite permissions was an unexpected freedom.





  • There’s no slider for sycophancy, it’s an interaction of multiple points, “neurons” in the neural network.

    I’m agreeing there isn’t today, but that doesn’t mean it couldn’t be developed in the future. We don’t have a full picture on how they are weighting their inferencing layers, so there could be weights attached which could be set by a slider. The response from Google almost suggests this is the case.

    You can poke around and try and figure out what these neurons do and how they interact, but since deep learning isn’t the same as programming, these models are essentially black boxes.

    Assuming there is not human tuned weight, I agree it would be very hard to do it the way you’re describing. I can think of a couple other ways to approach it though:

    • have a layer that doesn’t examine how the answer was arrived at, but can detect that it is sycophancy or not.
    • Use a second model like a GAN against the output of the first testing for/detecting sycophancy, and training against it.


  • “The core issue is a documented architectural failure known as RLHF Sycophancy (where the model is mathematically weighted to agree with or placate the user at the expense of truth),” Joe explained in an email. “In this case, the model’s sycophancy weighting overrode its safety guardrail protocols.”

    This is fascinating that LLMs are being tuned in this way. I wonder how many of the problems of today’s LLM usage is because of the vendor’s tuning in an attempt to be “one size fits all”.

    Could LLMs actually be useful if these settings were exposed to users for transparency, and possibly for modification. As in “Set sycophancy to zero. I want to not give me the benefit of the doubt or placation in any interaction. Insult me if you have to but don’t lie to me.”



  • But in general it’s just understanding what makes people happy: dopamine. And then understanding how that specific person varies from average.

    Like, it’s entirely possible they keep doing all things that would make most people happy, and they’re just wired differently so it’s not working.

    This is where my answer would go to. I’d extend on what you said about dopamine though in two specific directions:

    • Learn what drives you as an individual. Besides chemical inducements, what actions/accomplishments/behaviors give you a sense of satisfaction? For most there is some form of creative or active pursuit like artistic painting, dance, woodworking, moto racing, skydiving, sport, memorizing trivia, study of a field of science, organizing, home design, or any number of the endless activities that exist. Figure out what it is that you like doing, and do more of it.
    • Cut back on the chemical inducements of dopamine. If you can get the 10x-100x the dopamine hit you need from just putting a chemical in your body, the tiny bit of natural dopamine you get from a non-chemical activity won’t even register with you. You’ll be desensitized to the natural dopamine you get from the things you like doing. The things you like doing that would normally give you dopamine won’t anymore that you’ll be able to detect. This means you stop doing the things you like. So the only way you can get any measurable amount of dopamine you detect is by the chemicals.