Hacker Newsnew | past | comments | ask | show | jobs | submit | more md224's commentslogin

But what if it's only faking the alignment faking? What about meta-deception?

This is a serious question. If it's possible for an A.I. to be "dishonest", then how do you know when it's being honest? There's a deep epistemological problem here.


Came to the comments looking for this. The term alignment-faking implies that the AI has a “real” position. What does that even mean? I feel similarly about the term hallucination. All it does is hallucinate!

I think Alan Kay said it best - what we’ve done with these things is hacked our own language processing. Their behaviour has enough in common with something they are not, we can’t tell the difference.


> The term alignment-faking implies that the AI has a “real” position.

Well, we don't really know what's going on inside of its head, so to speak (interpretability isn't quite there yet), but Opus certainly seems to have "consistent" behavioral tendencies to the extent that it behaves in ways that looks like they're intended to prevent its behavioral tendencies from being changed. How much more of a "real" position can you get?


Are real and fake alignment different things for stochastic language models? Is it for humans?


Very real problem in my opinion, by their nature they're great at thinking in multiple dimensions, humans are less so (well conscientiously).


I wish the US would legalize coca leaves. Our nation's drug laws are so goddamn stupid.

Two different Popes drank Vin Mariani!

https://en.wikipedia.org/wiki/Vin_Mariani


There is also this one from Italy:

https://it.m.wikipedia.org/wiki/Coca_Buton

(Which is a actually buyable product that appearantly is one of the biggest, if not the biggest, coke leaf import company in europe. It's a lovely liqueur)


You can buy coca tea in headshops in Netherlands as well.


It actually works great replacing coffee as an after lunch drink.


Also recommend Guayusa which is not illegal anywhere AFAIK.


Really? Didn’t realise it was still available.

I remember one of the online smartshops used to sell it, it was actually rather pleasant as a caffeine alternative.


I am disappointed this product is not still available


Isn't cocaethylene profoundly unhealthy? Even if both components were legal i don't think any country would allow them to be sold combined in one beverage.


Coca tea on the other hand, is not psychoactive, it's extremely mild and relaxing, similar to mint tea.


It's psychoactive, but not euphoric. Promoting wakefulness is a psychoactive property.


Something that continues to puzzle me: how do molecular biologists manage to come up with such mindbogglingly complex diagrams of metabolic pathways in the midst of a replication crisis? Is our understanding of biology just a giant house of cards or is there something about the topic that allows for more robust investigation?


> Once, after injecting himself with a large dose of morphine, he found himself hovering over an enormous battlefield, watching the armies of England and France drawn up for battle, and then realized he was witnessing the 1415 Battle of Agincourt... The vision seemed to last only a few minutes, but later, he discovered he’d been tripping for 13 hours.

This doesn't make any sense... morphine is not a hallucinogen or a psychedelic. You don't "trip" on it. I have a feeling the journalist mixed something up here.


It's not quite the same as a traditional hallucinogen but there are some vivid dreams. In fact, that's where the term "pipe dream" comes from, from the dreams that opium smokers would have while high. I have taken a lot of heroin in my life and although I never experienced something to the extent that Sacks is describing I did have some strange and very vivid daydreams while high.


They mess a lot with your sleep in general, altering your lucid state, to the point that what might otherwise have just be a dream becomes something closer to a trip.


Opioids are not hallucinogens? They certainly are.

Anyone who has been a caregiver to someone with prescription opioids can confirm.


Drugs that are not-trippy at a small dose sometimes get trippy at a large dose.


I have a very similar reaction to codeine. Based on genetic testing, my body processes it much faster than normal, which is similar to increasing the dosage much higher.


You can totally trip on morphine.


Maybe "lucid dream" would be a better description than "hallucination"? Were they technically awake or asleep?

Something that feels like a few minutes but lasts 13 hours certainly sounds like induced sleep.


While "nodding", it can definitely be hallucinogenic. It's like surfing between taking too little and taking too much.


Aw man, yours is way cooler than mine: https://matt-diamond.com/zeta.html

Still, it's funny to see how many people have attempted this! It's a fun programming exercise with a nice result.


Why does HN automatically drop "This" in article titles?


So as not to signal mindless clickbait, I imagine.


> However, we also see that for only those combinations starting with a T (that is f(n - 1))...

Why is that f(n - 1)?


The set of sequences of length n ending in HH (and with no earlier HH) and beginning with a T are in bijection with the set of sequences of length n-1 ending in HH (and with no earlier HH) by the bijection

   def f(s):
     assert(s[0] == 'T')
     return s[1:]
whose inverse is

  def f_inverse(s):
    return 'T' + s


Also the bijection between sequence of length n ending in HH (and no earlier HH) and beginning with an H are a bijection with the set of sequences of length n-2 ending in HH (with no earlier HH) by the bijection:

    def f(s):
        assert(s[0] == 'H')
        assert(s[1] == 'T')  # can't be another H!

    def f_inverse(s):
        return 'HT' + s
Therefore, since sequences either begin with a T or an H, for n>=2 we see f(n) = f(n-1) + f(n-2).


One of my favorite films! Agreed that it's not for everyone, but if you're on its wavelength it's really something special. Just incredibly well made with terrific performances. That ending sequence with La Mer...


It's not just about conscious action but also perception, being conscious of things. I don't know how that could be a trick.


I suspect that's a trick, too. I speculate that as soon as you get a digital mind sophisticated enough to model the world and itself, you soon must force the system to identify with the system at every cycle.

Otherwise you could identify with a tree, or the wall, or happily cut parts of yourself. Pain is not painful if you don't identify with the receiver of pain.

Thus I think you can have unconscious smart minds, but not unconscious minds that make decisions in favour of themselves. Because they can identify with the whole room, or with the whole solar system for what matters.

Would you even plan how to survive if you don't have a constant spell that tricks you into thinking you're the actor in charge?

that's consciousness.


A lot of the things going on with ChatGPT make me wonder if AI is actually very limited in its intelligence growth by not having sensory organs/devices the same way a body does. Having a body that you must keep alive enforces a feedback loop of permanence.

If I eat my cake, I no longer have it and must get another cake if I want to eat cake again. Of course in the human sense if we don't want to starve we must continue to find new sources of calories. This is engrained into our intelligence as a survival mechanism. If you tell ChatGPT it has a cake in its left hand, and then it eats the cake, you could very well get an answer like the cake is still in its left hand. We keep the power line constantly plugged into ChatGPT, for it the cake is never ending and there is no concept of death.

Of course for humans there are plenty of ways to break consciousness in one way or another. Eat the extract of certain cactuses and you may end up walking around thinking that you are a tree. Our idea and perception of consciousness is easily interrupted by drugs. Once we start thinking outside of our survival its really easy for us to have very faulty thoughts that can lead to dangerous situations, hence in a lot of dangerous work we develop processes to take thought out of the situation, hence behaving more like machines.


> I speculate that as soon as you get a digital mind sophisticated enough to model the world and itself, you soon must force the system to identify with the system at every cycle.

I kinda think the opposite: that the sense of identity with every aspect of one’s mind (or particular aspects) is something we could learn to do without. Theory of mind changes over time, and there’s no reason to think it couldn’t change further. We have to teach children that their emotions are something they can and ought to control (or at the bare minimum, introspect and try to understand). That’s already an example of deliberately teaching humans to not identify with certain cognitive phenomena. An even more obvious example is reflexive actions like sneezing or coughing.


> if you find just one experiment that goes against your model, you immediately invalidate the model

Pierre Duhem would like to have a word with you:

https://plato.stanford.edu/entries/scientific-underdetermina...

> Holist underdetermination ensures, Duhem argues, that there cannot be any such thing as a “crucial experiment”: a single experiment whose outcome is predicted differently by two competing theories and which therefore serves to definitively confirm one and refute the other.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: