Hacker Newsnew | past | comments | ask | show | jobs | submit | more ctenb's commentslogin

Do you have a reference where it is explained? It's not part of the docs as far as I can tell


You might try this blog entry (or other blog entries that talk about shrinking):

https://hypothesis.works/articles/how-hypothesis-works/


> Shrinking is expressed as manipulations of that tape.

How do you do that in general? I can't find any documentation on that.


This paper by the Hypothesis authors has some information: https://www.doc.ic.ac.uk/~afd/papers/2020/ECOOP_Hypothesis.p...


What is complexity sorted order?


When you generate randomized data in an order from the simplest to the most complex.


Isn't that very wasteful or difficult to do in practice? If you consider that shrinkers generally take lower numbers to be 'simpler' than higher numbers, complexity-ordering requires you to generate all the numbers from low to high


Not really. There are many ways depending on your needs. For example, you can partition your space first, then generate randomly inside each of the subspaces. Let's say I need 200 numbers from -1000 to 999. The first range will be 0 to +99, the second -1 to -100, then +100 to +199, and so on. So, to generate a random number I just need an index and the bounds.


This post has many upvotes, but all the comments ask questions about the usefulness of this, without any justifying response so far. I have the same question, and I wonder what's going on with this post?


No, it's about the distribution being injective, not a single sampled response. So you need a lot of outputs of the same prompt, and know the LLM, and then you should in theory be able to reconstruct the original prompt.


This is a cool example of how specializing a generic algorithm to a specific subspace can yield much better results. This is quite often the case in my experience, but we often don't bother utilizing properties that are specific to our problem space, and just apply the generic algorithm out of convenience (and because it is often good enough)


I wrote my thesis on this! Application-specific system design can get you orders of magnitude performance improvement, as well as better scalability/fault tolerance properties. I focused on graph analytics, but it's reasonable to think it applies more broadly.

Definitely true that application-specific design is often not worth the investment though. Chasing that 1000x improvement can easily cost you a year or two.


Can you please link to your thesis? This sounds very interesting.



Came here to say this, but with caveats. The particular domain has extra properties that allow their "stupider" algorithm to work better in their case. But a general graph drawing system has to deal with the inherent generality of the domain.

Usually there is a good middle ground: heuristic analysis of the input to see if it fits well with special-case "stupid and fast" algorithms, and sophisticated optimizations that are the fallback and work for everything and come with guarantees.


That's how all application-specific specializations work though, take advantage of domain properties that make you need a less generic algorithm.


What is a CLO?


Chief Legal Officer.


What is the business model?


See helix.vim on GitHub


Yes, it feels neglectful that the author has not mentioned this prior art, especially since they are near identical


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: