TL;DR sample the top N results from the LLM and use traditional NLP to extract factoids, if the LLM is confabulating the factoids would have random distribution, but if it's not it will be heavily weighted towards one answer.
It's also interesting to see what temperature value they use (1.0, 0.1 in some cases?)... I have a feeling using the actual raw probability estimates (if available) would provide a lot of information without having to rerun the LLM or sample quite as heavily.
Or we could just ask the same question on 3 different LLMs, ideally a large LLM, a RAG LLM and a small one, then use LLM again to rewrite the final answer. When models contradict each other there is likely hallucination going on, but correct answers tend to converge.
Why use an LLM to check the work of a different LLM?
You could use the same technique that this paper describes to compare the answers each LLM gave. LLMs don’t have to be in opposition to traditional NLP techniques
A reply in the thread from David Major about performance:
"At the moment, performance is a mixed bag. Some tests are up and some are down. In particular I believe Speedometer is down a few percent."
"Note however that clang-cl is punching above its weight. These builds currently have neither LTO nor PGO, while our MSVC builds use both of those. Any regressions that we're seeing ought to be short-lived. Once we enable LTO[1] and PGO[2], I expect clang to be a clear performance win."
LTO can be particularly handy when doing devirtualization. Our experience ([1]) with Chromium demonstrated that there's a fair amount of virtual methods which have exactly two implementation in the code: one for production and another one for tests. While linking production code, it's trivial for this optimization to spot that, replace a virtual call with regular invocation and then often inline and allow for other in-place optimizations. In many cases of the renderer (Blink), that gives 3%-7% speedup out of nowhere.
Out of curiosity, why use virtual calls in those cases? Why not just have two different implementation files -- one for tests, and one for production -- and choose one based on which one you're doing? It seems like a win all around.
There's always performance hacks to be found in big apps, in exchange for implementation complexity and man-hours. And there's always a next bottleneck to be found. Performnace bugs are subject to an extended form of survivorship bias where they become bigger when the preceding perf bottlenecks get fixed.
I've refreshed and seen three major revisions of your comment after writing a full reply to the first one, and I can't say I've been able to clearly follow where you're coming from and where you're going with any of them. I would suggest that if you feel the need for such drastic revisions, it may be worth reconsidering whether whatever you're trying to argue is really compelling.
His advice is sage and you should look past whatever deficiencies you find in his form of communication. Simply put there's always bigger fish to fry, and virtual methods are a quick and easy way to implement test stubs. Especially when the compiler devirtualizes.
Well I had a response to that originally but his edits kind of changed how much sense my reply made in response. Here's what I had:
Performance is only one aspect of it. It also reduces code bloat, reducing the program's size footprint. Most tests (yes, I know, not all, but most) should not make it into the final binary users are running. I also don't see what's "hacky" about making a foo.test.cc file when I want an alternate implementation for foo.cc. It seems to be quite a positive and clear way to document the fact that an implementation is only needed for testing, and vice-versa. And not only that, but it reduces compile (& link) times, since you only need to compile one of the two implementations for each use case.
Yea all things being equal I'd choose your approach. But in an existing project that did things this way - it would be the last thing I'd touch. This is of course barring extreme circumstances - which profiling and other tools would identify.
Unfortunately a lot of C++ developers don't really understand how the linker works, and because of that the build system is black magic.
Yeah, I wasn't suggesting they immediately dispatch a team for transforming everything into this. But at least from now on they could try to get people to use this pattern, and make the move gradually.
I want to reiterate what I think the broader point is - If this is the most pressing problem you have a very highly functioning team. There's only so much bandwidth for change and I frequently see it get spent poorly.
Sorry! That's a risk you have when following comments tightly :)
I'll reconstruct my first take from memory if you have comments on it:
Current CPUs are pretty good at predicting indirect branches, and it's hard to tell beforehand which virtual calls still turn out to be perf problems - and it's wasted effort and a fallible process to attempt their compllete elimination beforehand.
Depends. Virtual calls can be done purely in code, with different implementation files you need to go round via the build system. The former just always works independent of platform whereas often for the latter it's more configuration work.
Also, ignoring YAGNI, you could say the virtual one makes it easy to quickly test things out etc, is easily mockable, ...
From my point of view a benefit for virtual calls is toolability. If you have an abstract interface, it’s pretty easy to either directly create mocks for it in the test file, or use sth like gmock. If you go through the link completely different implementation route, you might need to create and link a different library which contains the faked implementation for several test cases. It also works well with pure header code.
If I understand you correctly aren't you violating the initial assumption that there are only 2 implementations, one for production and one for testing? At which point, you're answering a different question...
>Out of curiosity, why use virtual calls in those cases? Why not just have two different implementation files -- one for tests, and one for production -- and choose one based on which one you're doing? It seems like a win all around.
I don't fully understand the proposal here. Do you have an example of a code in the wild that uses the proposed scheme?
I'm just saying do conditional compilation instead of virtual dispatch, since you generally shouldn't really need both the test and production implementations to run inside the same program. So if you have reason to require
// widget.h
class Widget { virtual void throb(); };
// widget.cc
class WidgetImpl : public Widget { void throb() { ... } };
class WidgetTest : public Widget { void throb() { ... } };
where you only compile widget.cc for the production build, and only compile widget.test.cc for the test build.
Feel free to post a more specific example and I can take a stab at seeing if I can make this transformation to it (or why it might not be possible, if you think it's not).
Thank you for elaborating your proposal. It might work in a theoretical scenario, but unlikely to be practical for existing projects with millions lines of codes.
Essentially, it was easier to write a compiler / linker optimization, than change the source code. On the bright side, everyone wins!
> It might work in a theoretical scenario, but unlikely to be practical for existing projects with millions lines of codes.
I mean there's no reason you have to change everything by tomorrow. You could introduce it as a policy moving forward, the same way I'm sure you make any other changes to other patterns that need to be followed. I'm sure you have a process for this?
> Essentially, it was easier to write a compiler / linker optimization, than change the source code.
I agree, but see above and below.
> On the bright side, everyone wins!
Well, you waste space, time, and energy making the compiler and linker do work that doesn't inherently need to be done. You also lose parallelizability of the compilation of the two implementations if they currently reside in the same file. Finally the compiler (er, linker) might not be able to actually apply the devirtualizations you expect, whereas in this case it's just a matter of inlining since the target method is already known.
> Well, you waste space, time, and energy making the compiler and linker do work that doesn't inherently need to be done.
It's repetitive, error-prone work, better to have the compiler and linker do it than rely on the programmer getting their use of the preprocessor right.
And even if the programmer gets it right, doing it via macros means every tool that you want to apply to your codebase - IDEs, profilers, coverage, instrumentation - needs to understand your macro. Are you sure they'll all get it right?
Better to write plain old standard code that every tool will work correctly with, and the worst thing that can ever happen is a slight performance penalty.
I misunderstood, but what you're actually advocating has all the problems I mentioned only even more so. Any tool that you want to apply to your codebase will have to understand not just a macro, but your build configuration.
How would you do it without using the preprocessor? Honest question. You could leverage the build system to do that instead but that seems even worse to me... You are effectively hoisting an implementation detail up into the build system.
Like the other commenter said, I'd rather do simple, virtual calls and then fix it using LTO.
> Well, you waste space, time, and energy making the compiler and linker do work that doesn't inherently need to be done.
You're suggesting that instead of a compiler wasting time space and energy, that we get actual people to waste their time space and energy by manually applying the transformation that the compiler is able to do anyway?
> lava flow is a problem in which computer code written under sub-optimal conditions is put into production and added to while still in a developmental state.
I don't know what you're referring to as the "suboptimal code" here. If it's the old pattern using virtual everywhere, then the ship's already sailed... that code is already in production. If you mean the two-file approach I proposed then that makes no sense; if it's worse than the last approach then why would you even use it.
What I meant is that you will end up permanently with multiple solutions for the same problem if you try to change things gradually. This has several drawbacks: Additional complexity having to support both versions, confusion with new developers about which variant to use, a higher likeliness that someone will add a third variant. You have to weigh the benefits of the new alternative against having a more complex code base/doing the grunt work of changing everything at once.
So in this example you have to weigh the benefits of the two file tests against having tests that work uniformly.
In the performance-insensitive code the approach you defend could be "good enough" but for the code on the "hot path" the changes like the proposed by dataflow could result in the consistent improvement and are surely better long-term "win."
"We do it that way here" is not something that has to be always applied.
My DoSomething function needs to link to widget, but now I have two different libraries that widget could be in. The linker encodes in my binary which library to load to get widget. All attempts to use the other library instead are undefined behavior.
The only way I know of to work around that is build DoSomething twice, linking the real widget once and the test widget the other. Note that DoSomething takes two parameters, and those are classes that may themselves also take other classes with virtual functions. All the possible cases quickly gets into a combinatoric explosion and my build times goes from an hour (already way too long) to days.
If I'm wrong please tell me how, I'd love to not have so many interfaces just because someone doesn't want their unit test to become a large scale integration test.
What class are you deriving from though? Can’t be Widget, because in the scenario you described Widget and TestWidget are mutually exclusive (TestWidget just replaces Widget altogether). The alternative is extracting the shared behaviour to a BaseWidget that both classes extend, but that’s just ugly as sin.
Yeah? I didn't get rid of throb2()'s virtual, because it actually had a good reason to be virtual. But I very much got rid of throb1()'s virtual because it didn't need to be virtual, and that was the point. It's to be expected that if you add extra constraints in the problem, you are likely to end up with extra constraints in the solution. I was just presenting an approach that let you limit virtuality (word?) to where it's actually needed, meaning that what you gained here was not having to pay costs you clearly didn't need. It's obviously not a magic pill to remove every virtual, which I would have hoped was pretty clear. It's your job to extend the approach to fit the precise constraints you're dealing with, whether it's adding pimpl or keeping a virtual or whatever. And remember you can always avoid this approach if you think it's too ugly in this scenario or have whatever complaints. You can still use it when you don't have extra constraints. Nobody's forcing you to pick a side. Just use the best approach for the problem at hand.
I mean, then you can avoid it in that situation. Or actually use the production version like I explained here: https://news.ycombinator.com/item?id=17505358 Or find another way to use it. It's not all-or-nothing.
Yes that works, but relies on devirtualization in the production code, and isn't at all what you were suggesting before about having separate implementation files.
Yeah that would work but also make your code base a mess. Things that you can catch during build step and move in, or out, is much better handled at that level rather than leaking it into your code base.
That's really cool. I've primarily seen LTO benefits on the more embedded side of things where it ends up enabling far more aggressive dead code elimination between code and libraries which makes it much easier to fit into small chips. I hadn't thought much about virtual methods with C++ code (uncommon to see those in embedded code to begin with).
> Similarly, awards with vesting triggers based on exit events such as an initial public offering or change-in-control would be taxable on grant unless they require the recipient to be employed through the liquidity date
Double trigger RSUs are popular in late stage unicorns where the exercise price for ISOs have already hit a high level but the stock isn't liquid enough to sell to cover the tax as they vest.
A lot of people might be affected, I wonder if existing grants are grandfathered in?
Not at all. It's a recognition that the people who made long-term arrangements under the old law shouldn't be punished for that. The safer people feel making commitments, the more we can pursue things that take time to pay off, which has broad societal benefits. And in practice, laws are much harder to change if you piss off a lot of people who didn't do anything wrong but now suddenly stand to lose big.
As an example, take the mortgage interest tax credit. Personally, I strongly believe it should go away; if the government is going to subsidize housing, it shouldn't spend most of that money on the already well off. But I think it would be wrong to screw all the homeowners who bought a house expecting the credit. We'd see a wave of disruption (short sales, foreclosures, people suddenly barely scraping by) that benefits nobody. So I'm fine with grandfathering existing mortgages and gradually phasing out the credit.
It would be totally reasonable to phase out the mortgage interest deduction over ~10 years, giving people time to plan and decide how they wanted to handle it. The cap is currently $1M, let's reduce it by $100k/yr. (or $50k/yr over 20 years)
There is an expectation of predictability in our legislation. Grandfather clauses preserve predictability for those who made decisions based on the old tax code. Sometimes binary grandfather status is not financially or politically possible and it's phased out instead. I think this kind of approach to legislation and regulation is fair and isn't commentary on the quality of the change.
Without speaking to this particular grandfather clause, I think in general these kinds of clauses deal with the fact that a new policy, good or bad, might cause people to design their incentive structures differently, and they want to avoid screwing over people who in good faith designed their structures to work best under the old system, and instead give them time to make adjustments.
What irks me is that tax policy is such that complicated incentive programs used to defer compensation and tax obligation are a thing in the first place.
Deferred compensation plans are not solely to manage or modify tax treatment of compensation.
As a long-term shareholder, I want the CEO, board, and senior executive compensation to be tied to long term shareholder value creation. By far the easiest way to do that is to create a deferred compensation plan that ties their financial outcome to that of a long-term shareholder.
If I win big holding their shares, I want them to win big.
If they just match the market, they should get paid something for their time, but not anything exceptional.
ISOs aren't really a tax dodge. They're designed to encourage employees and management to put extra work and thought in to the company they're building. I can tell you if I received the value of my ISOs in cash I would not come in to work early and leave late in the hope of a payoff sometime in the future. Unless you're working at a company doing something truly interesting, I think most employees would fall in to my camp.
Should taxpayers be forced to amend prior years' returns based on a new tax policy? (I believe no.)
If this policy were passed and applied to retroactively vested (but not yet received due to a time, performance, or other double trigger), that's what would seem to need to happen.
Other people have made plans for retirement or other meaningful milestones based on the current tax law. There is value in having stability and being able to plan around tax and other long-term financial realities.
Further, due to performance-dependent multipliers, it's often not even possible to know what the amount should be (which I admit applies to both grandfathered and not grandfathered comp).
New laws shouldn't, in general, be retroactive. The fact that a grandfather clause is necessary to specify that a law isn't intended to be retroactive is bothersome.
You need to have clauses to specify retroactivity, otherwise you get into a world where the government has to wait for everyone alive to die before they're allowed to implement a new taxation plan. There have to be limits somewhere, and that's exactly what these clauses need to implement.
Certainly not. It's all based on the purchase/exercise/offering date. Let's take it to the extreme: if homicide had been legal, and society decided we needed to outlaw it, it stands to reason that homicide committed before the law was enacted would not be prosecuted, but only those cases happening one or after that date.
A homicide law isn't ambiguous. Either you killed the dude before it was enacted or after. You know, to the second, whether or not you're a murderer.
Tax law isn't so clear. There's when you were granted the shares, when you acquired them (which may or may not be a taxable event, depending, among other things, on whether there's a spread between strike price and FMV), when you sold them (which is a taxable event), and when the law was enacted. There are probably yet other subtleties beyond those.
The grandfather clause covers the case when the enactment date falls amongst the others. ISOs purchased before the enactment but sold after. ISOs granted before the enactment, but purchased after. Double-triggers. Are you sure you know how the law applies, and what your tax liability is, without explicit statute to that effect, in all of those cases — or others I haven't listed, or even imagined? Is your accountant? Are you willing to bet an audit on that?
No, it's really not. The action that put the person in the hospital where they lingered for a month before dying either happened when it was legal to homicide, or not. Without that specific action, the victim would not be dead — or, more specifically, would not be in a situation that led to their death.
Criminal law absolutely recognizes that causal chains have a "first link", without which the rest of the chain wouldn't even have happened. It also specifically subjects the party causal to that first link to special scrutiny.
A grandfather clause like this doesn't make a law "retroactive". It exempts people who would be subject to the law because of things that happened before its passage from its effects, when the specific behavior in question (say, e.g., filing your taxes after the law's passage, on income earned before that passage) is newly affected. It prevents "punishing" people for behavior that wasn't contrary to the law, when they engaged in it.
It is, therefore, the exact opposite of retroactive.
I think you've misunderstood me. I've reworded to help. I did not intend to imply that I thought such a clause made the law retroactive. I'm saying that by default, laws should not be retroactive and that needing a clause to specifically prevent retroactivity is annoying as it specifies what should already be the default.
criminal law != civil law. And the law in fact doesn't apply retroactively, but it would apply to events in the future, that were determined in the past, based on certain assumptions.
Yes, the only reason the government is hiding time machines from us is their inability to finish the tax laws that would apply :)
Analogy: If congress raises gas taxes (nobody said analogies have to be realistic), everyone would have to pay more at the pump. But that doesn't mean the tax is applied retroactively.
Fast! is a Microsoft Research language for tree transforms, the cool part is the "deforestation" where a sequence of functions can be transformed in to a single function.
What makes this cool is it isn't "runtime" composition; the generated C# code does the transformation in a single pass. For example a sequence of maps and filters would only loop over the list once.
It's worth noting here that they demonstrated this effect with language learners i.e. it's very likely that the students had to labouriously work out the foreign language questions.
Basically they've demonstrated that if people spend a lot more time on a question they answer less emotionally (which seems like it would be true independent of the language).
If they can demonstrate the effect still holds with 2nd language speakers with native level proficiency then this would be a lot more interesting result.
Just an anecdote, I speak French with native proficiency and while I can see the emotional detachment in languages as English, in French it is certainly not there. So I think it also depends on language immersion.
There's some guys in Seattle doing this using motion-capture cameras for the positioning:
I answer that I'm terrified of running into something in the real world while my
vision is obscured, but they insist I'll be fine. It turns out the game space's
virtual walls correspond to the real ones in the admittedly small office I occupy
in meatspace.
Presuming this is RFC 3229, this is transport compression, not webserver offload.
The response is generated by the origin webserver as normal. But rather than sending that response using the normal HTTP encoding, instead the proxy first does a binary diff against any versions that the (CloudFlare) client says it has and that the (CloudFlare) proxy also has in its cache. They use e.g. ETags or MD5 to uniquely identify the entire response content.
You can still do cookie stripping etc to try to avoid the request to the webserver altogether, but that's a separate concern.
There isn't a per-site cache in Railgun because it's part of our large shared in-memory cache in our infrastructure.
Currently, cookies are not part of the hash.
We have customers of all types using Railgun. As an example, there's a British luggage manufacturer who launched a US e-commerce site last month. They are using it to help alleviate the cross-Atlantic latency. At the same time they see high compression levels as the site boilerplate does not change from person to person viewing the site.
What sort of sites do you think it doesn't apply to?
> What sort of sites do you think it doesn't apply to?
Single page webapps. In those cases the html/js is normally static and already CDN'ed and the data is a JSON API which varies on a per user basis.
There would be some gain as the dictionary would learn the JSON keys but I doubt it would be very dramatic vs deflate compared to the content sites referenced in the article.
Yes. That's up to the particular configuration of the site. It varies from site to site, but for optimal results you want it big enough to keep the content of the common pages of your site.
This looks like Twitter's version of Web Components?
You might have heard of it shadow DOM etc. Basically the idea is to be able to add GUI components eg <progressbar> that are as integrated as the native ones are.
Mozilla's X-Tags implementation seems closer to the goal though: http://www.x-tags.org/ IIRC it can do this because it's using Object.observable internally to detect DOM changes.
ie with X-tags you don't need the javascript definition part, just: <x-map data-key="39d2b66f0d5d49dbb52a5b7ad87aea9b"></x-map>.
Web Components is the storm that will wipe away all these frameworks: Backbone, Ember, Angular, Knockout, etc.
Unfortunately, we’ll have to stick around with them for the following couple of years, probably, as for now, most of the Web Components APIs are available only in Chrome Canary and Firefox dev builds. Monkey-patching (AKA “shimming”) to the rescue…
In my opinion, the more certain nowadays libs resemble the Web Components specs, the more clear the developers’ choice towards those libs should be.
A figure from the paper shows this better than my TL;DR: https://www.nature.com/articles/s41586-024-07421-0/figures/1