Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Days since last GitHub incident (github-incidents.pages.dev)
208 points by AquiGorka 1 day ago | hide | past | favorite | 127 comments




I found out when Actions started failing again for the Nth time this month.

The internal conversation about moving away from Actions or possibly GitHub has been triggered. I didn't like Zig's post about leaving GitHub because it felt immature, but they weren't wrong. It's decaying.


If you consider that an American maintainer was cheesed off enough to move an entire project off GitHub two days before Thanksgiving then the tone of the original post was completely in line with the energy involved.

Anger is a communication tool. It should absolutely be used when boundaries are being violated. Otherwise you’ll get walked all over.


I mostly agree, but a generalized attack at the remaining GitHub workers by calling them "losers" and then "rookies" is unwarranted and leaves a bad taste IMO.

See the edit history here: https://news.ycombinator.com/item?id=46133179

Edit: 1. just to be clear, it's very good that they have accepted the feedback and removed that part, but there's no apology (as far as I know) and it still makes you wonder about the culture. On the other side, people make mistakes under stress. 2. /s/not warranted/unwarranted/


A structural engineer will not sign off on bad designs no matter how much pressure the company applies to them. They will resign and/or report the incident to their local regulator as a safety issue.

We don't have that for developers. Maybe shame/offense is our next best bet. You are free to work for a terrible company accepting and/or encouraging terrible design decisions, but you need to take into account the potential of being laughed at for said decisions.


The Zig post has since been updated and the objectionable parts have been removed. I think we can put that part to rest.

I have no problem with their opinions but I don’t think it should have been said in a Foundation post.

It may have been updated, but nobody is reading the update.


OK, so we keep banging on about it forever? Move on

I just did and wouldn't have know about it if it hadn't been talked about here.

Idk, if being bad is the reason for leaving Github Actions, I think people would have left it ages ago. It stuck not because it is better than competitors but because it is included in the Github plans. It's decaying implies that it has somehow became worse, in fact it was one of the worst implementation to start with.

GitHub has seem to come under the same management as VSCode, everything has to be made AI and that is the only priority. It's like the Google+ of old but stupider.

Hopefully with that much AI they can finally make the Explore page more useful than "most stars" and "most recent updated". There seems to be no way to discover stuff on GitHub except knowing where it is (hence not discovering but knowing).

Combined with security concerns, this made us reconsider even our self-hosted GH Actions last month.

GH Packages is something we're extricating ourselves from after today too. One more outage in the next year and maybe we get the ammunition to move away from GH entirely.

It's still hard to believe that they couldn't even keep the lights on on this thing.


This is why I keep encouraging folks to a) have a mirror & b) make sure their tools automatically pick up the mirrors.

I recently got mirror support upstreamed into Nixpkgs for fetchdarcs & fetchpijul which actually work on my just-alpha-released pinning tool, Nixtamal <https://darcs.toastal.in.th/nixtamal/trunk/README.rst>, for just this sort of thing.


I envy you. Most of us struggle to get the resources to make our actual customer facing applications resilient, let alone our build pipeline.

Building your software usually involves getting dependencies, & those dependencies are, hopefully, in more than one location—which includes a cronjob to a bare repo, or Alice’s fork on another repo that at least has the latest tags. It should be trivial to point to these as mirrors for the cases where any forge/repository, even the ones held by megacorporations, inevitably go down. Even Nixpkgs itself, while not maintaining their own official mirrors, are mirrored by TUNA. Backups are an important strategy, & the source code should also be a part of that.

That's great for the repository, but what about if you're using ghcr, actions, issues, or copilot?

These are different concerns. There are a lot of use cases, where folks are just getting dependencies & not interacting with bug tracker or continuous integration use which are less critical & can be accessed later or ran locally.

I've been getting some weird cryptocurrency spam notifications on GitHub and they can't be cleared for some reason. Blue dot is gonna be there forever apparently. Some users made an issue out of it but nobody cared to fix it.

This should have been fixed here: https://github.blog/changelog/2025-12-04-notifications-trigg...

Are you still seeing it, would you mind checking? Our team will get on it if so.


I had the same. The dot is cleared on the mobile site. It is present on the desktop site.

Somewhat similar situation here. Cleared cache and logged back in on mobile, dot was fixed. Haven't tested on my laptop yet.

Thank you all, sharing internally now to get that fixed! Super appreciate the feedback.

Thanks for fixing it!

Had the same issue that the blue dot won’t disappear. I was able to clear the dot with:

gh api notifications -X PUT -F last_read_at=2025-10-06T00:00:00Z

Just change the date to today. I also got that line from a gh issue somewhere - maybe it was the same issue that you’re referring to.


Same happened to me. You can clear it via the CLI, hilariously.

```

gh api notifications\?all=true | jq -r 'map(select(.unread) | .id)[]' | xargs -L1 sh -c 'gh api -X PATCH notifications/threads/$0'

```


HN doesn't support markdown, but you can "code" format it with 2+ spaces.

https://news.ycombinator.com/formatdoc


Here’s how you can clear it:

https://github.com/orgs/community/discussions/174310#discuss...

I had the same issue too, and this was the only thing that fixed it for me.


Once GitHub fully migrates to Azure, it should be known as GitHub 11.

Github 3.11 for Workgroups.

GitHub Vista Cloud Edition

After a couple of service packs will it be OK?

I think you mean "365 Code Copilot".

AgenticHub!

GitHub Actions is a good example of systems thrown together that at face value have something to offer until they get put under stress.

Just now I found:

    * a job that's > 1 month old, still running
    * another job that started 2 hours ago that had 0 output
    * a job that was marked as pending, yet I could rerun it
    * auto-merges that don't happen
    * pull requests show (1), click it, no pull requests visible
Makes me wonder in how many places state is stored, because there is some serious disconnect between them.

That's just post-Windows 8 Microsoft quality for you. Every product has been like that - looks "ok" on the outside (in reality it looks shit, but at least that's intentional), but the second you dig deeper and start using it you get all kinds of paper cuts like that.

Is GitHub deployed using GitHub Actions?

I was talking with some GH sales/marketing engineers last month and they said it deploys with actions, but they have a custom deploy queue

It is a fun bootstrapping problem. How do you firewall enough dedicated resources to stand up your infrastructure if you dogfood your own product. Probably insidiously easy to have a dependency on the production service.

An Azure outage took out Office365 the day before CrowdStrike happened. I would not trust Microsoft to get this balance right.

Missed a chance to put this in meme format e.g. https://imgflip.com/memetemplate/439302803/Days-without-acci...

I've gotten accustomed lately to spending a lot of time in the Github Copilot / agent management page. In particular I've been having a lot of fun using agents to browse some of my decade-old throwaway projects; telling it to "setup playwright, write some tests, record screenshots/videos and commit them to the repo" works every time and it's a great way to browse memory lane without spending my own time getting some of these projects building and running again.

However this means I'm now using the Github website and services 1000x more than I was previously, and they're trending towards having coin-flip uptime stats.

If Github sold a $5000 box I could plug into a corner in my house and just use that entire experience locally I'd seriously consider it. I'm guessing maybe I could get partway there by spending twice that on a Mac Pro but I have no idea what the software stack would look like today.

Is there a fully local equivalent out-of-the-box experience that anyone can vouch for? I've used local agents primarily through VSCode, but AFAIK that's limited to running a single active agent over your repo, and obviously limited by the constraints of running on a single M1 laptop I currently use. I know at least some people are managing local fleets of agents in some manner, but I really like how immensely easy Github has made it.


None of the open weights models you can run locally will perform at the same level as the hosted frontier models. Some of them are becoming better, but the step-down in output quality is very noticeable for me.

> If Github sold a $5000 box I could plug into a corner in my house and just use that entire experience locally I'd seriously consider it. I'm guessing maybe I could get partway there by spending twice that on a Mac Pro but I have no idea what the software stack would look like today.

Right now, the only reasons to host LLMs locally are if you want to do it as a hobby or you are sensitive about data leaving your local network. If you only want a substitute for Copilot when GitHub is down, any of the hosted LLMs will work right away with no up front investment and lower overall cost. Most IDEs and text editors have built-in support for connecting to other hosted models or installing plugins for it.

> I know at least some people are managing local fleets of agents in some manner,

If your goal is to run fleets of agents in parallel, local LLM hosting is going to be a bottleneck. Familiarize yourself with some of the different tool options out their (Claude Code, Cline, even the new Mistral Vibe) and sign up for their cloud API. You can also check OpenRouter for some more options. The cloud hosted LLMs will absorb parallel requests without problem.


Thank you, a bit sad to hear that local inference isn't really at this level of performance yet. I was previously using the VSCode agent chat and playing with both OpenAI and Github hosted models but I switched to using the Github web UI directly a lot since my workflow became a lot more issue/PR-focused. Sounds like I should probably tighten up the more generic IDE-centric workflow and make it a keyboard shortcut to switch around when a given provider is down. I haven't actually used Claude directly yet but I think Github agents often use it under the hood anyway.

They do, it's called GHES.

https://docs.github.com/en/enterprise-server@3.19/admin/over...

"GitHub Enterprise Server is a self-hosted version of the GitHub platform"


you're not getting copilot on the self-hosted version, which is what the parent was focusing on.

That does not include the Copilot related APIs though.

I've tried getting this set up at my University, it was hell dealing with them. We ended up going with Gitlab.

An NVIDIA DGX Spark is $4000, pair that with a relatively cheap second box to run GitLab in the corner and you would have pretty good local AI inference setup. (you'd probably have to write a nontrivial amount of software to get your setup where you want)

The local models are just right on the edge of being really useful, there's a tipping point to where accuracy is high enough so that getting things done is easy vs models getting continuously stuck. We're in the neighborhood.

Alternatively, just have local GitLab and use one of the many APIs, those are much more stable than github. Honestly just get yourself a Claude subscription.


The DGX Spark is not good for inference though it's very bandwidth limited - around the same as a lower end MacBook Pro. You're much better off with a Apple silicon for performance and memory size at the moment but I'd recommend holding off until the M5 Max comes out early in the early as the M5 has vastly superior performance to any other Apple silicon chip thanks to its matmul instruction set.

Oof, I was already considering an upgrade from the M1 but was hoping I couldn't be convinced to go for the top of the line. Is the performance jump from the M# -> M# Max chips that substantial?

The main jump is from anything to M5; not because it's simply the latest but because it has matmul instructions similar to a CUDA GPU which fixes the slow prompt processing on all previous generation Apple Silicon chips.

> Is the performance jump from the M# -> M# Max chips that substantial

From m1? Yes, absolutely. M3 is marginal now but m5 will probably make it definite.


I can't say I'm not tempted looking at the Spark, I could probably save some cash on heating my house with that thing. Though yeah unless there's some good software already built around a similar LLM workflow I could use it'd probably be wasted on me, or spend its time desperately trying to pay for itself with crypto mining.

Adding Claude to my rotation is starting to look like the option with the least amount of building the universe from scratch. I have to imagine it can be used in a similar or identical workflow to the Copilot one where it can create PRs and make adjustments in response to feedback etc.


>Though yeah unless there's some good software already built around a similar LLM workflow I could use it'd probably be wasted on me, or spend its time desperately trying to pay for itself with crypto mining.

A big part of my success using LLMs to build software is building the tools to use LLMs and the LLMs making that tool building easy (and possible).


I tried this for a little while and couldn't really get passionate about it; I have too many other backlogged projects that I was eager to tear into with LLMs and I got impatient. That was a while ago though and the ROI for building my own tools has probably gotten a lot more attractive.

I started building my own tool set because I was doing too many projects with LLMs and getting frustrated by a very real need for organization and tooling to get repetitive meaningless tasks out of the way and to get all of my projects organized so I could see what was going on.

I'm convinced. :) I've got some time to kill in transit later today, maybe time to think about my setup a bit.

Oh nice - I'm literally playing around with a site to detect outages for major provides (AWS/cloudflare/github) based on social media/HN posts

hehe - thank you for helping me Github - https://imgur.com/a/0KqmKpU

It should always be at 0, because GitHub is unreachable over IPv6, which in 2025 should be considered an incident.

mobile adoption is high, desktop (residential and corporate) is still quite low.

I'm a big advocate for github to add ipv6 support , but let's not pretend it's critical for their business.


Aren't there serveral hosts now where IPv6 access is included but you have to pay for each attached IPv4? E.g. AWS and Hetzner


just a few hours ago we found a pretty nice residential desktop use case for proper v6 (with prefix delegation), due to no need for NAT the old router (2013) became less of a bottleneck!

Double check (I mean by remote port scan) that your firewall is working. I’ve seen routers with no IPv6 firewall . And it actually matters

I haven’t had a residential ISP that provided IPv6 yet.

"yeah but when I turn on ipv6 everything breaks"

The Primagen video about the bash scripts underpinning github actions runner was crazy. I'm a half-assed programmer at best and I don't even think I would make some of those mistakes.

woah this time i even caught it before the status page reported something - i thought they were rate-limiting me.

If GitHub actions break I now assume it’s them and not me. GitHub needs to work on stability ahead of AI features.

It seems to have started slowly. For me, Github releases have failed to serve requests for hours already.

When I opened the link, I just laughed for 5 mins straight

At this point, is there any downside to switching to GitLab?

What’s gitlab?

(Snarky way of saying: GitHub still has huge mindshare and networking effects, dealing with another forge is probably too much friction for a lot of projects)

Not that GitHub doesn’t suck…


When GitHub was bought by Microsoft, Gitlab made moving your repos to them super easy. Apparently not enough people have moved and it would seem even with sustained attacks from all kinds of different vectors, it would seem people continue to stick with them.

I use both Gitlab and Github and have yet to experience any downtime on any of my stuff. I do however, work at a large corporation and the latest NPM bug that hit Github caused enough of a stir where it basically shut down development in all of our lower environments for about two weeks so there's that.

But I do agree, and it seems like their market share increased after the Microsoft acquisition which is contrary to what I heard in all my dev circles because of how uncool MSFT is to many of my friends.


Is it any better ?

We had that last year, with the full premium stuff ("pay as much as we can" mindset)

Please see this: a basic feature, much needed by lots of people (those who are stuck on azure ..): https://gitlab.com/gitlab-org/gitlab/-/issues/360592

Please read the entire thread with a particular attention to the timeline


If escaping downtime is your goal, then you should aim for a service with less downtime than Github. (they're roughly the same, with Gitlab having a slightly higher percentage of "major" outages)

Is the uptime any better?

Not really:

GitHub - Historically, GitHub reports uptime around 99.95% or higher, which translates to roughly 20–25 minutes of downtime per month. They have a large infrastructure and redundancy, so outages are rare but can happen during major incidents.

GitLab - GitLab also targets 99.95% uptime for its SaaS offering (GitLab.com). However, GitLab has had slightly more frequent service disruptions compared to GitHub in the past, especially during scaling events or major upgrades. For self-hosted GitLab instances, uptime depends heavily on your own infrastructure.


This is a bit... low-effort, isn't it? I'd at least expect a video of an exasperated Github user walking up to the '# days since the last GitHub incident' board, sliding out the '1' or '2' card, and replacing it with a '0'.

I mean, that joke is as old as the universe (heck, in the brief period that I worked in an office, decades ago, I had a "# days since the last person asked a stupid question" sign to enact the exact same gag)...


How can we add AI to this perfectly functional product?

Tie it to quarterly performance results!

Or an octocat standing in front of the board, holding cards from 0 to 7 in its tentacles (with the rest lying on the ground) and looking at them quizzically?

An octocat with zero tentacles?

Base 8 numbering system?

I was half-expecting the "days since last" meme - the one with a person smiling awkwardly while clapping with a large four-digit counter in the background showing only zeroes.

I don't use Github Pages so I might be wrong but IMO I think at least part of the joke is that its URL betrays that it's a completely static site.

Very low effort. Couldn’t read the text on my iPhone without zooming in. I nearly mistook it for a blank page!

I used to have a magic 8-ball that people could use when they wanted me to debug their code for them. I think it was broken, though; it kept saying "Outlook good". Must've been a Microsoft magic 8-ball.

you found the bad-magic-ball, looks like a standard magic-ball with genius inside but is not, the one that finds it works, not the genius..hmmm...but hey glass balls with LLM inside...christmas...wonderful idea

I've not been able to browse any repo sources without the Unicorn for the past few hours.

The amazing part about this is the page even works when i'm offline.

[flagged]


I don't work at Github but I'd read here recently that they've been undergoing a herculean migration from whichever cloud provider they were on to Azure since their Microsoft acquisition, and that it coincides with an increase in outages. I'm guessing that the solution here was probably just to not do that and it's too late.

They weren't on any cloud provider previously. They famously had their own "metal cloud" of managed servers with everything being containerized and managed by Kubernetes. It seemed like it's worked pretty well, especially for their complex git operation tasks which had specific hardware requirements, but the official word is that apparently they're running into scaling limits with finding new datacenter capacity.

Yikes, that's worse, I thought the migration was at least a little politically motivated to reduce a dependency on a competitor like AWS or something. It's not exactly a great advertisement in any case to know that bare metal was more reliable for them than their own infrastructure when they now own it all the way through.

Yes I would image the issues are due to doing a migration period. Not the fact that it's moving to Azure in and of itself.

I won't blame Azure directly without a direct reason to, but as a developer often in the market for cloud providers it's definitely not the most reassuring that they're seemingly having so many migration pains.

A bit of an aside, I've only personally used Azure on one project at one company but their console UI had some bizarre footguns that caused us problems more than once. They have a habit of hiding any controls and options that your current logged-in user doesn't have permissions to use. In some cases that manifested as important warnings or tools that I wasn't even aware of (and were important to me!), but the owner of the company and other global admins could see. AWS, at least for a lot of the services last time I used it, was comfortable greying most things out with a tooltip telling you your user is missing X permission, which was way more actionable and the Azure version gave me whiplash by comparison.


GitHub was already done years ago. The ideal solution was in hand as of ~2020. Nearly every release since then has brought some kind of regression.

Hard disagree. GitHub Copilot is incredible. I can program pretty much entirely through my phone for large classes of problems now. Leveraged correctly it's amazing.

It's just a metric, not "whining". Besides, if complaining about companies (whether it's Github/Microsoft, Anthropic, Google, etc) without offering a solution is out-of-bounds, that probably knocks out 50% of the posts and comments on HN.

because Microsoft are known for listening to their customers?

this trivial bug fix took more than a year to be merged:

https://github.com/actions/runner/pull/3157

that bug likely ended up costing customers millions


Why is Microsoft supposed to listen? I'm not happy with them either but I understand their shift in business strategy.

So many people here treat github like it's a utility; it's not. If you're not happy with it, move on to alternatives or make your own version.


Largely, because they want money from people. If you are in a business selling a product or service and you don't at least pay some attention to what customers are calling for, then you're likely to eventually fall flat no matter how big you are.

Of course IBM and Oracle still exist, so who knows.


An optimal business only cares about whether the invoices get paid and the shareholders are hyped. Everything else is noise they block out. Not saying Microsoft is optimal, but this is just a business doing business.

The decision makers about what software forge to use are often not the same folks using the software forge.

> Why is Microsoft supposed to listen?

that's the point isn't it?

GitHub was a product that was loved by its userbase, because it was built by developers for developers

but Microsoft only care about one person, and one person alone: the individual that approves the purchase order

the people who have to suffer actually using the software are unimportant

which explains the rapid descent of GitHub into your standard quality Microsoft product (i.e.: terrible)


Genuinely curios, what shift?

Discussed entertainingly here by ThePrimeTimeagen: https://youtu.be/E3_95BZYIVs?si=IY-iT1eyXKnVvpTS

Woah, they actually fixed it? I mean, they are still using a poor man's sleep that uses up a core in the CPU instead of just using sleep, but progress!

$Work pays for GitHub, so the implicit solution offered is "take my money and make your service reliable"

Friendly reminder to use https://radicle.xyz!

Friendly reminder to stop saying friendly reminder when what you're saying isn't a reminder.

For any ESL folks here -

"Friendly reminder" is typically used for reminding people of common knowledge. Especially for beneficial but inconvenient things that some or most people neglect to do, either because they're annoying, inconvenient, or time consuming. Things for which busy people might need a "wink wink, nudge nudge".

Friendly reminder to floss. Friendly reminder to have your cancer screening. Friendly reminder to check your tires. Friendly reminder to file your taxes early. Friendly reminder to drink more water, eat fiber, etc.


You’re assuming that no native speaker is unfamiliar with the idiom and also that no ESL speaker is familiar with it.

Please shut up. No one asked for dragging an already offtopic answer. I couldn't care less for your pedantic and totally irrelevant statements

Keep getting fucked, couldn't care less. The future is decentralized and p2p

This quickly became much less friendly

> Keep getting fucked

By what?

> The future is decentralized and p2p

I wish it was but that isn't how things are going to turn out, especially if it's only people like you pushing it.


This is pretty dishonest because some trivial service no one cares about will reset this counter .

Your definition of "trivial" is not everyone's definition of trivial.

True, but the point remains that defining the whole as "down" when a subset is dilutes the value.

github haters (who still use the platform, for free) are the worst

I guess none of us really needs those 9s, and even two 9s are just good enough. I even doubt whether *SOME* of the banking transactions really really really need those 9s too -- like, I don't really mind if 1 out of 100 credit payment doesn't go through so I have to do it again -- it does happen once for a while and I just swiped it again.

GitHub has a container registry. That going down can cause pod start failure. I agree the source code probably doesn't need infinite nines, but the container registry is different.

Which should not even be that hard, because read-only replicas of artifact repos are trivial to create and easy to loadbalance.

I had an ATM glitch out on me a few months ago, I tried again and it confiscated my card. I called, and they explained that it is the failure mode to prevent people modifying them while they're offline.

Retry is fine, but imagine being unable to pay for something within 10 minutes in month. And 10m in 1M is 99.98% sla. So it depends



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: