Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Game Source Code Collection (archive.org)
405 points by tosh on July 21, 2019 | hide | past | favorite | 84 comments


Nice, this also includes the infamous half-life 2 leak from 2007 as well as some other ones that definitely weren't officially released. Or like re-volt, which is now owned by a company still actively trying to take down mirrors of the code that pop up, hence any unofficial port of it only releases in binary form.

But then again a lot of the older leaked ones, nobody probably cares about too much. There's definitely some old, sometimes less popular games where I'd like to get hold of the code and take a stab at porting to something modern.

So if you were a game dev for some now defunct game studio in the 90s and still have some games' source floating around it would be a great coincidence if that would surface somewhere like archive.org... :-)


Wow, I had no idea about Re-Volt's history. [1]

* Throwback Entertainment acquired 190 titles from Acclaim on July 7, 2006, including Re-Volt. [2]

* Throwback sold the rights to Re-Volt to WeGo Interactive Co. on February 23, 2011. [3]

* WeGo released Re-Volt Classic for iOS in Oct 2012, then April 2013 for Android. PC version was launched in Oct 2013, but was pulled in Jan 2014 due to a community patch misunderstanding. [4]

* Appears there's still a largish rewrite/port out there [5], which is a rewrite of the v1.2 community patch, based on the leaked Xbox version [6].

Anywho, thanks for mentioning that. Brought me down a fun rabbit hole.

[1] https://en.wikipedia.org/wiki/Throwback_Entertainment

[2] https://throwbackentertainment.com/throwback-entertainment-c...

[3] https://throwbackentertainment.com/throwback-sells-re-volt/

[4] https://en.wikipedia.org/wiki/Re-Volt

[5] https://rvgl.re-volt.io/

[6] https://forum.re-volt.io/viewtopic.php?t=139


Game companies are learning old IP can be modernized for fresh new profits so most likely a lot of old game copyrights will become retroactively enforced and sources hunted down and expunged as they're remastered and made exclusively available in whatever online shop or subscription service.

https://www.polygon.com/2018/7/22/17600008/nintendo-roms-law...

https://www.maxim.com/entertainment/best-video-games-remaste...


People have been reworking old IP, neglecting or enforcing various rights, etc for years. I don't think it's some new trend game companies recently learned or that it affects how aggressive (or not) owners are that much. Old sources in particular are not all that valuable in many cases, beside historically.


Right, but at the moment there are dozens of old movies being reworked and dozens of old games being reworked and the volume feels like it's growing with new announcements all the time. Over the last 20 years I have only seen a few games remade to maintain compatibility, usually unofficially patched by fans or as open source homages.

I think this is being fueled by movie and game outliers today being able to achieve their first $billion in their first week. That outsized ROI on gaming, like movies, completely changes the calculus on what old stuff is worth today. FAANG will throw billions into making games as soon as their movie pipelines are sorted.


Low value doesn't make copyright disappear.


IMHO that shouldn't be too much of a concern. As long as it's just the sources leaked and you only distribute your modified versions of it plus binaries, but not all the games assets, players still need a copy of the original game to actually play it. (So as long as people don't pirate it, it should be fine and if it's still on sale make them an additional buck or two, but that question is somewhat orthogonal.)

I'd even argue that you could be smart as the IP holder and release the code with a clause saying any modifications and extensions must be open source too and the IP holder had the right to use those and make money off of it. Then you can decide if you wanna take the risk of potentially working for some company for free when you take on porting that 90s racing game to the PS 4 in your spare time.

Taking down leaked source is obviously pretty simple due to DMCA, but as history has shown that just leads to groups working on it behind closed doors and only doing binary releases. Which might be technically illegal too but probably much more involved for the IP holder to prove, so they don't bother in most cases.


Wow, this comment is far removed from reality. The copyright holders are the only ones allowed by law to modify and or distribute the game or any of its parts unless they hand out licenses. In other words, touching leaked source code is pretty much illegal. It is mostly a question of how long it takes until they will bother to send lawyers after you if you do.

Game develipers often don't own the thengames they are working on. Nost publishing contracts transfer the rights to the publisher and all the studio gets is the money. So, the devs cannot do anything after they ar eww done with a project. It is up to the publishers to decide what to do with it. And these guys are bean counters fist, otherwise they wouldn't be able to run the core of their business the way they do (that is risk distribution between individual game projects). So unless you can give them a solid business case for releasing anything for free, they won't do it. So you need to prove to them iwith decent numbers that this works. Good luck with that.


After you sure you replied to the right comment? I don't think I said anywhere it would be legal. My post was about how released sources could actually be beneficial to the IP holder.


Confusingly, the "you" in the parents post was meant to reffer to "you, the copyright holder" not "You, the pirate".


This model is how early shareware/demos for the likes of Doom or Wolfenstein worked. All the final code is there, but you only include the asserts for the first level or so.

That said, it's still up to the copyright holder on what to disclose, and pretty sure distributing a binary based upon stolen copyrighted source would still be a violation, but I'm not a lawyer. I certainly wouldn't do it...


The archive entry is mislabeled; the Half-Life 2 leak was in 2003 [1].

[1] https://www.eurogamer.net/articles/2011-02-21-the-boy-who-st...


It's not mislabeled, there was a later leak of the 2007 version, which happened in 2012: http://www.valvetime.net/threads/source-engine-code-leak-ver...


I'm aware of the later leak, but at least the screenshots attached to the Archive file are from the 2003 leak. https://ia800800.us.archive.org/24/items/HalfLife22007Source...


If you check the actual content, you'll see that it includes both the 2003 and the 2007 leaks: https://archive.org/download/HalfLife22007Source/HL2%20Sourc...


I would advocate wiping any version control files, and reverting any custom changes to leave it as stock as possible. Delete any digital indication of who checked out the source.


Wow , re-volt used to be one of my favorite games back in those days.


I love the Internet Archive, but I absolutely hate the website redesign they made a few years back. Before navigation was so simple and sane, but unfortunately they had to update because the site was "too old" and this was during the "mobile trend" that was going on at the time, which resulted in infinite scrolling and other mobile stuff that makes using the website a pain, even after all these years.


I agree. I also think they need to redo the tagging/category system they currently have in place. It's super unintuitive and there are so many duplicate or useless tags.


I've been looking a collection like this, but for startups/sites/internet companies.

Something that would have Facebook's[0] and Snapchat's[1] source code.

Or even something that would have Staffjoy's[2](a startup that shut down then open sourced their source code).

[0] -> https://gist.github.com/nikcub/3833406

[1] -> https://github.com/JonnyBanana/Snapchat-Source-Code-Leak

[2] -> https://github.com/Staffjoy/v2


Gotta love Jason Scott and his work to preserve computer history. From the list:

Leisure Suit Larry Source Code Reading Excerpt

https://archive.org/details/Leisure_Suit_Larry_Source_Code_R...


I'm not sure I understand what he's doing here. Is he reading out the whole source code to the game as a performance art piece so he can share it without infringing copyright laws?


I am carefully enunciating the words that will undo the locks on the key to the universe.


Some of my favorites are not listed. Notably, Jedi Knight: Jedi Academy had some of the best gameplay of all time. I played through a couple of months ago and the lightsaber combat is still amazing by modern standards.

https://github.com/grayj/Jedi-Academy


A fun entrypoint to discovery into that repository is https://github.com/grayj/Jedi-Academy/search?q=FIXME&unscope... .


Jedi Academy had its source code officially available shortly after launch. I did some modding of it, back then. It was a ton of fun.


Something I want to ask for quite a while: what is the copyright status of Archive.org? Is it legal?

I'm asking not just for this source code collection: they host lots of DOS and console games on their website, which I highly doubt are in public domain or under a free license. They also have lots of scans of some recent publishings (particularly in my mind, I knew they have plenty of Japanese manga magazines from this decade that are definitely not copyright-free.)


They're officially a library in California which gives them a ton of carte blanche wrt to copyright. As for the status of their particular actions, I think it's a lot of grey area.


I believe they also have a team of lawyers, which helps greatly too.


They also have a remarkable amount of music albums and ebooks that are available for download, for basically brand-new things, too!


This really worries me. Could the internet archive be eventually taken down due to too much copyright-infringing stuff?


Magazines like Shonen Jump or something else? Any examples?



I find the views in the about section to be interesting. Seems this collection has gotten a massive spike of views in the last month of ~9,000 people, which is ~30x that of the last few months. I wonder if this comes down to hacker news, because at ~70 votes this post would have a view / vote ratio of about 125x, which seems in the ballpark but maybe a little high?

As you would guess California is the most common origin for views but surprisingly Alberta comes in second with 1/2 the views of California! Are there lots of fellow Albertan devs here?


https://old.reddit.com/r/gamedev/comments/cdih04/game_source...

https://old.reddit.com/r/programming/comments/caphdj/game_so...

It was also reposted on Reddit several times this month, and on Hacker News (before this) as well. Seems to be making the rounds.


Unrelated; but I always find myself drawn to game development, yet I have no desire to work in the industry. Does any other programmer feel the same, and do you know why?


> Does any other programmer feel the same

Of course.

> and do you know why?

Because games are fun, and programming is fun, and programming something that will be fun for other people is fun.

But the games industry is crowded, and highly profit-optimized, and you just instinctively know that getting a tour of the budget sausage factory wouldn't bring a true sausage lover any significant pleasure.


The games industry being crowded is a bit of a myth, though initially breaking into it does present a barrier. Lots of studios seem hungry for anyone with experience. Though, a lot of the actually interesting projects are crowded. Which is a real concern if that's the only type you want to work on (I'm in that camp...)

Like programming, once you have a few years under your belt, things really open up.


Really depends, if your a solid game dev and willing to work on the latest Barbie licensed mobile title its not so cutthroat. Profitable indie studio though? Good luck with that


> Lots of studios seem hungry for anyone with experience.

That's because once someone has experience, they can 3x their salary outside the industry.


I think I can sum it up in a Yogi-Berra-ism: The game industry is very industrialized. If you work on a commercial product, most of your effort goes towards making a product, and any specific details of the game are often a small part, one in a series of checklist features. And it's easy to dismiss assets when working on a little graphics demo, but the assets often take as much or more design insight than the code(which for most gameplay behaviors tends to boil down to finite state logic, timers, and lookup tables). What you most often get paid for as an industry employee is to churn out assets and simple behavioral scripts in quantity, so even though there are interesting problems at the top end of the field, you are probably not working on them for most production cycles, or only working on them for a short period.

As such, it's easy to tinker on a game and hyperfocus on a small aspect, but a different story to finish one in the way that most commercial games feel finished, both as software and in terms of filling in the blanks.


Just because you like painting doesn’t mean you’re interested in the art industry, or making a career out of it...

Gamedev pools together a large collection of backgrounds that makes development appealing to.. a lot of hobbies. And its also one of those subjects (some) users invest a lot of time into, both in playing and discussing, so its easy to feel you have the general knowledge of the outcome that its viable to create a “good” game; and particularly for programming, it crosses through a lot of domains — you can find reason to implement compilers, ML, graphics, a huge array of data structures, dynamic programming, any language paradigm (and any language), a full set of “patterns”, etc.

Tbh its harder to find reasons why a programmer-hobbyist wouldnt find something interesting in game development.

Its really easy to find reason you wouldn’t want to be in the industry (its an absolutely pathetic state of affairs)


I bought and shared the "Space Funky B.O.B." source code (included in that collection) and the "Super Noah's Ark 3D" source code (not in that collection). Both came from 3.5" diskettes on eBay. Both are for the Super Nintendo / Super Famicom system.

Space Funky B.O.B. is interesting due to the amount of swearing in the comments of the source code.

Super Noah's Ark 3D is notable for many reasons. It is an unlicensed Christian game that used a pass through style cartridge. It is based on the Wolfenstein 3D engine, and actually included some emails from John Carmack and Rebecca Heineman from back in the day. A delightful insight into software development decades ago.

https://eludevisibility.org/2018/super-noahs-ark-3d-source-c...



Perhaps a silly question, but is this entirely legal in the US?

This is, can I just download the source code and look through it without any major caveats like already owning the corresponding game or having permission from the publisher?

I see that some of the source code has a license associated with it, e.g. licenses claiming that the content's in the public domain, but I'm unsure if such licenses are necessarily legitimate.


We need also game binary code preservation and how to compile the sources..just the source code without build instructions is also pretty useless for the preservation


Heck Yea! I didn't know this was a thing?!


Right? Lots of classics.. Half Life, Sim City and Beyond Castle to name a few.

If you like spongebob you're in luck too. Probably gonna spend the next few weekends building these.


Good luck building. It's probably going to be a bit more than a few weekends... ;-) if you do get something built, maybe blog it / share it somewhere.


would love a filter based on language they were written in, more recent games are likely C++ or such but older games may not be


Thank you. Terrific list. Bill Gate's Donkey BASIC should be on here too. Actually learned a lot from old BASIC games


do you have the source code to FF8 because apparently that went missing


It didn't go missing. Square had an institutional policy that had them deleting source once projects were mastered and shipped, through to the early 2000's. FFVII-FFIX, Parasite Eve and numerous other games lost their source.

What's been recovered has been occasionally the code from contracted PC ports.


Why did they have such a policy? That seems crazy!


That was also their golden era for quality products.

I cannot claim to know why they did this, but my educated/experienced guess as a long-time game developer is that _someone_ thought it would be best for the teams not to rely upon existing projects when creating new ones.

Imagine if painters always started from their previous painting rather than a blank canvas; or if a home developer always started with prefabs of their previous home. You'd get American suburbs and Ikea prints.


Imagine if we did this for operating systems. I want to believe.


The BBC used to have a policy of wiping tapes for reuse, seems crazy in retrospect but there you go. Sometimes people just don't realise what they've got.


The BBC did that to save money for new magnetic tape used for archiving. They thought that parts of their archives were just not valuable enough to preserve and chise to overwrite them instead. Unfortunately, some of the overwritten material turned out to be much more interesting and valuable in retrospect, for example a host of Doctor Who episodes which are now lost completely.


Context that's often forgotten here is that the BBC typically didn't even have the contractual right to rebroadcast these programmes (the actors' union would have stipulated a maximum number of broadcasts, for example two within 7 days).

And video releases weren't yet a thing even if the rights could have been secured, so the apparent value of these archives was minimal.

https://en.wikipedia.org/wiki/Wiping#Rights


Your link states

"Talent unions were highly suspicious of the threat to new work if programmes were repeated; indeed, before 1955 Equity insisted that any telerecording made (of a repeat performance) could only "be viewed privately" on BBC premises and not transmitted"

I'm trying to work out if there's a misunderstanding here, as this only applies to repeat performances, not presumably the original performance. So I wonder if recording at all at that time was not standard practise. I tried to work out in what circumstances telerecording was actually used for at that time, havent found much, but found this interesting white paper on recording the Queens coronation.

https://www.bbc.co.uk/rd/publications/rdreport_1955_02


I believe what you're implying is correct. The original mental model of television was that it's live by definition (otherwise it's just an inferior and unduly complex substitute for ciné film, which was well established technology).

So telerecording was just a weird hack, and for a repeat performance you would expect to bring the actors into the studio again.


Do you have a source on that by chance? The only hearsay I've heard is that it went missing because 'circumstances', and now you're claiming that they purposefully deleted it.

Which, if true, makes me earnestly sick to my stomach.


This is truly one of the treasures of archive.org, which has so much valuable content on it I can't imagine what life would be like without it.

I'm going to present this amazing collection of games and their source code to my local kids computer club and maybe spend a few hours de-tarball'ing and building some of these wonderful games.

To the people who maintain archive.org, I salute you!


You can donate here: https://archive.org/donate/


I can’t get past the DRM they allow, if they stopped doing that and raised the bar I’d likely support them.


I just helped someone borrow a DRM-encumbered copy of a book from them. It wasn't pretty, but I wouldn't say it was worse than not having the book available at all, which is the only realistic alternative given that it isn't in the public domain.


I believe their stance on DRM is something like:

• We (archive.org) can choose to either take in these DRMed works, archive them, and publish them online for public consumption (with their DRM intact); or we can refuse the DRMed works at the door, and thus have them unavailable for public consumption, also leaving them unpreserved; or we can archive the DRMed works, but just not publish them (again leaving them unavailable to the public.) The one thing we can't do, legally, is to just ignore the wishes of the rightsholders and put up a DRM-less version of the content for public consumption. The rights-holders are still out there to sue us. So we have to pick one of the other three options.

• One day, the rights will expire or the rights holder will disappear, and the work will enter the public domain.

• If we had earlier rejected even archiving the work because of its DRM (from some principled moral stance, as you seem to be suggesting), then at the time the work enters the public domain, we won't have a copy, and would have to then acquire one. It might be impossible to acquire a copy to preserve at that point.

• So, it’s better to acquire a copy now, under license; and then just crack the work out of its DRM later once the work becomes abandonware. (As the Archive.org staff have proven happy to do and/or support, with e.g. 4am’s work on the Apple II software archive.)

• Plus, even if we did wait to acquire the work after its rights lapsed, we would likely have to crack it anyway. Rightsholders that go to the trouble to re-release their own works without DRM are pretty rare. Some rightsholders are so lazy that, in “anniversary” re-releases of their products, they use the community’s cracked copy! So it’s not like we’re making more work for ourselves by choosing to take in DRMed works and then crack them eventually. It’s just how it has to go, to ever archive these works at all. The likelihood of ever just "coming across" a non-DRMed version of the work at some point in the future is practically nil. (It would be like hoping that if you left an aged painting on the market long enough, it would just de-varnish and repair itself.)

• And, of course, we can start on the DRM-cracking process as soon as we get the work, and keep the de-DRMed version as the canonical version to do preservation work against, as long as that’s not the version we make visible to the public (until the work becomes abandonware.) Museums and libraries have many works in “private archives”, and those archives still hold value to the public: academics can usually access them for studies, for example, as this explicitly falls under Fair Use. But more importantly, the private preservation of works that can't be preserved in public, ensures that they're preserved at all.

I hope that makes it clear why either the first or the third options (give the public the DRMed version; or preserve the DRMed version but don't publish it) are better than the "refuse DRMed content at the door" approach. Which of those other two options you favor is up to you. Personally I'd rather the public have some access to this rights-bound content rather than none.


The problem is these work owners are using archive.org to play both sides of the argument, even worse consider the Tibetan Buddhist Resource Centern they digitized works they didn’t author, don’t own any rights to for the works that are 100s of years old, and then put them up on Archive.org under DRM. How is that in any way ethical to take someone’s work and do that?

https://archive.org/details/buddhist-digital-resource-center...


Are you sure they don't own the rights?

When you create a "derivative work" of a work under copyright, you're creating a new work that "samples" the original, and then asserting your own copyright to it—that's why you need a license from the rightsholder in the first place, to allow you to claim those IP rights on the derivative work.

In the case of a public-domain work, if you create a "derivative work" from it, you own the IP of that derivative work, 100%. The public-domain parts that you sampled aren't still public domain just because they're copied word-for-word into your work. (I mean, the original work itself is still PD, but the sample of it in your derivative work isn't. It's a "color of your bits" thing[1].) "Public Domain" isn't an infectious copyleft license. You can "fork" and "make proprietary" a PD work, and that's 100% allowed.

Now, I don't know enough about the Buddhist texts in question to say whether their presentation of them here qualifies as a "derivative work"—but usually even just translating a work makes it a derivative, so, if the TBRC were the ones that translated these texts to English? They own 'em.

(If you want a public version, do the same thing FOSS communities do when a FOSS project is forked into a proprietary product: walk back to the last open branch-point of the source, and make your own open fork. In this case: translate the texts yourself!)

[1] https://ansuz.sooke.bc.ca/entry/23

---

Also, there might be a more interesting consideration at play in this particular case: if a work is never published, then AFAIK, it never enters copyright; its copyright "clock" only begins when someone publishes it. So e.g. the diary of Anne Frank doesn't have a copyright year of 1945, but rather 1947—the year it was found and published. Until then, the work is a "manuscript", equivalent in IP rights status to a draft laying on a writer's desk destined for a publisher.

You can think of such a manuscript as a secret root node in a "derivative work" tree: the author creates work A [the manuscript], the publisher derives work B [the published book], the author assigns work-A derivative-work rights to the publisher, and then also [usually] relinquishes all rights to work A. Because of this, the copyright clock is relative to work B, not work A. If copyright stayed attached to manuscripts, you could run out the copyright [in a pre-Disney copyright regime] by just spending 30 years writing a book!


> Also, there might be a more interesting consideration at play in this particular case: if a work is never published, then AFAIK, it never enters copyright

This isn't really true in the U.S. AFAICT - unpublished works do enter the public domain, 120 years after creation. It is true elsewhere, e.g. in Europe, but the standard for publication is lower than you might expect; if one can argue that the work wasn't genuinely private to the author (e.g. copies were made, it was used for public performances, etc.) that's enough to consider it "published".


For the Buddhist texts, how could they own the rights to works written hundreds of years ago? They're simply digitizing the work, no translation. Here's an example: https://archive.org/details/bdrc-W1FPL194/page/n7


If the works, before their digitization, only existed in a private collection at a particular Buddhist temple, then their copyright clock wouldn't have kicked off. It would start the moment that the works entered the public sphere in some way. If that happened because of the digitization, then the digitization is under copyright. In this case, the original work is the "manuscript", and they're the "publisher."


Obviously at the time the texts were written, they were in circulation, but it predates modern copyright law. So maybe there's a loophole in the law they're exploiting, but regardless it's unethical, and Archive.org has the power to not accept works under those circumstances.


"the diary of Anne Frank"

A poor example, or interesting example depending on how you want to look at it.

https://www.theguardian.com/books/2016/jan/18/anne-franks-di...


> but we can’t legally archive it without DRM

They definitely can - they're an official archive. They can't have people looking at it online while the item is in copyright without putting some reasonable safeguards in, but that's not the same thing as not being allowed to archive it.


Right, sorry, let me edit that. (I made it clear in a later paragraph but that one does read funny.)


i have looked for this, and kind find any useful intersection of the terms 'archive.org' and 'DRM'...except for bemoaning its general existence.

you could elaborate?


Here’s one example of DRM: https://archive.org/details/inlibrary?sort=-publicdate

They have collections they don’t let you download, they should just refuse those collections to force the platform to be free to download like the Gutenberg Project for example.


you have a simple choice: either have those collections be kept and preserved, OR have them disappear forever. We can all be grateful we at least have one non-profit who does give a fuck at all, no for-profit business will ever do any preservation work like it.

Gutenberg is only preserving public domain works, MANY are lost to humanity way before that.


Because it discourages people from creating works then if they can't be free. Why would you want to write something valuable for humanity if in the end someone is going to steal it and put DRM on it? For me personally, I'd not write the work at all knowing that, since it goes against the spirit of making it for humanity to begin with.


If you slave away at a game (or some other piece of code), over time the fruits of your labor will become worthless.


Same goes if you grow bananas for a living. Not everything worthwhile lasts forever.

Of course, nothing ever can, anyway.


It's not worthless if it gets you paid


Not to mention that in game development, the experience from your first shipped title is massively valuable when developing the second




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: