Midjourney is the most popular discord channel by far with 19.5M+ members, $200M...

jonplackett · on March 23, 2024

Leonardo.ai have basically done exactly this and seem to be doing OK.

It’s a shame because they’re literally just using stable diffusion for all their tech but built a nicer front end and incorporated control net. No-where else has done this.

Controlnet / instantID etc are the really killer things about SD and make it way more powerful than Midjourney, but they aren’t even available via the stability API. They just don’t seem to care.

mikehearn · on March 23, 2024

InstantID uses a non-commercial licensed model (from insightface) as part of its pipeline so I think that makes it a no-go for being part of Stability's commercial service.

biztos · on March 23, 2024

Yes, and I'm grateful to them for sticking to that plan. :-)

But for the individuals involved, it might also be

Step 2: Leverage fame in AI space for massive VC injection on favorable terms.

epivosism · on March 23, 2024

Yes, and MJ has no public API either. Same for Ideogram, I imagine they have at least 10m in the bank, and aren't even bothering making an API despite being SoTA for lots of areas.

JimDabell · on March 23, 2024

> Seriously this seemed to be the plan:

> Step 1: Release SD for free

> Step 2: ???

> Step 3: Profit

That’s not true. He was pretty open about the business plan. The plan was to have open foundational models and provide services to governments and corporations that wanted custom models trained on private data, tailored to their specific jurisdictions and problem domains.

ptero · on March 23, 2024

Was there any traction on this? I cannot imagine government services being early customers. What models would the want?Military -- maybe, for simulation or training, but that requires focus, dedicated effort and a lot of time. My 2c.

lumost · on March 23, 2024

I've heard this pitch from a few AI labs. I suspect that they will fail, customers just want a model that works in the shortest amount of time and effort. The vast majority of companies do not have useful fine tuning data or skills. Consultancy businesses are low margin and hard to scale.

OscarTheGrinch · on March 23, 2024

Heres a Stable Diffusion buisness idea: sign up all the celebrities and artists who are cool with AI, and provide end users / fans with an AI image generation interface, trained on their exclusive likenesses / artwork (loras).

You know, the old tried and true licensed merchandise model. Everybody gets paid.

vasco · on March 23, 2024

> sign up all the celebrities and artists who are cool with AI

"Cool with AI" and "sell my likeness so nobody ever needs to hire me again" are too close for comfort on this one.

Loughla · on March 23, 2024

And also, here's a way to just make so much pornography of me.

I'm betting the list of folks who would sign the AI license are pretty small, and mostly irrelevant.

numpad0 · on March 23, 2024

I think the following isn't said often enough: there must be a reason why there are extremely few celebrities and artists who are cool with AI, and it cannot be something abstract and bureaucratic as copyright concerns although those are problematic.

It's just not there yet. GenAI outputs aren't something audiences wants to hang on a wall. It's something that evoke sense of distress. Otherwise everyone's tracing them at least.

ben_w · on March 23, 2024

Most people mix up all the different kinds of intellectual property basically all the time[0], so while people say it's about copyright, I (currently) think it's more likely to be a mixture of "moral rights" (the right to be named as the creator of a work) and trademarks (registered or otherwise), and in the case of celebrities, "personality rights": https://en.wikipedia.org/wiki/Personality_rights

> It's just not there yet. GenAI outputs aren't something audiences wants to hang on a wall.

People have a wide range of standards. Last summer I attended the We Are Developers event in Berlin, and there were huge posters that I could easily tell were from AI due to the eyes not matching; more recently, I've used (a better version) to convert a photo of a friend's dog into a renaissance oil painting, and it was beyond my skill to find the flaws with it… yet my friend noticed instantly.

Also, even with "real art", Der Kuss (by Klimt) is widely regarded as being good art, beautiful, romantic, etc. — yet to me, the man looks like he has a broken neck, while the woman looks like she's been decapitated at the shoulder then had her head rotated 90° and reattached via her ear.

[0] This is also why people look at a Google street view image with a ©2017 Google[1] tiled over on a blue sky and say "LOL, Google's trying to own the sky", or why people even on this very forum ask how some new company can trademark a descriptive term like "GPT"[2], seemingly surprised by this being possible even though there's already a very convenient example of e.g. Hasbro already having "Transformers".

[1] https://www.google.com/maps/@33.7319434,10.8655264,3a,77.2y,...

[2] https://news.ycombinator.com/item?id=35692476

numpad0 · on March 23, 2024

> Der Kuss (by Klimt) is widely regarded as being good art,

The point is, generative AI images are not widely regarded as good art. They're often seen as passable for some filler use cases and hard to tell apart from human generations, but not "good".

It's not not-there-yet because AI sometimes generates sixth fingers, it's something another level from Gustav Klimt, Damien Hirst, Kusama Yayoi, or the likes[0]. It could be that genAI is leaving something that human artist would filter out, or because images are too disorganized that they appear to us to be encoding malice or other negative emotions, or maybe I'm just wrong and it's all about anatomy.

But whatever the reason is, IMO, it's way too rarely considered good, gaining too few supportive celebrities and artists and audiences, to work.

0: I admit I'm not well versed with contemporary art, or art in general for that matter

ben_w · on March 23, 2024

> The point is, generative AI images are not widely regarded as good art. They're often seen as passable for some filler use cases and hard to tell apart from human generations, but not "good".

> It's not not-there-yet because AI sometimes generates sixth fingers, it's something another level from Gustav Klimt

My point is: yes AI is different — it's better. (Or, less provocatively: better by my specific standards).

Always? No. But I chose Der Kuss specifically because of the high regard in which it is held, and yet to my eye it messes with anatomy as badly as if he had put 6 fingers on one of the hands (indeed, my first impression when I look closely at the hand of the man behind the head of the woman, is that the fingers art too long and thumb looks like a finger).

numpad0 · on March 24, 2024

> and yet to my eye it messes with anatomy

wait what? Isn't that missing the point of expressionism? Klimt's Judith I is basically a photo, surely he can draw sh*t if he wanted to?

But myriad predecessors such as Vermeer, Rembrandt, Van Gogh, da Vinci, et al., have done enough in realism, and also photography was becoming more viable and more prevalent, that artists basically started diversifying? Isn't that what lead to various forms of early 20th century arts like surrealism(super-real -ism), cubism, etc?

I don't mean offense but that's just, surely that level of understanding can't be basis of policy decisions when it comes to moral rights and licensing discussions and "artists should just use AI" and such???

ben_w · on March 24, 2024

I think you're conflating "good" in the sense of "competent" with "good" in the sense of "ethical" or "legal".

I am asserting here that the AI is (at its best) more competent, not any of the other things.

I suspect that the law will follow the economics, just as it often has done for everything else before — you're communicating with me via a device named after the job that the device made redundant ("computer").

But I said "often" not "always", because the business leaders ignoring the workers they were displacing 200 years ago led to riots, and eventually to the Communist Manifesto. I wouldn't discount this repeating.

--

I've just looked up "Judith I" (I recognise the art, just not the name), and I don't even understand why you're holding this up as an example of "basically a photo".

As for the other artists demonstrating realism: photography made realism redundant despite being initially dismissed as "not real art". Artists were forced to diversify, because a small box of chemistry was allowing unskilled people do their old job faster, cheaper, and better. Photography only became an art in its own right when people found ways to make it hard, for example by travelling the world and using it to document their travels, or with increasingly complex motion pictures.

I suspect that art fulfils the same role in humans as tails fulfil in peacocks: an expensive signal to demonstrate power, such that the difficulty is the entire point and anything which makes it easy is seen as worse than not even trying. This is also why forgeries are a big deal, instead of being "that's a nice picture", and why an original painting can retain a high price despite (or perhaps because of) a large number of extremely cheap prints being plastered onto everything from dorm rooms to chocolate wrappers.

ben_w · on March 23, 2024

Why would those celebs pay Stability any significant money for this, given they can get it for a one off payment of at most a few hundred dollars salary/opportunity cost by paying an intern to gather the images and feed it into the existing free tools for training a LoRA?

brandall10 · on March 23, 2024

I think in this case the celebs are getting paid for using their likeness.

ben_w · on March 23, 2024

That sounds like the "lose money on every sale" philosophy of the first dot-com bubble, only without even the "but make it up in volume" second half.

sigmoid10 · on March 23, 2024

You can already do that with reference images and even for inpainting. No training required. Also no need to pay actors outrageous sums to use their likeness in perpetuity as long as you do business. The licensing still tricky anyways, because even if the face is approved and certified, the entire body and surroundings would also have to be. Otherwise you basically re-invented the celebrity deepfake porn movement. I don't see any A-lister signing up for that.

Der_Einzige · on March 23, 2024

What's insane to me is the fact that the best interfaces to utilize any of these models, from open source LLMs to open source diffusion models, are still random gradio webUIs made by the 4chan/discord anime profile picture crowd.

Automatic1111, ComfyUI, Oobabooga. There's more value within these 3 projects than within at least 1 billion dollars worth of money thrown around on yet another podunk VC backed firm with no product.

It appears that no one is even trying to seriously compete with them on the two primary things that they excel at - 1. Developer/prosumer focus and 2. extension ecosystem.

Also, if you're a VC/Angel reading my comments about this, I would very much love to talk to you.

teaearlgraycold · on March 23, 2024

> $200M in revenue in 2023 with 0 external investments and only 40 employees.

The dream

mike_hearn · on March 23, 2024

Only if your costs are a lot lower than $200M, which given the price of GPU compute right now, is not guaranteed.

manquer · on March 23, 2024

For a founder maybe , definitely not for employees .

AI startups need not an insignificant amount of startup capital , you cannot just spend weekends to build like you would a saas app . Model training is expensive so only wealthy individuals can even consider this route

Companies like that have no oversight or control mechanisms when management inevitably goes down crazy paths, also without external valuations option vesting structures are hard to ascertain value.

alfiedotwtf · on March 23, 2024

Sometimes you need to say fuck the money, I’ve already got enough, and I just want to do what I enjoy. It may not be an ideal model for HN but damn not everything in life is about grinding, P/E ratios, and vesting schedules

BhavdeepSethi · on March 23, 2024

Yeah, that's easier to say when you have enough. A lot of employees might not be in that privilege position. The reality for some of the folks might be addressing education loans, families to take care of, tuition for kids, medical bills, etc.

tasuki · on March 23, 2024

Also oversize houses, expensive cars, etc...

teaearlgraycold · on March 24, 2024

I’m grateful to be debt free

teaearlgraycold · on March 23, 2024

As a counter-point, with no VCs there's more equity left for employees.

hsjsbeebue · on March 23, 2024

If valued $2bn then even a 0.1% is $2m. Not bad.

teaearlgraycold · on March 23, 2024

Early employees may have gotten 2-3% and are completely undiluted.

hsjsbeebue · on March 23, 2024

Yes indeed! Probably furiously vesting now!

khazhoux · on March 23, 2024

As a counter-counter-point that gets rarely discussed on HN, VCs aren't taking as much of the pie as people think. In a 2-founder, 4-engineer company, it wouldn't be unusual to have equity be roughly:

20% investors 70% founders 2-3% employees (1% emp1, 1% emp2, 0.5% emp3, 0.25% emp4) 7% for future employees before next funding round

seanhunter · on March 23, 2024

This is not a fair comparison because you are not taking into account liquidation preferences. Those investors don't have the same class of equity as everyone else. That doesn't matter in the case of lights out success but it matters a great deal in many other scenarios.

khazhoux · on March 23, 2024

Sure. My point was that most employees think that VCs take 80+%, and especially the first few employees usually have no idea just how little equity they have compared to the founders.

killerstorm · on March 23, 2024

You can't run a sustainable business if you take VC money. VCs need an exit.

Satam · on March 23, 2024

There's money to be made for sure, and Stability's sloppy execution and strategy definitely didn't help them. But I think there are also industry-wide factors at play that make AI companies quite brittle for now.

blensor · on March 23, 2024

Are we sure that Midjourney is still on that trajectory?

I was a heavy user since the beginning but my usage has dropped to almost 0

jack_riminton · on March 23, 2024

I think you're a sample size of one

blensor · on March 23, 2024

I mean of course I am a sample size of one when I am speaking about my own experience?

That's why I asked that question to see if others notice something similar or if that's just me

llm_trw · on March 23, 2024

>The vast majority of users couldn't be bothered to take the steps necessary to get it running locally so I don't even think the open sourcing philosophy would have been a serious hurdle to wider commercial adoption.

The more I think about the AI space the more I realize that open sourcing large models is pointless now.

Until you can reasonably buy a rig to run the model there is simply no point in doing this. It's no like you will be edified by setting the weights either.

I think an ethical business model for these business is to release whatever model can fit into a $10,000 machine and keeping the rest closed source until above machine is able to run them.

AnthonyMouse · on March 23, 2024

The released image generation models run on consumer GPUs. Even the big LLMs will run on a $3500 Mac with reasonable performance, and the CPU of a dirt cheap machine if you don't care about it being slow, which is sometimes important and sometimes isn't.

Also, things like this are in the works:

https://news.ycombinator.com/item?id=39794864

Which will put the system RAM of the new 24-channel PC servers in range of the Nvidia H100 on memory bandwidth, while using commodity DDR5.

llm_trw · on March 23, 2024

The `big' AI models are trillion parameter models.

The medium sized models like GPT3 and Grok are 185b and 314b respectively.

There is no way for _anyone_ to run these on a sub $50k machine in 2024, and even if you can the token generation speed on CPU is under 0.1 tokens per second.

raincole · on March 23, 2024

It's just semantic gymnastics. I'm sure most people will consider LLaMa 70B a big model. Of course if you define big = trillion then sure big = trillion[1].

[1]: https://en.wikipedia.org/wiki/No_true_Scotsman

llm_trw · on March 24, 2024

Yes, you are engaging in the no true scotsman fallacy, please stop.

AnthonyMouse · on March 23, 2024

You can get registered DDR4 for ~$1/GB. A trillion parameter model in FP16 would need ~2TB. Servers that support that much are actually cheap (~$200), the main cost would be the ~$2000 in memory itself. That is going to be dog slow but you can certainly do it if you want to and it doesn't cost $50,000.

FireBeyond · on March 23, 2024

Even looking on Amazon, DDR4 seems still a decent bit above $2/GB:

2 x 32GB: $142

2 x 64GB: $318

8GB: $16

2 x 16GB: $64

2TB of 128GB DDR4 ECC: $9,600 (https://www.amazon.com/NEMIX-RAM-Registered-Compatible-Mothe...)

> Servers that support that much are actually cheap (~$200)

What does this mean? What motherboards support 2TB of RAM at $200? Most of them are pushing $1,000. With no CPU.

It may not hit $50K, but it's definitely not going to be $2K.

AnthonyMouse · on March 24, 2024

Here's a server that supports 3TB of memory for $130, you get 3TB by filling all 24 memory slots with 128GB LRDIMMs, 2TB with 16:

https://www.ebay.com/itm/176298520843

Here are 128GB LRDIMMs for $98:

https://www.ebay.com/itm/196305803969

For 2TB and the server you're at $1698. You can get a drive bracket for a few bucks and a 2TB SSD for $100 and have almost $200 left over to put faster CPUs in it if you want to.

That's stinking Optane, would work if you're desperate. Normal 128GB LRDIMMs cost more than other DDR4 DIMMs. You can, however, get DDR4 RDIMMs for ~$1/GB:

https://www.ebay.com/itm/186345903230

With 32GB RDIMMs that machine would max out at 768GB, which could still run a 1T model at q4 or grok at FP16. And then it would cost less than $1000.

Or find a quad-socket system with 48 memory slots and then use 64GB LRDIMMs ($1.12/GB):

https://www.ebay.com/itm/176299295509

The quad socket systems aren't $200, but you can find them for $550 or so:

https://www.newegg.com/hp-proliant-rack-mount/p/2NS-0006-3E5...

Maybe less if you shop around (they're not as common).

fauigerzigerk · on March 23, 2024

How slow? Depending on the task I fear it could be too slow to be useful.

I believe there is some research on how to distribute large models across multiple GPUs, which could make the cost less lumpy.

AnthonyMouse · on March 23, 2024

You can get a decent approximation for LLM performance in tokens/second by dividing the model size in GB by the system's memory bandwidth. That's assuming it's well-optimized and memory rather than compute bound, but those are often both true or pretty close.

And "depending on the task" is the point. There are systems that would be uselessly slow for real-time interaction but if your concern is to have it process confidential data you don't want to upload to a third party you can just let it run and come back whenever it finishes. And releasing the model allows people to do the latter even if machines necessary to do the former are still prohibitively expensive.

Also, hardware gets cheaper over time and it's useful to have the model out there so it's well-optimized and stable by the time fast hardware becomes affordable instead of waiting for the hardware and only then getting to work on the code.

bawana · on March 23, 2024

Why would increasing memory bandwidth reduce performance? You said "You can get a decent approximation for LLM performance in tokens/second by dividing the model size in GB by the system's memory bandwidth"

AnthonyMouse · on March 24, 2024

Yeah the sentence is backwards, you divide the system's memory bandwidth by the size of the model.

renewiltord · on March 23, 2024

Mixtral 8x7b is better than both of those and runs on a top spec M3 Max wonderfully.

woadwarrior01 · on March 23, 2024

Indeed! Also, Mixtral 8x7b runs just as well on older M1 Max and M2 Max Macs, since LLM inference is memory bandwidth bound and memory bandwidth hasn't significantly changed between M1 and M3.

karolist · on March 23, 2024

It didn't change at all, rather was reduced in certain configurations.

Takennickname · on March 23, 2024

I will make a 2 trillion parameter model just so your comment becomes outdated and wrong.

razodactyl · on March 25, 2024

I approve this comment.

GaggiX · on March 23, 2024

ChatGPT is 20B according to Microsoft researchers, also the fact that big AI models are trillion parameter models is mostly speculation, about GPT-4 it was spread by geohot.

speedgoose · on March 23, 2024

To be precise, ChatGPT 3.5 turbo being 20B is officially a mistake from a Microsoft Researcher, quoting a wrong source published before the release of chatgpt3.5 turbo. Up to you to believe it or not. But I wouldn’t claim it’s a 20B according to Microsoft Researchers.

The withdrawn paper: https://arxiv.org/abs/2310.17680

The wrong source: https://www.forbes.com/sites/forbestechcouncil/2023/02/17/is...

The discussion: https://www.reddit.com/r/LocalLLaMA/comments/17jrj82/new_mic...

GaggiX · on March 23, 2024

It's interesting how the paper was completely retracted instead of just being corrected.

razodactyl · on March 25, 2024

Yep. It feels like a 20B parameter model.

HarHarVeryFunny · on March 23, 2024

GPT-3 was 175B, so it'd be a bit odd if GPT-4 wasn't at least 5x larger (1T), especially since it's apparently a mixture of experts.

razodactyl · on March 25, 2024

I think it became apparent when mixtral came out. I've noticed too during training that my model overwrites useful information so it makes sense for these types of models to have emerged.

bradley13 · on March 23, 2024

Disagree. A few weeks ago, I followed a step-by-step tutorial to diwnlad ollama, which in turn can download various models. On my not-soecisl laptop with a so-so graphics card, Mixtral runs just fine.

As models advance, they will become - not just larger - but also more efficient. Hardware advances. Large models will run just fine on affordable hardware in just a few years.

Closi · on March 23, 2024

I’ve come to the opposite conclusion personally - AI model inference requires burst compute, which particularly suits cloud deployment (for these sort of applications).

And while AIs may become more compute-efficient in some respects, the tasks we ask AIs to do will grow larger and more complex.

Sure you might get a good image locally but what about when the market moves to video? Sure chat GPT might give good responses locally, but how long will it take when you want it to refactor an entire codebase?

Not saying that local compute won’t have its use-cases though… and this is just a prediction that may turn out to be spectacularly wrong!

aerhardt · on March 23, 2024

Ok but yesterday I was on a plane coding and I wouldn’t have minded having GPT4 as it is today available to me.

FeepingCreature · on March 23, 2024

Thanks to Starlink, planes should have good internet soon.

michaelt · on March 23, 2024

Huge models are the best type to open source.

You get all the benefits of academics and open source folks pushing your model forward, and a vastly improved hiring pool.

But it doesn't stop you launching a commercial offering, because 99.99% of the world's population doesn't have 48GB+ of VRAM.

throwaway2037 · on March 23, 2024

First paragraph: These are wild stats. Thank you to share. How did they fund themselves, if no external funding?

andrewljohnson · on March 23, 2024

I wonder if MidJourney is still ripping. I’m actually curious if it’s superior to ChatGPT’s Dall-E images… I switched and cancelled my subscription when ChatGPT added images, but I think I was mostly focused on convenience.

famouswaffles · on March 23, 2024

If you have a particular style in mind then results may vary but aesthetically Midjourney is generally still the best, however Dalle-3 has every other model beat in terms of prompt adherence.

Legend2440 · on March 23, 2024

Image quality, stylistic variety, and resolution are much better than ChatGPT. Prompt following is a little better with ChatGPT, but MJ v6 has narrowed the gap.