Here's one metric by which, in your first model, the overall idea space has been reduced: the distribution of models has become more concentrated. That's because 100 proposals are tiny variations around 1 basic one. The same holds in your second model: with word vectors X, if I pick (say) two ideas to fund at random, they will never be the same idea, while with (X, X), that will sometimes happen.