Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Show HN: I just open sourced my document/website extractor for Vision-LLMs (github.com/emcf)
37 points by emmettm on April 2, 2024 | hide | past | favorite | 4 comments
Hi HackerNews,

Lately, I have seen an explosion in posts offering paid APIs/services to get unstructured data into LLMs (i.e. langchain extract, ragflow, unstructured, unstract, just to name a few) and I have been largely disappointed by them, either because they fail to implement multimodal support, fail to give good context for "really tricky" PDFs / Word docs / Powerpoints, or are just plain difficult to use. In light of all these posts I figured I'd share my solution that has been working smoothly for me and my clients. I put it up on GitHub for free so you can check it out and hopefully offer some feedback / criticism or contribute to the code yourself.

and BTW, I'm not trying to throw shade at any of the services mentioned, I'm just giving my honest experience in case there are others out there who feel the same way and want something that works

Cheers!



I saw this on GitHub the other day and have been playing around with it for PDFs & the results are pretty good. I'm not sure how it compares to the paid services. I did try langchain's though and it sucked.


Sorry I looked at the GitHub and I don’t understand can you please explain the use case? Thank you.


What's the cost?





Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: