Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Non sucking, easy tool to convert any website to LLM ready data, Mojo (github.com/malvads)
2 points by malvads 25 days ago | hide | past | favorite | 3 comments


After running into only paid tools or overly complicated setups for turning web pages into structured data for LLMs, I was pretty much tired of this, wanted free open source solution to convert websites to MD format so built Mojo (for NotebookLM, or any RAG-like solution)

Mojo it's extremly fast, supports proxy rotation and it's MIT licensed -> https://github.com/malvads/mojo


It should start by looking at robot.txt.


Hi, thanks for your comments (it’s on the plan), since Mojo is early-stage software, there is still things that need to be integrated, however mojo is not a mass-crawler, (you have to specify directly what to crawl), so even if I add robots.txt (wich is in the plan) Evil users can still just bypass this (I expect mojo to be used by technical (non-evil) folks).

But thanks for your suggestion :)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: