Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Would love to see an open source tool like this for parsing general text / csv / fixedwidth / html in a similar way.

Very nifty concept.



I made an open source tool that does something like this using a modified version of the Levenshtein distance algorithm: https://github.com/nathanathan/fuzzyTemplateMatcher

Here's a demo you can play around with: http://nathanathan.com/fuzzyTemplateMatcher/


It has some problems e.g. take out the dog, call my mother (it is entirely possible I missed the point of the code)


I really want to open source the core parsing bit which is based on diff-match-patch but I've got some cleanup to do before... It is probably not hard to reverse engineer what we're doing, it is pretty simple!


I've been not quite needing this enough for years. Such is life.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: