logoalt Hacker News

Show HN: PHP-fts – Full-text search engine in pure PHP, no extensions

75 pointsby asmodiosyesterday at 8:28 PM18 commentsview on HN

Comments

idoubtityesterday at 10:56 PM

I expected a toy project, but it is a usable library, which required a lot of work. Good job on delivering. A few comments:

After reading "composer.json", I thought that the tests used a custom framework. I'm glad the project does not suffer from NIH syndrome, but the dev dependency on PHPUnit should be declared.

There should a warning that it's only meant for some Western Latin languages. The normalization of the input is built on a character table for a handful of cases. That's not enough for some Latin languages, e.g. Turkish. And any input with Cyrillic, Arabic, CJK and so on, will be ignored.

There is no Unicode normalization or cleanup. Real-life input have many corner cases, e.g. diacritics next to the characters, or invisible characters inside a word to prevent hyphenation. Unless I'm mistaken, this engine would treat the NFD form "fête" as "fe te", instead of the expected "fete", which the NFKD form "fête" produces. I suggest using ext-intl for Unicode normalization, at least as an option.

Lastly, I can't think of a use case for this library. I've always had access to some external service (MySQL, Postgresql, Manticore Search, Solr, etc.) or to a PHP extension for a local Sqlite with FTS. Even for hobby projects, I haven't deployed to a shared hosting for more than two decades.

show 1 reply
ulrischatoday at 7:08 AM

Great tool. Does it work with german umlaut (äöü)? I find it very useful because shared hosting is still big for me. I use ultra cheap shared hosting for nearly everything. No Server maintainance and no funky serverless stuff

isaisabellatoday at 5:42 AM

Great start! This bridge between LIKE and a full-blown engine is exactly what's needed for the PHP long-tail.

francislavoieyesterday at 11:57 PM

We've been using https://github.com/loupe-php/loupe, works quite well for small-to-medium single-instance apps.

captn3m0yesterday at 10:08 PM

Zend used to maintain a PHP port of Lucene 15 years ago that I used, but not sure what happened to it.

show 1 reply
BoxedEmpathytoday at 3:45 AM

This is super cool! Thank you!

cpollettyesterday at 9:47 PM

code looks pretty clean. is small and compact, decent benchmarks. might want to consider using an autoloader for classes.

show 1 reply
ksamanthatoday at 5:26 AM

太厉害了,希望有一天我也能做到