Posted by

In the beginning of 2016, I tried to solve the grammar correction issue in NLP. Goal is, given a text or line, the algorithm should find grammatical errors and correct it, by replacing the incorrect word by a correct word. Check the blog post Machine Learning Algorithm takes Grammar Test

That experiment failed, because of various reasons. First and foremost being, accuracy of algorithm is not good, I couldn't find out, why it fails in some tests but passed in others tests. Another reason is underlying database couldn't sustain the workload. I have tried various databases including graph and NoSQL, but none seem to be upto the mark. After trying for a few months, I shelved the algorithm and moved my efforts to Search and Deep Learning in NLP, which resulted in Contex2vec and Finding similar lines

After that I had wrote a disk persistent database which can sustain 100k queries per sec in single machine. Ofcourse this is much higher throughput than any database available today except, may be, in-memory databases. So I tried again to solve this grammar correction issue, which resulted in Protein Sequence Search and Generating New Word Sequences.

Current solution not only fixed errors in the previous version of the algorithm but also aims for 100% accuracy too. Current demo is not the full version of it owing to time and resource constraints. It is a proof of concept for how grammar correction would work. More data would be added later to increase the coverage.

This works by logical reasoning based on graphical model.

Demo yet to be updated. Watch this space

Check the Protein Search Demo which works by using the same algorithm

Protein Alignment and Search by Graphical Models

Related Posts:

Generating new sentences in Natural Language as a Graph Clique Problem : Graphical Reasoning based solution

Contact Email

Follow @naturaltext