Published:
Posted by

Last month, NaturalText introduced Contex2Vec to mine the words based on context. Expanding on the work, this time using Open Access Articles in the PubMed data, to show the this Natural Language Processing method using Machine Learning algorithms works for all the data.

Let us look at the sample grouping of words. To know about, how this is working including statistics about PubMed data.

Legend : > 90% < 100% = the word has greater than 90 accuracy of being similar

FDR value
> 90% < 100% 
appear likely priori reason
> 70% < 80%
E value SI value additional indication alpha value arrow pointing average GC content average deviation central feature contamination rate effect appeared emerging consensus estimated cost exciting possibility formal possibility formal proof greatest decrease inherent advantages logical choice median delay overall mortality rate overall prognosis overt phenotype promising avenue single occurrence sound basis specialized role species differ specific importance target sample size ultimate objective urine pH
B alleles
> 90% < 100% 
G alleles
A+T content
> 70% < 80% 
angular resolution average GC content core function exact incidence largest fraction literacy rate main strength overall mortality rate overall prognosis overall survival rate
C-reactive protein level
> 70% < 80% 
serum potassium level
D value
> 70% < 80% 
coverage rate defensive role high viability target sample size
myostatin
> 70% < 80% 
RalA
mutual relationship
> 70% < 80% 
anatomical relationship causative relationship collaborative process complex relationship
January 2000
> 70% < 80% 
April 2011 April 2012 January 2002 January 2004 July 2006 March 2008
systemic disorder
> 70% < 80% 
chronic inflammatory condition common activity computational tool conservative strategy correction procedure safe method scorpion sensory modality simple mechanism smartphone application
T cell clone
> 70% < 80% 
ECM protein agent used broader approach chromosome segment circRNA decrease occurred excellent book gene variant important region innovative method
UBA domain
> 70% < 80% 
UIM biphasic effect combined regimen saccade target
V protein
> 70% < 80% 
patient still smaller protein
albicans cells
> 90% < 100% 
chinensis
> 80% < 90%
vulgaris
> 70% < 80%
coli harveyi
jejuni
> 80% < 90% 
albicans
W chromosome
> 70% < 80% 
approach proposed average life span co-actor crura doses tested extreme C-terminus final aim latter groups latter mechanism main use
Z value
> 70% < 80% 
reliability coefficient

Email rajasankar@naturaltext.com, to analyse your text data

Follow @naturaltext