Articles | Volume 5, issue 2
SOIL, 5, 177–187, 2019
https://doi.org/10.5194/soil-5-177-2019
SOIL, 5, 177–187, 2019
https://doi.org/10.5194/soil-5-177-2019

Original research article 17 Jul 2019

Original research article | 17 Jul 2019

Word embeddings for application in geosciences: development, evaluation, and examples of soil-related concepts

José Padarian and Ignacio Fuentes

Download

Interactive discussion

Status: closed
Status: closed
AC: Author comment | RC: Referee comment | SC: Short comment | EC: Editor comment
Printer-friendly Version - Printer-friendly version Supplement - Supplement

Peer-review completion

AR: Author's response | RR: Referee report | ED: Editor decision
ED: Revision (16 Apr 2019) by John Quinton
AR by José Padarian on behalf of the Authors (29 May 2019)  Author's response    Manuscript
ED: Publish subject to minor revisions (review by editor) (07 Jun 2019) by John Quinton
AR by José Padarian on behalf of the Authors (08 Jun 2019)  Author's response    Manuscript
ED: Publish as is (14 Jun 2019) by John Quinton
ED: Publish as is (03 Jul 2019) by Johan Six(Executive Editor)
Download
Short summary
A large amount of descriptive information is available in geosciences. Considering the advances in natural language it is possible to rescue this information and transform it into a numerical form (embeddings). We used 280764 full-text scientific articles to train a language model capable of generating such embeddings. Our domain-specific embeddings (GeoVec) outperformed general domain embedding tasks such as analogies, relatedness, and categorisation, and can be used in novel applications.