Articles | Volume 5, issue 2
SOIL, 5, 177–187, 2019
https://doi.org/10.5194/soil-5-177-2019
SOIL, 5, 177–187, 2019
https://doi.org/10.5194/soil-5-177-2019

Original research article 17 Jul 2019

Original research article | 17 Jul 2019

Word embeddings for application in geosciences: development, evaluation, and examples of soil-related concepts

José Padarian and Ignacio Fuentes

Related authors

Additional soil organic carbon storage potential in global croplands
José Padarian, Budiman Minasny, Alex B. McBratney, and Pete Smith
SOIL Discuss., https://doi.org/10.5194/soil-2021-73,https://doi.org/10.5194/soil-2021-73, 2021
Manuscript not accepted for further review
Short summary
Game theory interpretation of digital soil mapping convolutional neural networks
José Padarian, Alex B. McBratney, and Budiman Minasny
SOIL, 6, 389–397, https://doi.org/10.5194/soil-6-389-2020,https://doi.org/10.5194/soil-6-389-2020, 2020
Short summary
A new model for intra- and inter-institutional soil data sharing
José Padarian and Alex B. McBratney
SOIL, 6, 89–94, https://doi.org/10.5194/soil-6-89-2020,https://doi.org/10.5194/soil-6-89-2020, 2020
Short summary
Machine learning and soil sciences: a review aided by machine learning tools
José Padarian, Budiman Minasny, and Alex B. McBratney
SOIL, 6, 35–52, https://doi.org/10.5194/soil-6-35-2020,https://doi.org/10.5194/soil-6-35-2020, 2020
Short summary
Using deep learning for digital soil mapping
José Padarian, Budiman Minasny, and Alex B. McBratney
SOIL, 5, 79–89, https://doi.org/10.5194/soil-5-79-2019,https://doi.org/10.5194/soil-5-79-2019, 2019
Short summary

Related subject area

Soil and methods
Estimation of soil properties with mid-infrared soil spectroscopy across yam production landscapes in West Africa
Philipp Baumann, Juhwan Lee, Emmanuel Frossard, Laurie Paule Schönholzer, Lucien Diby, Valérie Kouamé Hgaza, Delwende Innocent Kiba, Andrew Sila, Keith Sheperd, and Johan Six
SOIL, 7, 717–731, https://doi.org/10.5194/soil-7-717-2021,https://doi.org/10.5194/soil-7-717-2021, 2021
Short summary
The central African soil spectral library: a new soil infrared repository and a geographical prediction analysis
Laura Summerauer, Philipp Baumann, Leonardo Ramirez-Lopez, Matti Barthel, Marijn Bauters, Benjamin Bukombe, Mario Reichenbach, Pascal Boeckx, Elizabeth Kearsley, Kristof Van Oost, Bernard Vanlauwe, Dieudonné Chiragaga, Aimé Bisimwa Heri-Kazi, Pieter Moonen, Andrew Sila, Keith Shepherd, Basile Bazirake Mujinya, Eric Van Ranst, Geert Baert, Sebastian Doetterl, and Johan Six
SOIL, 7, 693–715, https://doi.org/10.5194/soil-7-693-2021,https://doi.org/10.5194/soil-7-693-2021, 2021
Short summary
Developing the Swiss mid-infrared soil spectral library for local estimation and monitoring
Philipp Baumann, Anatol Helfenstein, Andreas Gubler, Armin Keller, Reto Giulio Meuli, Daniel Wächter, Juhwan Lee, Raphael Viscarra Rossel, and Johan Six
SOIL, 7, 525–546, https://doi.org/10.5194/soil-7-525-2021,https://doi.org/10.5194/soil-7-525-2021, 2021
Short summary
Predicting the spatial distribution of soil organic carbon stock in Swedish forests using a group of covariates and site-specific data
Kpade O. L. Hounkpatin, Johan Stendahl, Mattias Lundblad, and Erik Karltun
SOIL, 7, 377–398, https://doi.org/10.5194/soil-7-377-2021,https://doi.org/10.5194/soil-7-377-2021, 2021
Short summary
Improved calibration of the Green–Ampt infiltration module in the EROSION-2D/3D model using a rainfall-runoff experiment database
Hana Beitlerová, Jonas Lenz, Jan Devátý, Martin Mistr, Jiří Kapička, Arno Buchholz, Ilona Gerndtová, and Anne Routschek
SOIL, 7, 241–253, https://doi.org/10.5194/soil-7-241-2021,https://doi.org/10.5194/soil-7-241-2021, 2021
Short summary

Cited articles

Baroni, M., Bernardi, R., Do, N.-Q., and chieh Shan, C.: Entailment above the word level in distributional semantics, in: Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics, Association for Computational Linguistics, 23–32, 2012. a
Baroni, M., Dinu, G., and Kruszewski, G.: Don't count, predict! A systematic comparison of context-counting vs. context-predicting semantic vectors, in: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Vol. 1, 238–247, 2014. a
Baxter, W. and ichi Anjyo, K.: Latent doodle space, in: Computer Graphics Forum, Wiley Online Library, Vol. 25, 477–485, 2006. a
Bengio, Y.: Neural net language models, Scholarpedia, 3, 3881, https://doi.org/10.4249/scholarpedia.3881, 2008. a
Download
Short summary
A large amount of descriptive information is available in geosciences. Considering the advances in natural language it is possible to rescue this information and transform it into a numerical form (embeddings). We used 280764 full-text scientific articles to train a language model capable of generating such embeddings. Our domain-specific embeddings (GeoVec) outperformed general domain embedding tasks such as analogies, relatedness, and categorisation, and can be used in novel applications.