Artificial intelligence (AI) is being used all around us in our personal and professional lives. With constant improvements to technologies and the data science driving these developments, we are now seeing a world where it is common to see AI in some form in everyday life. One such area of AI which is being utilised globally is machine learning.
Machine learning is the study and development of computer algorithms to carry out a specified function and to analyse large data to continuously improve decision making, accuracy and predictions. Simply put, the better the algorithm, the more accurate the coded predictions/decisions will be over time as more data is processed.
Examples of machine learning can be found in almost every major technological tool, from Google and social media platforms to email inbox filters and automated vacuum cleaners. As big data continues to grow, along with improvements to computing and data science capabilities, machine learning will continue to be integrated into all aspects of the human world.
Within the international aid and agricultural development sectors, we are starting to see a shift of focus to big data and digital development. This is also the case at CABI, as we work globally on various agriculture development projects and have a multitude of digital tools designed to improve smallholder farmer livelihoods and crop yields. For larger projects and programmes such as the global-reaching Plantwise programme, we are exploring the benefits machine learning and how we can integrate such technologies into our suite of digital tools and platforms.
There are four steps for building a machine learning application:
- Select and prepare a data set – training data representative of the real data that the machine learning algorithm will be analysing. Often the data is tagged so that specific features can be classified by the machine learning model.
- Choose an algorithm to run on the data – algorithm analyse data using statistical processing, the type of algorithm used depends on the problem being solved and the type of data available.
- Training the algorithm to create the model – this involves running variables, comparing the various outputs with the ideal results. Data scientists will adjust the biases within the code to produce more accurate or desirable results. The final product is an accurate, fully trained algorithm, this is the model.
- Using and improving the model – this is the final step in creating a machine learning application. Using the trained model, run the algorithm against new or ‘real’ data. The ideal scenario will be that the model will improve in accuracy over time as it processes more data.
Machine learning in Plantwise:
Within the Plantwise e-plant clinic system, extension officers called Plant Doctors record data collected from plant clinics (i.e. smallholder farmer crop health issues). The data collected during these clinic sessions are stored in the Plantwise Online Management System (POMS) to be analysed for trends in crop health issues and pest outbreaks. An important aspect to any database is the need for clean data to unify records and therefore allow for effective analysis. Due to the international nature of the data being collected and stored in POMS there are often various syntactic errors, such as spelling mistakes or local language differences. Within POMS there is the Harmonization Tool, which allows Data Managers to store multiple versions of a term. During harmonization, with the use of machine learning technology, when a record is identified as invalid the user will get a list of alternate suggestions based on the historical data inputs.
Following the harmonization of clinic data, the records go through the process of validation. This involved CABI scientists following a series of pre-defined guidelines to review individual text fields. The biggest challenge faced during validation is the sheer size of in-country clinic data and the complexity of the fields as many are free text and therefore open to interpretation. If we know the flow of the validation steps and have a large previously-validated dataset, it is possible to train a machine learning model using a questionnaire-style format to predict the outcome of the validation process.
As we continue to explore machine learning and how it can be integrated into digital systems, the hope is that AI technologies will support digital tools as well as enhance the sustainability of many CABI projects.
_______________________
If you would like to read more about how CABI are helping to improve the agricultural sector, visit our Plantwise site.
Related News & Blogs
Shaping sustainable management of the South American tomato leafminer in Vietnam
The global damage caused by the South American tomato leafminer (Phthorimaea absoluta) poses a significant threat to tomato production worldwide, including in Vietnam. Since its first official recording in Son La province in 2019, tomato growers have f…
5 September 2024