Project Everest

Work Update

Automating plant + pest classification // FarmEd's entry to machine learning

by
Matt Allan
Matt Allan | Jan 6, 2018 | in Knowledge Base

Context: (from what I've gathered) the future state of FarmEd is looking to offer services in a hands-off mobile/web app, which uses no (or minimal) human intervention. Hence, a lot of work is being done to understand and identify problems/solution patterns (eg. image databases of pests + plants). 

From a technical POV, I'd like to call out that the backend of what is envisioned will be orders of magnitudes more difficult than the front end. Hence, I'm suggesting that the project begin developing its automation capabilities, in order to a) learn what works and what doesn't b) learn the bounds of predictive modelling / machine learning (it can't do everything - not yet!) c) begin improving the accuracy of the algorithm (which is a lengthy process). 

Idea: the project leverages databases already built (or their own), to begin training a predictive model that will classify plants and pests. The model would receive an image, and it will predict what plant/pest the image is of. 

Tactically, this solves the problem of having to train each new group of trekkers into agricultural experts - the computer retains its intelligence. 

How: Similar capabilities have been built by research teams for plants and animals (below). The project would begin by investigating how to apply these programs to suit the specific use case of plant/pest classification.

See:

http://ijarcet.org/wp-content/uploads/IJARCET-VOL-5-ISSUE-11-2664-2669.pdf

http://tamaraberg.com/papers/berg_animals.pdf

With code: 

https://github.com/danforthcenter/plantcv

https://github.com/JavierLopatin/Grassland-Species-Classification/blob/master/README.md

Down the track, it's reasonable to consider how drone imaging can feed input to the model, and how the project can expand to other automated services (such as natural language processing, chatbots, and other suggestive models). The vibe about AI in the data analysis industry (I work for Westpac) is to start small, easy, and simple because machine learning has a way of getting big very quick. 

Floated the idea with Jess, and we wanted to record it here. 

edited on 6th January 2018, 11:01 by Matt Allan

Kaushik Bilimoria Jan 8, 2018

This is awesome. It's great to start thinking about thinking about this stuff given the direction we intend to head in and the potential roadblocks that will occur.

Reply 2

Zoe Paisley Jan 8, 2018

Hey Matt!
Sounds awesome. The reason we are collecting so many images of pests and diseases is to ease the training of the AI system in which we decide to use for the application. In July this year, the Drones Investigation team here in Fiji, started training a chatbot and back at home we began using AI and image recognition to learn the identified pests and diseases captured in the images. From this, we realised just how extensive our image database must be in order to train the machine effectively (as you've said).
We are currently investigating how we can leverage existing image databases of pests and diseases, so we reduce the labour involved in collecting and identifying all the images. I've contacted people/ companies/ databases to identify how we can work together and what their databases are like, so definitely looking into this.
In a simple form and MVP for this feature, we wouldn't require the system to identify the plant, as it would work in conjunction with the chatbot. Meaning, the farmer would take an image of the pest and/or disease and specify what crop it is affecting by typing the crop name, e.g. cabbage or cassava. In doing so, we are reducing the amount of variables the machine must learn (i.e. it doesn't have to identify the plant as farmers know what they are growing), at least to begin with.
Definitely look into these and other sources that are open-sourced. Also, look into their API's to see if we can access their coding and/or database easily.

Reply 1

Matt Allan Jan 8, 2018

Hey Zoe, thanks!

Can you please link the discussion/writeup of your chatbot? V interested..

How have you gone chasing down leads to these pre-existing databases of pests/pest-infected plants? If they exist, where can I find them?

The two links above are to source code - databases included. Someone (me if no one else offers) will need to investigate how applicable they are to our use-case of pest classification. I'll do some of that when I tour in Feb :)

Reply 1

Lisa Paisley Jan 8, 2018

100% We need to trial what in the database does/doesn't work and what we should/shouldn't be collecting. We want to start collecting necessary data today, so we have enough to teach the AI system once its up and running, hence why we have such a huge focus on data collection with the teams.
Totally agree with making the database now. It's something the PoC team in Fiji is working on , so if you want to give them a hand I've tagged the team below.

With the limitation of AI, yes there is a limit but I was chatting with an app developer today and he said just 'cause it hasn't been done yet doesn't mean it can't be done. Some things are easier to train AI systems, others are harder. So it's not going to happen instantly but I reckon it can still be done, and yea as you said, start small and go from there.

Reply 1

Matt Allan Jan 8, 2018

Hey Lisa, thanks for your reply.

@All, how can I get up to speed with what you're building? Keen to understand before I arrive in Cambodia in Feb.

Reply 1

Gabriel Raubenheimer Jan 9, 2018

Excellent idea, and it's great to see thought going into this prior to arriving in country - it'll be good to have direction and ideas from day zero once you arrive.

To preface this response, I'm a trekker in-country now on FarmEd (Cambodia), and as a Software Engineering major this a primary area of interest for me.

I agree wholeheartedly that putting infrastructure and frameworks in place in the near future is very important, and that the machine learning and pattern recognition implementation will take time to build and tweak. Here are my expanded thoughts.

Firstly, from the perspective of someone on the ground, we have to maintain the ability to pivot and adapt rapidly at this stage, and I'm concerned that IT infrastructure would not be able to do that, and may even inhibit our ability to do so by directing/limiting our thinking too much. Well designed and somewhat flexible infrastructure may be able to, and that is definitely worth exploring, so I'm very glad you've raised that. I think it would have to remain in a testing stage segregated from the work us trekkers are doing on the ground to some degree, but this sort of ideation around how it's done is both brilliant and important. If you have any thoughts surrounding how we might implement this whilst maintaining our maneuverability, that will never not be valuable.

Finally, going back to pivoting, I'd like to build on your idea by saying it is very likely that the requirements of this software will be different when it is implemented than it is now, meaning that information that is fairly meaningless now may be highly pertinent in the future. Therefore, I think in order to support your ideas, database systems that will support our future app development need to be put in place. This will both organise the information we do gather, and inform the data we collect, and provide or supplement a basis for AI development.

Brilliant work, pumped to see your ideas - this was excellent. Also always happy to have a conversation if you want an update on what's going on on the ground.

Reply 3

Matt Allan Jan 9, 2018

Hey Gabriel - thanks for your discussion, I enjoyed your reply.

Re: databases, I could not agree more. With the focus farmed has on data in this stage, some basic forethought into structuring your relational databases could not be more important. Looks like you're on top of this already.

For the basic level of machine learning that we'd start off with, there's very minimal infrastructure involved (AWS is cheap as chips for stuff like this and software is free) - unless I'm missing something(?). Choosing a tech stack (if that counts as infrastructure) shouldn't be a limiting factor at this point for basic tools since as you mentioned, the context will likely change for production code in the future, which is something we can't control.

Reply 1

Gabriel Raubenheimer Jan 9, 2018

Thanks for the response Matt. I agree regarding physical infrastructure - we have access to Azure as well, so that's easy, and obviously Azure plugs in well with SQL databases so we're covered in that regard.

Regarding infrastructure, I specifically mean the code we write (unless there are appropriate preexisting programs, in which case fantastic). In regards to the tech stack, I agree that we can't control the way that changes and therefore we shouldn't be too worried about it, but it is a factor, because it means that most of what we write is going to be used in a short term manner.

A question I'm interested in exploring with is when the optimal time is to implement this prototype, more in terms of human resources (time) than anything else - ie. do we have enough conceptual stability now to be worth the investment, or should we develop the project further first (although I know it doesn't necessarily need to be a huge time investment to write a basic prototype, in my experience, and as you'll no doubt have experienced too, any vaguely experimental software prototype tends to take vastly more time than one would think). We also need to make sure we understand the relationship between our prototype and the potential of the final product, in that the former has to approximate the latter to an appropriate degree in order to be worth doing. This is especially because one of the most important things for us to test is limitations of the AI.

I don't mean to sound vaguely negative. Quite the opposite is true - I'm extremely excited about them and it, and am very keen to explore it more, and to get it off the ground as soon as we can. These questions are nonetheless interesting and (at least in my mind) worthwhile to explore, and I very much encourage and would enjoy any further discourse around them. I also think there is huge value in developing a prototype as it will allow us to develop things such as how we take the photographs used for training and analysis, and to test different AI models to see what is most effective.

I look forward to hearing what you think, very much enjoying the discussion.

Reply 2

Matt Allan Jan 10, 2018

Timing wise, good thought. I'll need to get up to speed with the project on the ground to make an assessment, what do you think?

Reply 1

View all replies (3)

Darcy Connaghan Jul 1, 2018

Status label added: Work Update

Reply 0