Updated March 5, 2026
0:00 Welcome to Colaberry AI podcast brought to you by Colaberry AI Research Labs and the Carl Foundation. Today on the deep dive, we're gonna be looking at, some really cool stuff from Colaberry Inc. Okay. They partnered with Elena Lynn, and they're working to bring AI and data science to smallholder farmers. Oh, wow. 0:21 That's super interesting. Fascinating. Yeah. And and we're gonna look at how they do digital onboarding and risk assessment using AI to really change the game for these farmers. I'm very interested to hear how AI is being used for something so practical. 0:35 Yeah. Me too. And and we're talking about regions all over the globe, like India, Kenya, Ivory Coast, and more. Okay. So this is gonna be really impactful then Yeah. 0:44 If it can be applied in all these different places Right. For so many farmers. Because the goal here is to, like, to to, like, really improve their productivity Mhmm. And to give them better access to things like financial services. Makes sense. 0:58 I mean, smallholder farmers often face a lot of challenges when it comes to getting loans and stuff. Right? Exactly. It's like traditional banking systems just aren't set up for them. One of the big things they're trying to solve here. 1:09 I see. Yeah. So from what I've seen in in the material, it looks like they're building a digital platform and then leveraging AI to make these assessments, not just bringing the farmers on board digitally, but, like, actually evaluating the risk dynamically. Okay. So it's not just about digitizing existing processes, but about actually using the AI to do something new. 1:34 Exactly. Okay. I like that. So let's dive into, like, how they're actually doing this. I mean, it starts with machine learning algorithms to do the risk assessment. 1:43 Yeah. But algorithms are only as good as the data they use. Right? Right. So what kind of data are they feeding into this thing? 1:48 Well, that's what's really cool here. They are pulling data from all sorts of places, like, like, way more than you'd normally think of. Okay. So they're using satellite imagery so they can, like, look at the land, see how it's being used. That's cool. 2:00 And then they've got, like, sensors in the field, like, right in the soil. Mhmm. So they get data on soil health. So they're getting real time data from the field. Mhmm. 2:10 That's impressive. And they're even factoring in weather data, real time for weather stations. Okay. So they have the environmental factors covered. Yes. 2:18 What about the farmers themselves? Oh, they get that too. Yeah. So they obviously build detailed farmer profiles. They look at land ownership records, really trying to get a clear picture of the farmer's situation. 2:31 Okay. And then they look at the crop rotation cycles the farmer uses. Right. Because different crops have different risks and needs. Exactly. 2:38 And then one of the biggest things is they look at the transaction history. So their financial history. Yes. So everything related to agricultural what they sell, any previous loans or financial interactions. So they're really painting a complete picture then? 2:56 Yeah. It's pretty comprehensive. That's a lot of data, though. How do they even begin to make sense of all of that? Okay. 3:02 So this is where it gets really technical. They have a pretty intense data processing pipeline. Okay. Lay it on me. It relies heavily on Azure Data Factory. 3:11 Okay. And what does that do? So think of it as, like, the brains of the operation for the data. It handles what's called ETL. ETL. 3:18 Yeah. It stands for extract, transform, and load. Okay. So you've got all this raw data coming in from all these different sources we talked about. Right. 3:26 Azure Data Factory first cleans it all up. Because real world data is always messy. Yeah. Gotta get rid of inconsistencies and errors, make sure it's all accurate. Yeah. 3:37 Then it normalizes the data so everything is on a comparable scale. Makes sense. You can't compare apples and oranges as they say. Exactly. And then finally, it structures everything into, like, a nice relational format. 3:48 So it's all organized and ready for the AI to actually work with it. Yeah. Like, taking a giant messy pile of puzzle pieces and sorting them all out. I like that analogy. Yeah. 3:57 But isn't there another step? Something about feature engineering. Oh, yeah. That's a really crucial part. Variables from all that cleaned up data. 4:09 So they're not just using the raw data as is? No. They're making it even more useful, more informative. Okay. Like, for example, instead of just using raw rainfall numbers, they might create a variable that looks at the length of dry spells during, like, critical crop growth stages. 4:25 I see. Because that's way more relevant to the plant's health than just the total rainfall. Right. Right. That's a much better indicator of potential stress on the crops. 4:33 Exactly. Okay. So they've got all this data prepped and ready to go. Now what? How do they actually put it to use? 4:40 So this is where the AI magic really happens. They they have this dynamic risk scoring engine. Okay. And what's so cool about it is that it's not just like a one time assessment. It's constantly updating as new data comes in. 4:52 So it's dynamic. It adapts to the changing conditions of the farmer and the environment? Yes. It uses all that process data, all those engineered features to generate these really individualized risk scores for each farmer. And how are those scores actually used? 5:08 So financial institutions can use these scores to make better decisions about lending money. Oh, I see. Because now they have a much more nuanced understanding of the farmer's risk profile. That makes sense. It helps them assess the creditworthiness of farmers who might not have traditional collateral. 5:24 Yeah. It's directly addressing that problem of financial exclusion that we were talking about earlier. That's really important, especially in regions where smallholder farmers are such a vital part of the economy. Right. And and the farmers don't just benefit from getting loans easier. 5:38 They also get valuable insights from the AI themselves. Oh, how so? So the platform delivers what they call real time crop intelligence, often through just simple SMS messages. Oh, that's clever. Using technology that's already widely accessible. 5:53 Yeah. So they might get advice on when to irrigate, how much to irrigate based on the latest weather forecasts and soil moisture readings. Okay. So it's providing actionable information that can help them make better decisions. Exactly. 6:06 Or they might get recommendations for pest control or even tips on the best time to harvest. Wow. So they're getting personalized advice tailored to their specific crops and conditions. Yeah. Pretty cool. 6:18 Right. That's amazing. And this isn't a static system. Right? The AI models are constantly learning and evolving? 6:24 You got it. It's all about adaptive learning. Okay. So as new data comes in, whether it's updated weather information, new yield data from the harvest, loan repayment records, the AI models get retrained and refined. So the system's constantly improving its accuracy over time. 6:41 Exactly. The more data it gets, the smarter it gets. That's fantastic. The sources also mentioned that they generate different types of scores. Can you elaborate on those a bit? 6:50 Yeah. So they've got the stress score, the risk score, and the alternate credit score. Okay. Break those down for me. So the distress score, that's all about figuring out how financially vulnerable a farmer is at any given time. 7:02 I see. It helps lenders and investors identify farmers who might be facing, like, immediate hardship or who might be more likely to default on a loan. So it's a way to flag potential problems early on and maybe intervene to provide support. Exactly. Okay. 7:17 What about the risk score? So the risk score is a more general assessment of the overall risk associated with a farmer. Okay. From the perspective of lenders, insurance providers, that kind of thing. Yeah. 7:28 Anyone who has a stake in the farmer's success. Right. And then there's the alternate credit score. What's that all about? So that one's really interesting. 7:36 It recognizes that traditional credit scores don't always work well for smallholder farmers. Because they often lack a formal credit history. Yeah. Exactly. So this alternate credit score tries to take a more holistic view incorporating things like their social standing in their community, their involvement in co ops It's to get a better sense of their creditworthiness. 7:59 So it's trying to level the playing field and give them a fair chance to access financial services. Yeah. That's pretty innovative. Yeah. Now I'm curious to see how all of this actually plays out in real world scenarios. 8:11 The sources talk about something called farmer subscription churn prediction. Oh, yeah. So that's all about identifying farmers who might be thinking about leaving the platform. And why is that important? Well, if they lose subscribers, that impacts the sustainability of the whole thing. 8:25 Right? Right. Of course. So how do they predict churn? They use historical data on how farmers interact with the platform. 8:32 Okay. So they look at things like how often they use different features, how responsive they are to SMS messages. So engagement metrics. Yeah. And even their reported yield data. 8:44 And they feed all that into classification models. And what do these models actually do? So these models basically learn the patterns in the data that are associated with farmers who have churned in the past. I see. And then they apply those patterns to the current subscribers to figure out who's most at risk of leaving. 9:01 So they can be proactive and try to retain those farmers. Exactly. And which algorithms are they using for this churn prediction? The sources specifically mentioned logistic regression and random forest. Classic classification algorithms. 9:14 Yeah. And they're well suited for this kind of problem where you're trying to predict a yes or no outcome, like, will they churn or not? Makes sense. And who are the main users of these churn predictions? So it's mainly the subscription managers and the marketing teams. 9:29 So they can see who's at risk and maybe reach out to them with some targeted offers or support? Exactly. And the ultimate goal is to develop really effective retention strategies. Keep those farmers engaged and happy with the platform. Yeah. 9:43 Okay. That's a really practical example of how they're using predictive analytics. The other example mentioned is contract fulfillment prediction. Tell me more about that. So that one's all about minimizing the risk that farmers might not fulfill their contracts. 9:58 Like not delivering the agreed upon quantity of crops. Yeah. Or maybe not meeting the quality standards. Right. That can be a big problem for buyers. 10:06 Absolutely. So to predict this, they analyze all the contract related data, like the terms of the agreement, the farmer's history of fulfilling contracts, along with their harvest data, like predicted yields, historical yield variability, that kind of thing. So they're looking at both the contractual side and the actual agricultural production side. Mhmm. And they use classification models again, but this time, they're trying to predict the likelihood of a farmer fulfilling their commitments. 10:33 And for this contract fulfillment prediction, are they using the same algorithms as for churn? Actually, they're using more advanced stuff here. Okay. Like what? The sources mentioned XGBoost and neural networks. 10:47 So more sophisticated models capable of handling more complex relationships in the data. Yeah. Because contract fulfillment can be influenced by a lot of different factors, so they need more powerful algorithms. Makes sense. And who are these predictions intended for? 11:02 Primarily, the contract managers and the buyers. So they can better assess the risks involved and maybe adjust their strategies accordingly. Yeah. They can improve their contract design, put in place better risk management strategies. Okay. 11:14 So they're really trying to cover all their bases Mhmm. Predicting both subscriber churn and contract fulfillment. Yeah. It's pretty comprehensive. Now I'm curious about the technical side of things. 11:24 What kind of infrastructure are they using to support all of this? So they built the whole platform on Microsoft Azure. Okay. Cloud based infrastructure. Yeah. 11:32 It gives them the scalability and reliability they need to handle all that data and the demanding AI processing. Right. Because we're talking about massive amounts of data data here. Yeah. So they're using Azure Blob Storage as their main data lake. 11:45 To store all those satellite images, sensor readings, historical records. Mhmm. And then they've got Azure functions, which are these serverless compute services that run code on demand. Okay. For handling specific tasks. 11:59 And they're also using Azure Logic apps to automate the workflows. So it's all very automated and interconnected. Yeah. And to make the application itself scalable and manageable, they're using containerization. Right? 12:10 Yeah. They use Docker to package the application Okay. And all its dependencies into these neat little containers. Makes it portable and ensures everything runs smoothly in different environments. Exactly. 12:21 And then they use Azure Kubernetes service or AKS to manage and orchestrate all those containers. Ah, Kubernetes. Yeah. Essential for managing large scale containerized applications. Yeah. 12:33 And as we discussed earlier, Azure Data Factory is the glue that holds everything together. Right. It's the maestro of the data pipeline handling the ingestion, transformation, movement of data between all the different parts. Okay. That's a pretty impressive technical setup. 12:49 What about the software and development tools they're using? So they use Python for developing the machine learning models. Standard choice for data science. And then they've got all the usual suspects, TensorFlow, PyTorch, scikit learn. A robust machine learning toolkit. 13:05 Yeah. And Power BI is key for creating those visualizations and dashboards so everyone can understand the insights. Makes the data accessible to a wider audience. Exactly. And then they use C Sharp for the back end stuff. 13:16 Robust and reliable for building those core services. And, of course, they rely heavily on Microsoft Fabric and SQL Server for managing all that structured data. Solid choices for relational database management. And for building the APIs, they're using ASP dot NET Core. Okay. 13:31 So a well rounded set of tools for a complex application. Now how do they actually integrate the AI models into the platform itself? So they deploy the models as restful APIs using Azure App Service. A standard way to make machine learning models accessible to other applications. Yeah. 13:49 So any part of the LNM platform can send data to these APIs and get predictions back in real time. Seamless integration. And they're using Azure Synapse Analytics to bring together all the different data sources. Right. So they can get those comprehensive insights from the combined data. 14:05 Okay. So they've built a technically sophisticated system. But the real question is, what are the actual outcomes? Is it making a difference? From what we can tell, it's having a pretty big impact. 14:13 They've seen significant improvements in yield prediction accuracy. Which means farmers can make better decisions about planting and resource allocation. Exactly. And that translates into higher yields and potentially more income. That's fantastic. 14:26 What about financial access? Is the AI driven risk assessment actually helping farmers get loans? Yeah. They've definitely seen an increase in financial access for farmers who would have traditionally struggled to get loans. Because lenders now have more confidence in their ability to repay. 14:42 Exactly. And that extra capital allows them to invest in better inputs like seeds, fertilizers, even new technologies. So it's creating a positive feedback loop. Better access to finance leads to better yields, which further improves their financial stability. Yeah. 14:59 It's pretty powerful. Are there any other notable benefits maybe in terms of efficiency or strategic advantages? Definitely. The automation of the data analysis has made decision making much faster and more efficient. So they can respond more quickly to changing conditions, like weather fluctuations or pest outbreaks. 15:16 Yeah. It's all about staying ahead of the curve. And that efficiency also leads to cost savings because they're not wasting resources. Makes sense. And strategically, farmers using this platform have a clear advantage over those who aren't. 15:28 Right. They have access to these cutting edge insights that can help them optimize their operations and be more competitive. And the platform itself is built to be scalable so it could be expanded to different crops and regions. Exactly. It has the potential to transform smallholder farming on a global scale. 15:44 It's really impressive how they've brought together such a wide range of technologies and expertise to tackle this challenge. Yeah. It's a great example of how AI and data science can be used to make a real difference in the world. Absolutely. It's not just about fancy algorithms. 15:58 It's about using those algorithm to solve real world problems and improve people's lives. Couldn't have said it better myself. Well, that brings us to the end of our deep dive into digital onboarding and AI powered risk assessment for smallholder farmers. It's been fascinating to explore how these technologies are being applied to boost productivity and promote financial inclusion in the agricultural sector. It's truly inspiring to see the impact that data science and AI are having on such a vital industry. 16:27 Yeah. I agree. Thanks for listening in. Don't forget to subscribe and follow Calabrio on social media. You can find all the links in the description. 16:34 And be sure to check out our website, www.calabrio.ai, for more insights like this.