Domain Knowledge in Machine Learning
#

Let’s say the domain is a restaurant kitchen. A dataset with 3 variables. Two predictors and one predicted. Predictor variables are flour in kilograms and water in liters. A predicted variable is the number of roti/ bread. You know the model will be something like this.

Number of roti (y). = b1 * flour in kg (x1) + b2 * water in liter (x2)

Features are scaled to the same scale in terms of the unit of measurement, the KMS system.

During training, your model learns b1 and b2 from training data.

Now during prediction if you enter 1 kg flour and 10 liters water then what should be the number of roti from the model? Or if you input 10 kg flour and 0.1-liter water, then what is the expected output?

Those who have not worked in the kitchen may not understand what I am saying. That is why they say domain knowledge is essential. No problem, try some experiments in the kitchen without someone looking at you. 😊

Now questions are
Is there any problem with this model-building approach?
Are we missing some steps from data collection to data cleaning to the model building due to which we may get unexpected output?
If yes, then where is the problem?
What should be the expected output for the above inputs?
What needs to be done in this whole approach to build a good model?
Can we solve this problem without ML?

Don’t blame the user that you are giving the wrong input and my model is failing.

Any thoughts?

Follow Me

Dr. Hari Thapliyaal

Writes on data science & AI, project management, and Advaita Vedanta—and builds training and consulting work around those threads.

Education: Doctorate in AI/NLP (SSBM, Geneva); masters study across computer science, business, data science, and economics.
Career: 30+ years in management and technology leadership; 16+ years across the software product lifecycle; a decade in PM training, coaching, and consulting; hands-on Data Science/AI product solution delivery, course design, and mentoring in GenAI, ML, Deep Learning, NLP and Analytics.
Verticals: Solutions and delivery across logistics, BFSI, investment banking, NGOs, staffing, and industrial engineering.
Strengths: Clarifying messy stakeholder problems and turning them into practical outcomes.

Away from work: long meditation and quiet time in nature.

Domain Knowledge in Machine Learning#

Dr. Hari Thapliyaal

Comments:

Related

Domain Knowledge in Machine Learning
#