AI-powered Calorie Tracker. How to use ChatGPT and Python to Analyze Your Meals
A few weeks ago, I decided to track my daily calories, protein, carbs, fat, and fiber intake to make sure I was eating right (getting enough protein and fiber and not gaining more fat and calories than I needed). To do this, I created a Google Spreadsheet and painstakingly and meticulously entered all the calories and macronutrients for a total after each meal and snack.
The situation got complicated during lunch, when I was brought a plate of food where I definitely couldn’t weigh the ingredients individually. Luckily, I had the idea to take a photo of my lunch and send it to ChatGPT asking how many calories and nutrients are in each ingredient. To my surprise, ChatGPT 4o was quite accurate in describing what was on my plate, correctly identifying the ingredients and estimating the total calorie count. Yes, the accuracy was plus or minus 50 grams, but it is quite enough to assess the correctness of my diet.
At that moment, I thought that it is in such small things that AI penetrates our lives faster than you expect.
When I returned from lunch to the office, I thought that I could ask ChatGPT to return the result not as text, but as structured JSON, and then it could be processed into a data frame for output and calculation of calories, proteins, carbohydrates, fats and fiber for each product separately and in general. It will turn out to be a simple application for independent use, and maybe even a small prototype for a potential startup. Right now I will try to do this.
Table of Contents
- Plan for building the app
- Creating and testing the prompt
- Connecting to the OpenAI API and making a request
- Processing the received data
- Testing with multiple photos
- Creating a Flask app
- Cost of an OpenAI Request
- Ideas for Future Development
Plan for building the app
The idea is simple:
- Send a meal photo to ChatGPT, requesting a structured JSON response (a list of ingredients and their nutritional details).
- Process the JSON into a dataframe, displaying the results for each ingredient and the meal as a whole.
- Wrap this functionality in a simple web app that can be deployed on a server.
Creating and Testing the Prompt
Since it’s known that large language models (LLMs) aren’t great with precise math, I’ll separately ask for the ingredient weight in grams and use known data — calories, proteins, carbs, fats, and fiber per 100 grams. I’ll then calculate the exact values myself in the script.
Here’s the prompt I came up with:
You are a very useful assistant. Help me with determining the caloric content of my meal
The photo shows food products for a meal. Determine which products are shown in the photo and return them ONLY as a JSON list,
where each list element should contain:
* "title" - the name of the product,
* "weight" - weight in grams,
* "kilocalories_per100g" - how many calories are contained in this product in 100 grams,
* "proteins_per100g" - the amount of proteins of this product per 100 grams,
* "fats_per100g" - the amount of fat per 100 grams of this product,
* "carbohydrates_per100g" - the amount of carbohydrates per 100 grams of this product,
* "fiber_per100g" - the amount of fiber per 100 grams of this product
Let’s test this prompt with a photo of my delicious Portuguese lunch:
And try to test in web-version of ChatGPT…
Not bad! It looks like it’s working — ChatGPT returned a JSON in the exact format I requested. It did add some unnecessary text, but that’s not an issue.
Connecting to the OpenAI API and Making a Request
Now let’s do the same thing using the OpenAI API. To access the API, you need to register at OpenAI Platform, add a small balance (I added $5), and generate an API key.
Now let’s write a Python script to send the request to ChatGPT via the API.
Installing the required libraries:
pip install openai Pillow pandas
We then import the necessary libraries, API key, and the model we’ll use:
import os, base64, json
from openai import OpenAI
from PIL import Image, ImageOps
from io import BytesIO
import pandas as pd
#Define OPEN API Key and model
OPENAI_API_KEY = "[INSERT YOUR OPEN AI API KEY HERE]"
MODEL="gpt-4o" #or "gpt-4o-mini"
I used gpt-4, but I also tested gpt-4o-mini, and while it worked decently, its accuracy was slightly lower.
Next, we’ll declare two functions to process our lunch photo. The first corrects the image orientation if needed, and the second resizes it to reduce token usage (since ChatGPT doesn’t need a 12-megapixel photo). We’ll also convert the image to a base64 string to send it.
# Function to resize the image and return the resized PIL image
def resize_image(image_path, target_width):
#Open image
with Image.open(image_path) as img:
#Correcting image orientation if there is EXIF data
img = ImageOps.exif_transpose(img)
original_width, original_height = img.size #get original size
target_height = int((target_width / original_width) * original_height) #calc proportion
resized_img = img.resize((target_width, target_height), Image.Resampling.LANCZOS) #resize image
# return image object
return resized_img
# Function to convert a PIL image to a base64 string
def pil_image_to_base64(img):
buffered = BytesIO() #create buffer
img.save(buffered, format="JPEG") #Save the image to the buffer in JPEG format
base64_str = base64.b64encode(buffered.getvalue()).decode('utf-8') #convert buffer to base64 format
return base64_str
Next, we define the image path and process the image:
# Path to your image
image_path = "./data/meal1.jpg"
#Read and resize image, and convert to base64
resized_image = resize_image(image_path, 1000)
base64_image = pil_image_to_base64(resized_image)
We’ll set up the system and user prompts:
system_prompt = """
You are a very useful assistant. Help me with determining the caloric content of products
"""
user_prompt = """
The photo shows food products for a meal. Determine approximately which products are shown in the photo and return them ONLY as a json list,
where each list element should contain:
* "title" - the name of the product,
* "weight" - weight in grams,
* "kilocalories_per100g" - how many calories are contained in this product in 100 grams,
* "proteins_per100g" - the amount of proteins of this product per 100 grams,
* "fats_per100g" - the amount of fat per 100 grams of this product,
* "carbohydrates_per100g" - the amount of carbohydrates per 100 grams of this product,
* "fiber_per100g" - the amount of fiber per 100 grams of this product,
"""
Let’s initialize the OpenAI client and send the data:
client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY", OPENAI_API_KEY))
completion = client.chat.completions.create(
model=MODEL,
messages=[
{
"role": "system",
"content": [
{"type": "text", "text": system_prompt},
]
},
{
"role": "user",
"content": [
{"type": "text", "text": user_prompt},
{"type": "image_url","image_url": {"url": f"data:image/jpeg;base64,{base64_image}"}}
]
}
]
)
What did we do above? We “created” a chat using client.chat.completions
, specified our model with the MODEL
variable, defined the system prompt with role=system
, and the user prompt consisting of text and the photo in base64 format separately.
Finally, we’ll get and print the response:
response_content = completion.choices[0].message.content
print("Response: " + response_content)
Result: We received both text and JSON. It works!
Response: Based on a description of typical meals, I can provide approximate nutritional information. Here's a possible breakdown for what you've described:
```json
[
{
"title": "Beef Stew",
"weight": 150,
"kilocalories_per100g": 150,
"proteins_per100g": 20,
"fats_per100g": 7,
"carbohydrates_per100g": 5,
"fiber_per100g": 0
},
{
"title": "Roasted Potatoes",
"weight": 100,
"kilocalories_per100g": 150,
"proteins_per100g": 2,
"fats_per100g": 6,
"carbohydrates_per100g": 28,
"fiber_per100g": 2
},
{
"title": "Green Peas",
"weight": 100,
"kilocalories_per100g": 81,
"proteins_per100g": 5,
"fats_per100g": 0,
"carbohydrates_per100g": 14,
"fiber_per100g": 5
}
]
```
Adjust the weights and content as needed based on actual portion sizes and ingredients.
Processing the Received Data
Now we can extract the JSON part of the response:
# Try to extract and clean the JSON part from the response
try:
start_index = response_content.find('[') # JSON starts with [
end_index = response_content.rfind(']') + 1 # JSON ends with ]
# Extract the JSON string
json_str = response_content[start_index:end_index]
# Parse the JSON string
json_data = json.loads(json_str)
# Test pring Result
#print(json.dumps(json_data, indent=4, ensure_ascii=False))
except json.JSONDecodeError as e:
print("Failed to parse JSON:", e)
except Exception as ex:
print("An error occurred:", ex)
Then, we’ll iterate over each product in the JSON to calculate the exact value of calories, proteins, fats, carbohydrates, and fiber based on the weight, gather this into a new frame_data
list, and then create a dataframe from it:
frame_data = []
for row in json_data:
frame_data.append(dict(
title = row['title'],
weight = row['weight'],
kilocalories = round(row['weight']/100 * row['kilocalories_per100g']),
proteins = round(row['weight']/100 * row['proteins_per100g']),
fat = round(row['weight']/100 * row['fats_per100g']),
carbohydrates = round(row['weight']/100 * row['carbohydrates_per100g']),
fiber = round(row['weight']/100 * row['fiber_per100g']),
))
frame = pd.DataFrame(frame_data)
print(frame)
The result:
title weight kilocalories proteins fat carbohydrates fiber
0 Beef Stew 150 225 30 10 8 0
1 Roasted Potatoes 100 150 2 6 28 2
2 Green Peas 100 81 5 0 14 5
Now let’s calculate the total values for the meal:
#Сalculate overall
kilocalories = frame['kilocalories'].sum()
proteins = frame['proteins'].sum()
fat = frame['fat'].sum()
carbohydrates = frame['carbohydrates'].sum()
fiber = frame['fiber'].sum()
print()
print('== OVERALL ==')
print(f'Kilocalories: {kilocalories} kcal')
print(f'Proteins: {proteins} g')
print(f'Fat: {fat} g')
print(f'Carbohydrates: {carbohydrates} g')
print(f'Fiber: {fiber} g')
The result
== OVERALL ==
Kilocalories: 456 kcal
Proteins: 37 g
Fat: 16 g
Carbohydrates: 50 g
Fiber: 7 g
I think it looks pretty good.
Tests on Several Photos
Let’s send some more photos of my meals to ensure the script works properly.
Meal #2 — Salmon + vegetables
Result #2
title weight kilocalories proteins fat carbohydrates fiber
0 Salmon 200 412 44 26 0 0
1 Boiled Potatoes 150 130 3 0 30 3
2 Carrots 100 41 1 0 10 3
3 Green Beans 100 31 2 0 7 3
== OVERALL ==
Kilocalories: 614 kcal
Proteins: 50 g
Fat: 26 g
Carbohydrates: 47 g
Fiber: 9 g
Meal #3 — Lasagna
Result #3
It did, however, count my son’s pizza next to it, but it is in the picture, right? Ideally, you could ask in the prompt to recognize the main dish in the center or just take a more careful photo.
title weight kilocalories proteins fat carbohydrates fiber
0 Lasagna 300 450 24 30 36 6
1 Pizza slice 150 375 18 15 50 4
== OVERALL ==
Kilocalories: 825 kcal
Proteins: 42 g
Fat: 45 g
Carbohydrates: 86 g
Fiber: 10 g
Meal #4 — Sushi
Result #4
You could nitpick that it didn’t recognize all the nigiri, but it did a fairly decent job counting the rolls.
title weight kilocalories proteins fat carbohydrates fiber
0 Salmon Nigiri 40 100 5 3 14 0
1 Tuna Nigiri 40 52 8 2 4 0
2 California Roll 150 360 9 12 52 3
3 Avocado Roll 150 210 4 10 27 4
== OVERALL ==
Kilocalories: 722 kcal
Proteins: 26 g
Fat: 27 g
Carbohydrates: 97 g
Fiber: 7 g
Meal #5 — Complicated Breakfast
Result #5
Here, I purposely created a complicated breakfast where I piled a lot of different things onto the plate, and to ChatGPT’s credit, it recognized almost everything.
title weight kilocalories proteins fat carbohydrates fiber
0 fried egg 50 75 6 6 0 0
1 cherry tomatoes 100 18 1 0 4 1
2 salmon 50 103 11 6 0 0
3 feta cheese 30 79 4 6 1 0
4 bread slice 40 100 4 2 19 1
5 pickled mushrooms 50 11 1 0 2 1
== OVERALL ==
Kilocalories: 386 kcal
Proteins: 27 g
Fat: 20 g
Carbohydrates: 26 g
Fiber: 3 g
Wow! It even recognized the mushrooms. Excellent work! Yes, the script occasionally makes mistakes, but the result is much better than I expected, especially for a prototype put together on the fly.
Building a Flask app
Now we can build a separate Flask app that can be deployed on a web server or packaged into an Android/iPhone app.
First, install Flask:
pip install flask
Let’s create the controller:
import os, base64, json
from flask import Flask, request, render_template
from PIL import Image, ImageOps, UnidentifiedImageError
import pandas as pd
from io import BytesIO
from openai import OpenAI
app = Flask(__name__)
#Define OPEN API Key and model
OPENAI_API_KEY = "[INSERT YOUR OPEN AI API KEY HERE]"
MODEL="gpt-4o"
client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY", OPENAI_API_KEY))
system_prompt = "You are a very useful assistant. Help me with determining the caloric content of my meal"
user_prompt = """
The photo shows food products for a meal. Determine which products are shown in the photo and return them ONLY as a JSON list,
where each list element should contain:
* "title" - the name of the product,
* "weight" - weight in grams,
* "kilocalories_per100g" - how many calories are contained in this product in 100 grams,
* "proteins_per100g" - the amount of proteins of this product per 100 grams,
* "fats_per100g" - the amount of fat per 100 grams of this product,
* "carbohydrates_per100g" - the amount of carbohydrates per 100 grams of this product,
* "fiber_per100g" - the amount of fiber per 100 grams of this product
"""
# Function to resize the image and return the resized PIL image
def resize_image(img, target_width):
img = ImageOps.exif_transpose(img)
original_width, original_height = img.size #get original size
target_height = int((target_width / original_width) * original_height) #calc proportion
resized_img = img.resize((target_width, target_height), Image.Resampling.LANCZOS) #resize image
return resized_img
# Function to convert a PIL image to a base64 string
def pil_image_to_base64(img):
buffered = BytesIO() #create buffer
img.save(buffered, format="JPEG") #Save the image to the buffer in JPEG format
base64_str = base64.b64encode(buffered.getvalue()).decode('utf-8') #convert buffer to base64 format
return base64_str
@app.route("/", methods=["GET", "POST"])
def index():
if request.method == "POST":
#Get uploaded file
file = request.files["file"]
if file:
try:
image = Image.open(file)
except UnidentifiedImageError:
return render_template("index.html", error="Incorrect Image!")
resized_image = resize_image(image, 1000)
base64_image = pil_image_to_base64(resized_image)
completion = client.chat.completions.create(
model=MODEL,
messages=[
{
"role": "system",
"content": [
{"type": "text", "text": system_prompt},
]
},
{
"role": "user",
"content": [
{"type": "text", "text": user_prompt},
{"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{base64_image}"}}
]
}
]
)
#Get and parse result
response_content = completion.choices[0].message.content
start_index = response_content.find('[')
end_index = response_content.rfind(']') + 1
json_str = response_content[start_index:end_index]
try:
data = json.loads(json_str)
frame_data = []
for row in data:
frame_data.append(dict(
title = row['title'],
weight = row['weight'],
kilocalories = round(row['weight'] / 100 * row['kilocalories_per100g']),
proteins = round(row['weight'] / 100 * row['proteins_per100g']),
fat = round(row['weight'] / 100 * row['fats_per100g']),
carbohydrates = round(row['weight'] / 100 * row['carbohydrates_per100g']),
fiber = round(row['weight'] / 100 * row['fiber_per100g']),
))
frame = pd.DataFrame(frame_data)
totals = {
'kilocalories': frame['kilocalories'].sum(),
'proteins': frame['proteins'].sum(),
'fat': frame['fat'].sum(),
'carbohydrates': frame['carbohydrates'].sum(),
'fiber': frame['fiber'].sum()
}
return render_template("index.html",
frame=frame.to_dict(orient="records"),
totals=totals,
base64_image=f"data:image/jpeg;base64,{base64_image}")
except json.JSONDecodeError as e:
return render_template("index.html", error="No Meal found!")
return render_template("index.html")
if __name__ == "__main__":
app.run(debug=True)
Let’s create a template for displaying the form and result data.
<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
<title>Calorie Detector</title>
<link href="https://cdn.jsdelivr.net/npm/bootstrap@5.3.3/dist/css/bootstrap.min.css" rel="stylesheet" integrity="sha384-QWTKZyjpPEjISv5WaRU9OFeRpok6YctnYmDr5pNlyT2bRjXh0JMhjY6hW+ALEwIH" crossorigin="anonymous">
</head>
<body>
<div class="container mt-3">
<div class="row">
<div class="col">
<h2>Upload photo of your Meal!</h2>
<form method="POST" enctype="multipart/form-data">
<div class="row">
<div class="col">
<input type="file" name="file" class="form-control" accept="image/*" required>
</div>
<div class="col">
<button type="submit" class="btn btn-primary">Upload a Photo!</button>
</div>
</div>
</form>
<hr>
{% if error %}
<div class="alert alert-danger" role="alert">
{{ error }}
</div>
{% endif %}
{% if frame %}
<h2>Meal Ingredients</h2>
<div class="row">
<div class="col-md-4">
<img src="{{ base64_image }}" class="img-thumbnail mb-3" alt="Uploaded Meal" />
</div>
<div class="col-md-8">
<h2>Meal Overall Result:</h2>
<h5>Kilocalories: {{ totals.kilocalories }} kcal</h5>
<h5>Proteins: {{ totals.proteins }} g</h5>
<h5>Fat: {{ totals.fat }} g</h5>
<h5>Carbohydrates: {{ totals.carbohydrates }} g</h5>
<h5>Fiber: {{ totals.fiber }} g</h5>
</div>
</div>
<table class="table table-striped">
<thead>
<tr>
<th>Product</th>
<th>Weight (g)</th>
<th>Kilocalories (kcal)</th>
<th>Proteins (g)</th>
<th>Fat (g)</th>
<th>Carbohydrates (g)</th>
<th>Fiber (g)</th>
</tr>
</thead>
<tbody>
{% for item in frame %}
<tr>
<td>{{ item.title }}</td>
<td>{{ item.weight }}</td>
<td>{{ item.kilocalories }}</td>
<td>{{ item.proteins }}</td>
<td>{{ item.fat }}</td>
<td>{{ item.carbohydrates }}</td>
<td>{{ item.fiber }}</td>
</tr>
{% endfor %}
</tbody>
</table>
{% endif %}
</div>
</div>
</div>
<script src="https://cdn.jsdelivr.net/npm/bootstrap@5.3.3/dist/js/bootstrap.bundle.min.js" integrity="sha384-YvpcrYf0tY3lHB60NNkmXc5s9fDVZLESaAA55NDzOxhy9GkcIdslK1eN7N6jIeHz" crossorigin="anonymous"></script>
</body>
</html>
Result of the Flask Application with Data Output
Calculating the Cost of an OpenAI Request
It’s interesting to calculate the cost of an OpenAI request. Pricing information is available here: OpenAI Pricing. The cost of Vision will depend on the size of the images. For example, for an image of 1000x1000 pixels, it will be approximately $0.002 just for Vision. Therefore, for production, it’s essential to choose an image size that’s not too large but still has enough detail for recognition.
Here’s how to get the input/output tokens from the completion structure and calculate the cost:
#Get tokens from usage
prompt_tokens = completion.usage.prompt_tokens
completion_tokens = completion.usage.completion_tokens
total_tokens = completion.usage.total_tokens
#Show tokens
print(f"Prompt Tokens: {prompt_tokens}")
print(f"Completion Tokens: {completion_tokens}")
print(f"Total Tokens: {total_tokens}")
# https://openai.com/api/pricing/
cost_per_1m_input_tokens = 2.50 # price for input tokens
cost_per_1m_output_tokens = 10.00 # price for output tokens
#Calc cost
cost_input = (prompt_tokens / 1_000_000) * cost_per_1m_input_tokens
cost_output = (completion_tokens / 1_000_000) * cost_per_1m_output_tokens
total_cost = round((cost_input + cost_output),6)
print(f"Total cost of the request: ${total_cost}")
The result. Overall cost is approximate $0.004 per query.
Prompt Tokens: 974
Completion Tokens: 128
Total Tokens: 1102
Total cost of the request: $0.003715
Ideas for Future Development
Ideas for Technical Development
I naively ask the model for JSON in text format. This works because ChatGPT-4 is smart enough, but it’s better to define the format for returning data more explicitly.
It’s also worth experimenting with the prompt, image size, and parameters to optimize the detection of meal components and reduce token consumption. If I were building this app more “seriously,” I’d try fine-tuning the model with data from global cuisines so that the model could identify portion sizes and ingredients more accurately.
Ideas for Developing the App into a Product
First, I’d add the ability to detect sugar and saturated fats. These parameters are also important for healthy eating. It’s not difficult to add these parameters to the prompt and then save them into the dataframe.
Next, I’d implement meal tracking to calculate the total intake for a day. For example, after photographing your breakfast, lunch, and dinner, you could see how much you’ve eaten in total for the day. Data could be displayed in a calendar format.
Additionally, recommendations and goals could be added! If you’ve eaten too much fat, for example, the app could suggest cutting down on fats. Or, if you’re into sports, it could recommend increasing protein intake.
If we add user parameters (weight, height, basic physical activity), we could calculate recommended calorie intake and provide personalized suggestions based on that!
Maybe one of my readers will use this idea and implementation to launch a startup? Don’t forget to give me a share! :)
About the Author
The author of this article is an engineer with 20 years of experience as a developer and data engineer, with expertise in e-commerce, machine learning (including AI), and blockchain. Always eager to experiment and test new things, opening to project-based collaborations.
Links
- Quickstart with OpenAI: https://platform.openai.com/docs/quickstart
- OpenAI Vision — works with Images: https://platform.openai.com/docs/guides/vision
- OpenAI— Structured Json Output https://platform.openai.com/docs/guides/structured-outputs/json-mode#expander-0