AI-powered Calorie Tracker. How to use ChatGPT and Python to Analyze Your Meals

14 min readOct 15, 2024

A few weeks ago, I decided to track my daily calories, protein, carbs, fat, and fiber intake to make sure I was eating right (getting enough protein and fiber and not gaining more fat and calories than I needed). To do this, I created a Google Spreadsheet and painstakingly and meticulously entered all the calories and macronutrients for a total after each meal and snack.

AI-powered Calorie Tracker. How to use ChatGPT and Python to Analyze Your Meals. Image by the Author.

The situation got complicated during lunch, when I was brought a plate of food where I definitely couldn’t weigh the ingredients individually. Luckily, I had the idea to take a photo of my lunch and send it to ChatGPT asking how many calories and nutrients are in each ingredient. To my surprise, ChatGPT 4o was quite accurate in describing what was on my plate, correctly identifying the ingredients and estimating the total calorie count. Yes, the accuracy was plus or minus 50 grams, but it is quite enough to assess the correctness of my diet.

At that moment, I thought that it is in such small things that AI penetrates our lives faster than you expect.

When I returned from lunch to the office, I thought that I could ask ChatGPT to return the result not as text, but as structured JSON, and then it could be processed into a data frame for output and calculation of calories, proteins, carbohydrates, fats and fiber for each product separately and in general. It will turn out to be a simple application for independent use, and maybe even a small prototype for a potential startup. Right now I will try to do this.

Plan for building the app
Creating and testing the prompt
Connecting to the OpenAI API and making a request
Processing the received data
Testing with multiple photos
Creating a Flask app
Cost of an OpenAI Request
Ideas for Future Development

Plan for building the app

The idea is simple:

Send a meal photo to ChatGPT, requesting a structured JSON response (a list of ingredients and their nutritional details).
Process the JSON into a dataframe, displaying the results for each ingredient and the meal as a whole.
Wrap this functionality in a simple web app that can be deployed on a server.

Creating and Testing the Prompt

Since it’s known that large language models (LLMs) aren’t great with precise math, I’ll separately ask for the ingredient weight in grams and use known data — calories, proteins, carbs, fats, and fiber per 100 grams. I’ll then calculate the exact values myself in the script.

Here’s the prompt I came up with:

You are a very useful assistant. Help me with determining the caloric content of my meal

The photo shows food products for a meal. Determine which products are shown in the photo and return them ONLY as a JSON list, 
where each list element should contain:
  * "title" - the name of the product, 
  * "weight" - weight in grams, 
  * "kilocalories_per100g" - how many calories are contained in this product in 100 grams, 
  * "proteins_per100g" - the amount of proteins of this product per 100 grams, 
  * "fats_per100g" - the amount of fat per 100 grams of this product, 
  * "carbohydrates_per100g" - the amount of carbohydrates per 100 grams of this product, 
  * "fiber_per100g" - the amount of fiber per 100 grams of this product

Let’s test this prompt with a photo of my delicious Portuguese lunch:

*Photo of the Meal with Beef Stew, Roasted Potatoes, and Green Peas by the Author of the Article.*

And try to test in web-version of ChatGPT…

Not bad! It looks like it’s working — ChatGPT returned a JSON in the exact format I requested. It did add some unnecessary text, but that’s not an issue.

Connecting to the OpenAI API and Making a Request

Now let’s do the same thing using the OpenAI API. To access the API, you need to register at OpenAI Platform, add a small balance (I added $5), and generate an API key.

Now let’s write a Python script to send the request to ChatGPT via the API.

Installing the required libraries:

pip install openai Pillow pandas

We then import the necessary libraries, API key, and the model we’ll use:

import os, base64, json
from openai import OpenAI
from PIL import Image, ImageOps
from io import BytesIO
import pandas as pd

#Define OPEN API Key and model
OPENAI_API_KEY = "[INSERT YOUR OPEN AI API KEY HERE]"
MODEL="gpt-4o" #or "gpt-4o-mini"

I used gpt-4, but I also tested gpt-4o-mini, and while it worked decently, its accuracy was slightly lower.

Next, we’ll declare two functions to process our lunch photo. The first corrects the image orientation if needed, and the second resizes it to reduce token usage (since ChatGPT doesn’t need a 12-megapixel photo). We’ll also convert the image to a base64 string to send it.


# Function to resize the image and return the resized PIL image
def resize_image(image_path, target_width):

    #Open image
    with Image.open(image_path) as img:

        #Correcting image orientation if there is EXIF data
        img = ImageOps.exif_transpose(img)
        original_width, original_height = img.size #get original size

        target_height = int((target_width / original_width) * original_height) #calc proportion
        resized_img = img.resize((target_width, target_height), Image.Resampling.LANCZOS) #resize image

        # return image object
        return resized_img

# Function to convert a PIL image to a base64 string
def pil_image_to_base64(img):

    buffered = BytesIO() #create buffer
    img.save(buffered, format="JPEG") #Save the image to the buffer in JPEG format
    base64_str = base64.b64encode(buffered.getvalue()).decode('utf-8') #convert buffer to base64 format

    return base64_str

Next, we define the image path and process the image:

# Path to your image
image_path = "./data/meal1.jpg"

#Read and resize image, and convert to base64
resized_image = resize_image(image_path, 1000)
base64_image = pil_image_to_base64(resized_image)

We’ll set up the system and user prompts:

system_prompt = """
  You are a very useful assistant. Help me with determining the caloric content of products
"""

user_prompt = """
  The photo shows food products for a meal. Determine approximately which products are shown in the photo and return them ONLY as a json list, 
  where each list element should contain:
    * "title" - the name of the product, 
    * "weight" - weight in grams, 
    * "kilocalories_per100g" - how many calories are contained in this product in 100 grams, 
    * "proteins_per100g" - the amount of proteins of this product per 100 grams, 
    * "fats_per100g" - the amount of fat per 100 grams of this product, 
    * "carbohydrates_per100g" - the amount of carbohydrates per 100 grams of this product, 
    * "fiber_per100g" - the amount of fiber per 100 grams of this product, 
"""

Let’s initialize the OpenAI client and send the data:

client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY", OPENAI_API_KEY))

completion = client.chat.completions.create(
  model=MODEL,
  messages=[
    {
      "role": "system",
      "content": [
        {"type": "text", "text": system_prompt},
      ]
    },
    {
      "role": "user",
      "content": [
        {"type": "text", "text": user_prompt},
        {"type": "image_url","image_url": {"url": f"data:image/jpeg;base64,{base64_image}"}}
      ]
    }
  ]
)

What did we do above? We “created” a chat using client.chat.completions, specified our model with the MODEL variable, defined the system prompt with role=system, and the user prompt consisting of text and the photo in base64 format separately.

Finally, we’ll get and print the response:

response_content = completion.choices[0].message.content
print("Response: " + response_content)

Result: We received both text and JSON. It works!

Response: Based on a description of typical meals, I can provide approximate nutritional information. Here's a possible breakdown for what you've described:

```json
[
    {
        "title": "Beef Stew",
        "weight": 150,
        "kilocalories_per100g": 150,
        "proteins_per100g": 20,
        "fats_per100g": 7,
        "carbohydrates_per100g": 5,
        "fiber_per100g": 0
    },
    {
        "title": "Roasted Potatoes",
        "weight": 100,
        "kilocalories_per100g": 150,
        "proteins_per100g": 2,
        "fats_per100g": 6,
        "carbohydrates_per100g": 28,
        "fiber_per100g": 2
    },
    {
        "title": "Green Peas",
        "weight": 100,
        "kilocalories_per100g": 81,
        "proteins_per100g": 5,
        "fats_per100g": 0,
        "carbohydrates_per100g": 14,
        "fiber_per100g": 5
    }
]
```

Adjust the weights and content as needed based on actual portion sizes and ingredients.

Processing the Received Data

Now we can extract the JSON part of the response:

# Try to extract and clean the JSON part from the response
try:
    start_index = response_content.find('[')  # JSON starts with [
    end_index = response_content.rfind(']') + 1  # JSON ends with ]

    # Extract the JSON string
    json_str = response_content[start_index:end_index]

    # Parse the JSON string
    json_data = json.loads(json_str)
    
    # Test pring Result
    #print(json.dumps(json_data, indent=4, ensure_ascii=False)) 

except json.JSONDecodeError as e:
    print("Failed to parse JSON:", e)
except Exception as ex:
    print("An error occurred:", ex)

Then, we’ll iterate over each product in the JSON to calculate the exact value of calories, proteins, fats, carbohydrates, and fiber based on the weight, gather this into a new frame_data list, and then create a dataframe from it:

frame_data = []
for row in json_data:

    frame_data.append(dict(
        title =             row['title'],
        weight =            row['weight'],
        kilocalories =      round(row['weight']/100 * row['kilocalories_per100g']),
        proteins =          round(row['weight']/100 * row['proteins_per100g']),
        fat =               round(row['weight']/100 * row['fats_per100g']),
        carbohydrates =     round(row['weight']/100 * row['carbohydrates_per100g']),
        fiber =             round(row['weight']/100 * row['fiber_per100g']),
    ))


frame = pd.DataFrame(frame_data)
print(frame)

The result:

              title   weight  kilocalories  proteins  fat  carbohydrates  fiber
0         Beef Stew     150           225        30   10              8      0
1  Roasted Potatoes     100           150         2    6             28      2
2        Green Peas     100            81         5    0             14      5

Now let’s calculate the total values for the meal:

#Сalculate overall
kilocalories = frame['kilocalories'].sum()
proteins = frame['proteins'].sum()
fat = frame['fat'].sum()
carbohydrates = frame['carbohydrates'].sum()
fiber = frame['fiber'].sum()

print()
print('== OVERALL ==')
print(f'Kilocalories: {kilocalories} kcal')
print(f'Proteins: {proteins} g')
print(f'Fat: {fat} g')
print(f'Carbohydrates: {carbohydrates} g')
print(f'Fiber: {fiber} g')

The result

== OVERALL ==
Kilocalories: 456 kcal
Proteins: 37 g
Fat: 16 g
Carbohydrates: 50 g
Fiber: 7 g

I think it looks pretty good.

Tests on Several Photos

Let’s send some more photos of my meals to ensure the script works properly.

Meal #2 — Salmon + vegetables

Photo of the Meal with Salmon by the Author of the Article

Result #2

              title  weight  kilocalories  proteins  fat  carbohydrates  fiber
0           Salmon     200           412        44   26              0      0
1  Boiled Potatoes     150           130         3    0             30      3
2          Carrots     100            41         1    0             10      3
3      Green Beans     100            31         2    0              7      3

== OVERALL ==
Kilocalories: 614 kcal
Proteins: 50 g
Fat: 26 g
Carbohydrates: 47 g
Fiber: 9 g

Meal #3 — Lasagna

Photo of the Lasagna by the Author of the Article

Result #3

It did, however, count my son’s pizza next to it, but it is in the picture, right? Ideally, you could ask in the prompt to recognize the main dish in the center or just take a more careful photo.

         title    weight  kilocalories  proteins  fat  carbohydrates  fiber
0      Lasagna     300           450        24   30             36      6
1  Pizza slice     150           375        18   15             50      4

== OVERALL ==
Kilocalories: 825 kcal
Proteins: 42 g
Fat: 45 g
Carbohydrates: 86 g
Fiber: 10 g

Meal #4 — Sushi

Photo of the Sushi by the Author of the Article

Result #4

You could nitpick that it didn’t recognize all the nigiri, but it did a fairly decent job counting the rolls.

             title  weight  kilocalories  proteins  fat  carbohydrates  fiber
0    Salmon Nigiri      40           100         5    3             14      0
1      Tuna Nigiri      40            52         8    2              4      0
2  California Roll     150           360         9   12             52      3
3     Avocado Roll     150           210         4   10             27      4

== OVERALL ==
Kilocalories: 722 kcal
Proteins: 26 g
Fat: 27 g
Carbohydrates: 97 g
Fiber: 7 g

Meal #5 — Complicated Breakfast

Photo of the Breakfast by the Author of the Article

Result #5

Here, I purposely created a complicated breakfast where I piled a lot of different things onto the plate, and to ChatGPT’s credit, it recognized almost everything.

                title  weight  kilocalories  proteins  fat  carbohydrates  fiber
0          fried egg      50            75         6    6              0      0
1    cherry tomatoes     100            18         1    0              4      1
2             salmon      50           103        11    6              0      0
3        feta cheese      30            79         4    6              1      0
4        bread slice      40           100         4    2             19      1
5  pickled mushrooms      50            11         1    0              2      1

== OVERALL ==
Kilocalories: 386 kcal
Proteins: 27 g
Fat: 20 g
Carbohydrates: 26 g
Fiber: 3 g

Wow! It even recognized the mushrooms. Excellent work! Yes, the script occasionally makes mistakes, but the result is much better than I expected, especially for a prototype put together on the fly.

Building a Flask app

Now we can build a separate Flask app that can be deployed on a web server or packaged into an Android/iPhone app.

First, install Flask:

pip install flask

Let’s create the controller:

import os, base64, json
from flask import Flask, request, render_template
from PIL import Image, ImageOps, UnidentifiedImageError
import pandas as pd
from io import BytesIO
from openai import OpenAI

app = Flask(__name__)

#Define OPEN API Key and model
OPENAI_API_KEY = "[INSERT YOUR OPEN AI API KEY HERE]"
MODEL="gpt-4o"

client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY", OPENAI_API_KEY))

system_prompt = "You are a very useful assistant. Help me with determining the caloric content of my meal"

user_prompt = """
The photo shows food products for a meal. Determine which products are shown in the photo and return them ONLY as a JSON list, 
where each list element should contain:
  * "title" - the name of the product, 
  * "weight" - weight in grams, 
  * "kilocalories_per100g" - how many calories are contained in this product in 100 grams, 
  * "proteins_per100g" - the amount of proteins of this product per 100 grams, 
  * "fats_per100g" - the amount of fat per 100 grams of this product, 
  * "carbohydrates_per100g" - the amount of carbohydrates per 100 grams of this product, 
  * "fiber_per100g" - the amount of fiber per 100 grams of this product
"""

# Function to resize the image and return the resized PIL image
def resize_image(img, target_width):

    img = ImageOps.exif_transpose(img)
    original_width, original_height = img.size #get original size
    target_height = int((target_width / original_width) * original_height) #calc proportion
    resized_img = img.resize((target_width, target_height), Image.Resampling.LANCZOS) #resize image
    return resized_img

# Function to convert a PIL image to a base64 string
def pil_image_to_base64(img):

    buffered = BytesIO() #create buffer
    img.save(buffered, format="JPEG") #Save the image to the buffer in JPEG format
    base64_str = base64.b64encode(buffered.getvalue()).decode('utf-8') #convert buffer to base64 format
    return base64_str


@app.route("/", methods=["GET", "POST"])
def index():

    if request.method == "POST":

        #Get uploaded file
        file = request.files["file"]
        if file:

            try:
                image = Image.open(file)
            except UnidentifiedImageError:
                return render_template("index.html", error="Incorrect Image!")

            resized_image = resize_image(image, 1000)
            base64_image = pil_image_to_base64(resized_image)

            completion = client.chat.completions.create(
                model=MODEL,
                messages=[
                    {
                        "role": "system",
                        "content": [
                            {"type": "text", "text": system_prompt},
                        ]
                    },
                    {
                        "role": "user",
                        "content": [
                            {"type": "text", "text": user_prompt},
                            {"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{base64_image}"}}
                        ]
                    }
                ]
            )

            #Get and parse result
            response_content = completion.choices[0].message.content

            start_index = response_content.find('[')
            end_index = response_content.rfind(']') + 1
            json_str = response_content[start_index:end_index]

            try:

                data = json.loads(json_str)

                frame_data = []
                for row in data:
                    frame_data.append(dict(
                        title = row['title'],
                        weight = row['weight'],
                        kilocalories = round(row['weight'] / 100 * row['kilocalories_per100g']),
                        proteins = round(row['weight'] / 100 * row['proteins_per100g']),
                        fat = round(row['weight'] / 100 * row['fats_per100g']),
                        carbohydrates = round(row['weight'] / 100 * row['carbohydrates_per100g']),
                        fiber = round(row['weight'] / 100 * row['fiber_per100g']),
                    ))

                frame = pd.DataFrame(frame_data)
                totals = {
                    'kilocalories': frame['kilocalories'].sum(),
                    'proteins': frame['proteins'].sum(),
                    'fat': frame['fat'].sum(),
                    'carbohydrates': frame['carbohydrates'].sum(),
                    'fiber': frame['fiber'].sum()
                }

                return render_template("index.html",
                                       frame=frame.to_dict(orient="records"),
                                       totals=totals,
                                       base64_image=f"data:image/jpeg;base64,{base64_image}")

            except json.JSONDecodeError as e:
                return render_template("index.html", error="No Meal found!")

    return render_template("index.html")


if __name__ == "__main__":
    app.run(debug=True)

Let’s create a template for displaying the form and result data.

<!doctype html>
<html lang="en">
<head>
  <meta charset="utf-8">
  <meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
  <title>Calorie Detector</title>
  <link href="https://cdn.jsdelivr.net/npm/bootstrap@5.3.3/dist/css/bootstrap.min.css" rel="stylesheet" integrity="sha384-QWTKZyjpPEjISv5WaRU9OFeRpok6YctnYmDr5pNlyT2bRjXh0JMhjY6hW+ALEwIH" crossorigin="anonymous">
</head>
<body>

  <div class="container mt-3">
    <div class="row">
      <div class="col">

        <h2>Upload photo of your Meal!</h2>
        <form method="POST" enctype="multipart/form-data">
          <div class="row">
            <div class="col">
              <input type="file" name="file" class="form-control" accept="image/*" required>
            </div>
            <div class="col">
              <button type="submit" class="btn btn-primary">Upload a Photo!</button>
            </div>
          </div>
        </form>

        <hr>

        {% if error %}
          <div class="alert alert-danger" role="alert">
            {{ error }}
          </div>
        {% endif %}

        {% if frame %}

        <h2>Meal Ingredients</h2>

          <div class="row">
            <div class="col-md-4">
              <img src="{{ base64_image }}" class="img-thumbnail mb-3" alt="Uploaded Meal" />
            </div>

            <div class="col-md-8">

              <h2>Meal Overall Result:</h2>

              <h5>Kilocalories: {{ totals.kilocalories }} kcal</h5>
              <h5>Proteins: {{ totals.proteins }} g</h5>
              <h5>Fat: {{ totals.fat }} g</h5>
              <h5>Carbohydrates: {{ totals.carbohydrates }} g</h5>
              <h5>Fiber: {{ totals.fiber }} g</h5>

            </div>
          </div>

          <table class="table table-striped">
            <thead>
              <tr>
                <th>Product</th>
                <th>Weight (g)</th>
                <th>Kilocalories (kcal)</th>
                <th>Proteins (g)</th>
                <th>Fat (g)</th>
                <th>Carbohydrates (g)</th>
                <th>Fiber (g)</th>
              </tr>
            </thead>
            <tbody>
              {% for item in frame %}
              <tr>
                <td>{{ item.title }}</td>
                <td>{{ item.weight }}</td>
                <td>{{ item.kilocalories }}</td>
                <td>{{ item.proteins }}</td>
                <td>{{ item.fat }}</td>
                <td>{{ item.carbohydrates }}</td>
                <td>{{ item.fiber }}</td>
              </tr>
              {% endfor %}
            </tbody>
          </table>

        {% endif %}

      </div>
    </div>
  </div>

    <script src="https://cdn.jsdelivr.net/npm/bootstrap@5.3.3/dist/js/bootstrap.bundle.min.js" integrity="sha384-YvpcrYf0tY3lHB60NNkmXc5s9fDVZLESaAA55NDzOxhy9GkcIdslK1eN7N6jIeHz" crossorigin="anonymous"></script>
</body>
</html>

Result of the Flask Application with Data Output

Calculating the Cost of an OpenAI Request

It’s interesting to calculate the cost of an OpenAI request. Pricing information is available here: OpenAI Pricing. The cost of Vision will depend on the size of the images. For example, for an image of 1000x1000 pixels, it will be approximately $0.002 just for Vision. Therefore, for production, it’s essential to choose an image size that’s not too large but still has enough detail for recognition.

Here’s how to get the input/output tokens from the completion structure and calculate the cost:

#Get tokens from usage
prompt_tokens = completion.usage.prompt_tokens
completion_tokens = completion.usage.completion_tokens
total_tokens = completion.usage.total_tokens

#Show tokens
print(f"Prompt Tokens: {prompt_tokens}")
print(f"Completion Tokens: {completion_tokens}")
print(f"Total Tokens: {total_tokens}")

# https://openai.com/api/pricing/
cost_per_1m_input_tokens = 2.50  # price for input tokens
cost_per_1m_output_tokens = 10.00  # price for output tokens

#Calc cost
cost_input = (prompt_tokens / 1_000_000) * cost_per_1m_input_tokens
cost_output = (completion_tokens / 1_000_000) * cost_per_1m_output_tokens
total_cost = round((cost_input + cost_output),6)

print(f"Total cost of the request: ${total_cost}")

The result. Overall cost is approximate $0.004 per query.

Prompt Tokens: 974
Completion Tokens: 128
Total Tokens: 1102
Total cost of the request: $0.003715

Ideas for Future Development

Ideas for Technical Development

I naively ask the model for JSON in text format. This works because ChatGPT-4 is smart enough, but it’s better to define the format for returning data more explicitly.

It’s also worth experimenting with the prompt, image size, and parameters to optimize the detection of meal components and reduce token consumption. If I were building this app more “seriously,” I’d try fine-tuning the model with data from global cuisines so that the model could identify portion sizes and ingredients more accurately.

Ideas for Developing the App into a Product

First, I’d add the ability to detect sugar and saturated fats. These parameters are also important for healthy eating. It’s not difficult to add these parameters to the prompt and then save them into the dataframe.

Next, I’d implement meal tracking to calculate the total intake for a day. For example, after photographing your breakfast, lunch, and dinner, you could see how much you’ve eaten in total for the day. Data could be displayed in a calendar format.

Additionally, recommendations and goals could be added! If you’ve eaten too much fat, for example, the app could suggest cutting down on fats. Or, if you’re into sports, it could recommend increasing protein intake.

If we add user parameters (weight, height, basic physical activity), we could calculate recommended calorie intake and provide personalized suggestions based on that!

Maybe one of my readers will use this idea and implementation to launch a startup? Don’t forget to give me a share! :)

About the Author

The author of this article is an engineer with 20 years of experience as a developer and data engineer, with expertise in e-commerce, machine learning (including AI), and blockchain. Always eager to experiment and test new things, opening to project-based collaborations.