Loading ad...

Summarizing Emails with LLMs

Zack had already done something powerful: he connected his Gmail inbox to Python. He could fetch the latest ten emails, strip out the messy signatures, and print out the subject with a clean body.

But there was still a problem. Every morning, Zack opened his terminal and saw lines and lines of text. Subjects and bodies printed neatly, yes, but he still had to read them all to know what mattered.

“This feels like progress,” Zack sighed, “but I’m still drowning in words.”

That’s when he realized: the next step was to make the assistant summarize his inbox for him. Short, clear notes. A daily digest he could scan in one minute instead of thirty.

That’s what we’ll build in this lesson.

Step 1: Designing prompts for summarization

Before Zack wrote any code, he learned something important: LLMs are only as good as the prompts you give them.

Think of a prompt as your instruction manual to the model. If you say, “Summarize this email,” you might get one vague line. But if you say, “Summarize this email in one clear sentence highlighting the request and any deadlines,” the answer will be sharper.

Zack experimented with three styles of prompts:

  1. Naïve prompt:
“Summarize this email.”
  • Output: “The team is asked for a report.” (too short, missing details).
  1. Focused prompt:
“Summarize this email in one sentence, including the main request and deadline.”
  • Output: “The team is asked to send the report by Friday.” (better).
  1. Digest-friendly prompt:
“You are helping create a daily email digest. Summarize this message in 1–2 sentences. Focus on who is requesting what, and include any deadlines or key dates.”
  • Output: “Ali requests the team to send the project report by Friday. No other details were mentioned.” (perfect).

That third one became Zack’s standard. It wasn’t just a summary; it was a digest entry.

Step 2: Generating a daily digest

Now Zack wanted more than one summary. He wanted the assistant to process his latest unread emails, summarize each one, and print them together as a clean daily digest.

The structure he imagined was simple:

1
2
3
4
5
Daily Digest – February 6, 2025

1. Ali requests the team to send the project report by Friday.  
2. HR reminds staff to update their timesheets before Monday.  
3. Client asked to reschedule tomorrow’s call to Thursday.  

With that format, Zack could glance at one screen and know the essentials.

How would the script do this?

  • Fetch emails: Use the Gmail API code from Lesson 3.
  • Send each email body to GPT-4o with the digest prompt.
  • Collect summaries: Store them in a list.
  • Format the digest: Print them with numbering and today’s date.

This is where the assistant started to feel real.

Step 3: Handling long email threads

One morning, Zack’s inbox included a massive client thread. Twenty replies deep. Each reply quoted the previous one.

When he sent the whole thing to the model, it choked:

  • The output ignored the latest details.
  • The summary repeated old information.

That’s when Zack learned about chunking. Here’s the trick:

  • Split the thread into pieces, usually by reply markers like > On Jan 5, John wrote: or by chunk size (say, 500–1000 characters).
  • Summarize each chunk separately.
  • Summarize the summaries, a final pass that condenses the chunks into one clear note.

Example flow:

  • Chunk 1: “Initial project kickoff…” → summary: Kickoff scheduled for Feb 1.
  • Chunk 2: “Update about delays…” → summary: Vendor delayed materials, pushing timeline.
  • Final summary: Kickoff was Feb 1; timeline delayed due to vendor.

This way, Zack’s assistant stayed focused on the newest parts of long emails without getting lost in history.

Step 4: Coding the inbox summarizer

Zack opened a new file called summarize_inbox.py. He combined the Gmail fetcher from Lesson 3 with GPT-4o.

Here’s the script:

python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
import os
import base64
from datetime import datetime
from google.auth.transport.requests import Request
from google.oauth2.credentials import Credentials
from google_auth_oauthlib.flow import InstalledAppFlow
from googleapiclient.discovery import build
from openai import OpenAI
from bs4 import BeautifulSoup

SCOPES = ['https://www.googleapis.com/auth/gmail.readonly']
client = OpenAI()

def get_service():
    creds = None
    if os.path.exists('token.json'):
        creds = Credentials.from_authorized_user_file('token.json', SCOPES)
    if not creds or not creds.valid:
        if creds and creds.expired and creds.refresh_token:
            creds.refresh(Request())
        else:
            flow = InstalledAppFlow.from_client_secrets_file('credentials.json', SCOPES)
            creds = flow.run_local_server(port=0)
        with open('token.json', 'w') as token:
            token.write(creds.to_json())
    return build('gmail', 'v1', credentials=creds)

def clean_body(msg):
    payload = msg['payload']
    body = ""
    if 'parts' in payload:
        for part in payload['parts']:
            if part['mimeType'] == 'text/plain':
                data = part['body']['data']
                body = base64.urlsafe_b64decode(data).decode('utf-8', errors='ignore')
                break
            elif part['mimeType'] == 'text/html':
                data = part['body']['data']
                html = base64.urlsafe_b64decode(data).decode('utf-8', errors='ignore')
                body = BeautifulSoup(html, 'html.parser').get_text()
                break
    else:
        data = payload['body'].get('data')
        if data:
            body = base64.urlsafe_b64decode(data).decode('utf-8', errors='ignore')

    # Basic cleaning
    lines = body.splitlines()
    new_lines = []
    for line in lines:
        if line.strip().startswith('--'):  # signature
            continue
        if line.lower().startswith('forwarded message'):
            break
        if line.strip().startswith('>'):  # quoted reply
            continue
        new_lines.append(line)
    return '\n'.join(new_lines).strip()

def summarize_email(text):
    prompt = f"You are helping create a daily email digest. Summarize this message in 1–2 sentences. Focus on who is requesting what, and include any deadlines.\n\n{text}"
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": prompt}]
    )
    return response.choices[0].message.content.strip()

def main():
    service = get_service()
    results = service.users().messages().list(userId='me', labelIds=['INBOX'], maxResults=10).execute()
    messages = results.get('messages', [])
    summaries = []

    for m in messages:
        msg = service.users().messages().get(userId='me', id=m['id'], format='full').execute()
        subject = next((h['value'] for h in msg['payload']['headers'] if h['name'] == 'Subject'), '(No Subject)')
        body = clean_body(msg)
        if body:
            summary = summarize_email(body)
            summaries.append(f"{subject}: {summary}")

    print("\n=== DAILY DIGEST ===")
    print("Date:", datetime.now().strftime("%B %d, %Y"))
    for i, s in enumerate(summaries, 1):
        print(f"{i}. {s}")

if __name__ == '__main__':
    main()

Step 5: Testing the digest

When Zack ran the script:

1
python summarize_inbox.py

He got output like:

1
2
3
4
5
6
=== DAILY DIGEST ===
Date: February 6, 2025

1. Project Report: Ali asks the team to send the report by Friday.  
2. HR Reminder: HR asks staff to update timesheets before Monday.  
3. Client Call: Client requests to move the call to Thursday.  

Zack grinned. In less than a minute, he could see the essence of his inbox. No scrolling, no distractions.

Step 6: Practice exercise

Your task is to extend this script:

  • Fetch the latest unread emails only.
  • Summarize them using the digest prompt.
  • Print them with subjects and numbering.

Hint: In the Gmail list() call, add:

python
1
2
3
4
5
results = service.users().messages().list(
    userId='me',
    labelIds=['INBOX', 'UNREAD'],
    maxResults=10
).execute()

Run it, and check how many unread emails you can summarize at once.

Zack’s feedback

After using the daily digest for a week, Zack noticed big changes:

  • His mornings were calmer. Instead of slogging through 50+ emails, he scanned five summaries.
  • He could spot urgent items instantly. If a client wanted a reply, it stood out.
  • His stress dropped, because he no longer feared “missing something important.”

For the first time, Zack felt like his inbox was working for him, not against him.

Conclusion

In this lesson, you saw Zack move from raw email text to a usable digest:

  • He learned to craft better prompts for summarization
  • He generated a clean daily digest with GPT-4o
  • He handled long email threads using chunking
  • He built a script that summarized his inbox automatically

This is where the assistant really starts paying off. It’s no longer just code—it’s a daily tool that saves time and energy.

Frequently Asked Questions

LLMs follow instructions closely. Clear prompts (“summarize in 1–2 sentences, include deadlines”) produce better results than vague ones.

A daily digest is a short list of summaries of unread emails, showing the main request and deadlines so you can scan your inbox quickly.

Split the thread into chunks, summarize each chunk, then combine those summaries into one final note. This ensures newer details aren’t lost.

You’ll use Gmail API (for fetching emails), OpenAI’s API (for summarizing), and libraries like googleapiclient and openai.

Build a Python script that fetches today’s emails, summarizes each one with GPT-4o, and prints them as a numbered daily digest.

Still have questions?Contact our support team