(#2) Building NotebookLM from scratch: making consumption easier
From reading lists to playlists: Delivering AI-generated podcasts to your podcast apps
Quick Update
Built a working pipeline that transforms articles into audio content
Restructured the codebase for better maintainability
Implemented configurable workflows using YAML
Added support for S3-compatible storage
Here is a podcast version of this article generated via gyandex
New to this series? Check out the first post where I explored why I am building an open-source alternative to NotebookLM.
The Journey So Far
Remember how we talked about information overload in our last post? Well, I've made significant progress in tackling that challenge. Instead of just dreaming about an ideal content consumption tool, we now have a working prototype that does something pretty cool: it turns your reading list into a podcast feed!
What's Working Now
The system can now:
Take an article link as input
Process and transform the content
Generate audio output
Create a podcast feed that updates automatically with new content
This is particularly exciting because it takes us one step closer to flexible content consumption - a key feature I wanted to expand upon from NotebookLM.
Technical Progress
From Prototype to Production-Ready
The initial prototype lived in a single Jupyter notebook. Now, it's evolved into a structured project with dedicated modules.
Before:
main.ipynb
After:
gyandex
├── __init__.py
├── cli
│  ├── __init__.py
│  └── genpod.py
├── llms
│  ├── __init__.py
│  ├── factory.py
│  └── factory_test.py
├── loaders
│  ├── __init__.py
│  ├── factory.py
│  └── factory_test.py
└── podgen
├── __init__.py
├── config
│  ├── __init__.py
│  ├── loader.py
│  ├── loader_test.py
│  └── schema.py
├── engine
│  ├── __init__.py
│  ├── generator.py
│  ├── publisher.py
│  ├── publisher_test.py
│  ├── synthesizer.py
│  ├── synthesizer_test.py
│  └── workflows.py
├── feed
│  ├── __init__.py
│  ├── generator.py
│  ├── generator_test.py
│  ├── models.py
│  └── models_test.py
├── processors
│  ├── __init__.py
│  └── tts.py
└── storage
├── __init__.py
├── factory.py
├── factory_test.py
├── s3.py
└── s3_test.py
It seems a bit daunting, but that is also the reason why I had to do this now because having everything in a notebook was no longer sustainable.
Configurable Workflows
Configuration is now managed through YAML files, making it easier to control the entire process. The system supports environment variables for secure credential management.
Here's a simplified example:
version: "1.0"
content:
source: "https://www.rubick.com/skip-level-1-on-1s/"
format: "html"
llm:
provider: "google-generative-ai"
model: "gemini-1.5-pro"
temperature: 1.0
google_api_key: "${GOOGLE_API_KEY}"
tts: # This section isn't hooked up properly yet
provider: "aws"
default_voice: "default_host"
voices:
default_host:
voice_id: "Matthew"
speaking_rate: 1.0
pitch: 0
guest:
voice_id: "Joanna"
speaking_rate: 1.1
pitch: 1
storage:
provider: "s3"
access_key: "${ACCESS_KEY_ID}" # This is valid because the configuration format accepts environment variables
secret_key: "${SECRET_ACCESS_KEY}"
bucket: "gyandex"
region: "us-east-1"
endpoint: "https://xxx.r2.cloudflarestorage.com"
custom_domain: "pub-xxx.r2.dev"
feed:
title: "Gyandex: Tech Reading"
slug: "reading-list"
description: "Technical reading list curated by Dhruv Baldawa"
author: "Dhruv Baldawa"
email: "test@example.com"
language: "en"
categories: ["Technology", "Software Development", "Programming"]
image: "https://images.pexels.com/photos/26730962/pexels-photo-26730962.jpeg?cs=srgb&dl=pexels-helloaesthe-26730962.jpg&fm=jpg&w=640&h=960"
website: "https://github.com/dhruvbaldawa/gyandex"
I built a companion CLI (podgen
) which can run the entire workflow using this configuration. I plan to create separate configuration files for different use-cases and their own customizations. For example, I’d like to have different podcast structure for technical articles vs philosophical ones.
podgen reading-list.yaml
For now, the feeds are powered by a local SQLite database. I will be iterating on my choice of data store throughout this project.
Learning Journey
While this update focused more on infrastructure than innovation, I discovered Meta's NotebookLlama - a toy example with similar goals. I'm looking forward to exploring their approach in future updates.
Next Steps
Content Processing Improvements
Current focus: Enhancing the quality of generated content
Exploring better approaches for longer articles and creating high-quality podcasts
Configuration Enhancements
Expanding YAML configuration options
Implementing more control over the text-to-speech process, which does not exist right now
Building in Public
I'm sharing this journey to document my learning about Generative AI and hopefully help others who are interested in similar projects. Each update brings new insights about both the technical challenges and the practical applications of current AI capabilities.
Source code: https://github.com/dhruvbaldawa/gyandex
Pull request for this update: https://github.com/dhruvbaldawa/gyandex/pull/1