Building NotebookLM from Scratch
Building AI tools in public: From newsletters to podcasts, one experiment at a time
Introduction
As a developer passionate about AI and content consumption, I've embarked on an exciting journey to build in public - creating an open-source alternative to NotebookLM, gyandex. This project isn't just about coding; it's about exploring the fascinating world of Generative AI and sharing my learnings along the way.
The Challenge of Modern Content Consumption
We live in an age of information overflow. Like many of you, I subscribe to numerous newsletters and constantly battle with my inbox full of unread articles. Despite my genuine interest in these topics, finding the time to read everything has become increasingly challenging.
This is where the magic of Generative AI comes in. Through AI-powered content transformation, I've discovered a way to consume 80% of my content in just 5% time. It's not just about speed - it's about adapting content to fit how I’d like to consume it.
Why Another NotebookLM?
NotebookLM has been my go-to tool for various use cases - from researching baby strollers to reading technical papers and even debugging production incidents. It's an impressive tool, but as a developer, I've often found myself wanting more flexibility and customization options.
Instead of waiting for these features, I decided to build them myself. Why? Because sometimes the best way to understand and improve upon a tool is to rebuild it from scratch. Plus, it's an excellent opportunity to dive deep into the latest developments in Generative AI.
The MVP: From Text to Talk
For the initial version, I'm focusing on a specific yet powerful feature: transforming web articles into podcasts. The goal is to create a system that can:
Clean and extract content from web articles
Generate engaging podcast scripts
Convert these scripts into high-quality audio
Package everything into a podcast feed format
Current Progress
The initial prototype is already functional:
Content Extraction: Using jina.ai APIs for clean, accurate web content extraction
Script Generation: Implementing Gemini Pro APIs to transform articles into natural-sounding podcast scripts (currently well within the free tier limits)
Audio Synthesis: Utilizing Google Cloud Text-to-Speech APIs for high-quality audio generation
Demo
Generated podcast for The Right Kind of Stubborn by Paul Graham
Generated podcast for Manage your priorities and energy by @lethai
What's Next on the Roadmap
The project is evolving rapidly, with several key developments planned
Code Architecture Refinement
Moving from Jupyter notebooks to Python modules
Improve usability
Developing a flexible CLI interface
Creating YAML-based configuration profiles for different podcast profiles
Supporting different content transformation strategies
Podcast Integration
Implementing podcast feed generation
Adding metadata management
Ensuring compatibility with popular podcast apps
Follow the Journey
Check out the next article in this series
This is just the beginning of what I hope will become a valuable tool. I'm building this in public because I believe in the power of community feedback and collaborative learning.
Whether you're interested in Generative AI, content transformation, or just looking for better ways to consume information, I invite you to follow along this project.
Stay tuned for more updates as we continue to develop and refine gyandex!