Sitemap

How to Create Your Own Ghibli-Style Podcast!

10 min readMar 31, 2025

--

Drawing inspiration from NotebookLM and OpenAI’s Ghibli-style art, this guide explores how to create a captivating podcast on any topic, infused with the charm and aesthetics of the Ghibli style.

Let us dive into creating engaging and interactive content for a solo podcast using Generative AI. Once we select an interesting topic for which we would like to create a podcast, we can leverage various AI tools to generate compelling podcast content for our topic.

The following steps will demonstrate how to do this:

  1. Choose a Podcast Topic
  2. Set Up the Notebook
  3. Gather Content
  4. Data Cleaning and Chunking for LLMs
  5. Generate the Podcast Script
  6. Add Voice to the Podcast Script
  7. Generate an Image for the podcast
  8. Bring the Podcast Together

1. Choose a Podcast Topic

The first step in creating a podcast is selecting a topic and gathering content to serve as its foundation. In this case, we’ll use the Wikipedia data to generate the text corpus. However, there are various other ways to source content.

For this tutorial, as a sports enthusiast, I have chosen the IPL (Indian Premier League) as a topic for my podcast, as it is currently ongoing and a major event in my country. However, you can choose any topic that interests you.

2. Set Up the Notebook

In this tutorial:

  • Code:
    To code, we’ll be using a Google Colab notebook, but you can use any Jupyter notebook environment.
  • Content:
    To generate the podcast content, we will be using the OpenAI GPT-4o mini model. To use this model, we will need an API key. If you’ve already signed up with OpenAI, you can generate a new key from the profile section or use any existing key.
view API-Keys

If you haven't signed in yet, you can use the URL below to sign in to OpenAI and then get the key from the profile section.

https://platform.openai.com/docs/overview

  • Voiceover:
    To generate a voiceover for the podcast script, we’ll use ElevenLabs with the Multilingual V2 model. Like OpenAI, ElevenLabs requires an API key to access its models. If you’re already signed up, you can generate a new key from the profile section or use an existing one.
elevenlabs api-keys

To set up the notebook, perform the following steps:

Step 2.1: Open a new notebook and write a brief description of the activity or the topic of your podcast.

notebook-setup

Step 2.2: Once done, connect to the runtime.
Note: In Google Colab, a “runtime” is an environment where your code runs.

Step 2.3: Run the following code blocks in the notebook to store the API keys of Open API and Eleven Labs, which we will use later.

from getpass import getpass

api_key = getpass('Enter your OPENAI_API_KEY')

eleven_labs_api_key = getpass('Enter your ELEVEN_LABS_API_KEY')
store api keys

3. Gather Content

Wikipedia is a good starting point for our text corpus as it is a free online encyclopedia and covers an extensive range of subjects such as history, science, current events, celebrities, and much more. Using the Wikipedia library, we can get all the details of a particular Wiki page based on the title or the Wiki ID.

Note: Wikipedia is a useful tool for general information, but it should not be relied upon as the sole or primary source for in-depth, high-stakes research.

To proceed with this tutorial, we need to install several libraries in the Colab notebook.

Step 3.1: Install LangChain and required libraries

Install all the required libraries using the following command.

!pip install wikipedia tiktoken langchain_community langchain_experimental langchain_openai langgraph elevenlabs pydub --quiet

Step 3.2: Load Wikipedia Document Using LangChain

Once the libraries are successfully installed, to gather content about the IPL (Indian Premier League) from Wikipedia, we will use the LangChain WikipediaLoader. LangChain provides a streamlined way to load, process, and query data from various sources, including Wikipedia.

To load Wikipedia documents using LangChain, you can use the WikipediaLoader from the LangChain library in the following way.

from langchain_community.document_loaders import WikipediaLoader

docs = WikipediaLoader(query="Indian_Premier_League", load_max_docs=3).load()

To read more about the Langchain WikipediaLoader, check the following link.

Download data from Wikipedia

We can’t provide all the downloaded data to the LLM at the same time due to context length limitations. Therefore, we need to split the data into smaller, logical chunks that can be processed within these constraints. Each LLM model has its own context length, which determines how much text it can handle at a time. So, how do we address this challenge? Let’s explore the solution in the next step.

4. Data Cleaning and Chunking for LLMs

Before training any machine learning model, data cleaning is essential. Similarly, before providing data as input to an LLM, we must perform both data cleaning and chunking.

What is Data Chunking?

Data chunking is a strategy to work around the context window limitation of Large Language Models (LLMs), ensuring that longer texts can be effectively processed. It involves:

  • Splitting text into sections that fit within the model’s context window.
  • Maintaining inter-chunk dependencies to preserve narrative flow.

Using LangChain’s SemanticChunker

To efficiently split the Wikipedia-derived text corpus, we will use LangChain’s Semantic Text Splitter. This tool ensures that the text is broken down in a way that preserves meaning and coherence. You may need a language model or embeddings to help with semantic splitting. In this case, we will use OpenAI embeddings, but you can use other embeddings as well.

Below is the reference code used to split the documents using SemanticChunker.

from langchain_experimental.text_splitter import SemanticChunker
from langchain_openai.embeddings import OpenAIEmbeddings

text_splitter = SemanticChunker(
OpenAIEmbeddings(api_key=api_key), breakpoint_threshold_type="gradient"
)
split_docs = text_splitter.split_documents(docs)
len(split_docs)
Data Chunking

5. Generate the Podcast Script

To create a compelling podcast script, we need to identify key points from the content that would be engaging for the audience. Once we’ve gathered these interesting discussion points, we should structure them in a format suitable for voiceover narration.

Here, we’ll use LangGraph’s MapReduce pattern to summarize the content in a structured format that can be effectively used for our podcast.

The following code creates the map and reduce chains separately using ChatPromptTemplate, which will later be used in LangGraph.

from langchain.chat_models import init_chat_model
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
import os
if not os.environ.get("OPENAI_API_KEY"):
os.environ["OPENAI_API_KEY"] = api_key

llm = init_chat_model("gpt-4o-mini", model_provider="openai",temperature = 0.7)

# Map
map_template = """
You are a Sports enthusiast who is doing a research for a podcast. Your task is to extract relevant information from the Result delimited by triple quotes.
Please identify 3 interesting questions and answers which can be used for a podcast discussion.
The identified discussions should be returned in the following format.
- Highlight 1 from the text
- Highlight 2 from the text
- Highlight 3 from the text
Result: {docs}"""

# Reduce
reduce_template ="""
You are a writer creating the script for the another episode of a podcast "Sport 101" hosted by \"Tom\".
Make the podcast casual, engaging and informative
Extract relevant information for the solo podcast from the Result delimited by triple quotes.
Use \"Tom\" as the person speaking and sharing insights about the topic with the audience.
Use the below format for the solo podcast.
1. Introduction about the topic and welcome everyone for another episode of the podcast "Sport 101".
2. Introduce the speaker in brief.
3. Then start the podcast.
4. Start the podcast with some casual discussion like what he is doing right now at this moment.
5. End the podcast with thank you speech to everyone.
6. Do not use the word \"conversation\" in the response.
7. Do not use the word \"Introduction\" in the response.
8. Do not use the word \"Podcast Script\" in response.
Result: "{doc_summaries}"
"""

map_prompt = ChatPromptTemplate([("human", map_template)])
reduce_prompt = ChatPromptTemplate([("human", reduce_template)])

map_chain = map_prompt | llm | StrOutputParser()
reduce_chain = reduce_prompt | llm | StrOutputParser()

Here, the map chain is used to extract interesting facts from the content, while the reduce chain helps generate the podcast script in the desired format for our IPL podcast.

Below is a sample LangGraph implementation using the map and reduce chains created above. The graph includes a node for generating summaries, which is applied to a list of input documents. This output then flows into a second node that generates the final summary.

import operator
from typing import Annotated, List, TypedDict
from langgraph.constants import Send
from langgraph.graph import END, START, StateGraph

# This will be the overall state of the main graph.
# It will contain the input document contents, corresponding
# summaries, and a final summary.
class OverallState(TypedDict):
# Notice here we use the operator.add
# This is because we want combine all the summaries we generate
# from individual nodes back into one list - this is essentially
# the "reduce" part
contents: List[str]
summaries: Annotated[list, operator.add]
final_summary: str


# This will be the state of the node that we will "map" all
# documents to in order to generate summaries
class SummaryState(TypedDict):
content: str


# Here we generate a summary, given a document
async def generate_summary(state: SummaryState):
response = await map_chain.ainvoke(state["content"])
return {"summaries": [response]}


# Here we define the logic to map out over the documents
# We will use this an edge in the graph
def map_summaries(state: OverallState):
# We will return a list of `Send` objects
# Each `Send` object consists of the name of a node in the graph
# as well as the state to send to that node
return [
Send("generate_summary", {"content": content}) for content in state["contents"]
]

# Here we will generate the final summary
async def generate_final_summary(state: OverallState):
response = await reduce_chain.ainvoke(state["summaries"])
return {"final_summary": response}


# Construct the graph: here we put everything together to construct our graph
graph = StateGraph(OverallState)
graph.add_node("generate_summary", generate_summary)
graph.add_node("generate_final_summary", generate_final_summary)
graph.add_conditional_edges(START, map_summaries, ["generate_summary"])
graph.add_edge("generate_summary", "generate_final_summary")
graph.add_edge("generate_final_summary", END)
app = graph.compile()
LangGraph MapReduce Document chain

Let’s now call the Graph to get the desired output.

# Call the graph:
async for step in app.astream({"contents": [doc.page_content for doc in split_docs]}):
print(step)
Generate summary using LangGraph

Our podcast script is now ready! Next, we need to add a voiceover. Let’s see how we can do that using ElevenLabs.

6. Add Voice to the Podcast Script

First, let’s select a voice for our podcast. We can find a list of available voices in the voice section on the ElevenLabs website. Here, we’ll use the ‘Sports Guy’ voice, but feel free to choose any voice that suits your preference.

voice selection

Note: While ElevenLabs offers some free credits, they are only sufficient for short scripts. For longer scripts, additional credits must be purchased to generate the voiceover.

We will use the helper function below to convert the text to voice using ElevenLabs.

from elevenlabs.client import ElevenLabs

client = ElevenLabs(
api_key=eleven_labs_api_key,
)

def createPodcast(podcastScript, speakerChoice):
genPodcast = []
podcastLines = podcastScript.split('\n\n')
for line in podcastLines:
print(line)
genVoice = client.generate(text=line, voice=speakerChoice, model="eleven_multilingual_v2")
# Change: Iterate through the generator and write bytes to a bytearray
audio_bytes = bytearray()
for chunk in genVoice:
audio_bytes.extend(chunk)
genPodcast.append(audio_bytes) # Append bytearray to genPodcast
return genPodcast
speakerName1 = "Tom"
speakerChoice1 = "Sports Guy - Excited and fast, giving you that play by play."
genPodcast = createPodcast(podcastScript, speakerChoice1)
Convert text-to-voice

Once we execute the script above, an audio file will be generated. We can then manually download it for later use.

Now our podcast is ready with a voiceover. Next, we need to generate an image in Ghibli style for our podcast.

7. Generate an Image for the podcast

This step is simple — choose any podcast image as a reference and use ChatGPT to transform it into a Ghibli-style illustration.

Here is my Ghibli-style podcast image generated using ChatGPT.

Ghibli-style image

Sharing the prompt that we used for the above image transformation.

Convert the image into and anime style using immersive realism similar at 99.99% to Studio Ghiblis style but not the same but similar so it’s not infrigment

8. Bring the Podcast Together

This is the most exciting part of our podcast creation! We have our podcast audio script ready, along with a Ghibli-style podcast image. Now, let’s use Hedra to generate the podcast video.

Hedra Studio is a content creation platform that allows users to generate high-quality videos, images, and audio. It enables video creation using an image and an audio script.

Note: Although Hedra provides some free credits, they might not be sufficient to generate a full podcast video. If we need to create the entire video, we may have to purchase additional credits from Hedra.

Hedra Studio

The prompt that we used in Hedra: A sports enthusiast speaking on a solo podcast.

Hedra takes some time to generate the video. Once done, we can download and share our podcast video.

Below is the video which got generated using the above prompt.

If you’d like to access the complete notebook, please refer to the repository provided below.

That’s all, folks! Looking forward to seeing the creative tasks you accomplish with these LLM advancements.

References

--

--

Suman Das
Suman Das

Written by Suman Das

Tech Enthusiast, Software Engineer