Creating a Friends Script Generator with Deep Learning
Friends Sequential Network Generator
Although I have worked with convolution neural networks in the past, I wanted to expand my experience in deep learning. My first project idea was to create a script generator using a simple sequential neural network. While there is a lot of calculus and linear algebra involved in deep learning, I wont add a bunch of it in this article, but I’ll likely create another post talking about it later on. To help me along the way, I used this Tensorflow tutorial article to help me implement the basic model breakdown. The repo for this post can be found here.
I was eager to start creating my network, but before I could, I had to collect all my data. In deep learning, while the model is important, both the quality and amount of data available can greatly help or hurt your chances at correct identification. I found a website with the full text for all the scripts online, but they were all nested, with each script located on another page with a hyper link pointing to it from the main website page. Now I could just simply copy each script from each page of the website, but that would take forever. Since we are programmers, I wanted to create a script that would automate it for me.
The obvious language of choice that I wanted to use was Python, to which I could use the libary called BeautifulSoup, which is great for web scrapping.
First we must import all necessary packages, as well as getting the current project directory.
# Import Packages
import httplib2
from bs4 import BeautifulSoup, SoupStrainer
import urllib.request
import os
# root directory of the project
root_path = os.path.dirname(os.path.realpath(__file__))
Next, we declare a new http object, and get an http request while storing both the status and response of the request.
# Declare an http object
http = httplib2.Http()
# Get the status and respose from the webpage
status, response = http.request('https://fangj.github.io/friends/')
To store our output, we create an array, which will store all the hyperlinks on the main website that point to each individual script, and the output path.
# Create a list
links = []
# (same directory) in append mode and
file1 = open(root_path + "/dataset/dataset.txt","a", encoding="utf-8")
Now that we’ve done all the boiler plate, we can actually start parsing the html and grabbing all necessary text. First we declare the type of parser we want to use, which will be the html parser. Next, we open the base website with all the hyperlinks to the friends scripts. Finally, we pass all relevant objects to the BeautifulSoup object to begin parsing.
# Open the webpage, declaring the right decoder and parser
parser = 'html.parser' # or 'lxml' (preferred) or 'html5lib', if installed
resp = urllib.request.urlopen("https://fangj.github.io/friends/")
soup = BeautifulSoup(resp, parser, from_encoding=resp.info().get_param('charset'))
We then use a loop to grab all hyper links on the main website and append them to the links array we declared earlier.
# Add hyperlinks to a list
for link in soup.find_all('a', href=True):
links.append("https://fangj.github.io/friends/" + link['href'])
At last we actually parse all html and extract the text.
# Open each webpage, read all text, then write the text to a file
for i in range(0, len(links)):
resp = urllib.request.urlopen(links[i])
print("Reading " + str(resp))
soup = BeautifulSoup(resp, features="lxml")
txt = soup.get_text()
file1.write(txt)
# Close the file
file1.close()
Below is the full code for how I get the friends script dataset.
#!/usr/bin/env python
# -*- coding: utf-8 -*-
"""
This python script is used as a web scraper to gather all Friends scripts through various links
found at this website https://fangj.github.io/friends/
"""
# Import Packages
import httplib2
from bs4 import BeautifulSoup, SoupStrainer
import urllib.request
import os
# root directory of the project
root_path = os.path.dirname(os.path.realpath(__file__))
# Declare an http object
http = httplib2.Http()
# Get the status and respose from the webpage
status, response = http.request('https://fangj.github.io/friends/')
# Create a list
links = []
# (same directory) in append mode and
file1 = open(root_path + "/dataset/dataset.txt","a", encoding="utf-8")
# Open the webpage, declaring the right decoder and parser
parser = 'html.parser' # or 'lxml' (preferred) or 'html5lib', if installed
resp = urllib.request.urlopen("https://fangj.github.io/friends/")
soup = BeautifulSoup(resp, parser, from_encoding=resp.info().get_param('charset'))
# Add hyperlinks to a list
for link in soup.find_all('a', href=True):
links.append("https://fangj.github.io/friends/" + link['href'])
# Open each webpage, read all text, then write the text to a file
for i in range(0, len(links)):
resp = urllib.request.urlopen(links[i])
print("Reading " + str(resp))
soup = BeautifulSoup(resp, features="lxml")
txt = soup.get_text()
file1.write(txt)
# Close the file
file1.close()
Now we have one mega long file with all the friends scripts in text in it. Once this has been done, we can now begin creating our model to generate a random script.
To start off, we import all necessary libraries, while also getting the path of our current directory.
# Import necessary libraries
import tensorflow as tf
import numpy as np
import os
import time
# These are for generating a random logs folder
import string
import random
# Root directory of the project
root_path = os.path.dirname(os.path.realpath(__file__))
Although its not greatly important to include this feature, depending on your GPU and the amount of Vram it has, its best to enable memory growth, which allows for the GPU to only increase its memory consumption when needed, rather then just allocating a huge chuck of memory automatically.
#Allow GPU Growth
gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
try:
for gpu in gpus:
tf.config.experimental.set_memory_growth(gpu, True)
except RuntimeError as e:
print(e)
Next we will declare some parameters. We declare them here so if we want to adjust something for our model, we wont have to go hunting in our code to find it.
############################################################
# Settings
############################################################
# The maximum length sentence we want for a single input in characters
seq_length = 100
# Number of RNN units
rnn_units = 1024
# The embedding dimension
embedding_dim = 256
# Batch size
BATCH_SIZE = 65
# Number of epochs to run through
EPOCHS=10
BUFFER_SIZE = 100
# Steps per epochs
STEPS = 50
# Low temperatures result in more predictable text.
# Higher temperatures result in more surprising text.
# Experiment to find the best setting.
temperature = 1.0
# Number of characters to generate
num_generate = 200
# Initialization of loss value as a global variable
# to be used in multiple functions
loss = 0
To read in all of our data, we will create a function to handle all of those parameters for us. First we get the path to where we saved our big text file to.
# Path to dataset
dataset_path = os.path.join(root_path, r"dataset/dataset.txt")
Next, we decode all of the text and store it in one long string.
# Read, then decode for py2 compat.
text = open(dataset_path, 'rb').read().decode(encoding='unicode_escape')
To better randomize our data, we sort the text for unique characters and store them in a list.
# The unique characters in the file
vocab = sorted(set(text))
For randomization, we map characters to a dictionary and crab their indexes. We also slice up our data and create sequences for our dataset. Finally, we further split up our dataset which allows us to return the entire sorted, and split dataset as a return value.
# Creating a mapping from unique characters to indices
char2idx = {u:i for i, u in enumerate(vocab)}
idx2char = np.array(vocab)
text_as_int = np.array([char2idx[c] for c in text])
# Create training examples / targets
char_dataset = tf.data.Dataset.from_tensor_slices(text_as_int)
# Create the sequence from the dataset
sequences = char_dataset.batch(seq_length+1, drop_remainder=True)
def split_input_target(chunk):
input_text = chunk[:-1]
target_text = chunk[1:]
return input_text, target_text
# Split the dataset up for better readability and passing into epochs and batch sizes
dataset = sequences.map(split_input_target)
Below is the entire create dataset method.
############################################################
# Create Dataset
############################################################
def create_dataset():
# Path to dataset
dataset_path = os.path.join(root_path, r"dataset/dataset.txt")
# Read, then decode for py2 compat.
text = open(dataset_path, 'rb').read().decode(encoding='unicode_escape')
# The unique characters in the file
vocab = sorted(set(text))
# Creating a mapping from unique characters to indices
char2idx = {u:i for i, u in enumerate(vocab)}
idx2char = np.array(vocab)
text_as_int = np.array([char2idx[c] for c in text])
# Create training examples / targets
char_dataset = tf.data.Dataset.from_tensor_slices(text_as_int)
# Create the sequence from the dataset
sequences = char_dataset.batch(seq_length+1, drop_remainder=True)
def split_input_target(chunk):
input_text = chunk[:-1]
target_text = chunk[1:]
return input_text, target_text
# Split the dataset up for better readability and passing into epochs and batch sizes
dataset = sequences.map(split_input_target)
return dataset, vocab, idx2char, char2idx
Although their are a ton of models available with a ton of uses, for simplicity and to create a baseline to build from, we use a sequential model.
First we create a embedding layer, which turns the individual characters from our dataset into dense vectors which could be used to better pass through the rest of our model. Next, we use a GRU layer, which is the critical and most important layer of our model. Since we are creating a model that generates text, we need to know the history of inputted characters, so we can check if a combination of characters makes a valid word or matches with the dataset. Similar to an LSTM, which is also used heavily in text generation models, GRUs have gates, which can learn which data to ignore, and which data to include based on past history feed through the model. Finally, we add dense layer, which is the most commonly used layer in deep learning, and has both input and output nodes that have weight values between nodes.
############################################################
# Model
############################################################
def build_model(vocab_size, embedding_dim, rnn_units, batch_size):
model = tf.keras.Sequential([
tf.keras.layers.Embedding(vocab_size, embedding_dim,
batch_input_shape=[batch_size, None]),
tf.keras.layers.GRU(rnn_units,
return_sequences=True,
stateful=True,
recurrent_initializer='glorot_uniform'),
tf.keras.layers.Dense(vocab_size)
])
return model
Now we get the meat of our model python script where we begin to actually train our model. First we shuffle our dataset and get the length of the vocabulary that we are passing in our model.
dataset = dataset.shuffle(BUFFER_SIZE).batch(BATCH_SIZE, drop_remainder=True)
# Length of the vocabulary in chars
vocab_size = len(vocab)
Next, we build our model by passing our parameters declared earlier.
model = build_model(
vocab_size = vocab_size,
embedding_dim=embedding_dim,
rnn_units=rnn_units,
batch_size=BATCH_SIZE)
Next we get some data from a categorical distribution, then squeeze our data for better reshaping.
for input_example_batch, target_example_batch in dataset.take(1):
example_batch_predictions = model(input_example_batch)
print(example_batch_predictions.shape, "# (batch_size, sequence_length, vocab_size)")
# Print out the different layers of the model
model.summary()
sampled_indices = tf.random.categorical(example_batch_predictions[0], num_samples=1)
sampled_indices = tf.squeeze(sampled_indices,axis=-1).numpy()
sampled_indices
One of the main parameters that we need to keep track of is the loss of our model. This parameter helps us distinguish how well our model is learning. To distinguish our loss, we use sparse categorical crossentropy.
def loss(labels, logits):
return tf.keras.losses.sparse_categorical_crossentropy(labels, logits, from_logits=True)
The optimizer we use in this example is called Adaptive Moment Estimation, or Adam. A modified version of gradient descent called stochastic gradient descent, Adam helps find the minimum for each epoch and minimize loss.
model.compile(optimizer='adam', loss=loss)
To save our checkpoint and weight files, we must select the path and enable the option in our code. First we generate a random string so no matter how many times we try to train our model, the chances of overlapping log files would be very very small. Next we select the prefix of the checkpoints, then enable Tensorflow checkpoints.
# generate a random string 15 characters long
random_string = ''.join(random.choice(string.ascii_uppercase + string.digits) for _ in range(15))
# Directory where the checkpoints will be saved
checkpoint_dir = './logs/' + random_string + "/"
# Name of the checkpoint files
checkpoint_prefix = os.path.join(checkpoint_dir, "Model_{epoch}")
checkpoint_callback=tf.keras.callbacks.ModelCheckpoint(
filepath=checkpoint_prefix,
save_weights_only=True,
save_freq='epoch')
We then save our weights and fit our model. Once we fit our model, it begins training and generating weight files. We then select the path with the random string we generated earlier of where we want the weight files to go.
model.save_weights(checkpoint_prefix.format(epoch=0))
history = model.fit(dataset, steps_per_epoch=STEPS, epochs=EPOCHS, callbacks=[checkpoint_callback])
# Create a path for the saving location of the model
model_dir = checkpoint_dir + "model.h5"
# Save the model
# TODO: Known issue with saving the model and loading it back
# later in the script causes issues. Working to fix this issue
model.save(model_dir)
We then train from the last checkpoint, build our model, and print our the summary of it to the console.
# Train from the last checkpoint
tf.train.latest_checkpoint(checkpoint_dir)
# Build the model with the dataset generated earlier
model = build_model(vocab_size, embedding_dim, rnn_units, batch_size=1)
# Load the weights
model.load_weights(tf.train.latest_checkpoint(checkpoint_dir))
# Build the model
model.build(tf.TensorShape([1, None]))
# Print out the model summary
model.summary()
Below is all the code for the training of our model.
############################################################
# Train Model
############################################################
def train_model(dataset, vocab):
# Buffer size to shuffle the dataset
# (TF data is designed to work with possibly infinite sequences,
# so it doesn't attempt to shuffle the entire sequence in memory. Instead,
# it maintains a buffer in which it shuffles elements).
dataset = dataset.shuffle(BUFFER_SIZE).batch(BATCH_SIZE, drop_remainder=True)
# Length of the vocabulary in chars
vocab_size = len(vocab)
model = build_model(
vocab_size = vocab_size,
embedding_dim=embedding_dim,
rnn_units=rnn_units,
batch_size=BATCH_SIZE)
for input_example_batch, target_example_batch in dataset.take(1):
example_batch_predictions = model(input_example_batch)
print(example_batch_predictions.shape, "# (batch_size, sequence_length, vocab_size)")
# Print out the different layers of the model
model.summary()
sampled_indices = tf.random.categorical(example_batch_predictions[0], num_samples=1)
sampled_indices = tf.squeeze(sampled_indices,axis=-1).numpy()
sampled_indices
def loss(labels, logits):
return tf.keras.losses.sparse_categorical_crossentropy(labels, logits, from_logits=True)
example_batch_loss = loss(target_example_batch, example_batch_predictions)
print("Prediction shape: ", example_batch_predictions.shape, " # (batch_size, sequence_length, vocab_size)")
model.compile(optimizer='adam', loss=loss)
# generate a random string 15 characters long
random_string = ''.join(random.choice(string.ascii_uppercase + string.digits) for _ in range(15))
# Directory where the checkpoints will be saved
checkpoint_dir = './logs/' + random_string + "/"
# Name of the checkpoint files
checkpoint_prefix = os.path.join(checkpoint_dir, "Model_{epoch}")
checkpoint_callback=tf.keras.callbacks.ModelCheckpoint(
filepath=checkpoint_prefix,
save_weights_only=True,
save_freq='epoch')
model.save_weights(checkpoint_prefix.format(epoch=0))
history = model.fit(dataset, steps_per_epoch=STEPS, epochs=EPOCHS, callbacks=[checkpoint_callback])
# Create a path for the saving location of the model
model_dir = checkpoint_dir + "model.h5"
# Save the model
# TODO: Known issue with saving the model and loading it back
# later in the script causes issues. Working to fix this issue
model.save(model_dir)
# Train from the last checkpoint
tf.train.latest_checkpoint(checkpoint_dir)
# Build the model with the dataset generated earlier
model = build_model(vocab_size, embedding_dim, rnn_units, batch_size=1)
# Load the weights
model.load_weights(tf.train.latest_checkpoint(checkpoint_dir))
# Build the model
model.build(tf.TensorShape([1, None]))
# Print out the model summary
model.summary()
# Return the model
return model
The next and final method we will create is to generate our script from our trained model. Using a seed string value to start our text generator, we then generate our predictions from our model to predict what character would come from the last character based on our training.
############################################################
# Generate Text
############################################################
def generate_text(model, idx2char, char2idx, start_string):
# Evaluation step (generating text using the learned model)
# Converting our start string to numbers (vectorizing)
input_eval = [char2idx[s] for s in start_string]
input_eval = tf.expand_dims(input_eval, 0)
# Empty array to store our results
text_generated = []
# Here batch size == 1
model.reset_states()
for i in range(num_generate):
predictions = model(input_eval)
# remove the batch dimension
predictions = tf.squeeze(predictions, 0)
# using a categorical distribution to predict the character returned by the model
predictions = predictions / temperature
predicted_id = tf.random.categorical(predictions, num_samples=1)[-1,0].numpy()
# We pass the predicted character as the next input to the model
# along with the previous hidden state
input_eval = tf.expand_dims([predicted_id], 0)
text_generated.append(idx2char[predicted_id])
return (start_string + ''.join(text_generated))
Now that all of our methods have been created, we can set some arguments to select and choose what methods we want to call at a given time.
For our arguments, we have two types, train and generate. For training, we call our dataset and generate text based on that model. For generate we again call our dataset, then load int the path to the weights file that was saved before. We then build the model and generate the text.
############################################################
# Configure
############################################################
if __name__ == '__main__':
import argparse
# Parse command line arguments
parser = argparse.ArgumentParser(
description='Train or Detect.')
parser.add_argument("command",
metavar="<command>",
help="'train' or 'generate'")
parser.add_argument('--weights', required=False,
metavar="/path/to/weights",
help="Path to weights file")
parser.add_argument('--start', required=False,
metavar="start of string",
help="The word that will begin the output string")
args = parser.parse_args()
# Configurations
if args.command == "train":
# Load in the dataset and other function to use
dataset, vocab, idx2char, char2idx = create_dataset()
# Train the model
model = train_model(dataset, vocab)
print(generate_text(model, idx2char, char2idx,start_string=u"The "))
if args.command == 'generate':
# Load in the dataset and other function to use
dataset, vocab, idx2char, char2idx = create_dataset()
# Get the model path
model_path = os.path.join(root_path, args.weights)
# Length of the vocabulary in chars
vocab_size = len(vocab)
# Build the model with the dataset generated earlier
model = build_model(vocab_size, embedding_dim, rnn_units, batch_size=1)
# Load the weights
model.load_weights(model_path)
# Build the model
model.build(tf.TensorShape([1, None]))
# Generate text
print(generate_text(model, idx2char, char2idx,start_string=u"The "))
Below is the entire code for the model python script.
#!/usr/bin/env python
# -*- coding: utf-8 -*-
"""
The purpose of this python script is to both experiment with new deep learning techniques,
and to further my knowledge on the subject of deep learning.
I used this as a reference guide:
https://www.tensorflow.org/tutorials/text/text_generation
"""
# Import necessary libraries
import tensorflow as tf
import numpy as np
import os
import time
# These are for generating a random logs folder
import string
import random
# Root directory of the project
root_path = os.path.dirname(os.path.realpath(__file__))
#Allow GPU Growth
gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
try:
for gpu in gpus:
tf.config.experimental.set_memory_growth(gpu, True)
except RuntimeError as e:
print(e)
############################################################
# Settings
############################################################
# The maximum length sentence we want for a single input in characters
seq_length = 100
# Number of RNN units
rnn_units = 1024
# The embedding dimension
embedding_dim = 256
# Batch size
BATCH_SIZE = 65
# Number of epochs to run through
EPOCHS=10
BUFFER_SIZE = 100
# Steps per epochs
STEPS = 50
# Low temperatures result in more predictable text.
# Higher temperatures result in more surprising text.
# Experiment to find the best setting.
temperature = 1.0
# Number of characters to generate
num_generate = 200
# Initialization of loss value as a global variable
# to be used in multiple functions
loss = 0
############################################################
# Create Dataset
############################################################
def create_dataset():
# Path to dataset
dataset_path = os.path.join(root_path, r"dataset/dataset.txt")
# Read, then decode for py2 compat.
text = open(dataset_path, 'rb').read().decode(encoding='unicode_escape')
# The unique characters in the file
vocab = sorted(set(text))
# Creating a mapping from unique characters to indices
char2idx = {u:i for i, u in enumerate(vocab)}
idx2char = np.array(vocab)
text_as_int = np.array([char2idx[c] for c in text])
# Create training examples / targets
char_dataset = tf.data.Dataset.from_tensor_slices(text_as_int)
# Create the sequence from the dataset
sequences = char_dataset.batch(seq_length+1, drop_remainder=True)
def split_input_target(chunk):
input_text = chunk[:-1]
target_text = chunk[1:]
return input_text, target_text
# Split the dataset up for better readability and passing into epochs and batch sizes
dataset = sequences.map(split_input_target)
return dataset, vocab, idx2char, char2idx
############################################################
# Model
############################################################
def build_model(vocab_size, embedding_dim, rnn_units, batch_size):
model = tf.keras.Sequential([
tf.keras.layers.Embedding(vocab_size, embedding_dim,
batch_input_shape=[batch_size, None]),
tf.keras.layers.GRU(rnn_units,
return_sequences=True,
stateful=True,
recurrent_initializer='glorot_uniform'),
tf.keras.layers.Dense(vocab_size)
])
return model
############################################################
# Train Model
############################################################
def train_model(dataset, vocab):
# Buffer size to shuffle the dataset
# (TF data is designed to work with possibly infinite sequences,
# so it doesn't attempt to shuffle the entire sequence in memory. Instead,
# it maintains a buffer in which it shuffles elements).
dataset = dataset.shuffle(BUFFER_SIZE).batch(BATCH_SIZE, drop_remainder=True)
# Length of the vocabulary in chars
vocab_size = len(vocab)
model = build_model(
vocab_size = vocab_size,
embedding_dim=embedding_dim,
rnn_units=rnn_units,
batch_size=BATCH_SIZE)
for input_example_batch, target_example_batch in dataset.take(1):
example_batch_predictions = model(input_example_batch)
print(example_batch_predictions.shape, "# (batch_size, sequence_length, vocab_size)")
# Print out the different layers of the model
model.summary()
sampled_indices = tf.random.categorical(example_batch_predictions[0], num_samples=1)
sampled_indices = tf.squeeze(sampled_indices,axis=-1).numpy()
sampled_indices
def loss(labels, logits):
return tf.keras.losses.sparse_categorical_crossentropy(labels, logits, from_logits=True)
example_batch_loss = loss(target_example_batch, example_batch_predictions)
print("Prediction shape: ", example_batch_predictions.shape, " # (batch_size, sequence_length, vocab_size)")
model.compile(optimizer='adam', loss=loss)
# generate a random string 15 characters long
random_string = ''.join(random.choice(string.ascii_uppercase + string.digits) for _ in range(15))
# Directory where the checkpoints will be saved
checkpoint_dir = './logs/' + random_string + "/"
# Name of the checkpoint files
checkpoint_prefix = os.path.join(checkpoint_dir, "Model_{epoch}")
checkpoint_callback=tf.keras.callbacks.ModelCheckpoint(
filepath=checkpoint_prefix,
save_weights_only=True,
save_freq='epoch')
model.save_weights(checkpoint_prefix.format(epoch=0))
history = model.fit(dataset, steps_per_epoch=STEPS, epochs=EPOCHS, callbacks=[checkpoint_callback])
# Create a path for the saving location of the model
model_dir = checkpoint_dir + "model.h5"
# Save the model
# TODO: Known issue with saving the model and loading it back
# later in the script causes issues. Working to fix this issue
model.save(model_dir)
# Train from the last checkpoint
tf.train.latest_checkpoint(checkpoint_dir)
# Build the model with the dataset generated earlier
model = build_model(vocab_size, embedding_dim, rnn_units, batch_size=1)
# Load the weights
model.load_weights(tf.train.latest_checkpoint(checkpoint_dir))
# Build the model
model.build(tf.TensorShape([1, None]))
# Print out the model summary
model.summary()
# Return the model
return model
############################################################
# Generate Text
############################################################
def generate_text(model, idx2char, char2idx, start_string):
# Evaluation step (generating text using the learned model)
# Converting our start string to numbers (vectorizing)
input_eval = [char2idx[s] for s in start_string]
input_eval = tf.expand_dims(input_eval, 0)
# Empty array to store our results
text_generated = []
# Here batch size == 1
model.reset_states()
for i in range(num_generate):
predictions = model(input_eval)
# remove the batch dimension
predictions = tf.squeeze(predictions, 0)
# using a categorical distribution to predict the character returned by the model
predictions = predictions / temperature
predicted_id = tf.random.categorical(predictions, num_samples=1)[-1,0].numpy()
# We pass the predicted character as the next input to the model
# along with the previous hidden state
input_eval = tf.expand_dims([predicted_id], 0)
text_generated.append(idx2char[predicted_id])
return (start_string + ''.join(text_generated))
############################################################
# Configure
############################################################
if __name__ == '__main__':
import argparse
# Parse command line arguments
parser = argparse.ArgumentParser(
description='Train or Detect.')
parser.add_argument("command",
metavar="<command>",
help="'train' or 'generate'")
parser.add_argument('--weights', required=False,
metavar="/path/to/weights",
help="Path to weights file")
parser.add_argument('--start', required=False,
metavar="start of string",
help="The word that will begin the output string")
args = parser.parse_args()
# Configurations
if args.command == "train":
# Load in the dataset and other function to use
dataset, vocab, idx2char, char2idx = create_dataset()
# Train the model
model = train_model(dataset, vocab)
print(generate_text(model, idx2char, char2idx,start_string=u"The "))
if args.command == 'generate':
# Load in the dataset and other function to use
dataset, vocab, idx2char, char2idx = create_dataset()
# Get the model path
model_path = os.path.join(root_path, args.weights)
# Length of the vocabulary in chars
vocab_size = len(vocab)
# Build the model with the dataset generated earlier
model = build_model(vocab_size, embedding_dim, rnn_units, batch_size=1)
# Load the weights
model.load_weights(model_path)
# Build the model
model.build(tf.TensorShape([1, None]))
# Generate text
print(generate_text(model, idx2char, char2idx,start_string=u"The "))
Although deep learning can be an intimidating, breaking it down to smaller pieces can help you understand it better. Although I didn’t mention the math in this post, it is worth to read up on it, which could develop a better understanding of how deep learning works.
Definitely need the elevator to scroll back up.
LikeLike