Creating a Friends Script Generator with Deep Learning

Friends Sequential Network Generator

Advertisements

Although I have worked with convolution neural networks in the past, I wanted to expand my experience in deep learning. My first project idea was to create a script generator using a simple sequential neural network. While there is a lot of calculus and linear algebra involved in deep learning, I wont add a bunch of it in this article, but I’ll likely create another post talking about it later on. To help me along the way, I used this Tensorflow tutorial article to help me implement the basic model breakdown. The repo for this post can be found here.

I was eager to start creating my network, but before I could, I had to collect all my data. In deep learning, while the model is important, both the quality and amount of data available can greatly help or hurt your chances at correct identification. I found a website with the full text for all the scripts online, but they were all nested, with each script located on another page with a hyper link pointing to it from the main website page. Now I could just simply copy each script from each page of the website, but that would take forever. Since we are programmers, I wanted to create a script that would automate it for me.

The obvious language of choice that I wanted to use was Python, to which I could use the libary called BeautifulSoup, which is great for web scrapping.

First we must import all necessary packages, as well as getting the current project directory.

# Import Packages
import httplib2
from bs4 import BeautifulSoup, SoupStrainer
import urllib.request
import os

# root directory of the project
root_path = os.path.dirname(os.path.realpath(__file__))

Next, we declare a new http object, and get an http request while storing both the status and response of the request.

# Declare an http object
http = httplib2.Http()

# Get the status and respose from the webpage
status, response = http.request('https://fangj.github.io/friends/')

To store our output, we create an array, which will store all the hyperlinks on the main website that point to each individual script, and the output path.

# Create a list
links = []

# (same directory) in append mode and 
file1 = open(root_path + "/dataset/dataset.txt","a", encoding="utf-8") 

Now that we’ve done all the boiler plate, we can actually start parsing the html and grabbing all necessary text. First we declare the type of parser we want to use, which will be the html parser. Next, we open the base website with all the hyperlinks to the friends scripts. Finally, we pass all relevant objects to the BeautifulSoup object to begin parsing.

# Open the webpage, declaring the right decoder and parser
parser = 'html.parser'  # or 'lxml' (preferred) or 'html5lib', if installed
resp = urllib.request.urlopen("https://fangj.github.io/friends/")
soup = BeautifulSoup(resp, parser, from_encoding=resp.info().get_param('charset'))

We then use a loop to grab all hyper links on the main website and append them to the links array we declared earlier.

# Add hyperlinks to a list
for link in soup.find_all('a', href=True):
    links.append("https://fangj.github.io/friends/" + link['href'])

At last we actually parse all html and extract the text.

# Open each webpage, read all text, then write the text to a file
for i in range(0, len(links)):
    resp = urllib.request.urlopen(links[i])
    print("Reading " + str(resp))
    soup = BeautifulSoup(resp, features="lxml")
    txt = soup.get_text()
    file1.write(txt)

# Close the file
file1.close()

Below is the full code for how I get the friends script dataset.

#!/usr/bin/env python
# -*- coding: utf-8 -*-

"""
    This python script is used as a web scraper to gather all Friends scripts through various links
    found at this website https://fangj.github.io/friends/
"""

# Import Packages
import httplib2
from bs4 import BeautifulSoup, SoupStrainer
import urllib.request
import os

# root directory of the project
root_path = os.path.dirname(os.path.realpath(__file__))

# Declare an http object
http = httplib2.Http()

# Get the status and respose from the webpage
status, response = http.request('https://fangj.github.io/friends/')

# Create a list
links = []

# (same directory) in append mode and 
file1 = open(root_path + "/dataset/dataset.txt","a", encoding="utf-8") 

# Open the webpage, declaring the right decoder and parser
parser = 'html.parser'  # or 'lxml' (preferred) or 'html5lib', if installed
resp = urllib.request.urlopen("https://fangj.github.io/friends/")
soup = BeautifulSoup(resp, parser, from_encoding=resp.info().get_param('charset'))

# Add hyperlinks to a list
for link in soup.find_all('a', href=True):
    links.append("https://fangj.github.io/friends/" + link['href'])

# Open each webpage, read all text, then write the text to a file
for i in range(0, len(links)):
    resp = urllib.request.urlopen(links[i])
    print("Reading " + str(resp))
    soup = BeautifulSoup(resp, features="lxml")
    txt = soup.get_text()
    file1.write(txt)

# Close the file
file1.close()

Now we have one mega long file with all the friends scripts in text in it. Once this has been done, we can now begin creating our model to generate a random script.

To start off, we import all necessary libraries, while also getting the path of our current directory.

# Import necessary libraries
import tensorflow as tf
import numpy as np
import os
import time

# These are for generating a random logs folder
import string
import random

# Root directory of the project
root_path = os.path.dirname(os.path.realpath(__file__))

Although its not greatly important to include this feature, depending on your GPU and the amount of Vram it has, its best to enable memory growth, which allows for the GPU to only increase its memory consumption when needed, rather then just allocating a huge chuck of memory automatically.

#Allow GPU Growth
gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
  try:
    for gpu in gpus:
      tf.config.experimental.set_memory_growth(gpu, True)
  except RuntimeError as e:
    print(e)

Next we will declare some parameters. We declare them here so if we want to adjust something for our model, we wont have to go hunting in our code to find it.

############################################################
#  Settings
############################################################
# The maximum length sentence we want for a single input in characters
seq_length = 100

# Number of RNN units
rnn_units = 1024

# The embedding dimension
embedding_dim = 256

# Batch size
BATCH_SIZE = 65

# Number of epochs to run through
EPOCHS=10

BUFFER_SIZE = 100

# Steps per epochs
STEPS = 50

# Low temperatures result in more predictable text.
# Higher temperatures result in more surprising text.
# Experiment to find the best setting.
temperature = 1.0

# Number of characters to generate
num_generate = 200

# Initialization of loss value as a global variable
# to be used in multiple functions
loss = 0

To read in all of our data, we will create a function to handle all of those parameters for us. First we get the path to where we saved our big text file to.

  # Path to dataset
  dataset_path = os.path.join(root_path, r"dataset/dataset.txt")

Next, we decode all of the text and store it in one long string.

  # Read, then decode for py2 compat.
  text = open(dataset_path, 'rb').read().decode(encoding='unicode_escape')

To better randomize our data, we sort the text for unique characters and store them in a list.

  # The unique characters in the file
  vocab = sorted(set(text))

For randomization, we map characters to a dictionary and crab their indexes. We also slice up our data and create sequences for our dataset. Finally, we further split up our dataset which allows us to return the entire sorted, and split dataset as a return value.

  # Creating a mapping from unique characters to indices
  char2idx = {u:i for i, u in enumerate(vocab)}
  idx2char = np.array(vocab)

  text_as_int = np.array([char2idx[c] for c in text])

  # Create training examples / targets
  char_dataset = tf.data.Dataset.from_tensor_slices(text_as_int)

  # Create the sequence from the dataset
  sequences = char_dataset.batch(seq_length+1, drop_remainder=True)

  def split_input_target(chunk):
    input_text = chunk[:-1]
    target_text = chunk[1:]
    return input_text, target_text
   
  # Split the dataset up for better readability and passing into epochs and batch sizes
  dataset = sequences.map(split_input_target)

Below is the entire create dataset method.

############################################################
#  Create Dataset
############################################################

def create_dataset():
  # Path to dataset
  dataset_path = os.path.join(root_path, r"dataset/dataset.txt")

  # Read, then decode for py2 compat.
  text = open(dataset_path, 'rb').read().decode(encoding='unicode_escape')

  # The unique characters in the file
  vocab = sorted(set(text))

  # Creating a mapping from unique characters to indices
  char2idx = {u:i for i, u in enumerate(vocab)}
  idx2char = np.array(vocab)

  text_as_int = np.array([char2idx[c] for c in text])

  # Create training examples / targets
  char_dataset = tf.data.Dataset.from_tensor_slices(text_as_int)

  # Create the sequence from the dataset
  sequences = char_dataset.batch(seq_length+1, drop_remainder=True)

  def split_input_target(chunk):
    input_text = chunk[:-1]
    target_text = chunk[1:]
    return input_text, target_text
   
  # Split the dataset up for better readability and passing into epochs and batch sizes
  dataset = sequences.map(split_input_target)

  return dataset, vocab, idx2char, char2idx

Although their are a ton of models available with a ton of uses, for simplicity and to create a baseline to build from, we use a sequential model.

First we create a embedding layer, which turns the individual characters from our dataset into dense vectors which could be used to better pass through the rest of our model. Next, we use a GRU layer, which is the critical and most important layer of our model. Since we are creating a model that generates text, we need to know the history of inputted characters, so we can check if a combination of characters makes a valid word or matches with the dataset. Similar to an LSTM, which is also used heavily in text generation models, GRUs have gates, which can learn which data to ignore, and which data to include based on past history feed through the model. Finally, we add dense layer, which is the most commonly used layer in deep learning, and has both input and output nodes that have weight values between nodes.

############################################################
#  Model
############################################################

def build_model(vocab_size, embedding_dim, rnn_units, batch_size):
  model = tf.keras.Sequential([
    tf.keras.layers.Embedding(vocab_size, embedding_dim,
                              batch_input_shape=[batch_size, None]),
    tf.keras.layers.GRU(rnn_units,
                        return_sequences=True,
                        stateful=True,
                        recurrent_initializer='glorot_uniform'),
    tf.keras.layers.Dense(vocab_size)
  ])
  return model

Now we get the meat of our model python script where we begin to actually train our model. First we shuffle our dataset and get the length of the vocabulary that we are passing in our model.

  dataset = dataset.shuffle(BUFFER_SIZE).batch(BATCH_SIZE, drop_remainder=True)

  # Length of the vocabulary in chars
  vocab_size = len(vocab)

Next, we build our model by passing our parameters declared earlier.

  model = build_model(
    vocab_size = vocab_size,
    embedding_dim=embedding_dim,
    rnn_units=rnn_units,
    batch_size=BATCH_SIZE)

Next we get some data from a categorical distribution, then squeeze our data for better reshaping.

  for input_example_batch, target_example_batch in dataset.take(1):
    example_batch_predictions = model(input_example_batch)
    print(example_batch_predictions.shape, "# (batch_size, sequence_length, vocab_size)")

  # Print out the different layers of the model
  model.summary()

  sampled_indices = tf.random.categorical(example_batch_predictions[0], num_samples=1)
  sampled_indices = tf.squeeze(sampled_indices,axis=-1).numpy()

  sampled_indices

One of the main parameters that we need to keep track of is the loss of our model. This parameter helps us distinguish how well our model is learning. To distinguish our loss, we use sparse categorical crossentropy.

  def loss(labels, logits):
    return tf.keras.losses.sparse_categorical_crossentropy(labels, logits, from_logits=True)

The optimizer we use in this example is called Adaptive Moment Estimation, or Adam. A modified version of gradient descent called stochastic gradient descent, Adam helps find the minimum for each epoch and minimize loss.

  model.compile(optimizer='adam', loss=loss)

To save our checkpoint and weight files, we must select the path and enable the option in our code. First we generate a random string so no matter how many times we try to train our model, the chances of overlapping log files would be very very small. Next we select the prefix of the checkpoints, then enable Tensorflow checkpoints.

  # generate a random string 15 characters long
  random_string = ''.join(random.choice(string.ascii_uppercase + string.digits) for _ in range(15))

  # Directory where the checkpoints will be saved
  checkpoint_dir = './logs/' + random_string + "/"

  # Name of the checkpoint files
  checkpoint_prefix = os.path.join(checkpoint_dir, "Model_{epoch}")

  checkpoint_callback=tf.keras.callbacks.ModelCheckpoint(
      filepath=checkpoint_prefix,
      save_weights_only=True,
      save_freq='epoch')

We then save our weights and fit our model. Once we fit our model, it begins training and generating weight files. We then select the path with the random string we generated earlier of where we want the weight files to go.

  model.save_weights(checkpoint_prefix.format(epoch=0))

  history = model.fit(dataset, steps_per_epoch=STEPS, epochs=EPOCHS, callbacks=[checkpoint_callback])

  # Create a path for the saving location of the model
  model_dir = checkpoint_dir + "model.h5"

  # Save the model
  # TODO: Known issue with saving the model and loading it back
  # later in the script causes issues. Working to fix this issue
  model.save(model_dir)

We then train from the last checkpoint, build our model, and print our the summary of it to the console.

  # Train from the last checkpoint
  tf.train.latest_checkpoint(checkpoint_dir)

  # Build the model with the dataset generated earlier
  model = build_model(vocab_size, embedding_dim, rnn_units, batch_size=1)

  # Load the weights
  model.load_weights(tf.train.latest_checkpoint(checkpoint_dir))

  # Build the model
  model.build(tf.TensorShape([1, None]))

  # Print out the model summary
  model.summary()

Below is all the code for the training of our model.

############################################################
#  Train Model
############################################################

def train_model(dataset, vocab):
  # Buffer size to shuffle the dataset
  # (TF data is designed to work with possibly infinite sequences,
  # so it doesn't attempt to shuffle the entire sequence in memory. Instead,
  # it maintains a buffer in which it shuffles elements).
  dataset = dataset.shuffle(BUFFER_SIZE).batch(BATCH_SIZE, drop_remainder=True)

  # Length of the vocabulary in chars
  vocab_size = len(vocab)

  model = build_model(
    vocab_size = vocab_size,
    embedding_dim=embedding_dim,
    rnn_units=rnn_units,
    batch_size=BATCH_SIZE)

  for input_example_batch, target_example_batch in dataset.take(1):
    example_batch_predictions = model(input_example_batch)
    print(example_batch_predictions.shape, "# (batch_size, sequence_length, vocab_size)")

  # Print out the different layers of the model
  model.summary()

  sampled_indices = tf.random.categorical(example_batch_predictions[0], num_samples=1)
  sampled_indices = tf.squeeze(sampled_indices,axis=-1).numpy()

  sampled_indices

  def loss(labels, logits):
    return tf.keras.losses.sparse_categorical_crossentropy(labels, logits, from_logits=True)

  example_batch_loss  = loss(target_example_batch, example_batch_predictions)
  print("Prediction shape: ", example_batch_predictions.shape, " # (batch_size, sequence_length, vocab_size)")

  model.compile(optimizer='adam', loss=loss)

  # generate a random string 15 characters long
  random_string = ''.join(random.choice(string.ascii_uppercase + string.digits) for _ in range(15))

  # Directory where the checkpoints will be saved
  checkpoint_dir = './logs/' + random_string + "/"

  # Name of the checkpoint files
  checkpoint_prefix = os.path.join(checkpoint_dir, "Model_{epoch}")

  checkpoint_callback=tf.keras.callbacks.ModelCheckpoint(
      filepath=checkpoint_prefix,
      save_weights_only=True,
      save_freq='epoch')

  model.save_weights(checkpoint_prefix.format(epoch=0))

  history = model.fit(dataset, steps_per_epoch=STEPS, epochs=EPOCHS, callbacks=[checkpoint_callback])

  # Create a path for the saving location of the model
  model_dir = checkpoint_dir + "model.h5"

  # Save the model
  # TODO: Known issue with saving the model and loading it back
  # later in the script causes issues. Working to fix this issue
  model.save(model_dir)

  # Train from the last checkpoint
  tf.train.latest_checkpoint(checkpoint_dir)

  # Build the model with the dataset generated earlier
  model = build_model(vocab_size, embedding_dim, rnn_units, batch_size=1)

  # Load the weights
  model.load_weights(tf.train.latest_checkpoint(checkpoint_dir))

  # Build the model
  model.build(tf.TensorShape([1, None]))

  # Print out the model summary
  model.summary()

  # Return the model
  return model

The next and final method we will create is to generate our script from our trained model. Using a seed string value to start our text generator, we then generate our predictions from our model to predict what character would come from the last character based on our training.

############################################################
#  Generate Text
############################################################

def generate_text(model, idx2char, char2idx, start_string):
  # Evaluation step (generating text using the learned model)

  # Converting our start string to numbers (vectorizing)
  input_eval = [char2idx[s] for s in start_string]
  input_eval = tf.expand_dims(input_eval, 0)

  # Empty array to store our results
  text_generated = []

  # Here batch size == 1
  model.reset_states()
  for i in range(num_generate):
      predictions = model(input_eval)
      # remove the batch dimension
      predictions = tf.squeeze(predictions, 0)

      # using a categorical distribution to predict the character returned by the model
      predictions = predictions / temperature
      predicted_id = tf.random.categorical(predictions, num_samples=1)[-1,0].numpy()

      # We pass the predicted character as the next input to the model
      # along with the previous hidden state
      input_eval = tf.expand_dims([predicted_id], 0)

      text_generated.append(idx2char[predicted_id])

  return (start_string + ''.join(text_generated))

Now that all of our methods have been created, we can set some arguments to select and choose what methods we want to call at a given time.

For our arguments, we have two types, train and generate. For training, we call our dataset and generate text based on that model. For generate we again call our dataset, then load int the path to the weights file that was saved before. We then build the model and generate the text.

############################################################
#  Configure
############################################################

if __name__ == '__main__':
    import argparse

    # Parse command line arguments
    parser = argparse.ArgumentParser(
        description='Train or Detect.')
    parser.add_argument("command",
                        metavar="<command>",
                        help="'train' or 'generate'")
    parser.add_argument('--weights', required=False,
                        metavar="/path/to/weights",
                        help="Path to weights file")
    parser.add_argument('--start', required=False,
                        metavar="start of string",
                        help="The word that will begin the output string")

    args = parser.parse_args()

    # Configurations
    if args.command == "train":
      # Load in the dataset and other function to use
      dataset, vocab, idx2char, char2idx = create_dataset()
      # Train the model
      model = train_model(dataset, vocab)
      print(generate_text(model, idx2char, char2idx,start_string=u"The "))
    
    if args.command == 'generate':
      # Load in the dataset and other function to use
      dataset, vocab, idx2char, char2idx = create_dataset()
      # Get the model path
      model_path = os.path.join(root_path, args.weights)
      # Length of the vocabulary in chars
      vocab_size = len(vocab)
      # Build the model with the dataset generated earlier
      model = build_model(vocab_size, embedding_dim, rnn_units, batch_size=1)
      # Load the weights
      model.load_weights(model_path)
      # Build the model
      model.build(tf.TensorShape([1, None]))
      # Generate text
      print(generate_text(model, idx2char, char2idx,start_string=u"The "))

Below is the entire code for the model python script.

#!/usr/bin/env python
# -*- coding: utf-8 -*-

"""
   The purpose of this python script is to both experiment with new deep learning techniques,
   and to further my knowledge on the subject of deep learning.

   I used this as a reference guide:
   https://www.tensorflow.org/tutorials/text/text_generation
"""

# Import necessary libraries
import tensorflow as tf
import numpy as np
import os
import time

# These are for generating a random logs folder
import string
import random

# Root directory of the project
root_path = os.path.dirname(os.path.realpath(__file__))

#Allow GPU Growth
gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
  try:
    for gpu in gpus:
      tf.config.experimental.set_memory_growth(gpu, True)
  except RuntimeError as e:
    print(e)

############################################################
#  Settings
############################################################
# The maximum length sentence we want for a single input in characters
seq_length = 100

# Number of RNN units
rnn_units = 1024

# The embedding dimension
embedding_dim = 256

# Batch size
BATCH_SIZE = 65

# Number of epochs to run through
EPOCHS=10

BUFFER_SIZE = 100

# Steps per epochs
STEPS = 50

# Low temperatures result in more predictable text.
# Higher temperatures result in more surprising text.
# Experiment to find the best setting.
temperature = 1.0

# Number of characters to generate
num_generate = 200

# Initialization of loss value as a global variable
# to be used in multiple functions
loss = 0

############################################################
#  Create Dataset
############################################################

def create_dataset():
  # Path to dataset
  dataset_path = os.path.join(root_path, r"dataset/dataset.txt")

  # Read, then decode for py2 compat.
  text = open(dataset_path, 'rb').read().decode(encoding='unicode_escape')

  # The unique characters in the file
  vocab = sorted(set(text))

  # Creating a mapping from unique characters to indices
  char2idx = {u:i for i, u in enumerate(vocab)}
  idx2char = np.array(vocab)

  text_as_int = np.array([char2idx[c] for c in text])

  # Create training examples / targets
  char_dataset = tf.data.Dataset.from_tensor_slices(text_as_int)

  # Create the sequence from the dataset
  sequences = char_dataset.batch(seq_length+1, drop_remainder=True)

  def split_input_target(chunk):
    input_text = chunk[:-1]
    target_text = chunk[1:]
    return input_text, target_text
   
  # Split the dataset up for better readability and passing into epochs and batch sizes
  dataset = sequences.map(split_input_target)

  return dataset, vocab, idx2char, char2idx

############################################################
#  Model
############################################################

def build_model(vocab_size, embedding_dim, rnn_units, batch_size):
  model = tf.keras.Sequential([
    tf.keras.layers.Embedding(vocab_size, embedding_dim,
                              batch_input_shape=[batch_size, None]),
    tf.keras.layers.GRU(rnn_units,
                        return_sequences=True,
                        stateful=True,
                        recurrent_initializer='glorot_uniform'),
    tf.keras.layers.Dense(vocab_size)
  ])
  return model

############################################################
#  Train Model
############################################################

def train_model(dataset, vocab):
  # Buffer size to shuffle the dataset
  # (TF data is designed to work with possibly infinite sequences,
  # so it doesn't attempt to shuffle the entire sequence in memory. Instead,
  # it maintains a buffer in which it shuffles elements).
  dataset = dataset.shuffle(BUFFER_SIZE).batch(BATCH_SIZE, drop_remainder=True)

  # Length of the vocabulary in chars
  vocab_size = len(vocab)

  model = build_model(
    vocab_size = vocab_size,
    embedding_dim=embedding_dim,
    rnn_units=rnn_units,
    batch_size=BATCH_SIZE)

  for input_example_batch, target_example_batch in dataset.take(1):
    example_batch_predictions = model(input_example_batch)
    print(example_batch_predictions.shape, "# (batch_size, sequence_length, vocab_size)")

  # Print out the different layers of the model
  model.summary()

  sampled_indices = tf.random.categorical(example_batch_predictions[0], num_samples=1)
  sampled_indices = tf.squeeze(sampled_indices,axis=-1).numpy()

  sampled_indices

  def loss(labels, logits):
    return tf.keras.losses.sparse_categorical_crossentropy(labels, logits, from_logits=True)

  example_batch_loss  = loss(target_example_batch, example_batch_predictions)
  print("Prediction shape: ", example_batch_predictions.shape, " # (batch_size, sequence_length, vocab_size)")

  model.compile(optimizer='adam', loss=loss)

  # generate a random string 15 characters long
  random_string = ''.join(random.choice(string.ascii_uppercase + string.digits) for _ in range(15))

  # Directory where the checkpoints will be saved
  checkpoint_dir = './logs/' + random_string + "/"

  # Name of the checkpoint files
  checkpoint_prefix = os.path.join(checkpoint_dir, "Model_{epoch}")

  checkpoint_callback=tf.keras.callbacks.ModelCheckpoint(
      filepath=checkpoint_prefix,
      save_weights_only=True,
      save_freq='epoch')

  model.save_weights(checkpoint_prefix.format(epoch=0))

  history = model.fit(dataset, steps_per_epoch=STEPS, epochs=EPOCHS, callbacks=[checkpoint_callback])

  # Create a path for the saving location of the model
  model_dir = checkpoint_dir + "model.h5"

  # Save the model
  # TODO: Known issue with saving the model and loading it back
  # later in the script causes issues. Working to fix this issue
  model.save(model_dir)

  # Train from the last checkpoint
  tf.train.latest_checkpoint(checkpoint_dir)

  # Build the model with the dataset generated earlier
  model = build_model(vocab_size, embedding_dim, rnn_units, batch_size=1)

  # Load the weights
  model.load_weights(tf.train.latest_checkpoint(checkpoint_dir))

  # Build the model
  model.build(tf.TensorShape([1, None]))

  # Print out the model summary
  model.summary()

  # Return the model
  return model

############################################################
#  Generate Text
############################################################

def generate_text(model, idx2char, char2idx, start_string):
  # Evaluation step (generating text using the learned model)

  # Converting our start string to numbers (vectorizing)
  input_eval = [char2idx[s] for s in start_string]
  input_eval = tf.expand_dims(input_eval, 0)

  # Empty array to store our results
  text_generated = []

  # Here batch size == 1
  model.reset_states()
  for i in range(num_generate):
      predictions = model(input_eval)
      # remove the batch dimension
      predictions = tf.squeeze(predictions, 0)

      # using a categorical distribution to predict the character returned by the model
      predictions = predictions / temperature
      predicted_id = tf.random.categorical(predictions, num_samples=1)[-1,0].numpy()

      # We pass the predicted character as the next input to the model
      # along with the previous hidden state
      input_eval = tf.expand_dims([predicted_id], 0)

      text_generated.append(idx2char[predicted_id])

  return (start_string + ''.join(text_generated))

############################################################
#  Configure
############################################################

if __name__ == '__main__':
    import argparse

    # Parse command line arguments
    parser = argparse.ArgumentParser(
        description='Train or Detect.')
    parser.add_argument("command",
                        metavar="<command>",
                        help="'train' or 'generate'")
    parser.add_argument('--weights', required=False,
                        metavar="/path/to/weights",
                        help="Path to weights file")
    parser.add_argument('--start', required=False,
                        metavar="start of string",
                        help="The word that will begin the output string")

    args = parser.parse_args()

    # Configurations
    if args.command == "train":
      # Load in the dataset and other function to use
      dataset, vocab, idx2char, char2idx = create_dataset()
      # Train the model
      model = train_model(dataset, vocab)
      print(generate_text(model, idx2char, char2idx,start_string=u"The "))
    
    if args.command == 'generate':
      # Load in the dataset and other function to use
      dataset, vocab, idx2char, char2idx = create_dataset()
      # Get the model path
      model_path = os.path.join(root_path, args.weights)
      # Length of the vocabulary in chars
      vocab_size = len(vocab)
      # Build the model with the dataset generated earlier
      model = build_model(vocab_size, embedding_dim, rnn_units, batch_size=1)
      # Load the weights
      model.load_weights(model_path)
      # Build the model
      model.build(tf.TensorShape([1, None]))
      # Generate text
      print(generate_text(model, idx2char, char2idx,start_string=u"The "))

Although deep learning can be an intimidating, breaking it down to smaller pieces can help you understand it better. Although I didn’t mention the math in this post, it is worth to read up on it, which could develop a better understanding of how deep learning works.

Advertisements

1 Comments on “Creating a Friends Script Generator with Deep Learning”

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: