This repository has been archived by the owner on Aug 18, 2021. It is now read-only.
This repository has been archived by the owner on Aug 18, 2021. It is now read-only.
seq2seq: Replace the embeddings with pre-trained word embeddings such as word2vec #146
Open
Description
Hi,
Thank you for your tutorial! I tried to change the embedding with pre-trained word embeddings such as word2vec, here is my code:
class Lang:
def __init__(self, name):
self.name = name
self.word2index = {}
self.word2count = {}
self.index2word = {0: "SOS", 1: "EOS"}
self.n_words = 2 # Count SOS and EOS
def get_word2vec(self):
word2vec = KeyedVectors.load_word2vec_format('Models/Word2Vec/wiki.he.vec')
return word2vec
def addSentence(self, sentence):
for word in sentence.split(' '):
self.addWord(word)
def addWord(self, word):
if word not in self.word2index:
self.word2index[word] = self.get_word2vec[word]
self.word2count[word] = 1
self.index2word[self.n_words] = word
self.n_words += 1
else:
self.word2count[word] += 1
the dimension size of this word2vec is 300 dimensions
Is I need to change other things in my Encoder?
Thank you!
Metadata
Assignees
Labels
No labels
Activity