Predicting Diagnostic and Corrective Actions of Faulty PLC Devices Using LSTM.

4 min readJun 1, 2023

Problem Description

A programmable logic controller (PLC) is a specialized industrial computer used to control manufacturing processes, including assembly, machines, and robotic devices.

A challenge arises when these devices encounter errors, requiring operators to manually search for the error, diagnose the issue, and determine the appropriate corrective action.

To address this challenge, machine learning and natural language processing (NLP) can be leveraged to simplify the process of error correction for operators.

Data Collection and Preprocessing

The dataset used for this project is an Excel spreadsheet containing information about the PLC models, faults, diagnostics, and corrective actions. You can access the dataset through this link here.

I have already performed preprocessing, which involved removing stopwords, lemmatization, removing unwanted characters, and converting the text to lowercase.

I went with the approach to join the rows to create one column with the entire information for each fault.

columns_to_join = ['PLC', 'Model','Fault','Diagnostic','Corrective Action']

df['data'] = df[columns_to_join].apply(lambda x: ' '.join(x), axis=1)

The next step is to create an end token for each row to mark the beginning and end of each fault with the corresponding diagnostic and corrective action.

import re

def add_end_token(text):
    text = re.sub(r'([^.]*\.)', r'\1 <end>', text)
    text = text.strip().replace(' <end>', '<end>')
    if not text.endswith('<end>'):
        text += ' <end>'
    return text

df['data'] = df['data'].apply(add_end_token)

Tokenization and Sequence Generation

I utilized the tokenizer from the TensorFlow Preprocessing module to perform tokenization. Firstly, I instantiated a tokenizer object and then fitted it on the data, allowing it to generate tokenized sequences. Subsequently, I transformed the texts into sequences of integer values using the tokenizer’s functionality.

data = df['data']
tokenizer = Tokenizer()
tokenizer.fit_on_texts(data)
sequences = tokenizer.texts_to_sequences(data)

Padding and Categorization

Padding ensures that all sequences have the same length by adding zeros (padding tokens) to sequences that are shorter than the maximum length. The pad_sequences function from the Keras library is used to perform padding on the input sequences.

I calculated the maximum sequence length by examining the tokenized sequences. Employing a sliding window approach, I created input-output pairs for each sequence. To align the inputs, I padded them to match the maximum length, while I converted the corresponding outputs into one-hot encoded vectors. I then stored the data in lists and converted them to NumPy arrays.

max_sequence_length = max([len(seq) for seq in sequences])
input_data = []
output_data = []
for sequence in sequences:
    for i in range(1, len(sequence)):
        input_seq = sequence[:i]
        input_seq = pad_sequences([input_seq], maxlen=max_sequence_length)[0]
        output_seq = to_categorical(sequence[i], num_classes=len(tokenizer.word_index) + 1)
        input_data.append(input_seq)
        output_data.append(output_seq)
input_data = np.array(input_data)
output_data = np.array(output_data)

Word Embedding.

I used a pre-trained FastText word embedding model which you can get from here.

I then created an embedding matrix with dimensions based on the tokenizer’s vocabulary size and the chosen embedding dimension. For each word in the tokenizer’s vocabulary, if the word exists in the FastText model, the corresponding row in the embedding matrix is populated with the pre-trained word embedding.

embedding_dim = 300
embedding_matrix = np.zeros((len(tokenizer.word_index) + 1, embedding_dim))
for word, i in tokenizer.word_index.items():
    if word in fasttext:
        embedding_matrix[i] = fasttext[word]

Model Architecture

I created a sequential LSTM model with one embedding layer, 2 LSTM layers, one dropout layer, and one dense layer with softmax activation.

The model is compiled with categorical cross-entropy as the loss, Adam optimizer, and accuracy as the evaluation metric.

model = Sequential()
model.add(Embedding(len(tokenizer.word_index) + 1, embedding_dim, weights=[embedding_matrix], trainable=True))
model.add(LSTM(256, return_sequences=True))
model.add(Dropout(0.2))
model.add(LSTM(128))
model.add(Dense(len(tokenizer.word_index) + 1, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

model.summary()

model.fit(input_data, output_data,batch_size=256, epochs=20)

Output Function

To see the model in action, I wrote a function that takes in the model, tokenizer, input, and max_length of the output which I set to default as 40.


def generate_text(model, tokenizer, input_text, max_length=40):

    generated_text = input_text

    stop_condition = False
    while not stop_condition:
        input_sequence = tokenizer.texts_to_sequences([generated_text])[0]
        input_sequence = pad_sequences([input_sequence], maxlen=max_length-1, padding='pre')
        # make a prediction
        prediction = model.predict(input_sequence)[0]
        predicted_index = np.argmax(prediction)
        predicted_word = tokenizer.index_word.get(predicted_index, '')
        # check if we've generated the maximum length or found the end token
        if len(generated_text.split()) == max_length or predicted_word == 'end':
            stop_condition = True
        else:
            # append the predicted word to the generated text
            generated_text += ' ' + predicted_word
    return generated_text[len(input_text):]

input_text = ' '
generated_text = generate_text(model, tokenizer, input_text)
print(generated_text)

Challenges

The challenge of this implementation comes when a fault has more than one diagnostic and corrective action. Another way this can be solved is by using Langchain and open-source language models.

The link to the repository. The Streamlit app.