Abstract:
Reading and understanding similar type of requirements using human interventions can
be a laborious and time-consuming task. However, the complexity of natural language makes
it difficult to accurately identify semantic similarities, which makes more difficulty to the
task. The majority of algorithms for detecting similarities rely on matching words to words,
paragraphs to paragraphs, or the entire page to the matching.. In this project, a novel ap-
proach is proposed which uses large language model (LLM) such as Sentence Transformers
for detecting semantically similar requirement.The Sentence Transformer models used in this
project includes all-MiniLM-L6-v2 and paraphrase-MiniLM-L6-v2. The dataset consists of
10,500 software and system requirements vital for SCANIA’s (German Company) operations.
The aim of the project is to find semantically similar system and software requirements and
categorize those requirements based on similarity score and compare the performance of Sen-
tence Transformers with traditional algorithms such as Word2vec and TF-IDF Vectorizer to
find semantically similar requirements. In addition to it a GUI(Graphical User Interface)
is built so that user can interact easily and find similar requirements. The project employs
several measures to evaluate the performance of the model, including F1 score, precision ,
accuracy, euclidian distance , cosine similarity.Among ‘all-MiniLM-L6-v2 ‘ and ‘paraphrase-
MiniLM-L6-v2’, ‘all-MiniLM-L6-v2’ outperformed better with an accuracy of 92%,precision
of 95%,Recall of 82% and F1 score of 88%. Ultimately, the project aspires to revolutionize
requirement analysis, driving efficiency and productivity in software and system develop-
ment processes through the seamless integration of cutting-edge technologies and intuitive
interfaces.