ANALISIS TINGKAT PLAGIASI DOKUMEN SKRIPSI DENGAN METODE COSINE SIMILARITY DAN PEMBOBOTAN TF-IDF

Abstract

Plagiarism is the activity of duplicating or imitating the work of others then recognized as his own work without the author's permission or listing the source. Plagiarism or plagiarism is not something that is difficult to do because by using a copy-paste-modify technique in part or all of the document, the document can be said to be the result of plagiarism or duplication.             The practice of plagiarism occurs because students are accustomed to taking the writings of others without including the source of origin, even copying in its entirety and exactly the same. Plagiarism practices are mostly carried out by students, especially when completing the final project or thesis             One way that can be used to prevent the practice of plagiarism is by doing prevention and detecting. Plagiarism detection uses the concept of similarity or document similarity is one way to detect copy & paste plagiarism and disguised plagiarism. one of the right methods that can be done to detect plagiarism by analyzing the level of document plagiarism using the Cosine Similarity method and the TF-IDF weighting. This research produces an application that is able to process the similarity value of the document to be tested. Hasik testing shows that it is appropriate between manual calculations and implementation of algorithms in the application made. Use of the Literature Library is quite effective in the Stemming process. Calculations that use stemming will have a higher similarity value compared to calculations without stemming methods.