Language identification using machine learning and artificial intelligence in Python

Problem

Many types of research have been conducted, some proposing a language identification model and others conducting systemic reviews of studies. But little information has been conducted to evaluate various machine learning models and their performance based on the following metrics sensitivity, specificity, accuracy, F1-score, accuracy, true positive, false negative, true negative, and false positive.

Objective

·         To evaluate and compare the performance of various machine-learning models

·         To change the model hyper-parameter to determine the best parameters to train, validate, and test language identification

 

·         To recommend the best machine learning model for identifying language. 

 

Should have a theoretical framework 

Disclaimer

The assignment sample provided by Assignments Consultancy is a previously completed work for another student and contains plagiarism. It is being shared only as a reference or guideline to help you understand how to structure and approach your own assignment. We do not recommend submitting it directly as your own work. You are solely responsible for ensuring the originality and integrity of the assignment you submit, and we advise using this sample only as inspiration while adhering to your institution's academic policies.

The process of identifying a language is carried out by reading or listening to printed information. In the subject of computational linguistics, this is one of the most important, if not the most challenging, tasks. Language identification according to Ja uhiainen et al. (2019) is one of the many potential uses for machine learning, and text classification in particular. Models for language recognition are able to identify the language of text that is presented to them, regardless of the source (news articl es, voice-to-text output, emails, etc.). Lopez-Moreno et al. (2014) and Matejka et al. (2014) argue that by adding more layers, this strategy efficiently uses linguistically specific techniques to classify and organize content. For example, one has to cho ose the language of the dictionary that will be used on the page before they can use spell checking in a Word document to avoid the spell checker producing inaccurate results. Other possible uses could be applying language-specific text classification to t he text being analyzed, assigning emails to the correct customer service agent based on the recipient's location, or titling movies with closed captions or subtitles. Many different kinds of research have been done; some studies have suggested a framework for language identification, while others have carried out systematic reviews of the results. The effectiveness of various machine learning models as measured by the following metrics, however, is not well-researched: sensitivity, specificity, accuracy, F1 -score, true positive, false negative, true positive, and false positive. By analyzing earlier studies on the topic, this literature review aims to create an understanding of language identification using machine learning and artificial intelligence in Pyt hon.

Matejka et al. (2014) describe language as the essential communication tool that people need to share information with one another. People have the ability to take part in this communication by using sounds or gestures, d epending on the situation. Most people learn languages throughout time in order to improve their ability to interact with others around them. Beyond the simple transmission of information, language promotes professional growth, alliance building, the stren gthening of business-to-business economic ties, and a heightened understanding of the world's different cultures ( Fairclough , 2013; Mazari and Derraz , 2015). There are currently thousands and thousands of languages spoken daily throughout the world, and as a result, there is a need for language identification. According to Lui and Baldwin (2014), the basic goal of identifying the language used in the construction of a written text, spoken statement, or document is the cornerstone of the language identificat ion process. There are various roles that language identification plays, including (Van Otten , 2023):

LEAVE A COMMENT

Comment Box is loading comments...