We are very excited to invite you to Linguistics Circle this Friday, April 15th at 3pm in Greene Hall 528 to see Professors Claudia and Jerid Francom (Romance Languages/ Interpreting and Translation Studies) present their current research.
Tell me a story and I’ll tell you where you’re from: dialect recognition using machine learning algorithms
Jerid Francom, Romance Languages
Friday 3/20 at 3pm
Greene Hall 528
Linguistic variation is a pervasive characteristic of languages. It occurs at all linguistic levels and is predicted by socio-demographic variables, time period, and geographical and political boundaries. Understanding how languages and language varieties differ has attracted much attention from public and academic communities, and for good reason –it reflects the unique ways in which humans interface the world and holds the key to understanding language’s place in cognition.
In this talk I explore language variation through a lesser-traveled path: machine learning. Focusing on Spanish-language variation, I provide results from a series of text classification tasks that suggest variation between Argentine, Mexican, and Spanish dialects, present in the ACTIV-ES Spanish-language corpus, can be modeled and used to accurately predict where a speaker is from based on word choices alone.
If you are a native English speaker who has a very high level proficiency in Spanish, please consider taking the following linguistic task (~45 min). In addition to helping a fellow linguist out (David Miller, M.A. student at the University of Florida), you will gain insight regarding the types of linguistic studies and tasks that are out there. Some day you might need participants!