Hi, I am Shisen Yue and I'm studying for a B.A. in Linguistics in the Department of Foreign Language at Shanghai Jiao Tong University (SJTU) on track to graduate in 2024. I'm interested in Computational Psycholinguistics and Natural Language Processing. During my undergraduate study, I'm advised by Professor Hai Hu in SJTU on the research of large language models.
Started from Spring 2023, I was lucky to become a research assistant advised by Professor William Schuler, evaluating sentence processing theories with large language models and a computational cognitive model.
In my graduate study, I want to dive deeper into the intersection of linguistics, cognitive science, and computer science, exploring whether the theoretical frameworks proposed by linguistic theories truly manifest in language processing and how this cognitive process can be formalized at the algorithmic level. I want to employ machine learning, experimental methods and cognitive modeling to approach these problems. I'm excited about investigating state-of-the-art LLMs and developing psychologically plausible models to better understand human language processing.
Shanghai Jiao Tong University
2020.09 - 2024.06 (expected)
B.A. in Linguistics (GPA: 88.4/100), with a concentration on computational linguistics. See more details in the selected grades.
Linguistics
Introduction to Syntax: 97/100
Phonetics and Phonology: 90/100
Introduction to Semantics: 91/100
Language and Cognition: 93/100
Language Data and Python Techniques: 92/100
Introduction to English Linguistics: 92/100
Research Methods in Linguistics: 90/100
Language Acquisition: A
Computer Science
Design and Analysis of Algorithms: 97/100
Thinking and Approach of Programming: 92/100
Reinforcement Learning: 95/100
Data Structure: A
Machine Learning: A
C++ Programming (UCB Extension): A
Machine Learning (UCLA Extension): A+
Mathematics
Probability and Statistics: 97/100
Calculus: 96/100
Research Assistant, Ohio State University
2023.04 - Present
Advised by Professor William Schuler, I participated in two research projects about surprisal. Check out our findings!
With the psycholinguistic toolkit modelblocks, I processed and conducted regression analysis with naturalistic data from corpora PROVO and GECO. The analysis was then used to prove that as language models increase in size and the amount of training data, they perform worse in predicting surprisal as their superhuman accuracy prediction of rare words. The work has now been published at EACL!
I refactored a left corner parser with distributed associative memory used in Rasmussen and Schuler (2018). The model formalizes memory as vectors, reflects the parallel propagation of multiple analyses during parsing and proves the validity of the surprisal account of human sentence processing. The paper has been submitted to Cognitive Science.
I manually built and annotated a dataset which includes multiple choice questions about the conversational implicatures of character lines in a Chinese sitcom and I used this dataset to evaluate pragmatic understanding of LLMs.
Python Gricean maxims Implicatures
PaperI trained SVM classifiers with a series of linguistic features extracted from text to identify exam essays genereated by LLMs.
Python Linux Shell
PaperI designed questionnaires and an event segmentation task to investigate the correlation between the awareness of aspect markers 了 and 着 (pronounced as le and zhe) and people's perception of event boundaries.
SPSS Chinese aspect markers Boundedness
Paper07 Jan 2024
Practicing conversational skills with chatbots has become an innovative method in language learning. The AI-powered communication partners have demonstrated multiple advantages compared to human partners, such as being less time-consuming and less anxiety-induced. Moreover, students’ interest in the innovative learning...
Chatbots Language Acquisition
07 Jan 2024
How written words are encoded and decoded by the human brain? To pursue an answer to this question, we review on research in sentence processing and left corner parsing theories. We particularize the models and results in five studies that...
Sentence Processing Left Corner Parsing
15 Jan 2023
The ultimate attainment of the L2 possesses broad significance in both language pedagogy, human cognition and neuroscience and draws fierce debate in the academic circle. The Critical Period Hypothesis(CPH) argues that such a key period for learning L1 is also...
Critical Period Hypothesis Language Acquisition