
Data Science Professor William Claster Works at the Leading Edge of AI
by Daniel Cohen
Dr. William Claster, associate teaching professor in the College of Engineering on Northeastern’s Arlington campus, first encountered the field of data science after arriving at Ritsumeikan Asia Pacific University on the southern Japanese island of Kyushu. Until that point in his career he had focused on mathematics, teaching the subject at universities in the United States and Japan for 20 years.
Claster first visited Japan while he was an undergraduate and quickly discovered he wanted to return. He got the chance after teaching at Rutgers and Fordham universities for several years after earning his master’s at Temple University.
In 2005, Ritsumeikan recruited Claster to establish a new program in data science. One of his first research projects attempted to measure user sentiment in near-real time based on the tone of posts on a particular topic sent on social networking site Twitter, since renamed X. He developed a model to classify a tweet’s tone that appeared to track real world events and concluded that social media, such as micro blogs, can be used as a reliable indicator of consumer sentiment within the tourism and hospitality industry. A later project applied mathematical algorithms to social media and other information sources to forecast movements in the stock market.
Claster’s pivot to data science soon led him to embrace the emerging field of artificial intelligence and focus on machine learning and large language models (LLMs). He earned a Ph.D. at Ritsumeikan, taught math and data science at the school for 19 years, and directed the school’s data labs. Upon his departure in 2023 to return to the United States and join Northeastern’s faculty, the school named him a professor emeritus.
The combination of Northeastern’s course offerings in AI and data science, its innovative co-op program for graduate students, and the presence of an array of tech companies and the federal government in the Washington, D.C., metro area attracted Claster to the campus.
“D.C. is an exceptional place for AI,” he said. “You have the federal government actively investing in AI initiatives, major tech companies establishing AI research centers here, and a constant stream of conferences, hackathons and competitions. The proximity to policymakers, defense agencies, and federal contractors creates unique opportunities for AI applications that you don’t find elsewhere — even in Silicon Valley.”
At Northeastern, Claster teaches introductory and advanced classes in data science, and programming for students pursuing a Master of Science in Information Systems, Master of Science in Information Systems – Bridge (for students with nontechnical bachelor’s degrees), or a Master of Science in Software Engineering Systems. He also teaches a class in agentic AI, which builds on generative AI capabilities to complete complex tasks autonomously. Employers are using agentic AI to personalize customer service, streamline software development and improve patient interactions, among other applications.
“It’s very, very new. The fact that we have that here, is noteworthy,” said Claster, who also serves as associate program director of Multidisciplinary Graduate Engineering Programs for the Arlington campus.
Beyond his teaching, Claster leads a cross-campus agentic AI research lab that brings together graduate students from Northeastern’s Arlington and Miami campuses. The lab, which launched in October following the DevFest DC Hackathon, meets weekly to explore practical agent-based projects. Students work on initiatives ranging from machine learning research automation pipelines to multi-agent security research and LLM cost optimization. “We’re building a collaborative research community in what is still a very new field,” Claster said. “The goal is to move from learning to build AI agents to ultimately producing publishable research.”
His individual research focuses on improving the performance of LLMs used in generative AI, particularly through novel approaches such as reinforcement learning from human feedback. A better understanding of the models’ architecture will help scientists overcome their shortcomings, including the high cost to build and operate them, and their propensity to hallucinate responses.
Claster’s 430-page textbook on machine learning, Mathematics and Programming for Machine Learning with R: From the Ground Up, covers the coding skills beginning programmers need to implement machine learning algorithms. The textbook, published in 2020, addresses both probability-based machine learning algorithms and machine learning based on artificial neural networks.
Students interested in programming need to learn about AI as this rapidly evolving technology will have a tremendous impact on the future, he said. “There’s a lot to capture our imagination,” Claster said.
With the tech industry fully committed to advancing AI’s capabilities, acquiring the skills to become a software engineer through a master’s degree in information systems or software engineering systems should “open many doors,” he said.