Year

2024

Season

Fall

Paper Type

Master's Thesis

College

College of Computing, Engineering & Construction

Degree Name

Master of Science in Computer and Information Sciences (MS)

Department

Computing

NACO controlled Corporate Body

University of North Florida. School of Computing

First Advisor

Dr. Kevin Pfeil

Second Advisor

Dr. Karthikeyan Umapathy

Third Advisor

Dr. Corey Pittman

Department Chair

Dr. Zornitza Prodanoff

College Dean

Dr. William F. Klostermeyer

Abstract

Virtual Reality (VR) is increasingly popular, but many barriers exist for individuals with little experience in coding, 3D modeling, or creating their own virtual experiences. The current tools used for content creation are often viewed as complex or frustrating, and they exhibit a steep learning curve. This problem presents an opportunity to develop tools incorporating Natural User Interfaces that better support end users. One such possible tool to assist users is Large Language Models (LLM), which can, extract a user's intention through speech or text. We posit how using LLMs can better support novice and expert developers alike, and using a human-in-the-loop approach, we can foster a Human-AI cocreative process. Towards the realization of that goal, we created a multimodal tool in which users can use a virtual reality system that incorporates a large language model, as well as direct manipulation, menus, eye gaze, and speech, to facilitate a more natural VR authoring experience. We created a template in Unity3D that can be customized for various tasks, including the construction of a 3D environment and the creation of commands.

In this thesis, we describe a summative research study with 22 participants to determine the usability and future of our tool. Our participants were tasked with authoring a predefined environment, and we got a system usability scale score (SUS) of 57\% which is between "Ok" and "Good" which was expected due to the flexibility and high range of freedom within the system. All users indicated some degree of ease of use when using the system. However, most users also highlighted on how the mechanisms were difficult, highlighting the learning curve of the tool itself.

Our results indicate that our multimodal approach, combining a large language model with other 3D user interface modalities, can provide a more intuitive and accessible interface for users. Future research and development will focus on fine-tuning these interactions and expanding the capabilities to better support the user.

Suggested Citation

Sayed, Ahmed A., "Large language models for multimodal user interaction in a virtual environment" (2024). UNF Graduate Theses and Dissertations. 1300.
https://digitalcommons.unf.edu/etd/1300

Download

Included in

Computer Engineering Commons

COinS

UNF Graduate Theses and Dissertations

Large language models for multimodal user interaction in a virtual environment

Year

Season

Paper Type

College

Degree Name

Department

NACO controlled Corporate Body

First Advisor

Second Advisor

Third Advisor

Department Chair

College Dean

Abstract

Suggested Citation

Included in

Search

Links

Browse

Author Corner

UNF Graduate Theses and Dissertations

Large language models for multimodal user interaction in a virtual environment

Author

Year

Season

Paper Type

College

Degree Name

Department

NACO controlled Corporate Body

First Advisor

Second Advisor

Third Advisor

Department Chair

College Dean

Abstract

Suggested Citation

Included in

Share

Search

Links

Browse

Author Corner