RESEARCH

LLM-Driven Adaptive Robotic Manipulation for Complex Task Planning and Execution

This project showcases the integration of Large Language Models (LLMs) and Vision-Language Models (VLMs) for robotic task planning and execution in dynamic, simulated kitchen environments. Custom scenarios, including stacking, sorting, cleaning, and table setup, were designed to evaluate the robot's ability to interpret natural language instructions and perform complex tasks.

Key contributions include a robust perception-action pipeline with affordance-based strategies and modular task execution methods. While the system excelled in simpler tasks, challenges such as perception inaccuracies and compounded errors limited success in more complex scenarios.

Overall, this work highlights the potential of LLMs to enhance robot adaptability and generalization across diverse tasks, paving the way for future improvements in real-world applications.

A Multi-Objective Optimization Framework for Robots Design

In this project, a multi-objective optimization framework was developed to enhance robot design for specific tasks within complex operational environments.

The framework utilized genetic algorithms for broad solution exploration and gradient descent for fine-tuning parameters, focusing on optimizing critical attributes such as efficiency, safety, and dexterity.

Advanced computational methods were applied to balance competing design criteria effectively. The framework was rigorously tested through case studies involving dental and surgical robots, demonstrating its ability to produce designs that significantly improve manipulability, minimize errors, and ensure operational safety in constrained spaces.

Sketch-to-CAD: Generating Computer Aided Design Models from Sketches

In this project, we introduce Sketch-to-CAD, a deep learning model to generate 3D CAD workflows from 2D sketches. This can expedite converting sketches to CAD designs.

Four model architectures were evaluated. The CNN encoder + Transformer decoder model performed the best, achieving 91.3% accuracy on CAD commands and 85.8% on parameters. It generated reasonable CAD sequences for complex shapes.

Suggestions for future work include using larger datasets, alternate step representations, and reinforcement learning to improve generalization.

Enhancing Robotic Manipulation: Harnessing the Power of Multi-Task Reinforcement Learning and Single Life Reinforcement Learning in Meta-World

In this project, we aim to enable a robotic arm to successfully execute seven distinct manipulation tasks within the Meta-World simulation environment. To accomplish this, we first train a multi-task soft actor-critic (MT-SAC) agent on the seven tasks to generate useful prior experience data. We find that sine encoding is the most effective method for representing the task IDs. The trained MT-SAC model does not perform well when we test it on novel scenarios with different object positions.

To address this limitation, we propose a multi-task Q-weighted adversarial learning algorithm (MT-QWALE) that leverages the prior experience from MT-SAC to complete the tasks in novel situations within a single trial. In our experiments, MT-QWALE successfully completes most tasks under position novelty, outperforming MT-SAC. An ablation study finds MT-QWALE can still accomplish tasks without the end goal input, relying on reward feedback.

Safe Navigation: Training Autonomous Vehicles using Deep Reinforcement Learning

In this project, we developed a deep reinforcement learning system to train autonomous vehicles to safely navigate uncertain environments using the CARLA driving simulator. We used deep Q-networks (DQNs) to predict driving actions from sensor data like camera images and distance sensors. First, we pre-processed the sensor data to extract key information like the vehicle's position, orientation, and distance to obstacles. This reduced the complexity of the state space for more efficient training.

We trained separate DQN models for braking and driving actions. We combined these models hierarchically, with the braking model acting as a safety net. Our approach achieved a 94% success rate in navigating four test trajectories with traffic and pedestrians, demonstrating good generalization. The next step is training models that can estimate position and orientation directly from camera data, removing reliance on simulator data.

Predicting Mean Ribosome Load of 5’UTRs

In this project, we explored various machine learning techniques including CNNs, LSTMs, and Transformers to predict the mean ribosome load (MRL) of 5' UTR sequences, an important measure of gene expression. We started by reproducing a CNN model which achieved an R-squared of 0.90 on fixed length sequences. By training LSTM models and taking advantage of longer range dependencies, we improved performance to an R-squared of 0.94 on fixed lengths and 0.81 on varying lengths after combining datasets.

Our best model was an ensemble of Transformers trained on sequences with and without ATGs along with oversampling, which achieved an R-squared of 0.95. Through systematically evaluating different deep learning architectures, we were able to improve on the state-of-the-art in predicting MRL from mRNA sequences. This provides a strong baseline for future mRNA sequence optimization and gene expression prediction using ML.

Graduate Research Assistant at the School of medicine, Stanford University

In this project, I describe a study decoding monkey neural spikes to predict decision-making during a game theory experiment. The main goals were to train a decoder to predict the monkey's own decision, the opponent's decision, coordination, and the final outcome.

A linear regression model was chosen as the decoder after testing other models. It was trained on spike data preprocessed with a half-Gaussian filter. The decoder performed best at predicting the monkey's own decision, achieving 75% accuracy. However, it struggled with predicting coordination, the opponent's decision, and the final outcome.

Research Internship: Mechanical Actuator for an Organ-on-a-chip Platform

I developed a mechanical actuator for an organ-on-a-chip platform, realized from parts designed and 3D printed in the lab as well as an integrated electronic control system based on Arduino and servo-motors.

GitHub: https://github.com/LaboratoryOpticsBiosciences/oscillator

Bachelor Thesis: Time-dependent characterization and constitutive modeling of 3D-printed polymers, Tango Black

During my Bachelor Thesis, I characterized experimentally a 3D-printed polymer (Tango Black) by testing it under uniaxial tension, cyclic loading and relaxation. I designed the setup using SolidWorks. I learnt experimental techniques for material characterization, digital image correlation and post-processing analysis using Python. Finally, I fitted the experimental data with a model for hyper elastic elastomers.

Screen Shot 2021-11-21 at 9.36.01 PM.png

Summer Research Internship: Analysis of internal gravity waves

I contributed to a research project about a case study of a large-amplitude orographic gravity wave occurring over the Antarctic Peninsula. I analyzed and identified internal atmospheric gravity waves based on data collected from the Concordiasi balloons campaign. (https://www.lmd.polytechnique.fr/VORCORE/McMurdoE.htm)

I compared the characteristics of the gravity waves from both the real balloons data and virtual balloons launched in the simulation using Python. I combined mathematics and programming to understand a physical phenomenon.