Research

Inclusive robotic foundation model (JST CRONOS)
We develop a world model connected to a foundation model that can optimize the actions for various robots in response to language instrutions.
- Mapping between latent action space common among robots and robot-specific action spaces
- Lightweight hypernetworks that switch internal state representation according to language instrutions
- Learning world model interpreted as multi-objective optimization
Skill-transfer AI model (JST K Program)
We develop three AI models that extract the skill differences between novice and expert learners and efficiently transfer the necessary skills to novice learners.
- Extraction of skill differences by skillful AI model
- Interface optimization for skill transfer through instructional AI model
- Interface selection/balance based on biometrics/preference using personal AI model

Reinforcement learning as probabilistic inference
We derive and analyze various values that can be found by interpreting RL as a kind of probabilistic inference problem.
- General representation of optimality and divergence
- Optimization of discount factor according to events
- Weber-Fechner law in TD learning
- Distributional model with regular optimism and pessimism
- Theoretically-grounded optimistic learning
- Integration with feedback error learning
Model-based reinforcement learning
We study learning theory of world model and its application to model predictive control in real time.
- Generation and validity-based optimal use of pseudo-experiential data
- Adversarial learning with moderate robustness and less conservativeness
- Model predictive control with efficient convergence to a suboptimal solution
- Extraction of sparse low-dimensional latent space
Stabilization of deep reinforcement learning
We develop stabilization techniques that enable deep reinforcement learning to optimize policies stably.
- Robustness to uncertainty in reward and value estimation
- Analysis of experience replayable conditions
- Stable and fast target network updates
- Regularization of local Lipschitz continuity of policy and value functions

Utilization of imperfect demonstration
We develop imitation learning methodologies in the absence of sufficient quality, modality, and quantity of demonstration data.
- Spatiotemporal partial imitation through self-paced learning
- Robust behavioral cloning for outliers based on Tsallis statistics
- Safe and efficient framework to compensate for missing action data
Interactive imitation learning
We study a framework in which the demonstrator and agent collaborate to collect data to be imitated.
- Active exploration without sacrificing the sense of agency
- Optimal consensus decision-making among multiple actions with confidences

Time-series data processing
We design models to process time-series data, which is important for analyzing and representing robot and human motions.
- Model structure for both long-term memory and stability
- Learning theory of recurrent neural networks based on variational inference
- Design of reservoir computing in complex domain that embeds the representation of periodicity
- Approximated implementation of neuron dynamics following a power law
Stochastic gradient descent
We improve the performance of the stochastic gradient descent, which is a core technique in deep learning.
- Novel interpretation of dual structured algorithm
- Robustness to noise and outliers based on Student's t-distribution
- Improvement of AMSGrad to make it applicable to non-stationary problems
Lifelong (continual) learning
We develop fundamental technologies for autonomous robots to keep learning continuously.
- Crossmodal learning to complement missing modalities
- Balancing memory stability and plasticity
- Fractal model for continual learning of multiple tasks

Humanoid robots
We study motion control of humanoid robots, which are expected to play an active role in human society as general-purpose robots.
- Sim2real reinforcement learning for redundant robots
- Integration of whole-body model predictive control and machine learning
- Dancing with a partner using whole-body skin sensors
- Unified controller for bipedal walking and running and continuous transition between them
Human assistive robot
We develop AI technologies to assist various people.
- Semi-automation towards safe teleoperation systems
- Extracting and understanding the tacit knowledge hidden in subjective evaluation
- Surgical assistance with learning-based model predictive control
- Learning assistance in embryo manipulation with extracted expert skills
- Locomotion assistance based on recognition of human movement behaviors