Postdoctoral fellow • Carnegie Mellon University
Email • Github • CV • Google scholar
I am a postdoctoral researcher working with Fernando De la Torre at Carnegie Mellon University. Before this, I completed my PhD at UW Madison working with Yong Jae Lee. My past work has explored generative models for computer vision tasks. In general, my philosophy is to view this class of algorithms as a tool to help us go outside the training distribution, and generate something we don't already have. Following is a list of the projects I've worked on so far (* denotes equal contribution):
Thao Nguyen, Utkarsh Ojha, Yuheng Li, Haotian Liu, Yong Jae Lee
Most of the focus on image editing has remained on editing single images at a time. We present a method for interactive batch image editing, where given an edit specified by the user in the example image, our method can automatically transfer that edit to other test images so that irrespective of their initial state, they all arrive at the same final state.
Thao Nguyen, Yuheng Li, Utkarsh Ojha, Yong Jae Lee
Text-conditioned image editing has emerged as a powerful tool for editing images. However, in many situations, language can be ambiguous and ineffective in describing specific image edits. We present a method for image editing via visual prompting, where given a "before" and "after" images of an edit, our goal is to learn a text-based editing direction that can be applied to unseen images.
Utkarsh Ojha*, Yuheng Li*, Anirudh Sundara Rajan*, Yingyu Liang, Yong Jae Lee
When a student tries to mimic a teacher whle classifying an image, we see an improvement in its performance. But what happens in the background? Does the student really inherit teacher-specific properties which it would otherwise not have obtained? What are the ways in which we can study those properties? In these paper, we attempt to shed some light on this dark knowledge that the student inherits during the distillation process.
Utkarsh Ojha*, Yuheng Li*, Yong Jae Lee
The past few years has seen the birth of a plethora of generative models. This work attempts to build systems that can detect fake images as such across different breeds of generative models. We show why training of neural networks for real/fake classification is not a good idea, and consequently show the surprising effectiveness of a feature space not explicitly trained for this task.
Utkarsh Ojha, Yijun Li, Jingwan Lu, Alexei A. Efros, Yong Jae Lee, Eli Shechtman, Richard Zhang
If you have 1000s of images from a domain (e.g. human faces), you can typically train a big neural network to generate images resembling its properties. What if you don't have that luxury? What if you only have, say 10 paintings from an artist, and want to generate more? That is the goal of this work: model a bigger distribution of a domain using extremely few training images from it.
Utkarsh Ojha, Krishna Kumar Singh, Yong Jae Lee
Let's say you have data which contains images from not one, but multiple object categories (e.g. dogs and cars). Can you learn a generative model which can still disentangle object shape and its appearance? We proposed a method for this task, where upon learning such a model, we can take the appearance of a furry dog, and transfer it onto a car to create a new species of furry cars.
Utkarsh Ojha, Krishna Kumar Singh, Yong Jae Lee
When your data has discrete object categories, a typical assumption for the discrete factors is a uniform multinomial distribution. What happens when the data has a class imbalance? We highlight the shortcomings of existing work in such scenarios, and propose a method which disentangles the discrete factor much more accurately without access to the ground-truth distribution.
Yuheng Li, Krishna Kumar Singh, Utkarsh Ojha, Yong Jae Lee
Let's say you captured two pictures, one of a red sparrow, and another of a white swan. You're feeling creative, and want to imagine how that white swan would look with that red sparrow's appearance over it. MixNMatch does precisely that: it takes in real images, and extracts the object's shape and appearance independently, and combine them to create a hybrid bird: a red swan.
Krishna Kumar Singh*, Utkarsh Ojha*, Yong Jae Lee
Imagine a collection of natural birds. The goal in this project was to have a model which generates realistic images, and also learns to control its different properties. For example, the proposed method learns to control object shape, appearance, pose, background - without any supervision. We could now borrow the appearance of a colorful hummingbird, and put it over the body of a seagull.
Konda Reddy Mopuri*, Utkarsh Ojha*, Utsav Garg, R. Venkatesh Babu
Universal adversarial perturbation describes an image-agnostic noise pattern, which when added to any natural image will fool a neural network based classifier. We proposed a method to generate not one, but a distribution of such noise images for a neural network. These were much stronger in terms of fooling not just the targeted classifier, but also many unseen ones.