iCVL Meetings

The iCVL maintains its own weekly seminar series and reading group where we either have guests to discuss their research or we have one of our own lead the study of a topic or a specific research of interest from the literature. Our meetings run as two hour round table discussions (currently on Wednesdays 10:00-12:00) and enjoy a significant level of interaction than most seminars. We will be delighted to host you to learn about your work too. Please contact the lab’s director or our seminar coordinator.

While group meetings started in 2007, organized listings for the web site began in Feb 2012, shortly before its launching in July 2012. The forthcoming meeting is automatically highlighted and centered below. Please scroll up and down for future and past meetings.

  • 14.06.2017
  • Seminar slot
  • TBA
  • 07.06.2017
  • Shay Zweig - TAU
  • InterpoNet, A brain inspired neural network for optical flow dense interpolation
Abstract:Artificial neural networks are historically related to biological neural network not only by name but also by some of its key concepts such as convolutional neural networks. However this analogy does not hold beyond the general concepts. Works that try to tie the fields closer together usually remain mostly theoretical, while the leading benchmarks are dominated by more computationally driven approaches. An open question is how can we imitate concepts drawn from the cortex in ANN without loosing the simplicity and efficiency of the feed-forward inference and gradient based training. In this talk I will present our work in which we draw inspiration from concepts that we found in the monkey's visual cortex, to solve a classic computer vision problem: sparse to dense interpolation for optical flow. We took an innovative approach using training time supervision in a CNN, rather than changing the "anatomy" of the network, to enforce the brain inspired concepts leading to state-of-the-art results in the challenging benchmarks in the field.
  • 17.05.2017
  • Eran Goldman - BIU/Trax
  • Large-Scale Classification of Structured Objects, by Nonlinear CRF with Deep Class Embedding
Abstract:This work presents a novel deep learning architecture to classify structured objects in datasets with a large number of visually similar categories. Our model extends the CRF objective function to a nonlinear form, by factorizing the pairwise potential matrix, to learn neighboring-class embedding. The embedding and the classifier are jointly trained to optimize this highly nonlinear CRF objective function. The non-linear model is trained on object-level samples, which is much faster and more accurate than the standard sequence-level training of the linear model. This model overcomes the difficulties of existing CRF methods to learn the contextual relationships thoroughly when there is a large number of classes and the data is sparse. The performance of the proposed method is illustrated on a huge dataset that contains images of retail-store product displays, taken in varying settings and viewpoints, and shows significantly improved results compared to linear CRF modeling and sequence-level training.
  • 10.05.2017
  • Yaniv Oiknine - BGU
  • Compressive hyperspectral imaging
Abstract:Spectroscopic imaging has been proved to be an effective tool for many applications in a variety of fields, such as biology, medicine, agriculture, remote sensing and industrial process inspection. However, due to the demand for high spectral and spatial resolution it became extremely challenging to design and implement such systems in a miniaturized and cost effective manner. Using a Compressive Sensing setup based on a device that modulate the spectral domain and a sensor array, we demonstrate the reconstruction of hyperspectral image cubes from spectral scanning shots numbering an order of magnitude less than would be required using conventional systems. By examining the cubes we measured, we found that the performance of target detection algorithm in our images and in conventional hyperspectral images is similar. This principle was used also to build a compressive 4D spectro-volumetric imager and was implement in an along-track scanning task.
  • 03.05.2017
  • Amit Bermano - Princeton
  • Geometric Methods for Realistic Animation of Faces
Abstract:In this talk, I will briefly introduce myself and mainly focus on my doctoral dissertation, which addresses realistic facial animation. Realistic facial synthesis is one of the most fundamental problems in computer graphics, and is desired in a wide variety of fields, such as film and advertising, computer games, teleconferencing, user-interface agents and avatars, and facial surgery planning. In the dissertation, we present the most commonly practiced facial content creation process, and contribute to the quality of each of its three steps. The proposed algorithms significantly increase realism and therefore substantially reduce the amount of manual labor required for production quality facial content. Bio: Amit H. Bermano is a postdoctoral researcher in the Graphics Group at Princeton University. He obtained his M.Sc at the Technion, Israel and his doctoral degree at ETH Zurich, in 2015. Before Joining Princeton University, he was a postdoctoral researcher at Disney Research, Zurich. His research interests are applying geometry processing techniques to other fields, potentially benefiting both of them, mainly in the seam between computer graphics and computer vision. His past research includes work in geometry processing, reconstruction, computational fabrication, and animation.
  • 19.04.2017
  • Gil Levi - TAU
  • Temporal Tessellation: A Unified Approach for Video Analysis
Abstract:We present a general approach to video understanding, inspired by semantic transfer techniques that have been successfully used for 2D image analysis. Our method considers a video to be a 1D sequence of clips, each one associated with its own semantics. The nature of these semantics -- natural language captions or other labels -- depends on the task at hand. A test video is processed by forming correspondences between its clips and the clips of reference videos with known semantics, following which, reference semantics can be transferred to the test video. We describe two matching methods, both designed to ensure that (a) reference clips appear similar to test clips and (b), taken together, the semantics of the selected reference clips is consistent and maintains temporal coherence. We use our method for video captioning on the LSMDC'16 benchmark, video summarization on the SumMe and TVSum benchmarks, Temporal Action Detection on the Thumos2014 benchmark, and sound prediction on the Greatest Hits benchmark. Our method not only surpasses the state of the art, in four out of five benchmarks, but importantly, it is the only single method we know of that was successfully applied to such a diverse range of tasks.
  • 05.04.2017
  • Yair Adato - Trax
  • TBA
  • 11.01.2017
  • Nadav Cohen - HUJI
  • Inductive Bias of Deep Convolutional Networks through Pooling Geometry
Abstract:Our formal understanding of the inductive bias that drives the success of convolutional networks on computer vision tasks is limited. In particular, it is unclear what makes hypotheses spaces born from convolution and pooling operations so suitable for natural images. In this paper we study the ability of convolutional networks to model correlations among regions of their input. We theoretically analyze convolutional arithmetic circuits, and empirically validate our findings on other types of convolutional networks as well. Correlations are formalized through the notion of separation rank, which for a given partition of the input, measures how far a function is from being separable. We show that a polynomially sized deep network supports exponentially high separation ranks for certain input partitions, while being limited to polynomial separation ranks for others. The network's pooling geometry effectively determines which input partitions are favored, thus serves as a means for controlling the inductive bias. Contiguous pooling windows as commonly employed in practice favor interleaved partitions over coarse ones, orienting the inductive bias towards the statistics of natural images. Other pooling schemes lead to different preferences, and this allows tailoring the network to data that departs from the usual domain of natural imagery. In addition to analyzing deep networks, we show that shallow ones support only linear separation ranks, and by this gain insight into the benefit of functions brought forth by depth - they are able to efficiently model strong correlation under favored partitions of the input. Joint work with Amnon Shashua
  • 04.01.2017
  • Etai Littwin - TAU
  • The multiverse loss for robust transfer learning
Abstract:Deep learning techniques are renowned for supporting effective transfer learning. However, as we demonstrate, the transferred representations support only a few modes of separation and much of its dimensionality is unutilized. In this work, we suggest to learn, in the source domain, multiple orthogonal classifiers. We prove that this leads to a reduced rank representation, which, however, supports more discriminative directions. Interestingly, the softmax probabilities produced by the multiple classifiers are likely to be identical. Experimental results, on CIFAR-100 and LFW, further demonstrate the effectiveness of our method.