- 14.12.2021

- iCVL Group

- Monthly Research Status Meeting
Abstract: Each iCVL team member will give an update on the status of his last month's actions, raise issues for discussion and brief consultation, and present his action items for the coming month.
- 30.11.2021

- iCVL Group

- Monthly Research Status Meeting
Abstract: Each iCVL team member will give an update on the status of his last month's actions, raise issues for discussion and brief consultation, and present his action items for the coming month.
- 16.11.2021

- iCVL Group: Rotem Mairon

- Towards Revealing Visual Relations Between Fixations: Modeling the Center Bias During Free-Viewing
Abstract:Understanding eye movements is essential for comprehending visual selection and has numerous applications in computational vision and human-computer interfaces. Eye-movement research has revealed a variety of behavioral biases that guide the eye regardless of visual content.
However, computational models of eye-movement behavior only partially account for these biases. The most well-known of these biases is the center bias, which refers to the tendency of observers to fixate on the stimulus's center. In this study, we investigate two aspects of the center bias to help distinguish it from visual content-driven behavior. One aspect concerns standard procedures for collecting eye-movement data during free viewing. Specifically, the requirement to fixate on the center of the display prior to the appearance of the stimulus, as well as the presentation of the stimulus in the display's center. Our findings support a fundamental shift in data collection and analysis procedures, all in the name of obtaining data that reflects more content-driven
and less bias-related eye-movement behavior. The second aspect is how the center bias manifests during viewing, as opposed to how it is reflected in the spatial distribution of aggregated fixations, which is used in almost all computational models of eye-movement behavior. To that end, this work proposes an eye-movement data representation based on saccades rather than fixations. This representation not only demonstrates the dynamic nature of the center bias throughout viewing, but it also holistically demonstrates other eye-movement phenomena that were previously investigated exclusively. Finally, we demonstrate that the proposed representation allows for more accurate modeling of eye-movement biases, which is critical for establishing a baseline for computational models. Such a baseline paves the way for researchers to learn how,
if at all, visual content guides the human eye from one fixation to the next while freely viewing natural images.
- 09.11.2021

- iCVL Group: Image super-resolution

- Reading Group - Methods for Image super-resolution
Abstract:In this session we will discuss how to improve the resolution of low quality images, methods and solutions from recent years on the subject. Please refer to the article "Image super-resolution using deep convolutional networks" from 2015 as a basis for conversation. The meeting will be held in the style of a round table where everyone could participate in parallel.
- 25.10.2021

- iCVL Group

- Monthly Research Status Meeting
Abstract: Each iCVL team member will give an update on the status of his last month's actions, raise issues for discussion and brief consultation, and present his action items for the coming month.
- 11.10.2021

- Aviel Hadad - BGU

- BCVA and eye movement
Abstract:We are hosting Dr. Aviel Haddad who will give us an introductory lecture on the subject of eye movements and BCVA. Aviel will introduce the main field and also new updates on this topic.
- 13.09.2021

- iCVL Group

- Monthly Research Status Meeting
Abstract: Each iCVL team member will give an update on the status of his last month's actions, raise issues for discussion and brief consultation, and present his action items for the coming month.
- 23.08.2021

- Chen Keasar - BGU

- Protein structure prediction and protein protein docking
Abstract:We are hosting Professor Chen Caesar for a meeting on the folding and connection of proteins. The session will take the form of a "Questions and Answers Seminar", based on the questions from the previous session.
- 09.08.2021

- Elad Amar - BGU

- Computational visual astrophysics
Abstract: Elad will introduce us to his field of research for the thesis, while reviewing the literature relevant to the field
- 02.08.2021

- iCVL Group: Protein structure & docking

- Reading Group - Methods for Protein structure & docking
Abstract:We hold our first meeting on the connection, folding and modeling of proteins.
For the next meeting, we will discuss two articles as a basis for conversation:
1. Article "Highly accurate protein structure prediction with AlphaFold" from 2020.
The article presents a groundbreaking solution for modeling proteins with the help of learning networks.
Anyone interested in seeing a summary on the subject is welcome to visit the DeepMind page.
2. Article "Protein – protein docking dealing with the unknown" from 2010. The article gives an easy overview of the topic of Protein – protein docking.
- 19.07.2021

- iCVL Group

- Monthly Research Status Meeting
Abstract: Each iCVL team member will give an update on the status of his last month's actions, raise issues for discussion and brief consultation, and present his action items for the coming month.
- 11.07.2021

- iCVL Group: Reading Group

- Reading Group - Methods for the treatment of visual impairment
Abstract:In this meeting we will be discussing the following papers:
1. van Rheede, JoramJ., Christopher Kennard, and Stephen L. Hicks. "Simulatingprosthetic vision: Optimizing the information content of a limitedvisual display." Journal of vision 10.14 (2010): 32-32.
2. Sanchez-Garcia,Melani, Ruben Martinez-Cantin, and Jose J. Guerrero. "Semanticand structural image segmentation for prosthetic vision." PloSone 15.1 (2020): e0227677.
- 01.07.2021

- iCVL Group: Open problems in computer vision

- Open problems in computer vision and problems that exist in the human vision - brainstorming
Abstract:We will meet for a series of sessions in which we will deal with a theoretical / applied solution of open problems in the world of computational vision. The meeting will be held in the style of a round table where everyone could participate in parallel.
In the first meeting, each of the lab members will meet and discuss the open problems that exist in the field.
In the first hour everyone will look for interesting problems in the field. Then, in the second hour we will discuss what issues we would like to address in the next sessions.
Please note, the goal will not necessarily be to try to put out a joint project in the lab, but to get to know new tools and develop thinking about the subject (and perhaps encourage us to participate in group competitions).
- 24.06.2021

- iCVL Group: Open problems in computer vision

- Open problems in computer vision and medicine - brainstorming
Abstract:We will meet for a series of sessions in which we will deal with a theoretical / applied solution of open problems in the world of computational vision. The meeting will be held in the style of a round table where everyone could participate in parallel.
In the first meeting, each of the lab members will meet and discuss the open problems that exist in the field.
In the first hour everyone will look for interesting problems in the field. Then, in the second hour we will discuss what issues we would like to address in the next sessions.
Please note, the goal will not necessarily be to try to put out a joint project in the lab, but to get to know new tools and develop thinking about the subject (and perhaps encourage us to participate in group competitions).
- 03.06.2021

- iCVL Group: Open problems in computer vision

- Open problems in computer vision - brainstorming
Abstract:We will meet for a series of sessions in which we will deal with a theoretical / applied solution of open problems in the world of computational vision. The meeting will be held in the style of a round table where everyone could participate in parallel.In the first meeting, each of the lab members will meet and discuss the open problems that exist in the field.In the first hour everyone will look for interesting problems in the field. Then, in the second hour we will discuss what issues we would like to address in the next sessions.Please note, the goal will not necessarily be to try to put out a joint project in the lab, but to get to know new tools and develop thinking about the subject (and perhaps encourage us to participate in group competitions).
- 29.04.2021

- iCVL Group: Modeling the Human Brain

- Modeling the Human Brain – A Long-Term Perspective
Abstract:We meet to watch a lecture on brain modeling, following a film that dealt with the "Human Brain Project". The video lasts about 45 minutes, after which we will continue the discussion on the subject.
Abstract:
The best understood cortex is primary visual cortex of the mouse. This is witnessed by state-of-the-art simulation of ~1 mm3 of V1 from the adult mouse, using two different levels of granularity (point neurons, spatially extended HH models). These models, based on the massive Allen Institute databanks functional connectivity and in vivo recordings, replicate in vivo spiking data the model was not trained on in a quantitative manner (Billeh et al., Neuron 2020). It is being extended to include detailed connectivity of electron-microscopically reconstructed data from mouse V1. It is likely that within a few decades such models could be extended to faithfully simulate the brain and the behavior of mice, predicting genuine new phenomena and system-level properties. The human brain is three orders of magnitude bigger than the mouse brain. There is currently little evidence to suggest that it is, per unit volume, significantly more complex than the mouse brain. The field is now in the first stages of assembling a dataset of individual human pyramidal neurons and interneurons, based on in vitro data from neurosurgical samples. This provides a first, but limited view, onto the human brain circuits at the cellular level. For the foreseeable future, we will not have access to in vivo cellular data nor synaptic learning rules. This will impose unique limits onto our ability to faithfully simulate the human brain at the micro-functional level over the thirty or more years.
- 22.04.2021

- iCVL Group: "In Silico"

- In Silico - The human brain project
Abstract:We will provide some background about the human brain project (HBP) and we will watch the live screening of the film "In Silico" . Following the screening, we will have an open discussion on zoom.
The website of the film: https://insilicofilm.com/
- 21.03.2021

- iCVL Group: Reading Group

- Reading Group - Methods for the treatment of visual impairment
Abstract:Inthis meeting we will be discussing the following papers:
1. Niketeghad,Soroush, and Nader Pouratian. "Brain machine interfaces for visionrestoration: the current state of cortical visual prosthetics." Neurotherapeutics16.1 (2019): 134-143.
2. Chen,Xing, et al. "Shape perception via a high-channel-count neuroprosthesis inmonkey visual cortex." Science 370.6521 (2020): 1191-1196.
- 07.03.2021

- iCVL Group: Reading Group

- Reading Group - Methods for the treatment of visual impairment
Abstract:In this meeting we will be discussing the following paper:
|
Ong, Jong Min, and Lyndon da Cruz. "The bionic eye: a review." Clinical & experimental ophthalmology 40.1 (2012): 6-17. |
|
|
---|
- 17.02.2021

- iCVL Group: Peleg Harel - BGU

- Presentation of Peleg's thesis
Abstract:Title:
Lazy caterer jigsaw puzzles: Models, properties, and a mechanical system-based solver
Abstract:
Jigsaw puzzle solving, the problem of constructing a coherent whole from aset of non-overlapping unordered fragments, is fundamental to numerousapplications, and yet most of the literature has focused thus far on lessrealistic puzzles whose pieces are identical squares. Here we formalize a newtype of jigsaw puzzle where the pieces are general convex polygons generated bycutting through a global polygonal shape with an arbitrary number of straightcuts, a generation model inspired by the celebrated Lazy caterer's sequence. Weanalyze the theoretical properties of such puzzles, including the inherentchallenges in solving them once pieces are contaminated with geometrical noise.To cope with such difficulties and obtain tractable solutions, we abstract theproblem as a multi-body spring-mass dynamical system endowed with hierarchicalloop constraints and a layered reconstruction process.
- 10.02.2021

- iCVL Group: Peleg Harel - BGU

- Presentation of Peleg's thesis
Abstract:
Title:
Lazy caterer jigsaw puzzles: Models, properties, and a mechanical system-based solver
Abstract:
Jigsaw puzzle solving, the problem of constructing a coherent whole from aset of non-overlapping unordered fragments, is fundamental to numerousapplications, and yet most of the literature has focused thus far on lessrealistic puzzles whose pieces are identical squares. Here we formalize a newtype of jigsaw puzzle where the pieces are general convex polygons generated bycutting through a global polygonal shape with an arbitrary number of straightcuts, a generation model inspired by the celebrated Lazy caterer's sequence. Weanalyze the theoretical properties of such puzzles, including the inherentchallenges in solving them once pieces are contaminated with geometrical noise.To cope with such difficulties and obtain tractable solutions, we abstract theproblem as a multi-body spring-mass dynamical system endowed with hierarchicalloop constraints and a layered reconstruction process.
- 03.02.2021

- iCVL Group

- Monthly Research Status Meeting
Abstract: Each iCVL team member will give an update on the status of his last month's actions, raise issues for discussion and brief consultation, and present his action items for the coming month.
- 27.01.2021

- iCVL Group: Keren Berger - BGU

- Monthly Reading Group
Abstract: In this meeting we will be discussing the following paper:
Long, J., Shelhamer, E., & Darrell, T. (2015). Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 3431-3440).
- 10.01.2021

- iCVL Group: Keren Berger - BGU

- TBA
Abstract:TBA
- 30.12.2020

- iCVL Group: Keren Berger - BGU

- Presentation of Keren's thesis
Abstract:Advances in information technology have led to an increase in the need for user authentication methods for cybersecurity, which on the one hand, will provide maximum security and, on the other hand, will have high usability. However, in current methods, these two requirements in conflict, with an improvement in one leading to a deterioration in the other. In this work, we present a new way for building biometric identification systems, which use the existing connection between the characteristics of a person's sensory organs and his perceptual abilities to authenticate users in a usable way that allows for maximum security. As a case study, we describe a possible application of a biometric identification system according to our method, which uses interpersonal differences in color perception based on differences in the retinal cells' properties that are suppressed in the retina. Through experiments on artificial data we have created, we investigate the system's sensitivity to changes in its components and parameter values. The results we demonstrate form the basis for future research in the proposed direction.
- 23.12.2020

- iCVL Group: Keren Berger - BGU

- Presentation of Keren's thesis
Abstract:Advances in information technology have led to an increase in the need for user authentication methods for cybersecurity, which on the one hand, will provide maximum security and, on the other hand, will have high usability. However, in current methods, these two requirements in conflict, with an improvement in one leading to a deterioration in the other. In this work, we present a new way for building biometric identification systems, which use the existing connection between the characteristics of a person's sensory organs and his perceptual abilities to authenticate users in a usable way that allows for maximum security. As a case study, we describe a possible application of a biometric identification system according to our method, which uses interpersonal differences in color perception based on differences in the retinal cells' properties that are suppressed in the retina. Through experiments on artificial data we have created, we investigate the system's sensitivity to changes in its components and parameter values. The results we demonstrate form the basis for future research in the proposed direction.
- 09.12.2020

- Peleg Harel - Snapchat

- TBA
Abstract:Peleg Harel (a graduate of the laboratory) will be hosted by us and will share his work in the field of computer vision at Snapchat.
- 02.12.2020

- iCVL Group: Ben Vardi - BGU

- Monthly Reading Group
Abstract: In this meeting we will be discussing the following paper:
|
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., ... & Bengio, Y. (2014). Generative adversarial nets. In Advances in neural information processing systems (pp. 2672-2680). |
| |
---|
- 18.11.2020

- iCVL Group: Keren Berger - BGU

- Presentation of Keren's thesis
Abstract:Advances in information technology have led to an increase in the need for user authentication methods for cybersecurity, which on the one hand, will provide maximum security and, on the other hand, will have high usability. However, in current methods, these two requirements in conflict, with an improvement in one leading to a deterioration in the other. In this work, we present a new way for building biometric identification systems, which use the existing connection between the characteristics of a person's sensory organs and his perceptual abilities to authenticate users in a usable way that allows for maximum security. As a case study, we describe a possible application of a biometric identification system according to our method, which uses interpersonal differences in color perception based on differences in the retinal cells' properties that are suppressed in the retina. Through experiments on artificial data we have created, we investigate the system's sensitivity to changes in its components and parameter values. The results we demonstrate form the basis for future research in the proposed direction.
- 11.11.2020

- iCVL Group

- Monthly Research Status Meeting
Abstract: Each iCVL team member will give an update on the status of his last month's actions, raise issues for discussion and brief consultation, and present his action items for the coming month.
- 04.11.2020

- Roy Toren - BGU

- Differential Geometry Overview - 2
Abstract:In the following meeting, we will go over fundamental topics in differential geometry such as 2D curves, 3D curves, surfaces, and surface curvatures in the continuous case and the discrete case.
- 21.10.2020

- Roy Toren - BGU

- Differential Geometry Overview - 1
Abstract:In the following meeting, we will go over fundamental topics in differential geometry such as 2D curves, 3D curves, surfaces, and surface curvatures in the continuous case and the discrete case.
- 26.08.2020

- Seminar Slot

- TBA
Abstract: TBA
- 19.08.2020

- iCVL Group: Ilan Git - BGU

- Monthly Reading Group
Abstract: In this meeting we will be discussing the following paper:
|
He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770-778). |
- 12.08.2020

- iCVL Group

- Monthly Research Status Meeting
Abstract: Each iCVL team member will give an update on the status of his last month's actions, raise issues for discussion and brief consultation, and present his action items for the coming month.
- 29.07.2020

- Moshe Eliasof - BGU

- Graph Convolutional Networks via Differential Operators
Abstract: Graph Convolutional Networks (GCNs) have shown to be effective in handling unordered data like point cloud and meshes. In this work we propose novel approaches for graph convolution, pooling and unpooling, taking inspiration from finite-elements and algebraic multigrid frameworks. We form a parameterized convolution kernel based on discretized differential operators, leveraging the graph mass, gradient and Laplacian. This way, the parameterization does not depend on the graph structure, only on the meaning of the network convolutions as differential operators. To allow hierarchical representations of the input, we propose pooling and unpooling operations that are based on algebraic multigrid methods. To motivate and explain our method, we compare it to standard Convolutional Neural Networks, and show their similarities and relations in the case of a regular grid. Our proposed method is demonstrated in various experiments like classification and segmentation, achieving on par or better than state of the art results. We also analyze the computational cost of our method compared to other GCNs.
- 22.07.2020

- iCVL Group: Roy Toren - BGU

- Monthly Reading Group
Abstract: In this meeting we will be discussing the following paper:
|
Zeiler, M. D., & Fergus, R. (2014, September). Visualizing and understanding convolutional networks. In European conference on computer vision (pp. 818-833). Springer, Cham. |
- 15.07.2020

- Keren Berger - BGU

- TBA
Abstract: TBA
- 08.07.2020

- iCVL Group

- Monthly Research Status Meeting
Abstract: Each iCVL team member will give an update on the status of his last month's actions, raise issues for discussion and brief consultation, and present his action items for the coming month.
- 24.06.2020

- Ben Vardi - BGU

- Puzzle Solving With Relaxation Labeling
Abstract:The topic of the meeting is our ongoing work on jigsaw puzzle solving, done in collaboration with colleagues from Ca' Foscari University of Venice.
We will talk about approaches to assemble jigsaw puzzles, and specifically focus on our approach, which is to formulate the problem as a relaxation labeling problem.
Moreover, we will discuss general challenges in the puzzle solving problem, and specific challenges that apply for our method.
- 17.06.2020

- VisualComputing Seminar: Irit Chelly - BGU

- JA-POLS: a Moving-camera Background Model via Joint Alignment and Partially-overlapping Local Subspaces
Abstract: Background models are widely used in computer vision. While successful Static-camera Background (SCB) models exist, Moving-camera Background (MCB) models are limited. Seemingly, there is a straightforward solution: 1) align the video frames; 2) learn an SCB model; 3) warp either original or previously-unseen frames toward the model. This approach, however, has drawbacks, especially when the accumulative camera motion is large and/or the video is long. Here we propose a purely-2D unsupervised modular method that systematically eliminates those issues. First, to estimate warps in the original video, we solve a joint-alignment problem while leveraging a certifiably-correct initialization. Next, we learn both multiple partially-overlapping local subspaces and how to predict alignments. Lastly, in test time, we warp a previously-unseen frame, based on the prediction, and project it on a subset of those subspaces to obtain a background/foreground separation. We show the method handles even large scenes with a relatively-free camera motion (provided the camera-to-scene distance does not change much) and that it not only yields State-of-the-Art results on the original video but also generalizes gracefully to previously-unseen videos of the same scene.
The talk is based on [Chelly et. all, CVPR '20]
Speaker's short bio:
Irit Chelly is a Computer Science PhD student at Ben-Gurion University under the supervision of Dr. Oren Freifeld at the Vision, Inference, and Learning group. Her current research focuses on unsupervised learning and video analysis. She is interested in probabilistic graphical models, spatial transformations, dimensionality reduction, and deep learning. Irit won the national-level Aloni PhD scholarship from Israel's Ministry of Technology and Science as well as the BGU Hi-tech scholarship for excellent PhD students.
- 10.06.2020

- iCVL Group: Rotem Mairon - BGU

- Monthly Reading Group
Abstract: In this meeting we will be discussing the following paper:
Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.
- 03.06.2020

- iCVL Group

- Monthly Research Status Meeting
Abstract: Each iCVL team member will give an update on the status of his last month's actions, raise issues for discussion and brief consultation, and present his action items for the coming month.
- 27.05.2020

- iCVL Group: Keren Berger - BGU

- Monthly Reading Group
Abstract: AlexNet - Part 2/2: In this meeting we will be continuing last week's meeting - we will give an introduction to the field of neural networks, with a focus on CNN's in particular. We will also be discussing AlexNet, as presented in the following paper:
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems (pp. 1097-1105).
- 20.05.2020

- iCVL Group: Keren Berger - BGU

- Monthly Reading Group
Abstract: AlexNet - Part 1/2: In this meeting an introduction will be given to the field of neural networks, with a focus on CNN's in particular. We will also be discussing AlexNet, as presented in the following paper:
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems (pp. 1097-1105).
- 13.05.2020

- Ilan Git - BGU

- Underwater Objects Localization
Abstract: Ilan will present an overview of his thesis research on underwater objects localization. He will explain the motivation behind the project, describe the current limitations in this research field, and propose some directions for solutions. Ilan will also give an introduction to Simultaneous Localization and Mapping (Slam) algorithm, and present its connection to his research plan.
- 22.04.2020

- iCVL Group

- Monthly Research Status Meeting
Abstract: Each iCVL team member will give an update on the status of his last month's actions, raise issues for discussion and brief consultation, and present his action items for the coming month.
- 01.04.2020

- iCVL Group: Ben Vardi - BGU

- Monthly Reading Group
Abstract: In this meeting we will be discussing the following paper:
Dalal, N., & Triggs, B. (2005, June). Histograms of oriented gradients for human detection. In 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR'05) (Vol. 1, pp. 886-893). IEEE.
- 25.03.2020

- iCVL Group

- Monthly Research Status Meeting
Abstract: Each iCVL team member will give an update on the status of his last month's actions, raise issues for discussion and brief consultation, and present his action items for the coming month.
- 18.03.2020

- Roy Toren - BGU

- 3D Puzzle Solving with Aspects of Archaeology
Abstract: Roy will present an overview of his thesis on 3D puzzle solving with aspects of archaeology. He will explain the motivation behind the problem, current existing solutions, aspects which could be improved and a direction which he will continue to explore in his thesis work.
- 04.03.2020

- Guy Amit - BGU

- Neural Network Representation Control: Gaussian Isolation Machines and CVC Regularization
Abstract: In many cases, neural network classifiers are likely to be exposed to input data that is outside of their training distribution data. Samples from outside the distribution may be classified as an existing class with high probability by softmax-based classifiers; such incorrect classifications affect the performance of the classifiers and the applications/systems that depend on them. Previous research aimed at distinguishing training distribution data from out-of-distribution data (OOD) has proposed detectors that are external to the classification method. We present Gaussian isolation machine (GIM), a novel hybrid (generative-discriminative) classifier aimed at solving the problem arising when OOD data is encountered. The GIM is based on a neural network and utilizes a new loss function that imposes a distribution on each of the trained classes in the neural network's output space, which can be approximated by a Gaussian. The proposed GIM's novelty lies in its discriminative performance and generative capabilities, a combination of characteristics not usually seen in a single classifier. The GIM achieves state-of-the-art classification results on image recognition and sentiment analysis benchmarking datasets and can also deal with OOD inputs. We also demonstrate the benefits of incorporating part of the GIM's loss function into standard neural networks as a regularization method.
The paper can be found on arXiv:
https://arxiv.org/pdf/2002.02176.pdf
- 26.02.2020

- Assaf Arbelle - BGU

- QANet - A Quality Assurance Network for Image Segmentation
Abstract: We introduce a novel Deep Learning framework, which quantitatively estimates image segmentation quality without the need for human inspection or labeling. We refer to this method as the Quality Assurance Network (QANet). Specifically, given an image, and a proposed corresponding segmentation obtained by any method including manual annotation, QANet solves a regression problem in order to estimate a predefined quality measure with respect to the unknown ground-truth. QANet is by no means yet another segmentation method. Instead, it performs a multi-level, multi-feature comparison of an image-segmentation pair based on a unique network architecture, called RibCage.
To demonstrate the strength of QANet, we addressed the evaluation of instance segmentation using two different datasets from different domains, namely, high-throughput live-cell microscopy images from the Cell Segmentation Benchmark and natural images of plants from the Leaf Segmentation Challenge. While synthesized segmentations were used to train the QANet, it was tested on segmentations obtained by publicly available methods that participated in the different challenges. We show that the QANet accurately estimates the scores of the evaluated segmentations with respect to the hidden ground-truth, as published by the challenges’ organizers.
- 19.02.2020

- iCVL Group: Roy Toren - BGU

- Monthly Reading Group
Abstract: In this meeting we will be discussing the following paper:
Viola, P., & Jones, M. (2001). Rapid object detection using a boosted cascade of simple features. CVPR (1), 1(511-518), 3.
- 12.02.2020

- iCVL Group

- Monthly Research Status Meeting
Abstract: Each iCVL team member will give an update on the status of his last month's actions, raise issues for discussion and brief consultation, and present his action items for the coming month.
- 22.01.2020

- VisualComputing Seminar: Oran Shayer - BMW

- Enhancing Generic Segmentation with Learned Region Representations
Abstract:
Current successful approaches for generic (non-semantic) segmentation rely mostly on edge detection and have leveraged the strengths of deep learning mainly by improving the edge detection stage in the algorithmic pipeline. This is in contrast to semantic and instance segmentation, where deep learning has made a dramatic affect and DNNs are applied directly to generate pixel-wise segment representations. We propose a new method for learning a pixel-wise region representation that reflects segment relatedness. This representation is combined with an edge map to yield a new segmentation algorithm. We show that the representations themselves achieve state-of-the-art segment similarity scores. Moreover, the proposed, combined segmentation algorithm provides results that are either the state of the art or improve it, for most quality measures.
Bio:
Oran holds a BSc and MSc in EE from the Technion, majoring in machine learning, computer vision and deep learning. In the last 10 years, Oran worked at various companies like Apple, Intel and GM, and also experienced in the startup scene working for Clair Labs. He currently holds a position of machine learning researcher at BMW.
- 15.01.2020

- Roy Uziel & Meitar Ronen - BGU

- Bayesian Adaptive Superpixel Segmentation
Abstract: Roy and Meitar will present their ICCV-2019 paper,
Bayesian Adaptive Superpixel Segmentation,
https://www.cs.bgu.ac.il/~orenfr/BASS/Uziel_ICCV_2019.pdfSuperpixels provide a useful intermediate image representation. Existing superpixel methods, however, suffer from at least some of the following drawbacks: 1) topology is handled heuristically; 2) the number of superpixels is either predefined or estimated at a prohibitive cost; 3) lack of adaptiveness. As a remedy, we propose a novel probabilistic model, self-coined Bayesian Adaptive Superpixel Segmentation (BASS), together with an efficient inference. BASS is a Bayesian nonparametric mixture model that also respects topology and favors spatial coherence. The optimization based and topology-aware inference is parallelizable and implemented in GPU. Quantitatively, BASS achieves results that are either better than the state-of-the-art or close to it, depending on the performance index and/or dataset. Qualitatively, we argue it achieves the best results; we demonstrate this by not only subjective visual inspection but also objective quantitative performance evaluation of the downstream application of face detection.
- 08.01.2020

- iCVL Group: Ben Vardi - BGU

- Monthly Reading Group
Abstract: In this meeting we will be discussing the following paper:
Lowe, D. G. (1999, September). Object recognition from local scale-invariant features. In iccv (Vol. 99, No. 2, pp. 1150-1157).
- 01.01.2020

- iCVL Group

- Monthly Research Status Meeting
Abstract: Each iCVL team member will give an update on the status of his last month's actions, raise issues for discussion and brief consultation, and present his action items for the coming month.
- 25.12.2019

- VisualComputing Seminar: Ran M. Bittmann & Darya Afonina - SAP

- Measuring Advertisement Impact in Live Events
Abstract:In this talk we will present a Computer Vision (CV) project out of SAP’s Innovation Lab in Israel. The project addresses the problem of quantifying the impact of advertisments in live events such as sports events. While the talk discusses the CV technologies used in the project, emphasis will be given to the technical challenges of implementing these technologies in a commercial environment.
Bio:- Ran M. Bittmann, is a Data Scientist at SAP Concur Israel focusing on: machine learning, predictive analytics and fraud detection. Ran has 40 years of experience in the software industry. Prior to SAP Ran held executive positions in several successful start-up companies in the area of business intelligence, mobile applications and data communication. Ran holds a PhD in Information Systems from the Graduate School of Business Administration at the Bar-Ilan University.
- Darya Afonina, Scrum Master at SAP Concur Israel. In her daily work Darya is focused on supporting 2 Teams in Agile operations as well as Project Management tasks. Darya is finishing her MBA in Bar-Ilan University.
- 18.12.2019

- VisualComputing Seminar: Dr. Amit Haim Bermano - TAU

- Multi-modal Registration, Rapid Model Placement and Disentanglement: Deep-learning in the Service of Computer Graphics
Abstract:
In this talk I will go over some of the recent and current work done in the computer science lab for graphics and vision in TAU. I will mainly present recently accepted works regarding non-rigid image registration between different sensors, estimating 3D bounding boxes from single images of buildings, and image content transfer between domain using disentanglement. Time permitting, I will also briefly mention using vision for head scans calibration for children, hand tracking, and vascular detection in MRI.
Bio: Dr. Amit H. Bermano is a Senior Lecturer at the Blavatnik School of Computer Science in Tel-Aviv University since 2018. Beforehand, he was a postdoctoral Researcher at the Princeton Graphics Group and a postdoctoral researcher at Disney Research Zurich for a short period. He has conducted his Ph.D. studies at ETH Zurich under the supervision of Prof. Dr. Markus Gross, in collaboration with Disney Research Zurich. His Masters and Bachelors degrees were obtained at The Technion under the supervision of Prof. Dr. Craig Gotsman. His research focuses on Computer Graphics and Machine Learning, with applications in computational fabrication, geometry processing and medical imaging.
- 11.12.2019

- Keren Berger - BGU

- Introduction to Biometrics
Abstract: TBA
- 04.12.2019

- iCVL Group

- Monthly Research Status Meeting
Abstract: Each iCVL team member will give an update on the status of his last month's actions, raise issues for discussion and brief consultation, and present his action items for the coming month.
- 27.11.2019

- Ehud Barnea - Trax

- TBA
Abstract: TBA
- 20.11.2019

- iCVL Group: Peleg Harel - BGU

- Monthly Reading Group
Abstract: In this meeting we will be discussing the following paper:
Shi, J. (1994, June). Good features to track. In 1994 Proceedings of IEEE conference on computer vision and pattern recognition (pp. 593-600). IEEE.
- 06.11.2019

- Matan Shaked - BGU

- Natural Image Statistics and Reconstruction
Abstract: TBA
- 23.10.2019

- iCVL Group

- Monthly Research Status Meeting
Abstract: Each iCVL team member will give an update on the status of his last month's actions, raise issues for discussion and brief consultation, and present his action items for the coming month.
- 21.08.2019

- Keren Berger - BGU

- Introduction to Color Spaces
Abstract: TBA
- 14.08.2019

- iCVL Group

- Monthly Research Status Meeting
Abstract:
Each iCVL team member will give an update on the status of his last month's actions, raise issues for discussion and brief consultation, and present his action items for the coming month.
- 07.08.2019

- Ben Vardi - BGU

- Puzzle Solving with Relaxation Labeling
Abstract: We will review fundamental principles of the Relaxation Labeling problem and algorithm, and see how the puzzle problem may be formulated as a Relaxation Labeling problem.
- 10.07.2019

- iCVL Group

- Monthly Research Status Meeting
Abstract:
Each iCVL team member will give an update on the status of his last month's actions, raise issues for discussion and brief consultation, and present his action items for the coming month.
- 03.07.2019

- Rotem Mairon - BGU

- TBA
Abstract: TBA
- 19.06.2019

- VisualComputing Seminar: Rana Hanocka - TAU

- MeshCNN: A Network with an Edge
Abstract: Polygonal meshes provide an efficient representation for 3D shapes. They explicitly capture both shape surface and topology, and leverage non-uniformity to represent large flat regions as well as sharp, intricate features. This non-uniformity and irregularity, however, inhibits mesh analysis efforts using neural networks that combine convolution and pooling operations. In this talk, I discuss how we utilize the unique properties of the mesh for a direct analysis of 3D shapes using MeshCNN, a convolutional neural network designed specifically for triangular meshes. Analogous to classic CNNs, MeshCNN combines specialized convolution and pooling layers that operate on the mesh edges, by leveraging their intrinsic geodesic connections. Convolutions are applied on edges and the four edges of their incident triangles, and pooling is applied via an edge collapse operation that retains surface topology, thereby, generating new mesh connectivity for the subsequent convolutions. MeshCNN learns which edges to collapse, thus forming a task-driven process where the network exposes and expands the important features while discarding the redundant ones. We demonstrate the effectiveness of MeshCNN on various learning tasks applied to 3D meshes.
Project page: https://ranahanocka.github.io/MeshCNN/The speaker's short bio: Rana Hanocka is a Ph.D. candidate under the supervision of Daniel Cohen-Or and Raja Giryes at Tel Aviv University. Her research is focused on using deep learning for understanding 3D shapes.
- 12.06.2019

- Rotem Mairon - BGU

- TBA
Abstract: TBA
- 05.06.2019

- VisualComputing Seminar: Prof. Ayellet Tal - Technion

- Past Forward: When Computer Graphics and Archaeology Meet
Abstract: Technology is the symbol of our age. Nevertheless, some fields have been left out of the revolution. One of these is archaeology, where many tasks are still performed manually - from the initial excavations, through documentation, to restoration.
It turns out that some of these activities are classical computer graphics (and/or computer vision) tasks, such as puzzle solving, shape completion, matching, and edge detection. The objects, however, are much harder to deal with than usual, since they are broken and eroded after laying underground for thousands of years. Therefore, being able to handle these objects benefits not only archaeology, but also computer graphics.
In this talk I will describe some of the algorithms we have developed to replace manual restoration and show some results.
- 22.05.2019

- Beba Cibralic - Georgetown University

- Autonomous Weapons - Ethical Challenges and Opportunities
Abstract: Advancements in sensor recognition, processing speeds, and machine learning have helped create machines that are increasingly capable of performing complex tasks without human involvement. As machines develop the capacity to operate without human control, new societal questions arise. One salient concern is that this technology will be used to develop fully autonomous weapons, "killer robots", which have the ability to select targets without human engagement. In this discussion, we will examine (i) whether existing paradigms for regulating warfare can accommodate the introduction of fully autonomous weapons system. We will also explore (ii) the relationship between autonomy and responsibility as well as (iii) the ethical benefits of using semi-autonomous weapons. We will conclude by considering (iv) whether a preemptive ban on developing this technology is needed.
- 15.05.2019

- Peleg Harel - BGU

- Solving Archaeological Puzzles
Abstract: The paper presented by Peleg in this meeting proposes a fully-automatic and general algorithm addressing archaeological puzzles' solving.
- 01.05.2019

- iCVL Group

- Monthly Research Status Meeting
Abstract:
Each iCVL team member will give an update on the status of his last month's actions, raise issues for discussion and brief consultation, and present his action items for the coming month.
- 10.04.2019

- VisualComputing Seminar: Dr. Derya Akkaynak - Princeton University

- Sea-thru: Towards A Robust Method to Remove Water From Underwater Images
Abstract: Very large underwater image datasets are generated every day that capture important information regarding the state of our oceans (e.g., coral reef coverage, fish abundance, condition of vulnerable seafloor habitats, etc.). While large image datasets taken on land can be efficiently analyzed with a plethora of computer vision and machine learning algorithms, underwater datasets do not benefit from the full power of these methods because water degrades images too severely for automated analysis. In contrast to images taken in air, path
radiance (backscatter) in underwater images cannot be neglected, and object radiance diminishes quickly even across short distances from the camera. Researchers aiming to restore lost colors and contrast in underwater images are frequently faced with unstable results: available methods are either not robust, are too sensitive, or only work for short object ranges. Consequently, the analysis of most underwater imagery requires costly manual effort; on average, a human expert spends over 2 hours identifying and counting fish in a video that is one hour long.
In this talk, bridging optical oceanography and underwater computer vision, I will show that a fundamental reason for the lack of a robust color reconstruction method is a fairly simple one: the underwater image formation equation used by the computer vision community for the past 30 years is actually a simplification of the radiative transfer equation for horizontal imaging in the atmosphere. Then, based on the physically accurate equation I recently proposed and validated, I will introduce the Sea-thru algorithm that successfully removes
water from underwater images, revealing the underwater world in a way we have never seen before. Finally, I will discuss the potential of leveraging high-resolution (and free) ocean color data from Sentinel 3A/B satellites to boost underwater computer vision algorithms.
The speaker's short bio: Dr. Derya Akkaynak is a mechanical engineer and oceanographer (PhD MIT & Woods Hole Oceanographic Institution ‘14) whose research focuses on problems in underwater imaging and computer vision. In addition to using off-the-shelf RGB cameras for scientific data acquisition underwater, she also uses hyperspectral sensors to investigate how the world appears to non-human animals. Derya has professional, technical, and scientific diving certifications and has conducted fieldwork in the Bering Sea, Red Sea, Antarctica, Caribbean, Northern and Southern Pacific and Atlantic, and her native Aegean.
- 03.04.2019

- iCVL Group

- Monthly Research Status Meeting
Abstract:
Each iCVL team member will give an update on the status of his last month's actions, raise issues for discussion and brief consultation, and present his action items for the coming month.
- 27.03.2019

- VisualComputing Seminar: Majeed Kassis - BGU

- Alignment and Line Detection of Manuscripts: One Learning-Based and One Learning-Free Algorithm
Abstract: Handling manuscripts, in contrary to modern text, is a much more challenging task. Due to the nature and history of these documents, they contain several unique characteristics, such as multi-skewed text lines, different inter-line distances, touching text lines, and multi-size text. All these characteristics are in addition to the deteriorating condition of the manuscript due to ageing, handling, and storage conditions over the centuries.
In this talk I wish to present two of my latest works, the first work, based on a deep learning model, tackles the unique text alignment problem for historical documents.
It is a widely known problem in manuscript analysis for historians, and the attempt of finding the differences between manuscript editions, until today, is done by hand. Most of the computational tools coming to assist the historians are based on word recognition systems. In this work, I will present a Siamese neural network based system, which automatically identifies whether a pair of images contain the same text without the need of recognizing the text itself. The user is required to annotate several pages of the two manuscripts they wish to align, and with the assistance of the model, we are able to align these two manuscripts. This algorithm is robust to writing style differences between the manuscripts, text condition, and its quality.
The second work, which has been submitted recently, attempts to tackle the line detection problem in a learning-free manner. The vast majority of manuscript line detection algorithms are learning-based. These algorithms force the user to annotate data needed to train a model, prior to applying the line detection algorithm.
In this work, I will present a learning-free system for line detection in manuscripts. The system is based on the Document Graph automatically generated for the document. We begin by applying a distance transform on the image, extract the image skeleton, and generate a graph by detecting the vertexes and edges of the skeleton. After several iterative and automatic steps we are able to merge the graph edges together to form the document lines.
We have applied the system on the recently released DIVA-HisDB dataset, and achieved Line IU detection accuracy of 85.92%.
- 13.03.2019

- iCVL Group

- Monthly Research Status Meeting
Abstract:
Each iCVL team member will give an update on the status of his last month's actions, raise issues for discussion and brief consultation, and present his action items for the coming month.
- 06.03.2019

- Keren Berger - BGU

- Retinal Cone Mosaic and Individual Authentication - 3
Abstract: TBA
- 27.02.2019

- VisualComputing Seminar: Dr. Jonathan Laserson - Zebra Medical Vision

- Embrace The Noise: Mining Clinical Reports to Gain a Broad Understanding of Chest X-rays
Abstract: The chest X-ray scan is by far the most commonly performed radiological examination for screening and diagnosis of many cardiac and pulmonary diseases. It also one of the hardest to interpret, with disagreement rating of around 30% even for experienced radiologists. At Zebra, we have access to millions of X-ray scans, as well as their accompanied anonymized textual reports written by hospital radiologists. Can this data be used to teach an algorithm to identify significant clinical findings from these scans? By manually tagging a relatively small set of sentences, we were able to construct a training set of almost 1M studies over the 40 most prevalent chest X-ray pathologies. A deep learning model was trained to predict the findings given the patient frontal and lateral scans. We compared the model's predictions to those made by a team of radiologists. Would the average radiologist agree more with his/her colleagues or with the model?
The speaker's short bio: Dr. Jonathan Laserson is the lead AI researcher at Zebra Medical Vision. He did his master and undergraduate studies in the Technion and has a PhD from the Computer Science AI lab at Stanford University. After a few years doing machine learning at Google and IBM, today he is focused on Deep Learning algorithms and their application to medical images understanding.
- 30.01.2019

- Peleg Harel - BGU

- The Jigsaw Puzzle Problem
Abstract: In this meeting Peleg will present an overview of his thesis topic on "The Jigsaw Puzzle Problem". He will present the motivation behind the jigsaw puzzle problem, some past solutions and the direction that will be explored on his thesis work for solving the problem.
- 23.01.2019

- Rotem Mairon -BGU

- Quantifying the Center Bias in Eye-Movements during Scene Viewing
Abstract:TBA
- 16.01.2019

- Keren Berger - BGU

- Retinal Cone Mosaic and Individual Authentication - 2
Abstract: In this meeting we will continue the discussion on retinal cone mosaic and it's possible use as a biometric method.
- 10.01.2019

- iCVL Group

- Hands-On Experience in Writing Paper Reviews
Abstract: This meeting will be dedicated to discuss papers received for peer review, and to jointly write the reviews. The participation of all lab members in the process allows members to gain experience in this professional skill.
- 02.01.2019

- Keren Berger - BGU

- Retinal Cone Mosaic and Individual Authentication - 1
Abstract: The meeting will begin with Keren reviewing literature on the retinal cone mosaic and presenting it's possible use as a biometric method, which we will then open for group discussion.
- 14.06.2017

- Seminar slot

- TBA
Abstract:TBA
- 07.06.2017

- Shay Zweig - TAU

- InterpoNet, A brain inspired neural network for optical flow dense interpolation
Abstract:Artificial neural networks are historically related to biological neural network not only by name but also by some of its key concepts such as convolutional neural networks. However this analogy does not hold beyond the general concepts. Works that try to tie the fields closer together usually remain mostly theoretical, while the leading benchmarks are dominated by more computationally driven approaches. An open question is how can we imitate concepts drawn from the cortex in ANN without loosing the simplicity and efficiency of the feed-forward inference and gradient based training. In this talk I will present our work in which we draw inspiration from concepts that we found in the monkey's visual cortex, to solve a classic computer vision problem: sparse to dense interpolation for optical flow. We took an innovative approach using training time supervision in a CNN, rather than changing the "anatomy" of the network, to enforce the brain inspired concepts leading to state-of-the-art results in the challenging benchmarks in the field.
- 17.05.2017

- Eran Goldman - BIU/Trax

- Large-Scale Classification of Structured Objects, by Nonlinear CRF with Deep Class Embedding
Abstract:This work presents a novel deep learning architecture to classify structured objects in datasets with a large number of visually similar categories. Our model extends the CRF objective function to a nonlinear form, by factorizing the pairwise potential matrix, to learn neighboring-class embedding. The embedding and the classifier are jointly trained to optimize this highly nonlinear CRF objective function. The non-linear model is trained on object-level samples, which is much faster and more accurate than the standard sequence-level training of the linear model. This model overcomes the difficulties of existing CRF methods to learn the contextual relationships thoroughly when there is a large number of classes and the data is sparse. The performance of the proposed method is illustrated on a huge dataset that contains images of retail-store product displays, taken in varying settings and viewpoints, and shows significantly improved results compared to linear CRF modeling and sequence-level training.
- 10.05.2017

- Yaniv Oiknine - BGU

- Compressive hyperspectral imaging
Abstract:Spectroscopic imaging has been proved to be an effective tool for many applications in a variety of fields, such as biology, medicine, agriculture, remote sensing and industrial process inspection. However, due to the demand for high spectral and spatial resolution it became extremely challenging to design and implement such systems in a miniaturized and cost effective manner. Using a Compressive Sensing setup based on a device that modulate the spectral domain and a sensor array, we demonstrate the reconstruction of hyperspectral image cubes from spectral scanning shots numbering an order of magnitude less than would be required using conventional systems. By examining the cubes we measured, we found that the performance of target detection algorithm in our images and in conventional hyperspectral images is similar. This principle was used also to build a compressive 4D spectro-volumetric imager and was implement in an along-track scanning task.
- 03.05.2017

- Amit Bermano - Princeton

- Geometric Methods for Realistic Animation of Faces
Abstract:In this talk, I will briefly introduce myself and mainly focus on my doctoral dissertation, which addresses realistic facial animation. Realistic facial synthesis is one of the most fundamental problems in computer graphics, and is desired in a wide variety of fields, such as film and advertising, computer games, teleconferencing, user-interface agents and avatars, and facial surgery planning. In the dissertation, we present the most commonly practiced facial content creation process, and contribute to the quality of each of its three steps. The proposed algorithms significantly increase realism and therefore substantially reduce the amount of manual labor required for production quality facial content.
Bio: Amit H. Bermano is a postdoctoral researcher in the Graphics Group at Princeton University. He obtained his M.Sc at the Technion, Israel and his doctoral degree at ETH Zurich, in 2015. Before Joining Princeton University, he was a postdoctoral researcher at Disney Research, Zurich. His research interests are applying geometry processing techniques to other fields, potentially benefiting both of them, mainly in the seam between computer graphics and computer vision. His past research includes work in geometry processing, reconstruction, computational fabrication, and animation.
- 19.04.2017

- Gil Levi - TAU

- Temporal Tessellation: A Unified Approach for Video Analysis
Abstract:We present a general approach to video understanding, inspired by semantic transfer techniques that have been successfully used for 2D image analysis. Our method considers a video to be a 1D sequence of clips, each one associated with its own semantics. The nature of these semantics -- natural language captions or other labels -- depends on the task at hand. A test video is processed by forming correspondences between its clips and the clips of reference videos with known semantics, following which, reference semantics can be transferred to the test video. We describe two matching methods, both designed to ensure that (a) reference clips appear similar to test clips and (b), taken together, the semantics of the selected reference clips is consistent and maintains temporal coherence. We use our method for video captioning on the LSMDC'16 benchmark, video summarization on the SumMe and TVSum benchmarks, Temporal Action Detection on the Thumos2014 benchmark, and sound prediction on the Greatest Hits benchmark. Our method not only surpasses the state of the art, in four out of five benchmarks, but importantly, it is the only single method we know of that was successfully applied to such a diverse range of tasks.
- 05.04.2017

- Yair Adato - Trax

- TBA
Abstract:TBA
- 11.01.2017

- Nadav Cohen - HUJI

- Inductive Bias of Deep Convolutional Networks through Pooling Geometry
Abstract:Our formal understanding of the inductive bias that drives the success of convolutional networks on computer vision tasks is limited. In particular, it is unclear what makes hypotheses spaces born from convolution and pooling operations so suitable for natural images. In this paper we study the ability of convolutional networks to model correlations among regions of their input. We theoretically analyze convolutional arithmetic circuits, and empirically validate our findings on other types of convolutional networks as well. Correlations are formalized through the notion of separation rank, which for a given partition of the input, measures how far a function is from being separable. We show that a polynomially sized deep network supports exponentially high separation ranks for certain input partitions, while being limited to polynomial separation ranks for others. The network's pooling geometry effectively determines which input partitions are favored, thus serves as a means for controlling the inductive bias. Contiguous pooling windows as commonly employed in practice favor interleaved partitions over coarse ones, orienting the inductive bias towards the statistics of natural images. Other pooling schemes lead to different preferences, and this allows tailoring the network to data that departs from the usual domain of natural imagery. In addition to analyzing deep networks, we show that shallow ones support only linear separation ranks, and by this gain insight into the benefit of functions brought forth by depth - they are able to efficiently model strong correlation under favored partitions of the input.
Joint work with Amnon Shashua
- 04.01.2017

- Etai Littwin - TAU

- The multiverse loss for robust transfer learning
Abstract:Deep learning techniques are renowned for supporting effective transfer learning. However, as we demonstrate, the transferred representations support only a few modes of separation and much of its dimensionality is unutilized. In this work, we suggest to learn, in the source domain, multiple orthogonal classifiers. We prove that this leads to a reduced rank representation, which, however, supports more discriminative directions. Interestingly, the softmax probabilities produced by the multiple classifiers are likely to be identical. Experimental results, on CIFAR-100 and LFW, further demonstrate the effectiveness of our method.
- 30.11.2016

- Chen Sagiv - SagivTech

- Computer Vision, Deep Learning & Parallel Computing – from Theory to Practice
Abstract:In this talk I will describe the work that was done by SagivTech in computer vision, deep learning and parallel computing in two EU funded projects: SceneNet, that dealt with crowd sourcing of audio visual information to create 3D reconstructions of scenes based on 2D videos coming from multiple users and 3D MASSOMICS that dealt with analysis of mass spectrometry images.
- 23.11.2016

- Ohad Fried - Princeton

- Arranging & Improving Photos
Abstract:There are *many* photos in the world, with personal photo collections easily exceeding thousands of photos. We have reached a point where photo acquisition is trivial, and the next challenge lies in arranging and easily editing such large photo collections. I will start the talk by briefly surveying a few of our works that aim to arrange large collections, and to provide fast (yet sophisticated) image manipulation techniques. Next, I will describe a new type of photo elements: “distractors” and explain how those are related yet different from saliency, and how we can automatically detect them. Lastly, I will present our latest work that can fix perspective distortions in portrait photos.
Bio: Ohad Fried is a PhD student in the Computer Science Department at Princeton University. His work lies in the intersection of computer graphics, computer vision, and HCI. Previously, he received an M.Sc. in Computer Science and a B.Sc. in Computational Biology from The Hebrew University. Ohad’s research focuses on tools, algorithms, and new paradigms for photo editing. He published research papers in premier venues, including SIGGRAPH, CVPR, Eurographics, and NIME. Ohad is the recipient of several awards, including a Siebel Scholarship, a Google PhD Fellowship and a Princeton Gordon Y.S. Wu Fellowship in Engineering. If you own a cable modem, there’s a non-negligible chance that Ohad’s code runs within it, so feel free to blame him for your slow internet connection.
www.ohadf.com
- 16.11.2016

- Amir Rosenfeld - Weizmann

- Visual Classification & Localization: from Strong Supervision to Weakly Supervised Attention Model
Abstract:Localization of discriminative image regions is critical for various visual classification tasks, ranging from action classification to fine-grained categorization. In this talk I will present two contrasting approaches from my recent work, the first using a strongly supervised model to make the case for the importance of accurate localization in action recognition in still images. In the second part of the talk I will show how to achieve a good localization of semantically meaningful sub-parts of images as well as improved classification, while dropping the need for a strong supervision.
- 22.06.2016

- Ariel Benou - BGU

- De-noising of Contrast-Enhanced MRI Sequences by an Ensemble of Expert Deep Neural Networks
Abstract:Dynamic contrast-enhanced MRI (DCE-MRI) is an imaging protocol where MRI scans are acquired repetitively throughout the injection of gadolinium-based contrast agent. The analysis of dynamic scans is widely used for the detection and quantification of blood brain barrier (BBB) permeability. Extraction of the pharmacokinetic (PK) from the DCE-MRI washout curves allows quantitative assessment of the BBB functionality. Nevertheless, curve fitting required for the analysis of DCE-MRI data is error-prone as the dynamic scans are subject to non-white, spatially-dependent and anisotropic noise that does not fit standard noise models. Curve smoothing algorithms yields inherited inaccuracies since the resulting curves do not fit the PK model, while image denoising methods such as the Beltami framework and the patch based non-local means (NLM) cannot accommodate the high variability in noise statistics in time and space.
We present a novel framework based on Deep Neural Networks (DNNs), pre- trained as stacked restricted Boltzmann machines, to address the DCE-MRI denoising challenges. The key idea is based on an ensembling of expert DNNs, where each Is trained for different noise characteristics to solve an inverse problem on a specific subset of the input space, and choosing the most likely reconstruction. As ground-truth (cleaned) signals for training are not available, a model for generating realistic training sets with complex nonlinear dynamics was developed. The proposed approach has been applied to DCE-MRI scans from stroke and brain tumor patients and is shown to favorably compare to state-of the-art de-noising methods, without degrading the contrast of the original images. High reconstruction performances on temporally down-sampled DCE-MRI data suggests that our method can be used to improve the spatial resolution by increasing the time interval between consecutive scans without any information loss in the temporal domain.
- 08.06.2016

- Prof. Adrian Stern

- Compressive spectral imaging
Abstract:Spectroscopic imaging such as hyperspectral imaging or ultraspectral imaging methods are used in a wide field of real-life applications, such as, remote sensing, astronomy, biology and bio-medicine, food safety inspection, water quality inspection, archeology and art conservation and study. In many of these applications there is a continuous endeavor to increase the amount of data that the spectral imagers are able to capture in order to improve the detectability of the spatio-spectral features of the objects. Given the fact that, from one hand, the number of pixels accommodated by 2D imagers continues to grow, and from the other hand, spectral sensing capabilities are extended in the regime of hundreds and up to a thousand of bands, scientists and engineers endeavor to combine these technologies to design spectral images that are able to capture mega spatial samples with and many hundreds, or even a thousand, of spectral bands. One of the main hurdles in the way to develop such spectral imagers is the implications associated with the huge dimension of the spectral datacube. Since spectral imagers relay on 2D sensor arrays, some scanning process is required to capture the spectral datacube. For such huge spectral datacube the acquisition effort is enormous. The scanning time is large and the memory storage requirements are extensive. This motivates employment of compressive sensing (CS) techniques for this purpose. CS is a relatively new sampling scheme that allows reducing the amount of data much below the traditional. In this talk I'll overview compressive spectral imaging techniques, with focus on techniques that we have developed in our group.
- 18.05.2016

- Danny Harari

- How to develop visual understanding in an infant’s world?
Abstract:Human visual understanding develops rapidly during the first months of life, yet, little is known about the underlying learning process. Some early perceptual capacities are rather surprising, including the ability to categorize spatial relations between objects, such as ‘in front’, ‘behind’, and in particular to recognize containment relations, where one object is inserted into another. This lies in sharp contrast to the computational difficulty of the task, which is highly challenging for state-of-the-art computer vision models (including neural nets), when labeled examples are provided. Infants, however, learn such visual concepts from their visual environment without external supervision. This study describes computational models that learn to recognize containment and other spatial relations, using only elementary perceptual capacities. We demonstrate the applicability of the models by successfully learning to recognize spatial relations from unlabeled videos.
Joint work with Nimrod Dorfman and Shimon Ullman
- 15.05.2016

- Guy Ben-Yosef - Weizmann

- TBA
Abstract:TBA
- 04.05.2016

- Nir Ben-Zrihem - Technion

- Approximate Nearest Neighbor search for video
Abstract:Is it possible to perform BM3D in real-time? Probably not, but it can be approximated.
In this talk we will present an algorithm for video patch-matching that enables real-time video processing for a variety of applications, such as colorization, denoising, or artistic effects.
We name our algorithm RIANN - Ring Intersection Approximate Nearest Neighbor - since it finds potential matches by intersecting rings around key points in appearance space. RIANN's real-time performance is attributed to two key properties: (i) RIANN leverages statistical properties of videos and adapts its search complexity to the amount of temporal difference between frames. (ii) RIANN employs a relatively long pre-processing stage, however, since this stage is global for all videos and all frames it pays off. We show via experiments that RIANN is up to two orders of magnitude faster than previous patch-matching methods and is the only solution that operates in real-time.
- 06.04.2016

- Derya Akkaynak - IUI

- Using Consumer Cameras For Scientific Imaging Underwater
Abstract:While the title might suggest my talk will only be about consumer cameras, I will share my past and ongoing work in the field of animal coloration and camouflage from the perspective of animal visual systems (eg, a predator, conspecific, or a human), and optical imagers including consumer cameras, but also spectrometers and hyperspectral imagers. I will demonstrate a camouflage breaking algorithm I have developed based on studying the (unrivaled) camouflage abilities of cuttlefish that can be run real time on n-dimensional images or video. Then, I will present some of my joint projects with biologists, ecologists and biomedical engineers, hoping to inspire you for future collaborations and novel computer vision and machine learning solutions. Finally, I will talk about my ongoing project on determining the space of physically meaningful light attenuation coefficients for underwater imaging, for improved visibility and color recovery in underwater imaging.
- 30.03.2016

- Hadar Elor - TAU

- Distilled Collections and Applications
Abstract:We present a distillation algorithm which operates on a large, unstructured, and noisy collections of internet images returned from an online object query. We introduce the notion of a distilled set, which is a clean, coherent, and structured subset of inlier images.
In addition, the object of interest is properly segmented out throughout the distilled set.
Our approach is unsupervised, built on a novel clustering scheme, and solves
the distillation and object segmentation problems simultaneously.
We demonstrate the utility of our distillation results with a number of interesting
graphics applications, including a novel data-driven morphing technique.
- 20.01.2016

- Peri Muttath

- Application gaps of computer vision
Abstract:
- 13.01.2016

- Maria Kushnir - UoHaifa

- A General Preprocessing Method for Improved Performance of Epipolar Geometry Estimation Algorithms.
Abstract:A deterministic preprocessing algorithm, especially designed to deal with repeated structures and wide baseline image pairs, is presented. It generates putative matches and their probabilities. They are then given as input to state-of-the-art epipolar geometry estimation algorithms, improving their results considerably, succeeding on hard cases on which they failed before. The algorithm consists of three steps, whose scope changes from local to global. In the local step, it extracts from a pair of images local features (e.g. SIFT), clustering similar features from each image. The clusters are matched yielding a large number of matches. Then pairs of spatially close features (2keypoint) are matched and ranked by a classifier. The highest ranked 2keypoint-matches are selected. In the global step, fundamental matrices are computed from each two 2keypoint-matches. A match's score is the number of fundamental matrices, which it supports. This number combined with scores generated by standard methods is given to a classifier to estimate its probability. The ranked matches are given as input to state-of-the-art algorithms such as BEEM, BLOGS and USAC yielding much better results than the original algorithms. Extensive testing was performed on almost 900 image pairs from six publicly available datasets.
- 30.12.2015

- Prof. Yaffa Yeshurun - UHaifa

- The effects of attention on segregation and integration of spatial and temporal information
Abstract:Transient spatial attention refers to a fast, automatic selection of location that is driven by the physical stimulus rather than a mere voluntary decision. The aim of the studies I will discuss was to further our understanding of both the spatial and temporal components of visual perception, and their interrelations with transient spatial attention. Specifically, I will present a number of studies that explored the effects of transient attention on various spatial and temporal aspects of visual perception. These studies added peripheral precueing –a direct manipulation of transient attention– to the typical manipulations of these various aspects of perception, and revealed that the effects of transient attention are governed by perceptual tradeoffs between integration and segregation processes, and tradeoffs between the temporal and spatial domains of perception. Specifically, these studies suggest that transient attention facilitates spatial segregation and temporal integration but impairs spatial integration and temporal segregation.
- 02.12.2015

- Or Litany - TAU

- A picture is worth a billion bits: Real-time image reconstruction from dense binary pixels
Abstract:The pursuit of smaller pixel sizes at ever increasing resolution in digital image sensors is mainly driven by the stringent price and form-factor requirements of sensors and optics in the cellular phone market. Recently, Eric Fossum proposed a novel concept of an image sensor with dense sub-diffraction limit one-bit pixels (jots) which can be considered a digital emulation of silver halide photographic film. This idea has been recently embodied as the EPFL Gigavision camera.
A major bottleneck in the design of such sensors is the image reconstruction process, producing a continuous high dynamic range image from oversampled binary measurements. The extreme quantization of the Poisson statistics is incompatible with the assumptions of most standard image processing and enhancement frameworks. The recently proposed maximum-likelihood (ML) approach addresses this difficulty, but suffers from image artefacts and has impractically high computational complexity.
In this work, we study a variant of a sensor with binary threshold pixels and propose a reconstruction algorithm combining an ML data fitting term with a sparse synthesis prior. We also show an efficient hardware-friendly real-time approximation of this inverse operator.
Promising results are shown on synthetic data as well as on HDR data emulated using multiple exposures of a regular CMOS sensor.
- 18.11.2015

- Anastasia Dubrovina - Technion

- Gometric Algorithms For Image And Surface Analysis
Abstract:The complexity and the accuracy of image and shape analysis algorithms depend on the problem formulation and data representation. In this talk, I will describe how geometric problem formulation, together with novel data representations, can be utilized for efficient and accurate object segmentation and three dimensional shape matching.
For image segmentation, we consider the active contours model, combined with the level set framework. We extend this classical solution to obtain an efficient and accurate algorithm for multi-region image and volume segmentation, while exploiting a single level-set function. For user-assisted image segmentation, we consider a method based on generalized Voronoi tessellation, related to morphological watersheds. We represent the image as a graph of its level sets, rather than using the standard Cartesian grid, which leads to a consistent solver of the continuous problem and improves the segmentation results. For the non-rigid shape matching problem, we exploit the mapping continuity and the smoothness of the pointwise and pairwise shape properties. We formulate the matching problem in the natural spectral domain, thus, facilitating the matching and obtaining state of the art results.
- 04.11.2015

- Rafi Cohen - BGU

- Aligning Transcriptions of Historical Documents
Abstract:In the very recent decades, much effort has been devoted to digitizing document collections and archiving them as digital libraries. Without computer-assistance the extraction of information embedded in these documents would be practically impossible. Among the millions of documents scattered around the world, of increasing interest are historical documents and their analysis. Historical manuscripts are a basic resource for researching our cultural heritage and history. Unfortunately historical documents are very degraded due to poor storage conditions over time, moisture, and many other factors, so that off the shelf image processing algorithms do not perform well on them.
In this talk, we present two aspects of historical document image analysis. The first part of the talk is devoted to several challenging preprocessing steps. These include text and drawings separation and line segmentation of degraded handwritten documents. The second part of this talk deals with transcript alignment. That is, the automatic mapping between an ASCII transcripts written by historians and the word images in the digitized document.
- 28.10.2015

- Amir Kolaman - BGU

- Amplitude Modulated Video Camera -Light Separation for Dynamic Scenes
Abstract:Controlled light condition - used in the lab - improve the performance of most image and video processing algorithms. We propose to gain control over light conditions - outside the lab - by using the analogy that light carry scene information to the camera as radio waves carry voice to the receiver. Suggested method, named AM video camera, separates the influence of a modulated light from other light sources in the scene, by using AM communication principals. The proposed method is ideal for use in real time video due to its low complexity, highly parallelizable, use simple hardware and time independency between light and camera. AM video camera prototype system was built to demonstrate Color constancy, Shadow removal and Contrast enhancement performed in real time. Proposed system can produce the same noise levels as a standard camera, and its results are almost unaffected by change of background light intensities.
- 05.08.2015

- Assaf Arbelle - BGU

- Analysis of High Throughput Microscopy Videos: Catching Up With Cell Dynamics
Abstract:We present a novel framework for high-throughput cell lineage analysis in time-lapse microscopy images. Our algorithm ties together two fundamental aspects of cell lineage construction, namely cell segmentation and tracking, via a Bayesian inference of dynamic models. The proposed contribution exploits the Kalman inference problem by estimating the time-wise cell shape uncertainty in addition to cell trajectory. These inferred cell properties are combined with the observed image measurements within a fast marching (FM) algorithm, to achieve posterior probabilities for cell segmentation and association. Highly accurate results on two different cell-tracking datasets are presented.
- 15.07.2015

- Boaz Arad - BGU

- Introduction to dictionary learning
Abstract:TBA
- 10.06.2015

- Rotem Mairon - BGU

- Presenting the paper "Interesting objects are visually salient"
Abstract:Rotem will present the paper "Interesting objects are visually salient" by Elazary and Itti.
Abstract:
How do we decide which objects in a visual scene are more interesting? While intuition may point toward high-level object recognition and cognitive processes, here we investigate the contributions of a much simpler process, low-level visual saliency. We used the LabelMe database (24,863 photographs with 74,454 manually outlined objects) to evaluate how often interesting objects were among the few most salient locations predicted by a computational model of bottom-up attention. In 43% of all images the model’s predicted most salient location falls within a labeled region (chance 21%). Furthermore, in 76% of the images (chance 43%), one or more of the top three salient locations fell on an outlined object, with performance leveling off after six predicted locations. The bottom-up attention model has neither notion of object nor notion of semantic relevance. Hence, our results indicate that selecting interesting objects in a scene is largely constrained by low-level visual properties rather than solely determined by higher cognitive processes.
- 03.06.2015

- Ehud Barnea - BGU

- Intro to probabilistic graphical model
Abstract:Probabilistic graphical models serve as an important tool in many research areas as well as computer vision. In this talk we will introduce this topic, its uses, and the main ideas behind it. The talk is intended for people with no prior knowledge and intends to give a broad overview (delving in only when time permits).
- 22.03.2015

- Naphtali Abudarham - TAU

- Reverse-engineering the Face-Space: discovering the crucial features for face-identification
Abstract:The Face-Space theory suggests an analogy between face recognition by humans, and machine classification. According to this theory, faces are represented in memory as feature-vectors, or points in a multidimensional feature space, and each identity is a class, represented by a sub-space containing the different appearances of each person's face. The purpose of this study was to bring this theory to the ground, and to discover what are the features that humans use to identify faces. To this end we selected a set of features that were used to describe faces, and tested whether distances in this space corresponded to perceptual similarity between faces. Next we used a reverse-engineering approach, in which we changed faces systematically, and tested the effects of these changes on perceptual judgments. We then tested whether we can embed this space in a lower-dimensional space using only a subset of the original features, thus discovering the features that are critical for determining the identity of a face. We found that critical features are the ones for which we have high perceptual-sensitivity for detecting differences between different faces. Finally, we found that the same features are also invariant under different face appearances (e.g. pose, illumination or expression).
- 18.03.2015

- Simon Korman - TAU

- Inverting RANSAC: Global Model Detection via Inlier Rate Estimation
Abstract:This work presents a novel approach for detecting inliers in a given set of correspondences (matches). It does so without explicitly identifying any consensus set, based on a method for inlier rate estimation (IRE). Given such an estimator for the inlier rate, we also present an algorithm that detects a globally optimal transformation. We provide a theoretical analysis of the IRE method using a stochastic generative model on the continuous spaces of matches and transformations. This model allows rigorous investigation of the limits of our IRE method for the case of 2D translation, further giving bounds and insights for the more general case. Our theoretical analysis is validated empirically and is shown to hold in practice for the more general case of 2D affinities. In addition, we show that the combined framework works on challenging cases of 2D homography estimation, with very few and possibly noisy inliers, where RANSAC generally fails.
Joint work with Roee Litman, Alex Bronstein and Shai Avidan
- 14.01.2015

- Shaul Oron - TAU

- Extended Lucas-Kanade Tracking
Abstract:The Lucas-Kanade (LK) method is a classic tracking algorithm exploiting
target structural constraints thorough template matching. Extended Lucas
Kanade or ELK casts the original LK algorithm as a maximum likelihood optimization
and then extends it by considering pixel object / background likelihoods
in the optimization. Template matching and pixel-based object / background segregation
are tied together by a unified Bayesian framework. In this framework two
log-likelihood terms related to pixel object / background affiliation are introduced
in addition to the standard LK template matching term. Tracking is performed using
an EM algorithm, in which the E-step corresponds to pixel object/background
inference, and the M-step to parameter optimization. The final algorithm, implemented
using a classifier for object / background modeling and equipped with
simple template update and occlusion handling logic, is evaluated on two challenging
data-sets containing 50 sequences each. The first is a recently published
benchmark where ELK ranks 3rd among 30 tracking methods evaluated. On the
second data-set of vehicles undergoing severe view point changes ELK ranks in
1st place outperforming state-of-the-art methods.
Shaul Oron, Aharon Bar-Hillel (MSR), Shai Avidan
- 07.01.2015

- Dr. Véronique Prinet - GM

- Illuminant Chromaticity from Image Sequences
Abstract:We estimate illuminant chromaticity from temporal sequences, for scenes illuminated by either one or two dominant illuminants. While there are many methods for illuminant estimation from a single image, few works so far have focused on videos, and even fewer on multiple light sources. Our aim is to leverage information provided by the temporal acquisition, where either the objects or the camera or the light source are/is in motion in order to estimate illuminant color without the need for user interaction or using strong assumptions and heuristics. We introduce a simple physically-based formulation based on the assumption that the incident light chromaticity is constant over a short space-time domain. We show that a deterministic approach is not sufficient for accurate and robust estimation: however, a probabilistic formulation makes it possible to implicitly integrate away hidden factors that have been ignored by the physical model. Experimental results are reported on a dataset of natural video sequences and on the GrayBall benchmark, indicating that we compare favorably with the state-of-the-art.
(This is a joint work with Dani Lischinski and Michael Werman)
- 31.12.2014

- Prof. Shai Avidan

- Visual Nearest Neighbor Search
Abstract:Template Matching finds the best match in an image to a given template and is used in a variety of computer vision applications. I will discuss several extensions to Template Matching. First, dealing with the case where we have millions of templates that we must match at once, second dealing with the case of RGBD images, where depth information is available, and finally, presenting a fast algorithm for template matching under 2D affine transformations with global approximation guarantees.
Joint work with Simon Korman, Yaron Eshet, Eyal Ofek, Gilad Tsur and Daniel Reichman.
- 24.12.2014

- Prof. Michael Elad - Technion

- Wavelet for Graphs and its Deployment to Image Processing
Abstract:What if we take all the overlapping patches from a given image and organize them to create the shortest path by using their mutual Euclidean distances? This suggests a reordering of the image pixels in a way that creates a maximal 1D regularity. What could we do with such a construction? In this talk we consider a wider perspective of the above, and introduce a wavelet transform for graph-structured data. The proposed transform is based on a 1D wavelet decomposition coupled with a pre-reordering of the input so as to best sparsify the given data. We adopt this transform to image processing tasks by considering the image as a graph, where every patch is a node, and edges are obtained by Euclidean distances between corresponding patches. We show several ways to use the above ideas in practice, leading to state-of-the-art image denoising, deblurring, inpainting, and face-image compression results.
- 03.12.2014

- Dr. Tammy Riklin-Raviv

- Statistical Shape Analysis of Neuroanatomical Structures via Level-set based Shape Morphing
Abstract:Group-wise statistical analysis of the morphometry of brain structures plays an important role in neuroimaging studies. Nevertheless, most morphometric measurements are often limited to volume and surface area, as further morphological characterization of anatomical structures poses a significant challenge. In this paper, we present a method that allows the detection, localization and quantification of statistically significant morphological differences in complex brain structures between populations. This is accomplished by a novel level-set framework for shape morphing and a multi-shape dissimilarity-measure derived by a modified version of the Hausdorff distance. The proposed method does not require explicit one-to-one point correspondences and is fast, robust and easy to implement regardless of the topological complexity of the anatomical surface under study. The proposed model has been applied to well-dened regions of interest using both synthetic and real data sets. This includes the corpus callosum, striatum, caudate, amygdalahippocampal complex and superior temporal gyrus. These structures were selected for their importance with respect to brain regions implicated in a variety of neurological disorders. The synthetic databases allowed quantitative evaluations of the method. Results obtained with real clinical data of Williams syndrome and schizophrenia patients agree with published findings in Psychiatry literature.
- 25.06.2014

- Dr. Tali Treibitz - Haifa University

- Our Eyes Beneath the Sea: Advanced Optical Methods For Ocean Imaging
Abstract:The ocean covers 70% of the earth surface, and influences almost every aspect in our life, such as climate, fuel, security, and food. All over the world, depleting resources on land are encouraging increased human activity in the ocean, for example: gas drilling, desalination plants, port constructions, aquaculture, bio-fuel, and more. The ocean is a complex, vast, foreign environment that is hard to explore and therefore much about it is still unknown. Interestingly, only 5% of the ocean floor has been seen so far. As human access to most of the ocean is very limited, optical imaging systems can serve as our eyes in those remote areas. However, optical imaging underwater is challenging due to intense pressures at depth, strong color and distance dependent attenuation, refraction at the interface air/water, and the ever-changing and rugged conditions of the natural ocean. Thus, imaging underwater pushes optical imaging to its limits. This is where advanced computer vision methods may overcome some of these obstacles post-acquisition and enable large-scale operations using machine learning.
As a result, imaging systems for the ocean require a dedicated effort throughout all the development steps: design, optical, electrical and mechanical engineering and computer vision algorithms. In this talk I describe several in situ underwater imaging systems I developed and show how they can be used to solve acute scientific problems. These include an underwater in situ high-resolution microscope, 3D reconstruction, and systems for large-scale multispectral and fluorescence imaging.
- 24.06.2014

- Prof. Ilan Shimshoni - Haifa U

- Robust Epipolar Geometry Estimation Using Noisy Pose Priors
Abstract:Ilan Shimshoni (MSC thesis of Yehonatan Goldman)
Epipolar geometry estimation is fundamental to many computer vision algorithms. It
has therefore attracted a lot of interest in recent years, yielding high quality estimation
algorithms for wide baseline image pairs. Currently many types of cameras (e.g., in
smartphones and robot navigation systems) produce geo-tagged images containing pose
and internal calibration data. Exploiting this information as part of an epipolar geometry
estimation algorithm may be useful but not trivial, since the pose measurement may be
quite noisy. We introduce SOREPP, a novel estimation algorithm designed to exploit
pose priors naturally. It sparsely samples the pose space around the measured pose
and for a few promising candidates applies a robust optimization procedure. It uses all
the putative correspondences simultaneously, even though many of them are outliers,
yielding a very efficient algorithm whose runtime is independent of the inlier fractions.
SOREPP was extensively tested on synthetic data and on hundreds of real image pairs
taken by a smartphone. Its ability to handle challenging scenarios with extremely low
inlier fractions of less than 10% was demonstrated as was its ability to handle images
taken by close cameras. It outperforms current state-of-the-art algorithms that do not
use pose priors as well as other algorithms that do.
- 11.06.2014

- Hadar Elor - TAU

- RingIt: Ring-ordering Casual Photos of a Dynamic Event
Abstract:In this talk I will present RingIt – a novel technique to sort an
unorganized set of casual photographs taken along a general ring, where the
cameras capture a dynamic event in the center of the ring. The multitude of
cameras constantly present nowadays redefined not only the meaning of
capturing an event, but also the meaning of sharing this event with others.
The images are frequently uploaded to some common platform, like Facebook or
Picasa, and the image-navigation challenge naturally arises. Our technique
recovers the spatial order of a set of still images capturing an event taken
by a group of people situated around the event, allowing for a sequential
display of the captured object.
- 01.06.2014

- Prof. Michael Lindenbaum - Technion

- Probabilistic Local Variation
Abstract:Michael Baltaxe and Michael Lindenbaum
The goal of image oversegmentation is to divide an image into several pieces or "segments", such that each segment is part of an object present in the scene. In this talk we focus on the local variation (LV) algorithm, which is one of the most common algorithms for image oversegmentation. We show that all the components in LV are essential to achieve high performance and then show that algorithms similar to LV can be devised by applying different statistical decisions. This leads us to introduce probabilistic local variation (pLV), a new algorithm based on statistics of natural images and on a hypothesis testing decision. pLV presents state-of-the-art results (for fine oversegmentation) while keeping the same computational complexity of the LV algorithm, and is in practice one of the fastest oversegmentation methods in the literature.
- 28.05.2014

- Prof. Galia Avidan - BGU

- Structural and functional impairment of the face processing network in congenital prosopagnosia
Abstract:There is growing consensus that accurate and efficient face recognition is mediated by a neural circuit comprised of a posterior ‘core’ and an anterior ‘extended’ set of regions. In a series of functional and structural imaging studies, we characterize the distributed face network in individuals with congenital prosopagnosia (CP) – a lifelong impairment in face processing – relative to that of matched controls. Interestingly, our results uncover largely normal activation patterns in the posterior core face patches in CP. More recently, we also documented normal activity of the amygdala (emotion processing) as well as normal, or even enhanced functional connectivity between the amygdala and the core regions. Critically, in the same individuals, activation of the anterior temporal cortex, which is thought to mediate identity processing, was reduced and connectivity between this region and the posterior core regions was disrupted. The dissociation between the neural profiles of the anterior temporal lobe and amygdala was evident both during a task-related face scan and during a resting state scan, in the absence of visual stimulation. Taken together, these findings elucidate selective disruptions in neural circuitry in CP, and are also consistent with impaired white matter connectivity to anterior temporal cortex and prefrontal cortex documented in these individuals. These results implicate CP as disconnection syndrome, rather than an alteration localized to a particular brain region. Furthermore, they offer an account for the known behavioral differential difficulty in identity versus emotional expression recognition in many individuals with CP.
- 02.04.2014

- Tomer Michaeli - Weizmann

- Blind Super-Resolution
Abstract:Super resolution (SR) algorithms typically assume that the blur kernel is known (either the Point Spread Function 'PSF' of the camera, or some default low-pass filter like a Gaussian). However, the performance of SR methods significantly deteriorates when the assumed blur kernel deviates from the true one. We propose a general framework for "blind" super resolution. In particular, we show that:
(i) Unlike the common belief, the PSF of the camera is the wrong blur kernel to use in SR algorithms.
(ii) We show how the correct SR blur kernel can be recovered directly from the low-resolution image. This is done by exploiting the inherent recurrence property of small natural image patches (either internally within the same image, or externally in a collection of other natural images). In particular, we show that recurrence of small patches across scales of the low-res image (which forms the basis for single-image SR), can also be used for estimating the optimal blur kernel. This leads to significant improvement in SR results.
- 26.03.2014

- Boaz Arad - BGU

- How expressive is hypespecral data and what can it teach us?
Abstract:Hyperspectral imaging is an important visual modality with growing
interest and range of applications. The latter, however, is severely limited by the
fact that existing devices are limited in either spatial, spectral, and/or temporal
resolution, while yet being both complicated and expensive. We present a low cost
and fast method to recover high quality hyperspectral images of natural scenes directly
from RGB.
Furthermore, we examine a rich data-set of acquired hyperspectral images in order to analyze the phenomenon of "metamerism" (dissimilar spectra producing similar RGB responses), explore the evolution of the human visual system and suggest future research directions.
- 05.03.2014

- Lior Gorodisky - Technion

- Quantification of Mitral Regurgitation using Magnetic Resonance Imaging (MRI)
Abstract:Mitral Regurgitation (MR) is a cardiac disorder, in which there is dysfunction of the mitral valve, resulting in backwards flow of blood from the left ventricle into the left atrium. The gold standard to assess MR is by echocardiography, using ultrasound-Doppler, which are more qualitative than quantitative. The PISA (Proximal Isovelocity Surface Area) method can be used for quantitative evaluation; however, it is based on simplistic assumptions, including hemispheric geometry of the blood stream.
Magnetic Resonance Imaging (MRI) enables detailed 3D evaluation of flow vectors making it theoretically suitable for accurate quantification of MR without any assumptions. Furthermore, using ultrasound-Doppler, the velocity of the blood can be measured only for the component that is parallel to the direction of the ultrasound beam. Using MRI, the PISA can be in the form of any 3D shape.
The aim of this study is to test the feasibility of performing a quantitative estimation of the Regurgitation volume (RVol) using MRI 3D velocity vectors. Additionally, the hemispheric geometry assumption that is used for ultrasound-Doppler was examined by examining the 3D shapes.
Using MRI velocity measurements, and threshold velocity vmax, all the pixels within the image from specific time-stamp and slice-location with velocity smaller than vmax were marked. Then, an ellipse was matched for each separated area. Applying geometrical considerations, the volume of the blood flowing through the 3D-shape was calculated by summing (using Riemann sum) of the flows during a heart cycle. Finally, the volumes were sorted and the maximal volume should be selected as the 3D-PISA RVol.
MRI imaging results and calculation of MR volumes will be presented, along with theoretical and practical aspects of the research. Comparison between echocardiography results and MRI 3D results will be shown and discussed.
- 12.02.2014

- Julius Orlowski - RWTH Aachen

- Visual search in barn owls: From feature to conjunction search
Abstract:Mechanisms of visual attention have a long history of research in humans and primates, but whether these mechanisms are universal in all vertebrates remains unclear. We study these processes in behaving barn owls; namely, whether search strategies in simple (parallel) and complex (serial) search tasks are as described in primates. Since they have no eye movements, barn owls move their head to fixate visual targets. Due to that, their gaze can be tracked with a head mounted microcamera, the Owlcam. We analyze the gaze path and fixations of barn owls in an open field setting while owls investigate patterns containing an odd target placed among identical distractors on the ground. Initial findings suggested that barn owls look more often and prolonged at target items in different single feature searches (shape, orientation, and luminance contrast), even if they have no direct search task. Then we choose two feature searches and trained the owls to search for the target items while using different set sizes. The features were either differently oriented bars or grey discs of differing intensity. Finally, these were merged to a conjunction task; here the odd target was a unique conjunction of orientation and intensity. Our results suggest that barn owls search as humans do.
- 05.02.2014

- Prof. Chen Keasar

- Discussion - Applied machine learning
Abstract:This is not a regular talk, but a discussion regarding the practical issues faced when applying machine learning algorithms to practical problems and the gaps between theoretical and practical machine learning.
- 29.01.2014

- Prof. Daniel Keren - Haifa University

- Image Classification Using a Background Prior
Abstract:A canonical problem in computer vision is category
classification (e.g. find all instances of human faces,
cars etc., in an image). Typically, the input for training
a classifier is a relatively small sample of positive
examples, and a much larger sample of negative examples,
which in current applications can consist of images from
thousands of categories.
The difficulty of the problem sharply increases with the
dimension and size of the negative example set. In this talk
I will describe an efficient and easy to apply classification
algorithm, which replaces the negative samples by a prior and
then constructs a "hybrid" classifier which separates
the positive samples from this prior. The resulting classifier
achieves an identical or better classification rate than SVM,
while requiring far smaller memory and lower computational
complexity to train and apply.
While here it is applied to image classes, the idea is general
and can be applied to other domains.
Joint work with Margarita Osadchy and Bella Fadida-Specktor.
An early version of this work was presented in ECCV 2012.
- 22.01.2014

- Dan Rosenbaum - HUJI

- Learning the Local Statistics of Optical Flow
Abstract:Motivated by recent progress in natural image statistics, we use newly available datasets with ground truth optical flow to learn the local statistics of optical flow and rigorously compare the learned model to prior models assumed by computer vision optical flow algorithms. We find that a Gaussian mixture model with 64 components provides a significantly better model for local flow statistics when compared to commonly used models. We investigate the source of the GMM's success and show it is related to an explicit representation of flow boundaries. We also learn a model that jointly models the local intensity pattern and the local optical flow. In accordance with the assumptions often made in computer vision, the model learns that flow boundaries are more likely at intensity boundaries. However, when evaluated on a large dataset, this dependency is very weak and the benefit of conditioning flow estimation on the local intensity pattern is marginal.
Joint work with Daniel Zoran and Yair Weiss.
- 08.01.2014

- Maria Zontak - Weizmann

- On the Internal vs. External Statistics of Image Patches, and its Implications on Image Denoising
Abstract:Surprisingly, ”Internal-Denosing” (using internal noisy patches) usually out- performs ”External-Denoising” (using external clean patches), especially in high noise- levels. We analyze and explain this phenomenon. We further show how the ”fractal” property of natural images (cross-scale patch recurrence) promotes a new powerful internal search-space. Since noise drops dramatically at coarser scales of the noisy image, for almost any noisy patch, its unknown clean version naturally emerges in a coarser scale, at the same relative image coordinates.
Joint work with Inbar Mosseri and Michal Irani
- 04.12.2013

- Erez Freud - BGU

- Object recognition in cases of spatial uncertainty
Abstract:One of the greatest computational and representational challenges facing the visual system is the need to perform fine detailed discrimination on the one hand, and on the other, to be able to generalize across items belonging to the same category. My work examines how such processes are implicated in the representation of impossible objects, 2D line-drawings that represent objects that violate fundamental rules of spatial organization and therefore could not be created in 3D space. Importantly, the physical differences between impossible and possible objects are minimal and therefore the distinction between these two object categories requires fine detail representation. On the other hand, these object categories share common properties which may support generalization. I will describe findings from imaging and behavioral experiments, which suggest that the visual system is highly susceptible to objects spatial layout and could successfully discriminate between these object categories (i.e. detailed perception), but at the same time, can overcome spatial violation and successfully represents impossible objects based on the same principles applicable to possible objects by utilizing valid Gestalt shape attributes (i.e. generalization). Hence, this work may shed light on some of the neural and behavioral mechanisms which support the two ends of visual perception ranging from fine detailed representation enabling unique identification on the one hand, and generalization across similar items forming a coherent category on the other.
- 27.11.2013

- Asaf Ben Shalom - BGU

- See abstract
Abstract:Are we conscious of everything we see or just part of what we see? Vision is prevalently viewed as a serial process, which end result is an experience we call seeing. The debate about the location from which conscious phenomenal experience emerges in the visual process is a subject shrouded in theoretical controversy. Although the end result of the process is clearly conscious, we have no way of knowing what it is that makes it conscious. To make the situation even worse, by our current scientific methodologies we have no way of refuting the possibility that our cognitive access to our own experiences is constitutive to what makes them phenomenally conscious. The current path of investigation is finding which features or processes of "seeing" we can find in clearly conscious visual experience that could be accounted for the emergence of our phenomenal experience. When such features have been found, we can search for the temporal window of that feature realization in the visual hierarchy. Finding the neural architecture and processes that are present in our clearly conscious visual experiences would help in deciding where a representation becomes phenomenally conscious. We use classical psycho-physical methods to probe representations along several points in the visual hierarchy to try (once again) investigating the problem of perception without awareness.
- 20.11.2013

- Alon Faktor - Weizmann

- Co-recognition by composition -- unsupervised discovery and segmentation of image categories
Abstract:Given an image collection with few or no human annotations, we would
like to automatically add to it high level information such as what
images are semantically similar to what images and what image parts
are responsible for that similarly. More specifically, we want to
solve two problems:
1) Given an unlabeled image collection, we would like to cluster the
images to its underlying image categories.
2) Given a set of images which share an object from the same semantic
category, we would like to co-segment the shared object.
We define a “good image cluster” as one in which images can be easily
composed (like a puzzle) using pieces from each other, while are
difficult to compose from images outside the cluster. The larger and
more statistically significant the pieces are, the stronger the
affinity between the images. Similarly, we define “good co-segments”
to be ones which can be easily composed from large pieces of other
co-segments, yet are difficult to compose from remaining image parts.
This gives rise to unsupervised discovery of very challenging image
categories as well as co-segmentation of objects in very challenging
scenarios with large variations in appearance, shape and large amounts
of clutter. Our approach can be applied both to large image
collections, as well as to very few images (where there is too little
data for unsupervised learning). At the extreme, it can be applied
even to a single image, to extract its co-occurring objects.
- 06.11.2013

- Yonathan Aflalo - Technion

- Spectral Multidimensional Scaling
Abstract:An important tool in information analysis is dimensionality reduction. There are various approaches for large data simplification by scaling its dimensions down that play a significant role in recognition and classification tasks. The efficiency of dimension reduction tools is measured in terms of memory and computational complexity, which are usually a function of the number of the given data points. Sparse local operators that involve substantially less than quadratic complexity at one end, and faithful multiscale models with quadratic cost at the other end, make the design of dimension reduction procedure a delicate balance between modeling accuracy and efficiency. Here, we combine the benefits of both and propose a low-dimensional multiscale modeling of the data, at a modest computational cost. The idea is to project the classical multidimensional scaling problem into the data spectral domain extracted from its Laplace–Beltrami operator. There, embedding into a small dimensional Euclidean space is accomplished while optimizing for a small number of coefficients. We provide a theoretical support and demonstrate that working in the natural eigenspace of the data, one could reduce the process complexity while maintaining the model fidelity. As examples, we efficiently canonize nonrigid shapes by embedding their intrinsic metric into Formula, a method often used for matching and classifying almost isometric articulated objects. Finally, we demonstrate the method by exposing the style in which handwritten digits appear in a large collection of images. We also visualize clustering of digits by treating images as feature points that we map to a plane.
- 07.08.2013

- Summer Break

- Summer Break
Abstract:
- 31.07.2013

- Gilad Tsur - Weizmann

- Fast Match: Fast Affine Template Matching
Abstract:
- 17.07.2013

- Nimrod Dorfman - Weizmann

- Learning to See: Developing visual concepts from unlabeled video streams
Abstract:We consider the tasks of learning to recognize hands and direction of gaze from unlabeled natural video streams. These are known to be highly challenging tasks for current computational methods. However, infants learn to solve these visual problems early in development - during the first year of life. This gap between computational difficulty and infant learning is particularly striking. We present a model which is shown a stream of natural videos, and learns without any supervision to detect human hands by appearance and by context, as well as direction of gaze, in complex natural scenes. The algorithm is guided by an empirically motivated innate mechanism – the detection of ‘mover’ events in dynamic images, which are the events of a moving image region causing a stationary region to move or change after contact. Mover events provide an internal teaching signal, which is shown to be more effective than alternative cues and sufficient for the efficient acquisition of hand and gaze representations. We will discuss how the implications of our approach can go beyond the specific tasks, by showing how domain-specific ‘proto concepts’ can guide the system to acquire meaningful concepts, which are significant to the observer, but are statistically inconspicuous in the sensory input.
* Joint work with Danny Harari and Shimon Ullman
- 10.07.2013

- Elhanan Elboher - HUJI

- Efficient and Robust Visual Matching based on Low Rank Decompositions
Abstract:
- 03.07.2013

- Daniel Zoran - HUJI

- Natural Images, Gaussian Mixtures and Dead Leaves
Abstract:
- 19.06.2013

- Ehud Barnea - BGU

- RGB-D Object Detection from Partial Pose Estimation of Symmetric Objects
Abstract:
- 12.06.2013

- Ofer Levi - BGU

- Application of computational methods and optimization to imaging and remote sensing problems
Abstract:
- 29.05.2013

- Schechner Yoav - Technion

- Geometry from Refracted Radiance
Abstract:Imaging effects may sometimes originate outside the field of view.
Peripheral light strays into the image by refraction, scattering and
reflection (lens-flare). These effects reveal scene geometry. In particular,
we will show how refraction can help recovering 3D underwater scene
structure by viewing down from space. On the other hand, being submerged, a
look upwards creates a 'virtual pericsope' for observing objects in air.
Despite undetermined distortions created by a wavy dynamic water surface,
triangulating airborne objects is possible using in such settings. This may indicate abilities of some animals that prey through the surface.
In addition, the wavy water surface refracts natural illumination, creating random caustic illumination patterns. These patterns lead to accurate (and surprisingly simple) recovery of submerged structures.
- 08.05.2013

- Lihi Zelnik - Technion

- Saliency detection in videos and images
Abstract:Finding the salient pixels in an image or a video is useful for a broad range of tasks. Applications such as cropping, compression, and editing all aim at preserving the important regions intact. Hence, the goal of this talk is to present algorithms for
finding the “important” pixels in images and videos. This will be done while outlining the mutual and the disjoint factors that affect saliency in images and videos. I will show that the inherent differences suggest video saliency should be approached differently from image saliency and the corresponding algorithms should not be intuitive extensions of image saliency.
- 01.05.2013

- Efrat Dubi- BGU

- 3D constrained surface reconstruction from 2D image
Abstract:
- 17.04.2013

- Mor Ben Tov - Neural Computation @ BGU

- Visual search in the archer fish
Abstract:Vision begins when the image falls on the retina and continues downstream when higher visual areas in the brain process the image to enable visual behavior. One of the most important tasks the visual system needs to deal with is visual search; i.e., the detection of a specific, important object such as a food item or a predator in the visual environment. Thus the successful implementation of visual search is critical for the visual behavior of almost every organism and it is no wonder that much scientific effort has been devoted to understanding this subject. Earlier studies were primarily conducted with mammals and explored the relationship between the information processing properties of cells in the visual cortex, such as orientation and direction selectivity, to the animal's ability to perform a visual search. Visual search is widespread across the animal kingdom and a better understanding of visual search and the computation needed for visual search can benefit from studying visual search in organisms aside from mammals. In this study we test the ability of the archer fish to perform visual search, and explore whether the functional properties of cells in the optic tectum, the main visual area in the fish brain, are the basis for information processing during this task. We chose the archer fish since it hunts by shooting down insects situated up to two meters above the water level by squirting water from its mouth. We combine a behavioral study with electrophysiology experiments in the archer fish to understand visual search in this interesting animal. The results from this study may shed light on how information processing during visual search is performed and may lead to a better understanding of this fundamental issue in vision.
- 03.04.2013

- Assaf Glazer - Technion

- One-Class Background Model
Abstract:Background models are often used in video surveillance systems to find moving objects in an image sequence from a static camera. These models are often built under the assumption that the foreground objects are not known in advance. This assumption has led us to model background using one-class classifiers. Our model belongs to a family of block-based nonparametric models that can be used effectively for highly complex scenes of various background distributions with almost the same configuration parameters for all examined videos.
- 20.02.2013

- Shai Avidan - Tel Aviv university

- Locally Orderless Tracking
Abstract:Locally Orderless Tracking (LOT) is a visual tracking algorithm that automatically estimates the amount of local (dis)order in the object. This lets the tracker specialize in both rigid and deformable objects on-line and with no prior assumptions. We provide a probabilistic model of the object variations over time. The model is implemented using the Earth Mover?s Distance (EMD) with two parameters that control the cost of moving pixels and changing their color. We adjust these costs on-line during tracking to account for the amount of local (dis)order in the object. We show LOT?s tracking capabilities on challenging video sequences, both commonly used and new, demonstrating performance comparable to state-of-the-art methods.
Joint work with Shaul Oron, Aharon Bar-Hillel and Dan Levi
- 13.02.2013

- Tammy R. Raviv - EE@BGU

- Joint segmentation of MRI ensembles via latent atlases
Abstract:Probabilistic atlases play an important role in MRI segmentation. However, the availability of comprehensive, reliable and suitable manual segmentations for atlas construction is limited. Moreover, most existing atlases, while being adequate for modeling healthy adult brains are unable to capture the versatility and dynamic of brain anatomy due to growth, aging or pathology.
In the talk I will present a framework for the joint segmentation of corresponding regions of interest in a collection of aligned MR images that does not require labeled training data.
Instead, a latent atlas initialized by a few (and even a single) manual segmentation(s), is inferred from the evolving segmentations of the ensemble.
The proposed methodology is based on probabilistic principles but is solved using partial differential equations (PDEs) and energy minimization criteria.
I will also introduce the recently developed spatio-temporal latent atlas model for the analysis of longitudinal data. The method, applied to fetal MRI, enables capturing developmental characteristics or trajectories of cerebral structures during the early growth stages of the human brain. A continuous representation of the temporal domain to allow the interpolation of time points that are not represented by the available dataset is accomplished via kernel regression.
The proposed technique is successfully demonstrated for the segmentation of fetal brains at 20th-30th gestational weeks. The resulting spatio-temporal atlas captures morphological variability due to both cross-sectional differences as well as a spread in developmental rate across the population.
- 06.02.2013

- Boaz Arad - BGU

- Hyper spectral recovery from RGB source
Abstract:
- 23.01.2013

- Yuval Netzer - Google

- Reading text in Google Goggles and Streetview images
Abstract:Detecting and reading text from natural images is a hard computer
vision task that is central to a variety of emerging applications.
Related problems like document character recognition have been widely
studied by computer vision and machine learning researchers and are
virtually solved for practical applications such as reading
handwritten digits or reading scanned books/documents. Nevertheless,
reliably recognizing characters in more complex scenes like
photographs, is far more difficult: the best existing methods lag well
behind human performance on the same tasks. In this talk we discuss
the problem of reading text in images, such as in Street View and
Google Goggles images, and present our approach for tackling this hard
problem.
- 16.01.2013

- Yael Moses - IDC

- Photo-Sequencing
Abstract:Dynamic events such as family gatherings, concerts or sports events are often captured by a group of people. The set of still images obtained this way is rich in dynamic content but lacks accurate temporal information. We propose a method for {\em photo-sequencing} -- temporally ordering a set of still images taken asynchronously by a set of uncalibrated cameras. Photo-sequencing is an essential tool in analyzing (or visualizing) a dynamic scene captured by still images. We demonstrate successful photo sequencing on several challenging collections of images taken using a number of mobile phones.
This is a joint work with Tali Basha and Shai Avidan from TAU.
- 02.01.2013

- Boaz Nadler - Weizmann

- Vectorial phase retrieval
Abstract:Phase retrieval - namely the recovery of a signal from its absolute Fourier
transform coefficients is a problem of fundamental importance in many fields.
While in two dimensions phase retrieval typically has a unique solution,
in 1-D the phase retrieval problem is often not even well posed,
admitting multiple solutions.
In this talk I'll present a novel framework for reconstruction of
pairs of signals, from measurements of both their spectral intensities,
and of their mutual interferences. We show that for noise-free measurements of
compactly supported signals, this new setup, denoted vectorial phase retrieval,
admits a unique solution. We then derive a computationally e cient
and statistically robust spectral algorithm to solve the vectorial phase
retrieval problem, as well as a model selection criteria to estimate the
unknown compact support. We illustrate the reconstruction performance of our
algorithm on several simulated signals.
Joint work with Oren Raz and Nirit Dudovich (Weizmann Institute of Science).
- 19.12.2012

- Tal Hassner - Open University

- Subspaces, SIFTs, and Scale Invariance
Abstract:Scale invariant feature detectors often find stable scales in only a few image pixels. Consequently, methods for feature matching typically choose one of two extreme options: matching a sparse set of scale invariant features, or dense matching using arbitrary scales. In this talk we turn our attention to the overwhelming majority of pixels, those where stable scales are not found by standard techniques. We ask, is scale-selection necessary for these pixels, when dense, scale-invariant matching is required and if so, how can it be achieved? We will show the following: (i) Features computed over different scales, even in low-contrast areas, can be different; selecting a single scale, arbitrarily or otherwise, may lead to poor matches when the images have different scales. (ii) Representing each pixel as a set of SIFTs, extracted at multiple scales, allows for far better matches than single-scale descriptors, but at a computational price. Finally, (iii) each such set may be accurately represented by a low-dimensional, linear subspace. A subspace-to-point mapping may further be used to produce a novel descriptor representation, the Scale-Less SIFT (SLS), as an alternative to single-scale descriptors.
* This is joint work with Viki Mayzels, and Lihi Zelnik-Manor
- 05.12.2012

- Michal Shemesh - BGU

- TBD
Abstract:TBD
- 07.11.2012

- Orit Kliper - WIS

- Action Recognition in challenging Real-World Videos
Abstract:Action recognition, the problem of recognizing the action performed by humans
from digital videos, is a central research theme in computer vision. In recent years,
computer vision research shows increasing interest in challenging, real-world scenarios.
However, while real-world object recognition, face identication, image
similarity and visual navigation systems demonstrate improved capabilities and
become useful in many practical scenarios, existing action recognition systems
still fall short of the real-world applications' needs. Much of the advancement in
other computer vision domains has been attributed to the emergence of better
image descriptors, the better use of learning methods, and the advent of large and
realistic benchmarks that help push performance boundaries. In our work
we focus on advancing the capabilities of computer vision systems for the task
of action recognition in realistic scenarios by making the following contributions:
(1) We have collected a unique, comprehensive data set of real-world videos of human
actions, the "Action Similarity LAbeliNg" (ASLAN) database, and we have
established a well defined testing protocol on this set, focusing on action similarity,
rather than action classification. (2) We report the performance of a variety
of leading action systems on this set, and we have conducted an extensive human
survey, which demonstrates the significant gap between machine and human performance.
(3) We have developed a metric learning technique that is geared towards
improved similarity performances, and finally (4), we have developed two new video
descriptors: the Motion Interchange Patterns (MIP), for general Action Recognition,
and the VIolent-Flows (ViF) descriptor, designed for the particular task of
violent crowd-behavior detection. Both these descriptors focus on encoding local
changes within videos.
- 31.10.2012

- Tali Treibitz - UCSD

- (TENTATIVE)
Abstract:TBD
- 24.10.2012

- Yair Adato - BGU

- MeanTexture
Abstract:
- 12.09.2012

- Chetan Arora - IIT

- MRF-MAP Labeling
Abstract:Many tasks in low level computer vision can be formulated as labeling problems. With labeling satisfying the properties of a Markov Random Field (MRF) and the most suitable labeling defined as the one which maximizes a posteriori (MAP) probability, a labeling problem can be modeled as a constraint optimization problem. We develop algorithms based on primal-dual techniques to solve various classes of MRF-MAP labeling problems. We deal with 2 label problems where clique size is more than two. Such modeling is known to be useful, but is hardly used as the most popular algorithmic processes used for solving them so far has involved transforming the higher order potentials to an equivalent quadratic form and then use approximate algorithms developed for optimizing non-submodular quadratic forms. The technique is inefficient and suboptimal.
We give an algorithm for higher order labeling problems with submodular costs which runs in low order polynomial time for fixed clique sizes. As for the two clique version of the problem in which optimizing the primal can be viewed as finding a min cut and the dual as a max flow problem, we show that these concepts can be generalized for higher order clique problems. The dual framework has resulted in a novel max flow min cut theorem in which both the capacity of an edge and the cost of a cut have new but natural generalizations.
- 02.09.2012

- Oren Freifeld - Brown

- Lie Bodies: A Manifold Representation of 3D Human Shape
Abstract:Three-dimensional object shape is commonly represented in terms of deformations of a triangular mesh from an exemplar shape. In particular, statistical generative models of human shape deformation are widely used in computer vision, graphics, ergonomics, and anthropometry. Existing models, however, are based on a Euclidean representation of shape deformations. In contrast, we argue that shape has a manifold structure: For example, averaging the shape deformations for two people does not necessarily yield a deformation corresponding to a valid human shape, nor does the Euclidean difference of these two deformations provide a meaningful measure of shape dissimilarity. Consequently, we define a novel manifold for shape representation, with emphasis on body shapes, using a new Lie group of deformations. This has several advantages. First, we define triangle deformations exactly, removing non-physical deformations and redundant degrees of freedom common to previous methods. Second, the Riemannian structure of Lie Bodies enables a more meaningful definition of body shape similarity by measuring distance between bodies on the manifold of body shape deformations. Third, the group structure allows the valid composition of deformations. This is important for models that factor body shape deformations into multiple causes or represent shape as a linear combination of basis shapes. Similarly, interpolation between two mesh deformations results in a meaningful third deformation. Finally, body shape variation is modeled using statistics on manifolds. Instead of modeling Euclidean shape variation with Principal Component Analysis we capture shape variation on the manifold using Principal Geodesic Analysis. Our experiments show consistent visual and quantitative advantages of Lie Bodies over traditional Euclidean models of shape deformation and our representation can be easily incorporated into existing methods.
Remark: Time permitting, I may also briefly discuss additional recent works from our lab
related to statistical generative models of 2D and 3D detailed human shape and pose.
Joint work with Michael Black.
- 18.07.2012

- Prof. Ohad Ben-Shahar, BGU

- On "cortical vision" without visual cortex
Abstract:Abstract:
Our visual attention is attracted by salient stimuli in our
environment and affected by primitive features such as orientation,
color, and motion. Perceptual saliency due to orientation contrast has
been extensively demonstrated in behavioral experiments with humans
and other primates and is commonly explained by the very particular
functional organization of the primary visual cortex. We challenge
this prevailing view by studying orientation-based visual saliency in
two non-mammalian species with enormous evolutionary distance to
humans. The surprising results not only imply the need to reestablish
our understanding of how these processes work at the neural level, but
they also suggest that orientation-based saliency has computational
optimality in a wide variety of ecological contexts, and thus
constitutes a universal building block for efficient visual
information processing in general.
- 11.07.2012

- Guy Rosman - Technion

- Novel Parameterizations in Motion Analysis
Abstract:Choosing the right parameterization is crucial in motion estimation
and analysis. In this talk we demonstrate two important cases in
motion analysis where the right selection of local parameterization
leads to simple and yet generic formulations.
Specifically, in 2D stereovision we suggest to use the plane equation
and planar homographies as a basis for an over-parameterized optical
flow estimation. The algorithm obtains state-of-the-art results in
optical flow computation and its regularization term has a physically
meaningful interpretation bridging the gap between optical flow
computation and scene understanding.
In 3D motion understanding we incorporate a Lie-group
representation into an Ambrosio-Tortorelli scheme for analysis and
segmentation of articulated body motion. The resulting algorithm
obtains results comparable to those of domain specific, tailored,
tools, on 3D range data, with fast variational schemes suggested for
the regularization.
- 04.07.2012

- Simon Korman - Tel Aviv

- Coherency Sensitive Hashing
Abstract:Coherency Sensitive Hashing (CSH) extends Locality Sensitivity Hashing (LSH) and PatchMatch to quickly find matching patches between two images. LSH relies on hashing, which maps similar patches to the same bin, in order to find matching patches. PatchMatch, on the other hand, relies on the observation that images are coherent, to propagate good matches to their neighbors, in the image plane. It uses random patch assignment to seed the initial matching. CSH relies on hashing to seed the initial patch matching and on image coherence to propagate good matches. In addition, hashing lets it propagate information between patches with similar appearance (i.e., map to the same bin). This way, information is propagated much faster because it can use similarity in appearance space or neighborhood in the image plane. As a result, CSH is at least three to four times faster than PatchMatch and more accurate, especially in textured regions, where reconstruction artefacts are most noticeable to the human eye. We verified CSH on a new, large scale, data set of 133 image pairs. In further recent work, we show preliminary results of applying this technique on RGB+depth images (as provided by Kinect-like sensors).
joint work with Shai Avidan
- 13.06.2012

- Ilan Kadar - BGU

- Small Sample Scene Categorization from Perceptual Relations
Abstract:This paper addresses the problem of scene categorization while arguing that
better and more accurate results can be obtained by endowing the computational process
with perceptual relations between scene categories.
We first describe a psychophysical paradigm that probes human scene categorization, extracts
perceptual relations between scene categories, and suggests that these perceptual relations
do not always conform the semantic structure between categories.
We then incorporate the obtained perceptual findings into a computational classification scheme,
which takes inter-class relationships into account to obtain better scene categorization
regardless of the particular descriptors with which scenes are represented.
We present such improved classification results using several popular descriptors, we discuss why the contribution
of inter-class perceptual relations is particularly pronounced for under-sampled training sets, and we argue
that this mechanism may explain the ability of the human visual system to perform well under similar conditions.
Finally, we introduce an online experimental system for obtaining perceptual relations
for large collections of scene categories.
- 23.05.2012

- Todd Zickler - Harvard

- Toward computer vision on a tight budget
Abstract:At least in the near term, micro-scale platforms like micro air vehicles and micro sensor nodes are unlikely to have power, volume, or mass budgets to support conventional imaging and post-capture processing for visual tasks like detection and tracking. These budgets are severe enough that even common computations, such as large matrix manipulations and convolutions, are difficult or impossible. To help overcome this, we are considering sensor designs that allow some components of scene analysis to happen optically, before light strikes the sensor. I will present and analyze one class of designs in this talk. These sensors reduce power requirements through template-based optical convolution, and they enable a wide field-of-view within a small form. I will describe the trade-offs between field-of-view, volume, and mass in these sensors, and I will describe our initial efforts toward choosing effective templates. I will also show examples of milli-scale prototypes for simple computer vision tasks such as locating edges, tracking targets, and detecting faces.
Related publications:
- Koppal, et al., "Wide-angle micro sensors for vision on a tight budget." CVPR 2011.
- Gkioulekas and Zickler, "Dimensionality reduction using the sparse linear model." NIPS 2011.
- 02.05.2012

- Uri Shalit - Hebrew University

- Online Learning in The Manifold of Low-Rank Matrices
Abstract:In this talk I will focus on the problem of learning models that are represented in matrix forms, such as models used for multiclass learning or metric and similarity learning.
For such models, enforcing a low-rank constraint can dramatically improve the memory and run time complexity, while providing a natural regularization of the model.
However, naive approaches to minimizing functions over the set of low-rank matrices are either prohibitively time consuming (repeated singular value decomposition of the matrix) or numerically unstable (optimizing a factored representation of the low-rank matrix). We built on recent advances in optimization over manifolds, and developed an iterative online learning procedure, that can be computed efficiently. It has run time and memory complexity of O((n+m)k) for a rank-k matrix of dimensions m X n.
We used this algorithm, LORETA, for two tasks: the learning of a matrix-form similarity measure over pairs of documents represented as high dimensional vectors, and the ranking of 1600 possible labels for a set of one million images (taken from ImageNet).
Finally, if time allows, I will discuss our newest work which focuses on learning interpretable similarity models.
This is joint work with Prof. Daphna Weinshall and Dr. Gal Chechik
- 28.03.2012

- Daniel Glasner - Weizmann Institute

- Viewpoint-Aware Object Detection and Pose Estimation
Abstract:We describe an approach to category-level detection and viewpoint estimation for rigid 3D objects from single 2D images. In contrast to many existing methods, we directly integrate 3D reasoning with an appearance-based voting architecture. Our method relies on a nonparametric representation of a joint distribution of shape and appearance of the object class. Our voting method employs a novel parametrization of joint detection and viewpoint hypothesis space, allowing efficient accumulation of evidence. We combine this with a re-scoring and refinement mechanism, using an ensemble of view-specific Support Vector Machines. We evaluate the performance of our approach in detection and pose estimation of cars on a number of benchmark datasets.
This is joint work with Meirav Galun, Sharon Alpert, Ronen Basri and Gregory Shakhnarovich.
Project webpage: http://www.wisdom.weizmann.ac.il/~vision/viewpoint-aware/index.html
- 22.02.2012

- Dan Raviv - Technion

- Equi-Affine invariant intrinsic geometries for bendable shapes analysis.
Abstract:Traditional models of bendable surfaces are based on the exact or approximate invariance to deformations that do not tear or stretch the shape,
leaving intact an intrinsic geometry associated with it.
Intrinsic geometries are typically defined using either the shortest path length (geodesic distance), or properties of heat diffusion (diffusion distance) on the surface.
Both ways are implicitly derived from the metric induced by the ambient Euclidean space.
In this paper, we depart from this restrictive assumption by observing that a different choice of the metric results in a richer set of geometric invariants.
We extend the classic equi-affine arclength, defined on convex surfaces, to arbitrary shapes
with non-vanishing gaussian curvature. As a result, a family of affine-invariant intrinsic geometries is obtained.
The potential of this novel framework is explored in a wide range of applications such as shape matching and retrieval,
symmetry detection, and computation of Voroni tessellation.
We show that in some shape analysis tasks, our affine-invariant intrinsic geometries often outperform their Euclidean-based counterparts.
This work was done with collaboration of Dr. Alex Bronstein (TAU), Dr. Michael Bronstein (Lugano, Switzerland), Prof. Ron Kimmel (Technion) and Prof. Nir Sochen (TAU).
- 01.02.2012

- Daniel Zoran - The Hebrew Univesity

- From Learning Models of Natural Image Patches to Whole Image Restoration
Abstract:Learning good image priors is of utmost importance for the study of vision, computer vision and image processing applications. Learning priors and optimizing over whole images can lead to tremendous computational challenges. In contrast, when we work with small image patches, it is possible to learn priors and perform patch restoration very efficiently. This raises three questions - do priors that give high likelihood to the data also lead to good performance in restoration? Can we use such patch based priors to restore a full image? Can we learn better patch priors? In this work we answer these questions.
We compare the likelihood of several patch models and show that priors that give high likelihood to data perform better in patch restoration. Motivated by this result, we propose a generic framework which allows for whole image restoration using any patch based prior for which a MAP (or approximate MAP) estimate can be calculated. We show how to derive an appropriate cost function, how to optimize it and how to use it to restore whole images. Finally, we present a generic, surprisingly simple Gaussian Mixture prior, learned
from a set of natural images. When used with the proposed framework, this Gaussian Mixture Model outperforms all other generic prior methods for image denoising, deblurring and inpainting.
- 16.11.2011

- Michael Chertok - Bar Ilan University

- Spectral Symmetry Analysis
Abstract:Spectral relaxation was shown to provide an efficient approach for solving a gamut of computational problems, ranging from data mining to image registration. We show that in the context of graph matching, spectral relaxation can be applied to the detection and analysis of symmetries in n-dimensions. First, we cast symmetry detection of a set of points in Rn as the self-alignment of the set to itself. Thus, by representing an object by a set of points S â Rn, symmetry is manifested by multiple self-alignments. Secondly, we formulate the alignment problem as a quadratic binary optimization problem, solved efficiently via spectral relaxation. Thus, each eigenvalue corresponds to a potential self-alignment, and eigenvalues with multiplicity greater than one correspond to symmetric self- alignments. The corresponding eigenvectors reveal the point alignment and pave the way for further analysis of the recovered symmetry. We apply our approach to image analysis, by using local features to represent each image as a set of points. Last, we improve the schemeâs robustness by inducing geometrical constraints on the spectral analysis results. Our approach is verified by extensive experiments and was applied to two and three dimensional synthetic and real life images.
- 03.08.2011

- Tal Hassner - The Open University

- Patch-Based LBPs, One-Shot Similarities, and Real-world Face Recognition
Abstract:Computer Vision systems have demonstrated considerable improvement in recognizing and verifying faces in digital images. Still, recognizing faces appearing in unconstrained, natural conditions remains a challenging task. In this talk I will present a face image, pair-matching approach developed and tested on data-sets that reflect the challenges of face recognition from unconstrained images and videos. I will discuss the following contributions. (a) We introduce a family of novel face-image descriptors designed to capture statistics of local patch similarities. (b) We demonstrate how unlabeled background samples may be used to better evaluate image and video similarities. To this end we describe a number of novel, effective similarity measures. (c) We present the YouTube Faces data-set and associated benchmark for pair-matching of face videos obtained in the wild. Our system achieves state-of-the-art results on both the LFW benchmarks as well as the new YouTube Faces set. In addition, it is well suited for multi-label face classification (recognition) problems, on both LFW images and on images from the laboratory controlled multiPIE database.
* This is joint work with Prof. Lior Wolf from TAU, Yaniv Taigman from face.com, Itay Maoz from TAU, and Orit Kliper-Gross from the Weizmann Institute
- 27.07.2011

- Maria Zontak - Weizmann institute

- Internal Statistics of a Single Natural Image
Abstract:Statistics of ânatural imagesâ provides useful priors for solving under-constrained problems in Computer Vision. Such statistics is usually obtained from large collections of natural images. We claim that the substantial internal data redundancy within a single natural image (e.g., recurrence of small image patches), gives rise to powerful internal statistics, obtained directly from the image itself. While internal patch recurrence has been used in various applications, we provide a parametric quantification of this property. We show that the likelihood of an image patch to recur at another image location can be expressed parametrically as a function of the spatial distance from the patch, and its gradient content. This âinternal parametric priorâ is used to improve existing algorithms that rely on patch recurrence.
Moreover, we show that internal image-specific statistics is often more powerful than general external statistics, giving rise to more powerful image-specific priors. In particular:
(i) Patches tend to recur much more frequently (densely) inside the same image, than in any random external collection of natural images.
(ii) To find an equally good external representative patch for all the patches of an image, requires an external database of hundreds of natural images.
(iii) Internal statistics often has stronger predictive power than external statistics, indicating that it may potentially give rise to more powerful image-specific priors.
*Joint work with Michal Irani.
- 27.04.2011

- Amir Egozi - BGU

- Markov Random Fields and HMMs
Abstract:
- 23.03.2011

- Michal Shemesh - BGU

- Multidimensional scaling with applications for computational vision
Abstract:
- 16.03.2011

- Dolev Pomeranz - BGU

- The patch transform
Abstract:
- 17.01.2011

- Tali Basha - TAU

- Multi-View Scene Flow Estimation: A View Centered Variational Approach
Abstract:Tali Basha from TAU will present her work:
Multi-view scene flow estimation: a centered variational approach.