Kalanit Gril-Spector
Title: Population Receptive Fields in the Human Ventral Stream and their Role in Face Perception
|
Abstract: The cortical system for processing faces is a model system for studying the functional architecture of the human brain and its role in visual recognition for two reasons. First, the functional architecture of the cortical face network is well understood and second activations in these face-selective regions are causally related to the perception of faces. Here, I will describe the similarity and differences between the functional architecture of the human face network and recently developed, neurally inspired, deep neural networks for face recognition. In particular, I will discuss computations performed by population receptive field (pRFs) in the human face network, describing three main findings. First, there is hierarchical organization within the face network, whereby pRFs become larger and more foveal across this hierarchy. Second, attention to faces both increases pRF size and shifts pRFs to the location of the attended faces. As a consequence, the representation of faces in the peripheral visual field, where visual acuity is the lowest, is enhanced. Third, properties of pRFs of the face network affect face recognition abilities: participants with larger pRFs, that allow better integration across facial features, perform better in face recognition than participants with smaller pRFs. These findings suggest a rethinking of classical computational models of face recognition as they not only show that computational units of the face network are not size and position invariant, but that the spatial tuning of these units have consequences for face recognition ability. I will end by discussing implications of these findings for building efficient deep neural networks for face recognition.
This work has been done in collaboration with Dr. Kevin Weiner, Dr. Kendrick Kay, Dr. Nathan Witthoft and Jesse Gomez.
Biography: Dr Grill-Spector obtained a BSc. in Electrical and Computer Engineering from Ben Gurion University, Beer-Sheva, Israel, a Masters in Computer Science from the Weizmann Institute of Science, Rehovot, Israel and a PhD in Neurobiology from the Weizmann Institute of Science, Rehovot, Israel. After her studies she did postdoctoral research with Prof. Nancy Kanwisher at the Department of Brain and Cognitive Sciences at MIT. Since 2001, Dr. Grill-Spector is a professor of Psychology and member of the Stanford Neurosciences Institute at Stanford University. Her research uses functional and quantitative magnetic resonance imaging, diffusion imaging, electrocorticography, behavioral measurements, and computational models to study neural mechanisms of visual recognition in humans.
|
Mark Frank
Title:Recognizing Deception
|
Abstract: When encountering a person, humans first judge whether they recognize the person; and then they recognize the intentions of that person. This presentation will discuss the science behind recognizing intentions, particularly when someone tries to conceal or mislead about his or her intentions. It will conclude by discussing the implications that has for the lie catcher in the real world in contexts such as business and security.
Biography: Mark G. Frank is a professor and chair of the Communication Department, as well as the director of the Communication Science Center, at the University at Buffalo, State University of New York. Dr. Frank received his Ph.D. in Social Psychology from Cornell University in 1989. He received a National Research Service Award from the National Institute of Mental Health to do postdoctoral research with Dr. Paul Ekman in the Psychiatry Department at the University of California at San Francisco Medical School. He has since worked at the School of Psychology at the University of New South Wales in Sydney, Australia,and the Communication Department at Rutgers University. In 2005 he accepted a position in his hometown at the School of Informatics at the University of Buffalo, where he created and directs the Communication Science Center. He has published numerous research papers on facial expressions, emotion, interpersonal deception, and also violence in extremist groups, and he has recently won the SUNY Chancellor’s Award for Excellence in Scholarship and Creative Activities. He has had research funding from The National Science Foundation, US Department of Homeland Security, and the US Department of Defense to examine deception, aggression, and hidden emotion behaviors in checkpoint, law enforcement, and counter-terrorism situations. He is also the co-developer of a patented automated computer system to read facial expressions, for which he won a Visionary Innovator Award from the University at Buffalo. He has used these findings to lecture, consult with and train virtually all US Federal Law Enforcement/Intelligence Agencies, as well as local/state and select foreign agencies such as CSIS (Canada), the Australian Federal Police, and Scotland Yard (UK). He is also one of the original members and Senior Fellow of the FBI Behavioral Science Unit’s Terrorism Research and Analysis Project. He has presented briefings on deception and counter-terrorism to the US Congress as well as the US National Academies of Sciences. He has also given workshops to the US Federal Judiciary, various state Courts, and foreign judges and magistrates. He has also presented to numerous business groups as well including Fortune 500 companies. Finally, he has appeared in over 100 print, radio, and television appearances to talk about his work, including The New Yorker Magazine, Time Magazine, New York Times, Wall Street Journal, CBS Evening News, CNN, Fox News Channel, CNBC, National Public Radio, The Learning Channel, the Discovery Channel, the Oprah Show, the CBC, BBC, London Weekend Television, the Australian Today Show, the Sydney Morning Herald, and ABC Radio National, among others.
|
Lior Wolf
Title: Recent Advancements in Deep Learning for Face Identification and Face Detection
|
Abstract: Transfer learning plays a major role in the recent success of deep face recognition methods. Deep networks are trained to solve the multiclass classification problem using a cross entropy loss and the learned representations are then transferred to a different domain and are often employed for verification. In the talk, I will point to a few research questions, including: What is the ideal source dataset? How to train in the target domain? How to estimate the certainty of the identification?
One of the observations we make is that the transferred representations support only a few modes of separation and much of their dimensionality is underutilized. To alleviate this, we suggest to learn, in the source domain, multiple orthogonal classifiers. We prove that this leads to a reduced rank representation, which, however, supports more discriminative directions. For example, we obtain the 2nd best result on LFW for a single network using a training set that is a few orders of magnitude smaller than that of the leading literature network, and obtaining a very compact representation of only 51 dimensions.
Finally, I will present a state of the art face detector, which has an astonishing runtime of 3 milliseconds.
Biography: Lior Wolf is a full professor at the School of Computer Science at Tel-Aviv University. Previously, he was a post-doctoral associate in Prof. Poggio's lab at MIT. He graduated from the Hebrew University, Jerusalem, where he worked under the supervision of Prof. Shashua. Lior Wolf was awarded the 2008 Sackler Career Development Chair, the Colton Excellence Fellowship for new faculty (2006-2008), the Max Shlumiuk Award for 2004, and the Rothchild Fellowship for 2004. His joint work with Prof. Shashua in ECCV 2000 received the best paper award, and their work in ICCV 2001 received the Marr Prize honorable mention. He was also awarded the best paper award at the post ICCV 2009 workshop on eHeritage, and the pre-CVPR2013 workshop on action recognition. Prof. Wolf research focuses on computer vision and deep learning and includes topics such as face identification, document analysis, NLP, digital paleography, and video action recognition.
|