Second Workshop on Analysis and Understanding of Document Images in Network Media (AUDINM)
La Rochelle, April 7-8, 2016
Chairs: Jean-Marc Ogier, Cheng-Lin Liu
This workshop
Document images are prevalent in network communication and social networks (such as Facebook, Twitter,Weibo). Ordinary Web search and information extraction tools cannot recognize texts in those heterogeneous collections of images. The technology for extracting text information from images, as developed in this project, is valuable for personal communication and sensitive information extraction. Sensitive information related to security is likely to be embedded in images, so, text extraction from images is important to anti-terrorism.Text information extraction from images is valuable for economy and industry as many images in network media are related to commerce. For example, advertisements on the Web are mostly in image form. The automatic recognition of texts in images is thus an important part of deep Web search and data mining. This obviously produces values for the data mining industry such as the development of next generation search engines.
To push forward the research in this field and strengthen the collaboration between Chinese and French researchers, the National Natural Science Foundation of China (NSFC) and the French National Research Agency (ANR) co-funded a project Analysis and Understanding of Document Images in Network Media (AUDINM), which is performed by the Institute of Automation of Chinese Academy of Sciences (PI: Cheng-Lin Liu) and the University of La Rochelle (PI: Jean-Marc Ogier). As a partial commitment of the project, this workshop invites researchers from China and France to exchange the progress in the field of document image analysis.
Workshop Program
Place: Pascal Building – University of La Rochelle, 23 Avenue Albert Einstein, 17000 La Rochelle
Date |
Time Slot |
Activity |
Speaker |
Room |
---|---|---|---|---|
Thursday 7th April 2016 |
9:00 – 9:20 |
Opening |
101 |
|
9:20 – 9:40 |
AUDINM Project Introduction |
101 |
||
9:40 – 10:20 |
Quality Assessment and Enhancement in Mobile-Captured Documents – MDIAE project |
101 |
||
10:20 – 10:40 |
Coffee Break |
101 |
||
10:40 – 11:10 |
Cultural Heritage |
101 |
||
11:10 – 11:40 |
Historical Coin Image Retrieval and Classification |
101 |
||
11:40 – 12:00 |
Recent Updates on the SmartDoc Framework – New Datasets |
101 |
||
12:00 – 14:00 Lunch Break |
||||
14:00 – 15:00 |
Seminar: Danish Technological Institute – Robot Technology Center |
018 |
||
15:00 – 16:00 |
Handwritten Character Recognition and Text Line Recognition: Some Advances |
017 |
||
16:00 – 17:15 |
“Mille Sabords” Event – La Rochelle |
|||
17:15 – 18:00 (if possible) |
– For our guests from NLPR Lab – L3i Lab Tour: Muzzamil Luqman + Nibal Nayef + Wafa Khlif |
L3i Lab |
||
Friday 8th April 2016 |
9:00 – 10:00 |
Some Advances in Robust Reading at NLPR Lab |
026 |
|
10:00 – 10:40 |
Comics Image Analysis |
026 |
||
10:40 – 10:50 |
Short Coffee Break |
026 |
||
10:50 – 11:20 |
Fraud Detection |
Nicolas Sidère |
026 |
|
11:20 – 12:00 |
Steganography: Application to document analysis |
026 |
||
12:00 – 14:00 Lunch Break |
||||
14:00 – 14:45 |
SHADES: Semantic Hash for Advanced Document Electronic Signature |
026 |
||
14:45 – 15:30 |
Color Segmentation of Administrative Document Images |
026 |
||
15:30 – 15:50 |
Coffee Break |
026 |
||
15:50 – 16:30 |
– For our guests from NLPR Lab – Continue L3i Lab Tour |
L3i Lab |
||
16:30 – 18:00 |
– For AUDINM project team – Progress, next steps & research visits |
AUDINM project team |
101 |
Biography of Speakers & Abstracts of their talks
- Jean-Marc Ogier (University of La Rochelle, France)
- Title: Workshop Opening
-
Abstract: Welcoming of workshop participants. Introduction to L3i laboratory at the university of La Rochelle, and an introduction to this workshop in the context of the AUDINM project.
-
Biography: received his PhD degree in computer science from the University of Rouen, France, in 1994. During this period (1991-1994), he worked on graphics recognition for Matra Ms&I Company. From 1994 to 2000, he was an associate professor at the University of Rennes 1 during a first period (1994-1998) and at the University of Rouen from 1998 to 2001. Now full professor at the University of La Rochelle, Professor Ogier is the head of L3i laboratory which gathers more than 80 members and works mainly of Document Analysis and Content Management. Author of more than 160 publications / communications, he managed several French and European projects dealing with document analysis, either with public institutions, or with private companies. Professor Ogier has been a Deputy Director of the GDR I3 of the French National Research Center (CNRS) from 2005 to 2013. He is one of the 2 french representative at the governing board of IAPR and is also Chair of the Technical Committee 10 (Graphics Recognition) of the International Association for Pattern Recognition (IAPR). He is also vice president of the University of La Rochelle.
-
Nibal Nayef
- Title: Workshop Opening
-
Abstract: The increasing availability of high-performance, low-priced, portable digital imaging devices has created a tremendous opportunity for supplementing traditional scanning of all types of document image acquisition. Camera-captured images can suffer from low resolution, blur, and perspective distortion, as well as complex layout and interaction of the content and background.
Robust solutions should be developed for the analysis of documents captured with such devices.
The MDIAE project deals with mobile-captured administrative documents, with two main goals:- Quality Assessment: estimates the quality of captured images so as to request new captures by the user, if necessary, while he has the source document at his disposal.
- Quality Enhancement: improves the OCR accuracy and the visual appearance of the captured document images.
For achieving those two goals, we need to identify the possible distortions that degrade the quality of document images due to the used imaging device or the environment and the user. After that, we focus on the methods that fix these distortions.
-
Biography: Nibal Nayef works currently as a post-doctoral researcher at Valconum and L3i Laboratory at the University of La Rochelle, France. She works on quality assessment and enhancement of mobile captured documents, information spotting and text / image segmentation. Nayef has a Ph.D. in computer science (2012) from the technical university of Kaiserslautern in Germany. She was a member of the IUPR laboratory (Image Understanding and Pattern Recognition) there, where she finished her PhD thesis entitled “Geometric-based symbol spotting and retrieval in technical line drawings”. Her research interests are: analysis and retrieval of line drawings and their associated evaluation protocols, statistical feature grouping, machine learning for vis ion, geometric matching and document image quality assessment and enhancement. She is a regular reviewer in IJDAR journal and DAS, ICDAR, ICFHR, ICPR conferences .
-
Wafa Khlif
- Title: AUDINM Project Introduction
-
Abstract: Introduction to AUDINM project.
-
Biography: is currently a first-year PhD student at L3i. She is co-supervised by Professor Jean-Christophe Burie at L3i Laboratory, University of La Rochelle (France) and Professor Adel Alimi at Regim Lab, National School of Engineers of Sfax (Tunisia). Wafa received the engineering diploma in computer science from the Tunisian engineering university ENIS-SFAX within the exchange program Erasmus mundus with Central Nantes in December 2014. In addition, she received the M.Sc. degree in 2015 from Polytech Nice-Sophia, the University of Nice Sophia Antipolis.
-
Vinh-Loc Cu
- Title:Steganography: Application to document analysis
-
Abstract: In the fast-growing digitalization era, variety of legal documents are being converted or scanned into image format for better storage, retrieval as well as online transactions. The most concerns are that how to ensure the document image still not be forged by malicious attacks during the transmission meanwhile still retaining the form as original version. To address the mentioned issues, steganography technique is used in conjunction with other ones such as pattern recognition and document signature will be achieved a higher level of authentication and integrity of document images. This presentation is just the overview of related techniques, typical proposed works, and the beginning ideas in new direction for our coming works.
-
Biography: received the B.Eng. degree from Cantho University, Vietnam and the M.Sc. degree from Asian Institute of Technology (AIT), Thailand. He is currently pursuing the PhD degree at La Rochelle University. From October 2001 to December 2015, he was a lecturer at Software Engineering and Multimedia Department, Cantho University Software Center, Vietnam. His research interests include programming languages, RDBMS, pattern recognition, document signature, and information hiding.
- Clément Guérin
- Title: Comics Image Analysis
-
Abstract: The eBDtheque project started in 2011 in the L3i laboratory. Its aim is to provide methods to understand the content of digitized comic books, from the extraction of the layout to the analysis of panels and balloons content.
This talk will present the work on visual and semantic content extraction achieved during the past few years.
It will also introduce the ongoing work on text recognition and twin-pages spotting. -
Biography: is currently a research engineer (postdoc) at the L3i Laboratory of the University of La Rochelle (France). He received his PhD degree in 2014 for his thesis titled “A framework for the automated analysis, interpretation and interactive retrieval of comic books’ images”. This work was supervised by Professors Karell Bertet and Arnaud Revel from the L3i Laboratory of the University of La Rochelle (France). He works at the intersection of several research domains, from knowledge engineering to image processing, with some interests in formal concept analysis, machine learning and recommendation systems.
Web: http://l3i.univ-larochelle.fr/Guerin-Clement
- Elodi Carel
- Title:Color Segmentation of Administrative Document Images
-
Abstract: Industrial companies receive huge volumes of documents everyday. Automation, traceability, feeding information systems, reducing costs and processing times, dematerialization has a clear economic impact. In order to respect the industrial constraints, the traditional digitization process simplifies the images by performing a background/foreground separation. However, this binarization can lead to some segmentation and recognition errors. With the improvements of technology, the community of document analysis has shown a growing interest in the integration of color information in the process to enhance its performance. In order to work within the scope provided by our industrial partner in the digitization flow, an unsupervised segmentation approach was chosen. Our goal is to be able to cope with document images, even when they are encountered for the first time, regardless their content, their structure, and their color properties. To this end, the first issue of this project was to identify a reasonable number of main colors which are observable on an image. Then, we aim to group pixels having both close color properties and a logical or semantic unit into consistent color layers. Thus, provided as a set of binary images, these layers may be reinjected into the digitization chain as an alternative to the conventional binarization. Moreover, they also provide extra-information about colors which could be exploited for segmentation purpose, elements spotting, or as a descriptor. Therefore, we have proposed a spatio-colorimetric approach which gives a set of local regions, known as superpixels, which are perceptually meaningful. Their size is adapted to the content of the document images. These regions are then merged into global color layers by means of a multiresolution analysis.
-
Biography: ******************
- Axel Jean-Caurant & Cyrille Suire
- Title: User activity characterization in a Cultural Heritage Digital Library System
-
Abstract: Digital access to large amount of heterogeneous data can create methodological bias regarding the discovery and exploitation of resources, particularly when it comes to Social Sciences where institutions give access to numerous resources on digital libraries. In order to provide relevant adaptivity and useful content recommendation for social scientists, it is important to fully consider their research practice diversity. To do so, we consider an activity-based approach for researchers’ information search behavior. We took interest in several features to discover the type of task users are engaged in.
- Biography:
-
Axel Jean-Caurant is a PhD student in the L3i since October 2014. He studied here in La Rochelle, where he obtained a Master degree related to the management of digital contents. His research focus on how to handle heterogeneous resources in a digital Library system.
-
Cyrille Suire is a PhD Student at L3i laboratory of University of La Rochelle since 2014. His topic of research is the confluence of two major area of research, Computer Sciences and Humanities. He mainly focuses on social sciences researchers behavior in cultural heritage information systems by crossing approaches of humanists and computer scientists.
-
- Joseph Chazalon
- Title: Historical Coin Image Retrieval and Classification
-
Abstract: The proposed demonstration will illustrate a case of coin image classification and retrieval (coin images can be considered as a special kind of document images). On a dataset of approximatively 60,000 coins, composed of 2 images each (one for each side), the task presented consists in categorizing a query image into predefined classes, and/or retrieving similar coins from the database. This project was conducted with an industrial partner and a numismatic expert. Varying acquisition sources and the lack of ground truth were tough challenges to cope with.
-
Biography: Joseph Chazalon received M.Sc. and engineer degrees in Computer Science from the Institut National des Sciences Appliquées (INSA) in Rennes (France) in 2008, then a PhD degree in 2013. His doctoral research focused on visual languages for document recognition, historical document processing and interactive document interpretation. He joined the L3i laboratory at the University of La Rochelle, France, where he is now working as a post-doctoral researcher on document image acquisition and processing problems, under the supervision of J.-M. Ogier.
His research interests include visual languages, historical documents processing, (mobile) document image acquisition, image indexation, classification and retrieval, and performance evaluation.
- Muhammad Muzzamil LUQMAN
- Title: Recent updates on the SmartDoc Framework – New Datasets
-
Abstract: In this talk I will present an overview on the evolution of the SmartDoc framework and I will present details on the various works that have been realized at the L3i laboratory of the University of La Rochelle for generating datasets for evaluating smartphone-based acquisition of document images.
-
Biography: Muhammad Muzzamil LUQMAN is currently a Research Engineer (Permanent) at the L3i Laboratory, University of La Rochelle (France). Luqman has worked as a Research Engineer at the Bordeaux Bioinformatics Center (Centre de Bioinformatique de Bordeaux), France and has worked as a Postdoctoral researcher with Professor Jean-Marc Ogier, at L3i Laboratory, University of La Rochelle (France). Luqman has a PhD in Computer Science from François Rabelais University of Tours (France) and Autonoma University of Barcelona (Spain). His PhD thesis was co-supervised by Professor Jean-Yves Ramel and Professor Josep Llados. He successfully defended his PhD thesis – titled “Fuzzy Multilevel Graph Embedding for Recognition, Indexing and Retrieval of Graphic Document Images” – with distinction “très honorable (magna cum laude)”, on Friday 2nd of March 2012 at François Rabelais University of Tours (France). His research interests include Structural Pattern Recognition, Document Image Analysis, Camera-Based Document Analysis and Recognition, Graphics Recognition, Machine Learning, Computer Vision, Augmented Reality and Biomimicry. Luqman has authored 19 scientific publications including a book, a journal paper and conference papers. Luqman is a regular reviewer for journals (PR, IJDAR, IJPRAI, IJCSAI, TALLIP), he regularly serves on the program committees of many international scientific events (ICDAR, DAS, CIFED, ICET) and has actively participated in organizing several international conferences, workshops and scientific competitions.
- François PICARD
- Title: Seminar: Danish Technological Institute – Robot Technology Center/span>
-
Abstract: Founded in 1906, the Danish Technological Institute is a Research Technology Organization, which role is to assist the transfer of technological knowledge to the Industry. Including several different centers in Europe, Francois Picard is a consultant at the Robot Technology center, in Odense, Denmark. He will present the activity of the institute, implying the development of robotics systems for the Industry and for the service sectors. He will emphasize the work done in his team, dedicated to implement vision systems to drive robots’ behaviors. The presentation will end with some discussions about the possible collaborations with the laboratory.
-
Biography:…….
- Prof. Cheng-Lin Liu
- Title: Handwritten Character Recognition and Text Line Recognition: Some Advances
-
Abstract: Handwritten character recognition and text line recognition are at the core of document image analysis and recognition (DIAR). Numerous methods have been proposed for them, and recently, the use of convolutional neural networks (CNNs) and recurrent neural networks (RNNs) are producing new records of recognition performance. Nevertheless, traditional character recognition methods based hand-crafted features and text line recognition methods based on character over-segmentation are still competing in some respects. In this talk, I first give an overview on the major techniques of character recognition an text line recognition. Then, I present some of our recent works of Chinese character recognition using CNNs and text line recognition using over-segmentation and language models. Finally, the prospects of technology in this field will be discussed.
-
Biography: Professor Cheng-Lin LIU received the B.S. degree in electronic engineering from Wuhan University, Wuhan, China, the M.E. degree in electronic engineering from Beijing Polytechnic University, Beijing, China, the Ph.D. degree in pattern recognition and intelligent control from the Chinese Academy of Sciences, Beijing, China, in 1989, 1992 and 1995, respectively. He was a postdoctoral fellow at Korea Advanced Institute of Science and Technology (KAIST) and later at Tokyo University of Agriculture and Technology from March 1996 to March 1999. From 1999 to 2004, he was a research staff member and later a senior researcher at the Central Research Laboratory, Hitachi, Ltd., Tokyo, Japan. His research interests include pattern recognition, image processing, neural networks, machine learning, and especially the applications to character recognition and document analysis. He has published over 190 technical papers at prestigious international journals and conferences. He is a Professor at the NLPR, and is now the director of the laboratory. He is on the editorial board of Pattern Recognition Journal, Image and Vision Computing, International Journal on Document Analysis and Recognition. He is a fellow of the IAPR and a senior member of the IEEE.
- Fei Yin
- Title: Some Advances in Robust Reading at NLPR Lab
-
Abstract: With the development of the Internet and smart cameras, there are massive images in the word and many of them contain texts. If texts in these images can be detected and recognized by computers, they can play significant roles for various application, such as spam detection, products search, recommendation, intelligent transportation, robot navigation and geo-localization. However, the detection and recognition of text in both natural scene and born-digital images, so called robust reading, remains a challenge. In this talk, I first give an overview on the research status of robust reading. Then, I introduce some recent advances in this field within our group at the National Laboratory of Pattern Recognition (NLPR). These include: fast Web image categorization, text detection from born-digital images, camera-based mathematical expression recognition with Fully Convolutional Network. These works are still ongoing, and I will discuss the prospective directions in this field.
-
Biography: He received the Ph.D. degree in pattern recognition and intelligent systems from the Institute of Automation, Chinese Academy of Sciences in 2010. He got his BS and MS from Xi’an University of Posts and Telecommunications in 1999 and Huazhong University of Science and Technology in 2001 respectively. His research interests include character recognition and decument processing, etc. The current projects include “Theory and Key Techniques for Perturbation based Character Recognition” and “Video/Image text detection and recognition”. He has published more than thirty papers on international journals and conferences.
- Petra Gomez-Krämer
- Title: SHADES project
-
Abstract: The objective of the SHADES project (Semantic Hash for Advanced Document Electronic Signature) is to provide a new tool for authenticating the entirety of the content of a document through an advanced compact signature in order to fight against fraud and falsification. This signature will be based on the document’s content and structure what we call a semantic signature. Thanks to a hashing of the document’s information during the signature computation, no information from the original document will be deduced from its signature alone. The signature can then be inserted in the document or used in company content management software in order to check the authenticity of the document without compromising its confidentiality.
Nowadays a document, the so-called hybrid document, is often used in electronic or paper form according to the need. Hence, the hybrid document undergoes a life cycle of printing and scanning and thus different degraded versions of the document exist as the printing and scanning process introduces specific degradations, such as print and scan noise, in the document. Our idea is to extract the layout, the text and the images from the document to compute a stable signature that will be the same for all the authentic copies of the document. In consequence, this requires document analysis algorithms that are stable with respect to print and scan noise, so that two copies of the same document can have the same signature. This talk will present our recent research results on the stability of document analysis algorithms (optical character recognition, document segmentation, and layout description). -
Biography: Petra Gomez-Krämer is Associate Professor in Computer Science at the L3i laboratory of the La Rochelle University in France. In 2007, she obtained the PhD degree in Computer Science from the University of Bordeaux 1 and then she was a researcher at the Vicomtech research center in Spain until 2009. Her main research interests are image, video, and document processing.