2015 ftsm malim khairuddin omar 28/5/2015 bicara bicara malim prof ko.pdf · arab dan hukum tajwid...
Post on 14-Mar-2018
227 Views
Preview:
TRANSCRIPT
Khairuddin Omar
28/5/2015
Bicara Malim FTSM 2015
Pengenalan Pengecaman aksara optik (PAO) adalah proses menukar
imej teks bercetak atau tulisan tangan yang telah diimbas (angka, huruf, dan simbol), ke dalam betuk aliran aksara mesin-boleh baca, jelas (contoh fail teks) atau diformat (contoh fail HTML).
PAO adalah cabang Pengecaman Corak (PC) yang paling berjaya. Indeed, to recognize a character from a given image, one would match
(via some known metric) this character’s feature pattern against some very limited reference set of known feature patterns in the given alphabet. This clearly is a classical case of a pattern recognition problem. Eugene Borovikov, 2014. A survey of modern optical character recognition techniques -
Computer Vision and Pattern Recognition
Bicara Malim FTSM 2015
A typical OCR System
Bicara Malim FTSM 2015
Contoh Struktur Seni Bina PAO (Khairuddin 2000)
Bicara Malim FTSM 2015
Handwriting recognition Jawi
Khairuddin Omar, Jawi Handwritten Text Recognition Using Multi-level Classifier (in Malay), PhD Thesis, Universiti Putra Malaysia, 2000.
Mazani Manaf, Jawi Handwritten Text Recognition Using Recurrent Bama Neural Networks (in Malay), PhD Thesis, 2002.
Roslim Mohammad, Modification of Combined Segmentation Technique for Jawi Manuscript (in Malay). MIT Thesis, 2002.
Mohammad Faidzul Nasrudin, Pengecaman Aksara Jawi Menggunakan Jelmaan Surih. 2011.
Bicara Malim FTSM 2015
Handwriting recognition Jawi (sambungan)
Che Norhaslida Deraman, Extension of Combined Segmentation Technique for Jawi Manuscripts (in Malay). MIT Thesis, 2005.
Viska Mutiawani, Segmentation of Jawi Text Using Voronoi Diagram (in Malay) MIT Thesis, 2007.
Remon Redika, Features Extraction Of Jawi Character Base On Hidden Markov Method, 2009.
Anton Heryanto, Segmentation technique for jawi character recognition using Dynamic Programming, 2009.
Bicara Malim FTSM 2015
Handwriting recognition Arabic Ahmad M. Z. Mohammed, Segmentation of Arabic Characters
Using Voronoi Diagrams, PhD Thesis. Fakulti Teknologi dan Sains Maklumat, Universiti Kebangsaan Malaysia, Bangi, 2007.
Atallah Mahmoud Awad Al-Shatnawi. A Non-Iterative Thinning Method Based on Exploited Vertices of Voronoi Diagrams, 2010.
Ali Mohammed Massud Mady. A Comparative Study in The Algorithms of Voronoi Diagrams Construction on Thinning Process, 2011.
Jabril Ramdan Abdslam Salem. Comparative Study of Algorithms for Voronoi Diagram Construction on Segmentation Of Arabic Handwriting, 2011
Bicara Malim FTSM 2015
Intelligent post-processing Azniah Ismail, ASCII Code and UNICODE for Arabic and Jawi Word
Processing (in Malay). MIT Thesis, 2003. Suliana Sulaiman, Digital Jawi Manuscript in UNICODE Character
Code (in Malay), MIT Thesis, 2007. Juhaidah Abu Bakar, Transliteration System of Old Jawi to New Jawi
Using Grafem (in Malay), MIT Thesis, 2007. Suliana Sulaiman. Pencantas Perkataan Melayu untuk Aksara Jawi
Berasaskan Petua, 2013. Juhaidah Abu Bakar. Minimizing Part of Speech Tagging Gap:
Identifying Proper Names in Jawi corpus.
Bicara Malim FTSM 2015
OCR in multi-media Che Wan Shamsul Bahari Che Wan Ahmad, Old Jawi to
New Jawi Translator (in Malay), MIT Thesis, Fakulti Teknologi dan Sains Maklumat, Universiti Kebangsaan Malaysia, Bangi, 2006.
Yonhendri . Enjin Transliterasi Rumi-Jawi, 2009.
Che Wan Shamsul Bahari Che Wan Ahmad. Transliterasi Mesin untuk Ejaan Melayu Lama.
Bicara Malim FTSM 2015
Adaptive OCR wider range of printed document imagery
Majdi Abdel Rahim Saleh Salameh. Pengecaman Harakat Arab dan Hukum Tajwid Quran Menggunakan Rangkaian Neural dan Teknik Logik Kabur. Main, 2009.
omni-font texts Mohd Sanusi bin Azmi. Fitur Baharu dari Kombinasi
Geometri Segitiga dan Pengezonan untuk Paleografi Jawi Digital, 2013.
multi-script and multi-language recognition Waleed Abdel Karim Helal Abu-Ain. Automatic Off-line
International Handwritting Script Identification Based on Skeleton Primitive Direction Features.
Bicara Malim FTSM 2015
Document Image Enhancement Mohd Sanusi Azmi, Reengineering of Slant and Slope
Orientation Skew Histogram for Merong Mahawangsa Manuscript (in Malay), MIT Thesis, 2003.
Bilal Mohammad Ahmad Bataineh Adaptive Binarization and Statistical Texture Analysis for Document Images Analysis and Recognition, 2011.
Sitti Rachmawati Yahya. Pembentukan Semula Imej Manuskrip Lama Secara Kaedah Adaptif Perduaan Automatik Dan Penjejakan Tetingkap Piksel.
Tarik Abdel Kareem Helal Abu Ain. Joint-Landmarks Baseline and Advanced Direction Features for Arabic Character Segmentation and Classification.
Bicara Malim FTSM 2015
Trend Utama dalam PAO moden Adaptive OCR aims at robust handling of a wider
range of printed document imagery by addressing multi-script and multi-language recognition
omni-font texts
automatic document segmentation
mathematical notation recognition
Bicara Malim FTSM 2015
Trend Utama dalam PAO moden Handwriting recognition is a maturing OCR
technology that has to be extremely robust and adaptive. In general, it remains an actively researched open problem that has been solved to a certain extent for some special applications, such as
recognition of hand-printed text in forms
handwriting recognition in personal checks
postal envelope and parcel address readers
OCR in portable and handheld devices
Bicara Malim FTSM 2015
Trend Utama dalam PAO moden Document image enhancement - involves
(automatically) choosing and applying appropriate image filters to the source document image to help the given OCR engine better recognize characters and words.
Bicara Malim FTSM 2015
Trend Utama dalam PAO moden Intelligent post-processing is of great importance
for improving the OCR recognition accuracy and for creating robust information retrieval (IR) systems that utilize smart indexing and approximate string matching techniques for storage and retrieval of noisy OCR output texts.
Bicara Malim FTSM 2015
Trend Utama dalam PAO moden OCR in multi-media is an interesting development
that adapts techniques of optical character recognition in the media other than printed documents, e.g. photo, video, and the internet
Bicara Malim FTSM 2015
Mengapa POA sukar? Datang dari dua sumber utama:
kualiti imej yang rendah poor original document quality
noisy, low resolution, multi-generation image scanning
incorrect or insufficient image pre-processing
poor segmentation into recognition items
keupayaan diskriminan pengelas Sukar untuk dapatkan 99% kadar pengecaman Bicara Malim FTSM 2015
Mengapa POA sukar? script and language
document image types and image defects
document segmentation
character types
OCR flexibility, accuracy and productivity
hand-writing and hand-printing
OCR pre- and post-processing Bicara Malim FTSM 2015
Complex character scripts
Bicara Malim FTSM 2015
Insufficient image preprocessing
Bicara Malim FTSM 2015
Document segmentation ambiguity
Bicara Malim FTSM 2015
Character shape variability
Bicara Malim FTSM 2015
Baseline detection
Bicara Malim FTSM 2015
Skew and slanting
Bicara Malim FTSM 2015
Poor original document quality
Bicara Malim FTSM 2015
Poor segmentation into recognition items
Bicara Malim FTSM 2015
Complex features
Bicara Malim FTSM 2015
Stemming, tagging, homograph
Bicara Malim FTSM 2015
the most promising directions adaptive OCR aiming at robust handling of a wider range of
printed document imagery – deep learning document image enhancement as part of OCR pre-
processing intelligent use of context providing a bigger picture to the
OCR engine and making the recognition task more focused and robust
handwriting recognition in all forms, static and dynamic, general-purpose and task-specific, etc.
multi-lingual OCR, including multiple embedded scripts multi-media OCR aiming to recognize any text captured by
any visual sensor in any environment
Bicara Malim FTSM 2015
Sekian
Bicara Malim FTSM 2015
top related