Skip to content

wanghaisheng/ocr-arxiv-daily

Repository files navigation

OCR-paper-arxiv-daily latest papers

Automated deployment @ 2023-06-07 08:05:21 Asia/Shanghai

Welcome to contribute! Add your topics and keywords in topic.yml. You can also view historical data through the storage.

OCR

OCR

Publish Date Title Authors PDF Code
2023-06-05 Transformer-Based UNet with Multi-Headed Cross-Attention Skip Connections to Eliminate Artifacts in Scanned Documents David Kreuzer et.al. 2306.02815v1 null
2023-06-03 TransDocAnalyser: A Framework for Offline Semi-structured Handwritten Document Analysis in the Legal Domain Sagar Chakraborty et.al. 2306.02142v1 link
2023-06-02 DocFormerv2: Local Features for Document Understanding Srikar Appalaraju et.al. 2306.01733v1 null
2023-06-01 Layout and Task Aware Instruction Prompt for Zero-shot Document Image Question Answering Wenjin Wang et.al. 2306.00526v1 link
2023-05-31 Improving Handwritten OCR with Training Samples Generated by Glyph Conditional Denoising Diffusion Probabilistic Model Haisong Ding et.al. 2305.19543v1 null
2023-05-30 DuoSearch: A Novel Search Engine for Bulgarian Historical Documents Angel Beshirov et.al. 2305.19392v1 link
2023-05-29 GlyphControl: Glyph Conditional Control for Visual Text Generation Yukang Yang et.al. 2305.18259v1 link
2023-05-28 FuseCap: Leveraging Large Language Models to Fuse Visual Data into Enriched Image Captions Noam Rotstein et.al. 2305.17718v1 link
2023-05-27 Exploring Better Text Image Translation with Multimodal Codebook Zhibin Lan et.al. 2305.17415v2 link
2023-05-27 Super-Resolution of License Plate Images Using Attention Modules and Sub-Pixel Convolution Layers Valfride Nascimento et.al. 2305.17313v1 link
2023-05-26 People and Places of Historical Europe: Bootstrapping Annotation Pipeline and a New Corpus of Named Entities in Late Medieval Texts Vít Novotný et.al. 2305.16718v1 null
2023-05-24 Quantifying Character Similarity with Vision Transformers Xinmei Yang et.al. 2305.14672v1 link
2023-05-21 Measuring Intersectional Biases in Historical Documents Nadav Borenstein et.al. 2305.12376v1 link
2023-05-19 XTREME-UP: A User-Centric Scarce-Data Benchmark for Under-Represented Languages Sebastian Ruder et.al. 2305.11938v2 link
2023-05-18 TextDiffuser: Diffusion Models as Text Painters Jingye Chen et.al. 2305.10855v2 link
2023-05-16 Sequence-to-Sequence Pre-training with Unified Modality Masking for Visual Document Understanding Shuwei Feng et.al. 2305.10448v1 null
2023-05-16 Mobile User Interface Element Detection Via Adaptively Prompt Tuning Zhangxuan Gu et.al. 2305.09699v1 link
2023-05-13 On the Hidden Mystery of OCR in Large Multimodal Models Yuliang Liu et.al. 2305.07895v2 link
2023-05-12 Visual Information Extraction in the Wild: Practical Dataset and End-to-end Solution Jianfeng Kuang et.al. 2305.07498v1 link
2023-05-11 Combining OCR Models for Reading Early Modern Printed Books Mathias Seuret et.al. 2305.07131v1 link
2023-05-09 E2TIMT: Efficient and Effective Modal Adapter for Text Image Machine Translation Cong Ma et.al. 2305.05166v2 link
2023-05-04 Text Reading Order in Uncontrolled Conditions by Sparse Graph Segmentation Renshen Wang et.al. 2305.02577v1 null
2023-05-03 Evaluating BERT-based Scientific Relation Classifiers for Scholarly Knowledge Graph Construction on Digital Library Collections Ming Jiang et.al. 2305.02291v1 null
2023-04-28 LLaMA-Adapter V2: Parameter-Efficient Visual Instruction Model Peng Gao et.al. 2304.15010v1 link
2023-04-24 DocParser: End-to-end OCR-free Information Extraction from Visually Rich Documents Mohamed Dhouib et.al. 2304.12484v2 null
2023-04-24 ICDAR 2023 Competition on Reading the Seal Title Wenwen Yu et.al. 2304.11966v2 null
2023-04-17 Multimodal Short Video Rumor Detection System Based on Contrastive Learning Yuxing Yang et.al. 2304.08401v3 null
2023-04-15 TransDocs: Optical Character Recognition with word to word translation Abhishek Bamotra et.al. 2304.07637v1 link
2023-04-07 Linking Representations with Multimodal Contrastive Learning Abhishek Arora et.al. 2304.03464v2 null
2023-04-07 Cleansing Jewel: A Neural Spelling Correction Model Built On Google OCR-ed Tibetan Manuscripts Queenie Luo et.al. 2304.03427v1 null

scene text

scene text

Publish Date Title Authors PDF Code
2023-06-05 Neuralangelo: High-Fidelity Neural Surface Reconstruction Zhaoshuo Li et.al. 2306.03092v1 null
2023-06-05 Brain Diffusion for Visual Exploration: Cortical Discovery using Large Scale Generative Models Andrew F. Luo et.al. 2306.03089v1 null
2023-06-05 Machine Learning and Statistical Approaches to Measuring Similarity of Political Parties Daria Boratyn et.al. 2306.03079v1 null
2023-06-05 Interactive Editing for Text Summarization Yujia Xie et.al. 2306.03067v1 link
2023-06-05 Of Mice and Mates: Automated Classification and Modelling of Mouse Behaviour in Groups using a Single Model across Cages Michael P. J. Camilleri et.al. 2306.03066v1 null
2023-06-05 Structured Voronoi Sampling Afra Amini et.al. 2306.03061v1 null
2023-06-05 ELEV-VISION: Automated Lowest Floor Elevation Estimation from Segmenting Street View Images Yu-Hsuan Ho et.al. 2306.03050v1 null
2023-06-05 Designing Equilibria in Concurrent Games with Social Welfare and Temporal Logic Constraints Julian Gutierrez et.al. 2306.03045v1 null
2023-06-05 HeadSculpt: Crafting 3D Head Avatars with Text Xiao Han et.al. 2306.03038v1 null
2023-06-05 Tackling Cooperative Incompatibility for Zero-Shot Human-AI Coordination Yang Li et.al. 2306.03034v1 null
2023-06-05 Interpretable Alzheimer's Disease Classification Via a Contrastive Diffusion Autoencoder Ayodeji Ijishakin et.al. 2306.03022v1 null
2023-06-05 Automating Style Analysis and Visualization With Explainable AI -- Case Studies on Brand Recognition Yu-hsuan Chen et.al. 2306.03021v1 link
2023-06-05 Using Sequences of Life-events to Predict Human Lives Germans Savcisens et.al. 2306.03009v1 null
2023-06-05 Nonparametric Iterative Machine Teaching Chen Zhang et.al. 2306.03007v1 null
2023-06-05 Unveiling the Two-Faced Truth: Disentangling Morphed Identities for Face Morphing Detection Eduarda Caldeira et.al. 2306.03002v1 link
2023-06-05 BeyondPixels: A Comprehensive Review of the Evolution of Neural Radiance Fields AKM Shahariar Azad Rabby et.al. 2306.03000v1 null
2023-06-05 Long-range UAV Thermal Geo-localization with Satellite Imagery Jiuhong Xiao et.al. 2306.02994v1 link
2023-06-05 Second-scale rotational coherence and dipolar interactions in a gas of ultracold polar molecules Philip D. Gregory et.al. 2306.02991v1 null
2023-06-05 Integrated Sensing, Computation, and Communication for UAV-assisted Federated Edge Learning Yao Tang et.al. 2306.02990v1 null
2023-06-05 Brain tumor segmentation using synthetic MR images -- A comparison of GANs and diffusion models Muhammad Usman Akbar et.al. 2306.02986v1 null
2023-06-05 A Term-based Approach for Generating Finite Automata from Interaction Diagrams Erwan Mahe et.al. 2306.02983v1 null
2023-06-05 Which Argumentative Aspects of Hate Speech in Social Media can be reliably identified? Damián Furman et.al. 2306.02978v1 link
2023-06-05 Best of Both Worlds: Hybrid SNN-ANN Architecture for Event-based Optical Flow Estimation Shubham Negi et.al. 2306.02960v1 null
2023-06-05 Complex Preferences for Different Convergent Priors in Discrete Graph Diffusion Alex M. Tseng et.al. 2306.02957v1 null
2023-06-05 Explicit Neural Surfaces: Learning Continuous Geometry With Deformation Fields Thomas Walker et.al. 2306.02956v1 null
2023-06-05 A Simple and Flexible Modeling for Mental Disorder Detection by Learning from Clinical Questionnaires Hoyun Song et.al. 2306.02955v1 null
2023-06-05 Color-aware Deep Temporal Backdrop Duplex Matting System Hendrik Hachmann et.al. 2306.02954v1 null
2023-06-05 INDigo: An INN-Guided Probabilistic Diffusion Algorithm for Inverse Problems Di You et.al. 2306.02949v1 null
2023-06-05 Continual Learning with Pretrained Backbones by Tuning in the Input Space Simone Marullo et.al. 2306.02947v1 null
2023-06-05 Human Spine Motion Capture using Perforated Kinesiology Tape Hendrik Hachmann et.al. 2306.02930v1 link