The demand for accessible and reusable PDFs, optimized for screen readers and data extraction, is on the rise. Joseph explains a multi-year effort to automatically create accessible documents with LaTeX
#Accessibility#LaTeX#PDFInnovation#InclusiveTech
https://t.co/sphlBzTQfA
The digitization of historical documents faces significant challenges due to OCR inaccuracies. James Zhang explores OpenAI's GPT models to improve post-OCR correction for texts from the Princeton Prosody Archive (1559-1928). #AI#OCR#Digitization
๐https://t.co/uvVIVH4enW
Hrishikesh Kulkarni presents LexBoost, a document retrieval method that combines sparse and dense retrieval methods by leveraging a pre-build dense corpus graph. The result? Improved ranking performance with minimal computational overhead!
Full paper: https://t.co/FuVRZ3h6zH
In an age where every problem is solved through machine-learning, Jean-Luc presents ZigZag, a machine-learning free algorithm that binarizes images which suffer from difficult lighting environments.
Read more: https://t.co/XJikfSbox1
Try the app: https://t.co/p5HVPIlbbe
Sinan presents a new dataset bridging the gap between text descriptions and other document modalities used in the context of engineering design catalogs.
๐ Ready to dive deeper? Check out the paper here: https://t.co/OPCMqX9H8C
#document#engineering#design#dataset#NLP
Watch Didier Verna's short presentation explaining the key ideas behind an extension to the Knuth-Plass algorithm that improves TEX's paragraph justification enhancing typography quality.
Check out the paper here: https://t.co/3CssshZSrU
#TeX#Typography
And that's a wrap for #DocEng'24! Many thanks to our excellent hosts Matt Hardy, Curtis Wigington and Monica Delgado at #Adobe, and the program chairs Steve Simske and Steve Bagley. Thanks to all the participants for the vibrant conversations! See you in 2025!
Another great conference day today with talks related to Security and PDF Applications: efficient malware detection, and producing accessible & reusable PDFs; and Algorithms: lexical document retrieval and justification of paragraphs.
Yesterday's session concluded with two excellent sessions covering all things AI: automatic annotation, OCR correction and detection of AI-generated texts; and summarisations: assessing summarisation methods, and assessing the metrics used for summarisation!
Frank Mittelbach presented the second keynote talk today. An exciting overview of the history of LaTeX with new insights on new tricks to be taught to old docs :)
#LaTeX#DocEng2024
Concluded the second presentation session for the day, on Data Representation and Markup, featuring mathematical markup encoding, graph exploration and visualization queries, and a catalog dataset with semi-automatic annotation. Now exploring topics for #Birdsofafeather session!
This was followed by a session on Scanning, Document Input, and Binarisationโfeaturing a hand-held video document scanner, innovative binarisation algorithms, and a deep dive into optimal OCR scanner resolution. #DocEng2024#AI#OCR
DocEng 2024 kicked off today with a fantastic keynote by LN Renganarayana on generative AI and its impact on document interaction. Exciting insights into current trends and future tools!
Good news everyone! DocEng'23 Full Papers Deadline Extended!
Full papers abstract due: Today 17th of April 2023!
Deadlines for the submission of the final full paper manuscripts has been extended to: Monday, 1st of May 2023
All times are end of the day AOE (Anywhere-on-Earth)
DocEng'23 full papers deadline Monday, 17th of April๐ซฃ! DocEng seeks original research papers that focus on the design, implementation, development, management, use and evaluation of advanced systems where document and document collections play a key role! https://t.co/D51fGIFxME
It is official! DocEng'23 - The 23rd ACM Symposium on Document Engineering is coming to University of Limerick @UL. We look forward to welcome in @Limerick_ie the best minds in computer science and the document engineering field. @ACM_SIGWEB@ACMQueue https://t.co/KJcrLYPS3W
Day 4 of #DocEng2021 started with two really great sessions on Systems for Visual Document Analysis and Collections, Systems and Management! Many thanks to all that made this a (virtual) reality in @Limerick_ie
Daniela Costa from @CInUFPE, presenting their paper titled "A Comparative Study on Methods and Tools for Handwritten Mathematical Expression Recognition".
#DocEng2021
paper: https://t.co/YXWoZX5UeY
Lucas Kirsten from @HP Research &Development, Brazil, presenting their paper titled "Evaluating Deep Neural Networks for Image Document Enhancement".
#DocEng2021
paper: https://t.co/In8cO58dNn