Convert PDF to Excel: Tabula Table Extraction Tool

April 15, 2014 Timothy Uncategorized

Adobe’s Portable Document File or PDF is one of the most popular means of publishing complex documents on-line. The PDF format is so common that business software, such as Microsoft Word, incorporates the ability to export their documents to PDF.

The problems arise when one wants to reverse the process and export the PDF back into a business document. One such case is the ability to import data tables, such as census data, and analyze the data in a common Excel spreadsheet. Anyone searching for a solution often discovers its cheap and easy to make a PDF, not so much the reverse. Its seems none of the aforementioned business software imports from PDF and everyone looking to make a fast buck is selling a PDF conversion tool. What to do?

Enter Tabula, a free tool for allows you to extract that data in CSV format, through a simple interface. Extracting data is simple. Upload a PDF file, draw a box around the data, and click to download to a CSV file for use in a spreadsheet or database. Note that Tabula only works with text and not images. If the data is embedded in an image, its invisible to Tabula and pretty much any other software.

Tabula is available in versions for the Windows, Mac OS and Linux operating systems. One caveat: Tabula requires Java to run.



About Timothy Lee

Tim, the Arkansas Small Business and Technology Development Center's webmaster and technical training specialist, has been with ASBTDC since 1995. He retired from the U.S. Air Force with the rank of master sergeant. He's a bit gung-ho, turns cat food cans into cook stoves, and keeps packing ASBTDC equipment for rapid worldwide deployment, but he's your "go to" guy for technical solutions and full-scale disasters.

Comments are currently closed.