Next Generation Forms Processing Leverages Two Kinds Of PDF

By Ralph Gammon, Editor Document Imaging Report (

For reprints of the original article, please contact

February 6, 2004 - A Spanish start-up may have come up with a method for taking OCR out of forms processing applications. Combining PDF forms and 2D PDF417 bar code technology, Dataintro Software, Inc. has introduced a patent-pending system that could represent the future of automated data entry. Dataintro has already sold an Ultraforms® system to the Missouri Department of Revenue (DOR) and is just starting to introduce its technology to the market.

Ultraforms® is a system for creating PDF417 bar codes containing information from electronic PDF forms. The Missouri DOR is utilizing Ultraforms® on the tax forms it distributes over the Web. Taxpayers can enter information into the forms using Acrobat Reader 4.0 or higher, and the data is automatically encoded into a PDF417 bar code that is printed on the document. The data is captured by scanning the bar code.

"We'd been looking for something like this for quite awhile," said Mitzi Crump, forms analyst at the Missouri DOR. "For a few years, we've been reading PDF417 bar codes on the forms created by several tax software programs. I thought there must be some way to produce bar codes on the PDF forms that we distribute."

The key to Dataintro's technology is that it can be embedded directly into a PDF form. This means when the user downloads the form, it automatically downloads the technology used to create the PDF417 symbol. "Dataintro has the only product I know of that can enable the creation of PDF417 bar codes without making our taxpayers download a plug-in," said Crump.

It's probably also worth noting that a form utilizing Ultraforms® is different from an online form, because with an Ultraform, no data is being transmitted electronically. The data resides in the PDF417 bar code on the printed page. The UltraForm represents a transition between electronic and paper submission. "We process approximately 2.6 million tax returns per year," Crump told DIR. "Last year, 1 million of those were filed electronically. Ideally, we'd like everyone to file electronically, but that is not going to happen in my lifetime. There are a number of reasons people will continue to send paper."

Crump looked at both OCR and ICR solutions to automate data entry from paper forms, but was not impressed. "ICR has limitations and would have required a lot of forms redesign," she said. "With OCR, there were some font issues we would have run into. Also, we already had 30 handheld PDF417 bar code readers that we were using to process the forms created by software programs."

Eliminating Capture Inaccuracies

Dataintro's President, CEO, and Co-Founder Carlos Gonzalez has a long history in the data capture industry. "The problem with OCR-based software is that it is not 100% accurate," he told DIR. "There are people whose full-time jobs are correcting mistakes made in data capture applications. These mistakes can be made either by the OCR software or the person filling out the form."

Ultraforms® creates a PDF417 bar code using data input from electronic forms, not scanned characters from paper forms, so recognition errors are eliminated. Also, it's possible to leverage e-forms technology to verify data as it is entered onto a PDF form, which further reduces errors. "Eliminating the need for error correction represents a huge cost benefit over traditional automated data capture applications," said Gonzalez.

The Missouri DOR, in fact, has embedded an automatic calculation feature into its PDF forms. "Since we already had set up a back-end system to receive bar code information, I just had to do some code modification to incorporate Ultraforms," said Crump. "It took us about a month to set up the application for all our forms. We purchased Ultraforms at the end of October and had PDF forms ready to burn to CD by the end of November."

When we spoke with Crump in January, it was still early in the tax season. "However, we've received a number of returns utilizing Ultraforms, and the system is working extremely well," she said. "It is set up so I can monitor the data being captured. The system also throws up a flag if it detects an error. Originally, we were having some mapping issues for users with Acrobat 6, but that was not because of Ultraforms, and we solved the problem quickly."

Crump, who labels herself an optimist, hopes that between 150,000 and 200,000 tax forms will be filed using Ultraforms® this year. Although PDF417 bar code reading software is available on document scanners and document capture applications, the Missouri DOR is not currently imaging its returns. Rather, it is saving the paper copies.

Broad-Ranging Potential

Dataintro originally developed Ultraforms® for the Spanish tax collection agency. "They have approximately 400 different forms that they wanted to make available as downloadable electronic forms," Gonzalez told DIR. "They wanted to be able to print PDF417 symbols on every form. They were going about it one form at a time and were finding out they needed to develop different versions of the forms for each operating system. There are only 3,000 Unix users in Spain, but they had to have access to the tax forms.

"Going about it that way, the agency was able to develop about 20 different forms. To eliminate the problem of creating different forms for multiple OSs, we started working with them to create a PDF package for the remaining 380 forms."

In addition to the Spanish and Missouri tax services, Dataintro also has an installation in the education market. "We've been working on this technology for two years," Gonzalez told DIR. "We wanted to make it more widely available a while ago, but our lawyers told us to wait because of our patent situation. We did, however, contact the Federation of Tax Administrators and presented demos for more than 25 state revenue agencies. That's how we were introduced to the Missouri DOR."

In addition to tax processing, Gonzalez said Ultraforms® could be used on almost any type of form. "You could use it on envelopes to do mail control, on loan documents, proxy statements, legal documents, etc.," he said. "We've been cultivating relationships in the human resources market, at medical institutions, and in testing laboratories. Many large companies already have infrastructures to handle data collection from bar codes. Ultraforms would be a natural extension."

According to its specification, a PDF417 symbol can contain up to 1,800 characters of information. "The specification also allows for the chaining of multiple symbols together that can be read as a single file," said Gonzalez. "In addition to PDF417, we can build traditional 1D linear bar codes into files and are working on incorporating other 2D codes, such as the Data Matrix code."

To date, Dataintro has sold Ultraforms® directly to end users. "Pricing has been based on a number of factors, including the number of forms, the expected savings, and the expected number of users," Gonzalez said. "However, we may develop an indirect model as our sales increase, which I expect to happen this year."

About Dataintro Software

DATAINTRO SOFTWARE is a privately owned software company with offices in Sacramento, California. We are the leading provider of 2D barcode generation technologies for PDF Forms, and are a highly specialized company in the Paper Process Automation area. Our clients are large private companies and government agencies worldwide. The solutions we provide translate into cost savings, higher productivity and better efficiency. All of the above also translates into customer and/or citizen service quality improvement.

» More News