 |
 |
|
The Oct 2005 issue of ACM
Queue
featured a cover article titled “Managing semi-structured
data”. The article concluded -
Semi-structured data exists all
around us, yet
often we are unable to process and use it. The high cost of processing
information with existing techniques - because of the current
requirement of tight (inflexible) coupling of data, schemas, and code -
creates a natural barrier for using this information. …
Enabling
semi-structured information processing that is flexible, cheap, simple
and effective is an important goal. We are still at day one.
Semi-structured data does exist all around us. Let us look at it again
from the perspective of the three Main applications that are a part of
the Pro-Active suite.
|
| |
|
|
| |
| You can think of your own
application and
relate it to the picture above. Pro</DOC> responds to
this
fundamental reality through implementation of a technology framework
encompassing the following elements- |
| |
| 1. The Hybrid
Data model- |
The Data Model is at the core of the technology and the foundation for
supporting the vision of universal information. A basic Pro-Active
document can handle most presentation elements of HTML and it overlays
a hierarchical data structure behind the document. |
- A data representation based on the XML and
relational
Database technologies, which can model data in all its
forms ranging from unstructured to fully structured.
- Web application and tools to define the
document structure; and create and manage documents.
- A Document query language and an API.
|
| |
| 2.
Transformation Tools- |
The transformation tools allow content and structure to be
added
or removed from the Pro-Active document incrementally through many
automated and semi-automated tools. |
- Integrated ICR/OCR - both Batch mode and
on-line live ICR/OCR
- Two way legacy-host interface using screen
scraping [IBM3270 character streams and Character mode UNIX hosts]
- Two way ODBC database interface
- Web application interface using screen scraping
- Seamless integration with MS-Word, MS-Excel,
HTML and PDF document formats
- Pattern matching based content transformation
- Scripting capability - local as well as remote
(using Inter-process communication)
|
| |
| 3. Document
Tools- |
The documents are stored and managed on the Server in the Document
repository - Storage, retrieval, search and analysis are a part of this
main function group of Pro<DOC/>. |
- Document management
- Workflow management
- Version management
- Indexing
- Encryption
- Search
- OLAP based analysis for BI and workflow
performance analysis
|
| |
| |
| Pro</DOC>
and Document Management function space |
|
|
|
|
| |
| |
| |