Abstract
Current World-Wide Web technologies concentrate on presenting documents to human readers. Although HTML identifies structures within a document, it does not allow the semantic content of document sections to be specified explicitly. We investigate a small extension to HTML which allows parts of a document to be mapped onto an underlying database schema. This allows automatic identification and extraction of key information from a web using standard database techniques. Such “lightweight” databases may span servers, with searches being performed at client- or server-side. We have applied this approach to generating “flattened” versions of hypertext documents suitable for printing.
Original language | English |
---|---|
Pages (from-to) | 1009-1015 |
Journal | Computer Networks and ISDN Systems |
Volume | 27 |
Issue number | 6 |
DOIs | |
Publication status | Published - Apr 1995 |
Keywords
- www
- mark-up
- information retrieval
- web technology
- programming languages
- Databases
- resource discovery
- semantic mark-up
- HTML extensions
- Printing hypertext