Libera Libro

Free the books: 94053494

Module 3: Introducing Metadata

Posted by arlekeno on July 24, 2012

You would have already come across the term ‘metadata’ in the readings. What is it? (Or more correctly, What are they?) Hider (2008, p.332) defines metadata as ‘a set of elements that describes an information resource’.

Makes pretty good sense to me, information about the information, e.g. The number of pages, year of production. I think it is general to meet all the possible needs. BUT to look up 3 other definitions.

First, Wikipedia, the first search result.


From Wikipedia, the free encyclopedia
Jump to: navigation, search
For the page on metadata about Wikipedia, see Wikipedia:Metadata.

The term metadata is an ambiguous term which is used for two fundamentally different concepts (types). Although the expression “data about data” is often used, it does not apply to both in the same way. Structural metadata, the design and specification of data structures, cannot be about data, because at design time the application contains no data. In this case the correct description would be “data about the containers of data”. Descriptive metadata, on the other hand, is about individual instances of application data, the data content. In this case, a useful description (resulting in a disambiguating neologism) would be “data about data content” or “content about content” thus metacontent. Descriptive, Guide and the National Information Standards Organization concept of administrative metadata are all subtypes of metacontent.[citation needed]

Ok, i would have gone with Data about Data, but the term Meta-content (though un-cited) appeals to me. Interesting that Libraries had it first.

Anyway, now to a more reliable and relevant resource. The National Library of Australia.


What is metadata? My impression, from a number of recent meetings which I have attended, is that the concept is proving difficult to define with clarity. The Macquarie Dictionary defines the prefix “meta-” as meaning “among”, “together with”, “after” or “behind”. That suggests the idea of a “fellow traveller”: that metadata is not fully fledged data, but it is a kind of fellow-traveller with data, supporting it from the sidelines. My definition is that “an element of metadata describes an information resource, or helps provide access to an information resource”. A collection of such metadata elements may describe one or many information resources.

It is inherent in the concept of metadata that there is an association of some kind between the metadata and the information resource which it describes. For example, a library catalogue record is a collection of metadata elements, linked to the book or other item in the library collection through the call number. Information stored in the “META” field of an HTML Web page is metadata, associated with the information resource by being embedded within it. The indexing data held by Web crawlers is also metadata (though not very good metadata) – linked to the information resource through the URL.

Metadata can be an information resource in its own right. For example, a review of a film – which on one level is a piece of metadata related to the film – is, on another level, a literary work with its own author and perhaps its own intellectual property constraints.

Now this is the Professional view… 1) its not clear! 2) its not data but hangs out with Data (like its posse?) 3) it can be useful information in its own right.  You will notice I extended this example to talk of the embedded data in Web pages. just to keep us up to date.

Finally, from the Australian National Data Service.

Who needs to know this?

This is a general introduction which is likely to be of interest to researchers, their support staff, data centre and repository staff and research administrators.


The term metadata refers to information used to describe items and groups of items. It is data about data. It can be used to describe physical items as well as digital items (files, documents, images, datasets, etc.). A library catalogue, for example, is made up of metadata describing the books, journals and other items held by the library. The File Properties for a word processing document is a rudimentary (and imperfect) metadata record.

Item level metadata is used to describe a single object such as a photograph: who took the photograph, who is in it, the date it was taken, the place it was taken, the type of camera used to take the photograph, and so on.

Collection level metadata is used to describe an aggregation of objects such as the photo album (or CD-ROM or file folder) that contains a group of photographs: the size of the collection, who took the photographs (there may be more than one person), the time period over which the photographs were taken, and so on. Some of these attributes, such as ‘Title’ may be the same as those used to describe an individual photograph.

Metadata adds value to documents or images. For scientific data, metadata is even more important because it provides the context needed to make sense of what would otherwise be a collection of random numbers.

Types of metadata

The metadata elements used to describe either an item or a collection can serve different purposes. Some examples include:

  • Descriptive metadata, such as the name of the photographer, the subject of the photograph, the date and time that the photograph was taken;
  • Technical metadata, such as the type of camera used, the file format in which the photograph is stored, the exposure time and dimensions of the photograph, and so on;
  • Access or rights metadata, defining who is allowed to view to this photograph and under what conditions; and
  • Preservation metadata, which allows a digital preservation expert to keep track of actions taken to preserve or sustain the photograph for later access and use.

This is a good one because of the detail. I like that it says Metadata adds value to data. So we can make sense of it.

What do these 3 definitions have in common? They all say its data about data. I think the last one says it best by saying it adds value.

Although the current version of the Anglo-American Cataloguing Rules (the second edition, 2002 revision) includes guidance for cataloguing digital material, many librarians think that it does not do so very effectively.

Think back, for a moment, to the reasons for wanting to organise information. We do it so that we can provide access. Here’s a relevant quote about why we need metadata:

Metadata is crucial to searching. If searching is, today, largely a matter of matching query words with words in the text of articles, then anything that makes the matching process easier or more standardized is bound to improve the process. Metadata is expected to improve matching by standardizing the structure and content of indexing or cataloging information. (Jessica Milstead & Susan Feldman, ‘Metadata: Cataloging by Any Other Name …’, Online, January 1999 .)

Onwards to Dublin Core

Levels of the standard

The Dublin Core standard includes two levels — Simple and Qualified. Simple Dublin Core comprises 15 elements; Qualified Dublin Core includes three additional elements;— Audience, Provenance and RightsHolder;— as well as a group of element refinements, also called qualifiers, that refine the semantics of the elements in ways that may be useful in resource discovery.

[edit] Simple Dublin Core

The Simple Dublin Core Metadata Element Set (DCMES) consists of 15 metadata elements:[2] (from Wiki)

  1. Title
  2. Creator
  3. Subject
  4. Description
  5. Publisher
  6. Contributor
  7. Date
  8. Type
  9. Format
  10. Identifier
  11. Source
  12. Language
  13. Relation
  14. Coverage
  15. Rights

Each Dublin Core element is optional and may be repeated. The DCMI has established standard ways to refine elements and encourage the use of encoding and vocabulary schemes. There is no prescribed order in Dublin Core for presenting or using the elements.

From the notes :

It is important to note that Dublin Core metadata is based on four principles:

  • Simplicity – DC was designed to be applied by the people who create the information resources, rather than by information professionals.
  • Semantic interoperability – DC must be useable in different disciplines, and not be limited to any one subject area or group of subjects.
  • International consensus – because the internet operates across national boundaries, DC is developed by an international, interdisciplinary group.
  • Extensibility – DC is designed to be flexible so that it can be built on if required by specialist applications.

And some examples of what it all looks like in Action.



Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: