1. Heirloom vegetables and heirloom information access
Vegetables
Years ago there were many more varieties of vegetables available than there are today – spotted beans, yellow and black tomatoes, orange jelly turnips, easy picking ‘lazy wife beans’ – people had variety and choice. Now we suffer from a one size fits all mentality. We have, for example, tomatoes with wonderful storage qualities, but with little flavour. Selection of the ‘optimal’ variety – according to priorities of the day such as transportability – has meant the absence of other varieties in the shops.
An heirloom vegetable is one that is at least fifty years old. Often their histories have been documented, and can be traced back to early settlers. Heirloom vegetables are open-pollinated, meaning that offspring of a plant are true to type, as opposed to hybrids where there is no guarantee of the characteristics of the new generation. Heirloom vegetables also often have unique characteristics – such as unusual colouration – rather than the somewhat bland visual appeal of the new generation of uniform products.
Growers of ‘heirloom’ vegetables – also known as antique or old-time plant varieties – choose to grow these forgotten varieties to ensure that we don’t end up with a monoculture with only one option in the future. There is a great risk with a horticultural monoculture that if that vegetable suffers a major pest or disease attack the whole crop is obliterated, whereas if a number of varieties are grown, one or more is likely to be resistant to attack.
Heirloom vegetable growers are often members of ‘seed saver’ networks, cooperatively ensuring that these special seeds are propagated widely so that they remain available.
Information access
Just as there are valuable and exciting heirloom vegetables, there are also ‘heirloom’ information access methods. These methods are old – relative to the web – and have been nurtured over the years. Library classification schemes such as the Dewey Decimal Classification and Library of Congress Classification have been used along with subject heading schemes (from which thesauri and keyword metadata developed) for over a century. Book-style indexing has developed gradually since the development of subject indexes in France in the thirteenth century, and the simple indexing of commonplace books in England in the mid-sixteenth century.
Just as open pollinated vegetables are true to type, so these heirloom information access tools give results that are ‘true to type’. The Dewey notation 100 means the same here as in Lancaster, Kuala Lumpur or Suva; bibliographic databases are used world-wide; and indexes are easily recognised and used everywhere. These access methods are reliable and consistent. They are also more individual than a global search engine. Information access methods can be chosen for the need at hand, and will reflect the priorities and choices made by the information professionals working on them.
Unfortunately, information access is tending towards a monoculture. Search engines such as Google provide excellent retrieval, and have become the dominant information access method for the web. But while they are good at what they do, there are limits to the quality of information retrieval using a search engine, and there are times when another information access method would be better.
Luckily, there are a few enthusiasts working on maintaining heirloom information access methods. Websites of societies of indexers have back-of-book-style indexes, library sites use classification schemes, and subject gateways and image libraries depend on thesauri for optimal retrieval. Even though these sites may be few and far between, it is important to keep the ideas alive, and at least present some ‘example’ sites across the web to show what options exist. Just as seed saving works best with many people cooperating, similarly, these traditional access methods – which can be labour intensive – depend on a number of volunteers working together for a common aim.
Unfortunately, they suffer from the same scalability problems as heirloom vegetables, so must be complemented by other approaches. They also demand organisational or personal commitment for their maintenance.
2. Value of heirlooms
While modern vegetables, and modern access methods, have many advantages, they do not suit everyone, all the time. If we want optimal nutrition, visual appeal and taste appeal, we need to grow a range of vegetables with features apart from the solely practical. Similarly, modern access methods which rely heavily on computer processing are wonderful tools, but without maintaining our traditional access methods to complement them, we will be information poorer.
The values I believe heirloom vegetables and heirloom access methods offer are:
- User choice
- Local suitability
- Money saving
- Links to the past
- Genetic diversity/insurance against monoculture
User choice
Growing heirloom vegetables, and maintaining heirloom access methods, means that people have more choice, and are not stuck with one or two varieties. When choice involves unique, valuable qualities, it is worth preserving.
As Kathy Mendelson says: ‘Heirlooms invite passion. There is just something about all their wonderful shapes, sizes, colors, and flavors that sparks a sense of wonder. Take heirloom tomatoes, for example. They can be big, small, fluted, smooth, red, orange, pink, purple, yellow, green, white, striped, round, pear-shaped, determinate, indeterminate, potato-leaved, and more. They also vary in traits you can’t see – taste, hardiness, adaptability, and the like’. Expensive restaurants are now using heirloom varieties to spark interest in food as well as providing tasty alternatives. Examples are Blushing Lady apple pie and Madras Sweet Pod radish salad.
And just as heirloom vegetables often have some special characteristic which makes them unique and desirable, so do heirloom access methods. The approaches taken in classification, book-style indexing and thesaurus construction and keyword selection are all different, and each method has different uses. Some people will use classifications for their clear structure and grouping of concepts, while others prefer book-style indexes, with their natural language and quick access to specific concepts. Yet others, dealing perhaps with large numbers of structured documents, use metadata and a search engine. Unless we preserve and develop these alternatives, we’ll be left with no option but a search engine that retrieves thousands of hits – some of which are sure to be great if one had the time to sort the wheat from the chaff!
Local suitability
When a national horticultural company selects a product to sell, for maximum profitability they need one that will grow in a wide range of environments. But when a local producer chooses a seed to grow, they can select one that is optimally suited to the environment they are in. Open-pollinated seed that has been grown for years in a region will be adapted to that area’s soil, climate and pests. In addition, when gardeners save the seed of the plants that worked best for them, they gradually select their own cultivars which will be suited to their own tastes.
Similarly, heirloom access methods can be tailored to the needs of the community that will be using them. For example, a website index for nurses may contain more technical language than a website index for school students. The more knowledge a website creator has of their potential audience, the better they can target their content and access methods. Focus groups and card sorting exercises provide information about the needs and expectations of specific users which can be incorporated into heirloom access methods to give the optimal solution for that specific website.
Once people start using a site, feedback from search logs indicates the terms people are searching for and the topics they are interested in. Again, this can influence the content that is provided, and the metadata terms and synonyms added to enhance the search engine.
Saving money
Saving heirloom seeds is cheaper than buying new hybrid seeds each year. Using heirloom access methods can be cheaper than IT-based solutions when the costs of not finding information are taken into account along with labour and hardware costs. Manual indexing is often considered to be expensive, but when looked at as a proportion of total project costs for a website it is not very expensive at all. It makes no sense to spend millions of dollars putting content into a website if users cannot effectively find that content.
Links to the past
The more sophisticated and technological our culture becomes, the more we look to the past for a sense of continuity and stability of values. Family history can now involve growing and tasting food that our ancestors grew, as well as finding out general details of their lives. Regional history exhibits include gardens with old-time varieties that show the options people had in the past.
Similarly, user studies show that people using new computer systems like to relate them to prior knowledge. Carroll and Rosson discuss problems of ‘assimilation bias’, which means that people apply what they already know to interpret new situations. This can be helpful when there are true similarities, but it can blind users to features of the new situation they are facing. One suggestion of the authors to mitigate assimilation bias is to ‘Make or describe the system as similar to something familiar’. Since heirloom access methods are familiar to people used to traditional information access, this suggests that they could be of value in providing an old and familiar tool that works well in the new environment.
One key example of a classification scheme used for the organisation of web resources is BUBL LINK, which links to resources on a range of topics. One of its strengths is that it offers traditional access methods such as the Dewey Classification and alphabetic subject indexes as complements to a search engine
Genetic diversity/insurance against monoculture
When one variety of vegetable is considered highly desirable for some specific feature, it tends to dominate horticulture at the expense of all other varieties, and we have a monoculture. The problem with a monoculture is that if something happens to make the key variety vulnerable, the total growth of that vegetable is at stake. For example, if a pest or disease strikes the current ‘queen tomato’ or ‘king potato’ the whole crop of tomatoes or potatoes may be lost. If, on the other hand, a range of varieties is grown, there is be a good chance of one or more of these being resistant to that specific pest or disease, thus keeping up production.
Maintaining genetic diversity is also important to keep the gene pool of all varieties available for future use. Pharmaceutical research to find new drugs often involves screening hundreds of plants and microorganisms on the chance of finding a useful biochemical. The more varieties we maintain, the more hope we have of finding the drugs we need.
In information access we are tending towards a monoculture, with a heavy reliance of many users on Google search to find whatever they need on the web. While Google is great – and is continually adding features – it is not always the best tool for each need. If it becomes so powerful that it reduces the viability of other options we are losing our gene pool, and we will be poorer for it.
And just as horticultural monocultures are vulnerable to wipeout by a specific, uncontrollable pest or disease, so Google itself is vulnerable to manipulation. If it fails to provide effective retrieval and ranking, we will be left with a big hole in our access options. In recent months we have seen how people are vulnerable to changes in Google policies, and how Google is vulnerable to spamming by a small number of individuals.
Google Dance
When Google recently changed its ranking policies, many sites which had ranked highly (usually after targeted search engine optimisation procedures) found themselves ranked much lower. This was dubbed the Google Dance, as sites wiggled around the rankings. Any decision made by Google has the power to affect the livelihoods and the communication chances of every website owner. If the ranking changes are appropriate and well-thought out they will benefit some and hinder others. If the ranking changes are introduced to manipulate the system in any way (for example, to encourage more people towards paid advertising options) then Google’s dominance gives them enormous power to sway markets, and to influence the information we can find.
Google Bombing/Metadata spamming
Just as Google has power to affect site owners, site owners have the power to affect Google rankings in a process known as Google Bombing. Because Google rankings depend on the words used in links to a site, as well as words within a site, a small number of websites can influence ranking by using the same words in links to a specific site. In one case the search terms were ‘miserable failure’ and the target was the site for George Bush’s biography. (With a distinct lack of originality, the same phrase now also leads to listings for Jimmy Carter, Hillary Rodham Clinton and Michael Moore).
The ease of this bombing – which required less than a hundred links – depends to some extent on the uncommon nature of ‘miserable failure’ as a link term. It also seems a fairly harmless piece of fun, although it does reduce valid information access queries. Most importantly, however, it shows how vulnerable a search engine is to a small number of people who wish to manipulate it.
I find it particularly frustrating because concerns about spamming using keyword metadata have led to its almost total removal as a significant feature in search engine rankings. In ‘An end to metatags’ Andrew Goodman writes that web-wide search engines (except Inktomi, to a small extent) no longer take keyword metadata into account when ranking sites. This is unfortunate, because metadata offers as much potential as link terminology to give useful site rankings. A more balanced approach with some attention to metadata and some to link terminology would limit the potential of both for manipulation.
However, while keyword metadata may currently be of little importance to web-wide search engines, it is still important in site search where search engines can be set to take it into account, and spamming is of little or no significance. In this case metadata, based on heirloom database indexing concepts, remains of recognised value.
Search engine bias
Also worrying is the risk of deliberate or accidental bias in search engine results. Susan Gerhart studied the retrievability of controversial websites using three search engines and two metasearch engines and found that for some searches controversial sites were less likely to be found by search engines than organisational sites presenting the ‘sunny side’ of a topic. While Gerhart does not attribute this to deliberate bias, she discusses concerns that certain types of material are less likely to be found by search engines, and discusses strategies to remedy the situation.
3. Keeping heirlooms going – cooperative seed saving
Heirloom fruits and vegetables are being revived and maintained using seed saved by enthusiasts and exchanged by members of a seed savers exchange. Growers choose the healthiest, most productive and tastiest plants to regenerate.
Just as heirloom plants often depend on volunteers and enthusiasts to keep them going, so, many heirloom access methods are being developed by enthusiasts with a passion for information and a passion for the access method they choose. Most of the websites for indexing societies, for example, have website indexes, and a number of librarians have set up classification schemes or have used thesauri and metadata to improve access within a site or portal.
Interestingly, a number of the long lasting sites were created by information technology (IT) professionals. Indexers and librarians often see IT as being the competition, yet this suggests that many IT experts are also aware of the value of manual information retrieval approaches. The W3C (World Wide Web Consortium) website has had an alphabetical index for many years, as has the Unix Manual online. Cooperation with IT staff may be one key to the successful implementation – and, more importantly, maintenance – of heirloom access methods.
4. Developing the future
Many heirloom vegetables that nearly disappeared are now valued by seed sellers for their special qualities, and some will come back into favour in the future as their match with current needs is recognised. Similarly, many heirloom access methods are being incorporated into the most sophisticated of the new approaches to information access. These include topic maps, faceted classifications and information visualisation.
Topic maps
Topic maps are acknowledged to be based on principles used in traditional indexes and thesauri, with inspiration from semantic networks (not all newer methods acknowledge their traditional sources so clearly). They provide information about topics, but go further by clarifying the relationships of these topics to other topics. Thus a person can be precisely linked to their parents and children; a city to its state and country; and an item to its constituent parts.
Topic maps gain their power from the use of controlled vocabularies derived from traditional library tools. In addition, they use computer processing to efficiently integrate legacy data without manual rekeying and reindexing. By combining the effectiveness of heirloom methods, and the efficiency of computer conversion, they offer a way of making heirloom access methods cost-effective and scalable for current needs.
Faceted metadata classifications
Faceted metadata classifications are another development based on traditional controlled vocabulary concepts. Faceted classifications rely on the addition of metadata relating to different facets, or attributes, of a resource. For example, books have the facets author, publication date and publisher, while toys have the facets recommended age, cost, size, and construction material.
Faceted classifications allow users to select their own search path according to the attributes of the item that are of interest to them. If they want to buy a toy and only have $20 to spend, they can first limit the search by cost. On the other hand, if the child must have a Nutcracker Barbie, then searching by brand name should be the first step. A good faceted classification displays the number of hits for each option, thus showing the user whether they need to limit their search further.
One example of a faceted classification on the web is the online proceedings of the DC- 2002 Dublin Core conference. There is also a good example on the facetmap.com site.
Information visualisation
Information visualisation is a technique for visually presenting large quantities of information to users. It has been applied to search engine results, library catalogues, business information and scientific research data. The methods for displaying the data are often compared with traditional maps – they both summarise an enormous amount of information and allow users to identify patterns and trends. Information visualisation techniques work with both structured and unstructured data.
Visual Net software from the Canadian company Antarctica is used to map various types of data to create large-scale browsable maps. Visual Net has been used effectively to display the library catalogue of Belmont Abbey College, North Carolina in an attempt to replicate the concept of shelf browsing. The structure of the data – which forms the basis for the visualisation – comes from the Library of Congress Classification scheme.
5. Heirloom problems
Heirloom vegetables and heirloom access methods have a few drawbacks. Not all heirloom varieties are good for all uses, and they can be expensive to produce and hard to maintain.
The right environment and use
Some heirlooms might work well for one environment – or one use – but not for another. There used to be carrots grown to provide colour for butter – to grow these and complain that they make bad coleslaw would be unfair. Heirlooms can also be quirky – some seeds may germinate slowly, or lead to unexpected traits. Similarly, since indexing is as much an art as a skill, the index one person creates may seem quirky to another.
Thesauri are a valuable tool for metadata creation and searching, but their use on the web has been limited because users prefer searching ‘blind’, and linking a thesaurus to content can be difficult, and thesaurus search can be very slow. For these reasons even sites that do offer thesaurus search tend to do so as an alternative, rather than as the primary search approach. For example, for PICMAN image search you have to select ‘Subject’ to do a thesaurus search (as opposed to the default, which is ‘Basic’) and on HealthInsite you have to go to the ‘Power Search’ page and then select ‘Thesaurus Navigator and Thesaurus Search’.
Cost of production/Centralisation
Classification schemes and indexes on the web can be time-consuming and costly to implement. They also require centralised control for effective implementation, meaning that the work of indexing can’t be distributed to the people who create the content. At James Cook University the centrally-maintained index has been replaced by an index which is automatically generated from metadata added by content creators. This puts responsibility for inclusion in the index on the shoulders of the people creating the content, but also means that the index is dependent on the quirks of term selection from a range of independent indexers. It still needs a central editor, as can be seen by the duplicated and incomplete entries it contains.
Maintenance
Even sadder than seeing heirloom access methods ignored, is seeing them implemented and then dumped. Over the past five years I have taught a website indexing course and written a book on website indexing, and every time I update my notes I have to remove examples of excellent web indexes because they no longer exist. This may be because the site, or part of a site has been removed, or because indexing has been replaced by search. For example, the Case-in-point index (which has been a placegetter in the AusSI Web Indexing Awards, and has been described in print ) no longer exists because the journal it refers to is no longer online. The online index to the University of Texas Policies and Procedures Manual has been replaced by metadata and a search engine. A strong organisational commitment is obviously needed to ensure that adequate time and money are made available for these information access methods to be maintained.
When searching through a list on the OCLC website of sites that use classification schemes I found that most of those on the list no longer exist. The situation is the same with specialised subject classifications. The EELS site , for example, used to organise engineering information according to a classification scheme, but now has a note that it is no longer being updated, and that a new project using harvested records automatically selected for relevance in accordance with the Engineering Index Thesaurus (EI) is being implemented to replace it.
6. Conclusion
Modern hybrid vegetables are well suited for survival, offer a consistent yield, and can be grown in a wide range of climates. They’re tough and reliable. Similarly, search engines are great survivors, they offer a continually updated product and they’re available world-wide. So why go backwards and fiddle around with manual methods best suited to musty volumes?
There are many reasons to maintain heritage vegetables and heritage access methods, and most relate to choice. The choice to be different; to reject the ‘one size fits all’ mentality and to use whatever method works best at the time for a specific need. Other reasons relate to the desire to preserve the past, and to make sure that we leave a diversity of options for generations to come. Finally, there are issues of quality and scale. The sledge-hammer approach is great for breaking stone, but is overkill for cracking a nut. We need to preserve the small against the smothering qualities of the big. If we end up with a monoculture then we risk having no culture at all.
| < Prev | Next > |
|---|









