Update 97 & Database '97

back to e-Publications




Update ('97) Tokyo is the annual Japanese user meeting of Knight-Ridder Information. It was hosted by their Japanese agents KMK DigiTex. It was at the invitation, and expense, of KMK DigiTex that I had the pleasure of attending both Update and Database Tokyo. I want to begin this report by expressing my deep gratitude to KMK DigiTex for their generosity and hospitality, and for the efficiency and grace with which they made all the arrangements both before and during the meetings.

Welcome and Introduction

Kunizo Hirai, CEO of KMK DigiTex

This opening speech was in Japanese.

Keynote Speech

Martin Buerger, Knight-Ridder

This year marks 20 years of DIALOG in Japan. DIALOG's mission is to remain the world's electronic storehouse of knowledge. When Buerger heard that news of the MAID-Knight Ridder merger had leaked and an article had appeared in the Guardian newspaper he found the information he needed in a newswire on DIALOG Web, before he could find the news anywhere else.

There has been an explosion in the number of Internet hosts in Japan. The Internet is growing twice as fast in Japan as in the rest of the world. Knight-Ridder (KR) sees potential for helping those frustrated by the information explosion of the World Wide Web. Companies understand the importance of text and knowledge management in their subsidiaries around the globe. KR is the world's greatest storehouse of this knowledge. Buerger addressed two topics, while admitting that some of the details might change as a result of the planned merger. His first topic concerned the global information specialist market and his second, reaching the end user on the desktop.

KR has introduced Web services for the information specialist. DataStar Web was introduced in December 1996 and DIALOG Web was launched this year. KR offers 25 times more searchable text than the entire World Wide Web and has a continued commitment to delivering Web solutions. If there is a demand for it, a Japanese language interface could be introduced in future. KR will remain the world's greatest information source: 120 new files are to be introduced, 80 of them on DIALOG alone. Each year,1 Terabyte, or 1 billion pages, of data will be added. There will be twice as much information in 5 years time as there is today. KR intends to be the premier source of information in 5 areas: news, business, intellectual property, biomedical, and scientific and technical information.

KR offers the full text of more than 150 newspapers around the world. Following a deal between KR, Dow Jones and FT information companies, World Reporter will become a global news source later in September 1997. The data collection centre is in London and many of the staff are linguists. Another service is KR Investment Research, with images.

Quality of service is important for both DIALOG and DataStar: there is a highly skilled network of customer service and training staff. The Web is a new vehicle for expert training: the "Crossroads" web site allows the customer to share in real-time discussions, browse for industry news and attend training courses.

KR serves the end user in 120 countries from 50 offices. The service will get larger after the merger. It ties into a company's intranet at a sensible price. The end user wants to go online and find reliable answers quickly and easily. Two services have been introduced: DIALOG Select for business users, released last spring and DIALOG@Carl for academics, with access to 300 databases on DIALOG. DIALOG Select will be released in Japan in September 1997. Another new service due later this year is KR@Site, the new interface to DIALOG OnDisc. KR's commitment to intranets goes beyond the Web. The Lotus Notes service to Ernst and Young has 30,000 users.

Customers need pricing programmes beyond the pay-as-you-go model. They want predictable price plans such as KR's "search builder" plans in which a monthly budget is set. Information professionals can set up joint plans with KR.

There are lots of changes going on and we are sailing in choppy seas. KR will focus on its core market, the information specialist. Jeff Galt was meeting with MAID in London as the Tokyo meeting took place and there is a shared KR-MAID commitment to serve the information professional market. The commitment is

  • to continue to sustain the company's leadership and content
  • to add value with links across databases (including addition of MAID's InfoSort)
  • to continue to enhance products and technology
  • to provide quality training and support, and
  • to further links to information specialists with Crossroads on the Web.

Buerger concluded by saying how much he valued the support of KR customers over 20 years and he looked forward to producing solutions for the next 20 years.

Other Keynote Speeches

After the KR talk, the conference divided into parallel sessions. There were keynote speeches in Patents and Intellectual Property; Medical, Science and Technology; and Database Technology (two concurrent talks). Presentations given in English were simultaneously translated into Japanese but papers in Japanese were not translated. This was a reasonable approach since amongst the 400 attendees there were very few of English mother tongue. I did the "medical" keynote speech, entitled "An Overview of Combinatorial Chemistry and Molecular Diversity". Randall Marcinko, President of Information Canada, Ltd., gave a database technology keynote entitled "The Quest for Information - Search Engines, Indexes and Databases - Where is it Going?" I was not able to hear this talk, since it clashed with my own, but I have summarised his audio visuals below.

An Overview of Combinatorial Chemistry and Molecular Diversity

Dr. Wendy Warr, Wendy Warr & Associates

This was an abbreviated and simplified version of a chapter I have written for the John Wiley Encyclopedia of Computational Chemistry. I can supply the text to any interested reader of this report. Please contact me by email.

The Quest for Information - Search Engines, Indexes and Databases - Where is it Going?

Mr. Randall Marcinko, Information Canada Ltd.

This paper was subtitled "Why has No-one Indexed the World Wide Web (WWW)?" Mr. Marcinko asserts that because the WWW has not been organised by information professionals, the ability to navigate the WWW and find information with pinpoint accuracy is very limited. Information creation and information management have been, and continue to be, influenced by the World Wide Web (the Internet.) The WWW has crept into all links of the information chain. It provides a new medium and a new format for displaying and storing information and is a new means for conducting "for-fee & for-free" transactions (E-commerce). The Web provides new tools for the end user to become intimately involved with information retrieval, a new venue for advertising, and a plethora of new threats.

The Web was conceived and deployed by computer techies. There was no preparation for information retrieval when the Web was created. Devices such as YAHOO were created as tools to assist the techies. The need to seek out and retrieve data (with precision, recall and specificity) was an afterthought and was not part of the original design; indeed the World Wide Web was designed with no concern for indexing, archiving or standardisation.

How is information located and navigated on the Web? Search Engines are used to locate Web sites containing information of interest. These Search Engines possess the ability to find a word or groups of words extremely quickly, using very powerful computers; however, even advanced linguistic and artificial intelligence is inadequate for information retrieval. Some common Search Engines are: InfoSeek, YAHOO, HotBot, Magellan, AltaVista, Excite, Lycos, DejaNews, Web Crawler, Autonomy, and CyberHound.

Mr. Marcinko then considered Search Engine quality. Are Search Engines "great"? Although they access 50-100 million WWW pages and are very fast, they ignore long-established principles of indexing and abstracting, they have poor specificity, they are inadequate for many types of research, personal, academic and scientific, and they have unacceptable recall.

With a Search Engine can you locate all companies with earnings greater than $50,000,000? Can you locate all drugs that contain diaminobenzene? Can you find all full-text articles that discuss the use of drugs for curing breast cancer? The answer to these questions is emphatically "no!" Mr. Marcinko gave several examples to prove his point. There is a very real need for improvement.

Several years ago, end users asked information professionals to help make information available on the net. They asked loudly and repeatedly. When nothing was done, the computer "techies" created products themselves. Despite being inferior from the information science perspective, Search Engines and other tools have become very pervasive.

Internet indexing must

  • improve WWW navigation
  • define standardised units of information
  • improve specificity, precision and recall
  • address document archiving
  • address document authorship
  • address document dating
  • use interlinguas for cross-language searching
  • address cross-culture searching and
  • support natural language searching

Medical Information and Drug Databases

After the four keynote speeches, the conference split into five tracks and I attended the sessions on medical information, plus related papers by Bonnie Snow and Sharon Bellard which were actually in two of the other tracks. The presentations by Ms. Myuki Matsumura for Adis International Ltd. and by Mr. Yoshihiro Kondo for J. R. Prous S. A., were in Japanese. Regrettably this means that I can only report accurately on the PJB Publications presentation which was in English. My apologies to the other vendors for any apparent bias. Prous Science had added structures for more than 85,000 products to Drug Data Reports, and Drugs of the Future is now available with full monographs, structures, schemes of synthesis, tables and figures, and context tables.

MediConf - a Database for the Pharmaceutical Industry

Dr. Rudolf Steck, Fairbase Database Ltd.

MediConf is a database on coming events in medicine, healthcare, pharmacology, and biotechnology, including all sorts of other topics such as medical computing, high throughput screening, and combinatorial chemistry. MediConf serves the pharmaceutical industry primarily. Hoffmann La Roche, Amgen and other companies have the database in-house. Online it is exclusively on DataStar.

Over 7500 coming events for 1997-2007 are listed and over 35,000 past events since 1993. Scientific events and annual meetings of professional societies are listed. For medical and pharmaceutical trade exhibitions, the potential number of visitors and details of exhibit space are recorded. The database covers more than 100 countries.

Users can do a "quick and dirty" search, or a more exact search, on subject, location, and date. Thus, you can search for "neurosurgery USA 1998" or you can do a more precise search by looking for USA in the country field etc. Subjects can be searched by medical speciality such as urology, or cardiology, or by free text for items such as combinatorial chemistry or BSE. Location searches may be by region (e.g., North America), by country (e.g., USA), by state (e.g., Massachusetts), by city (e.g., Boston) or by place (e.g., Hines Convention Center). Users can search on year, month and day, or they can use the DataStar commands for a limited time range.

The database is used by the pharmaceutical industry in long range planning, international marketing, sponsoring, science, clinical research, and drug approval and regulations. On DataStar Web, users can download in HTML format and use links to the Web pages of conference organisers, conference cities and international hotels.

The Dictionary of Substances and Their Effects

Dr. Sharon Bellard, The Royal Society of Chemistry

The Royal Society of Chemistry (RSC) is an international publisher and it runs library, document delivery and enquiry services. All RSC primary journals are now available electronically through the CatchWord system and through the OCLC First Search Electronic Collection Online (ECO). Other distribution methods are being explored. Electronic-only journals are planned for 1998 and the review journals will be published electronically in 1998. RSC produces specialist databases in various media. Analytical Abstracts, Chemical Business Newsbase; Chemical Engineering and Biotechnology Abstracts and Chemical Safety NewsBase (CSNB) are available on both DIALOG and DataStar.

Volume 1 of the Dictionary of Substances and their Effects (DOSE) was first published in 1992. By 1994 there were 7 volumes. DOSE was launched on DIALOG in May 1997. DOSE brings together essential data from the world's chemical literature on over 4000 chemicals which have been studied for environmental impact or toxicity, including those on regulatory lists such as Europe's Black List and Grey List of Dangerous Substances, the UK's Red List, priority pollutants from the USA and Canada, and the pollution list from Germany.

Data include mammalian and avian toxicity, ecotoxicity and environmental fate, occupational exposure, physical properties and a full list of literature references. Identifiers include chemical name, synonyms, CAS Registry Number, uses and occurrence. Regulatory information includes UN Number, NIOSH (RTECS) Number, EINECS/ELINC Number, threshold limit values, risk and safety phrases, HAZCHEM code and other UK legislation.

Dr. Bellard compared DOSE to its competitors. CHEMTOX has health and safety and toxicity data for 10,000 chemicals but no ecotoxicity or environmental information. HSDB gives toxicity data and environmental effects for 4,500 chemicals but the amount of information included might overwhelm the occasional user. RTECS includes toxicity data but no ecotoxicity or environmental information. It has US regulatory information and has exposure limits for various countries but it is not updated frequently. The data is not evaluated; DOSE has evaluated information. MSDS has health and safety data but no toxicity or environmental information. Toxline and other bibliographic databases serve a very different purpose from DOSE. DOSE is a unique, single source of toxicological and environmental data on chemicals that have adverse effects on life forms and the environment.

Enhancements to the Pharmaprojects Online Service

Mr. Alan Wilkinson, PJB Publications

Pharmaprojects Online on DIALOG and DataStar consists of drugs in development, launched products and discontinued products. Three key advantages are instant access, weekly updating, which offers accuracy and currency, and the fact that complicated searches can be constructed quickly and easily. Pharmaprojects contains more than 21,000 drug entries, about 6400 in active development (i.e., preclinical to the market place), about 13,400 discontinued (showing what has been tried and failed) and about 1200 widely launched in 28 major pharmaceutical markets, or with no further launch plan. Pharmaprojects tracks all compounds with human therapeutic use from preclinical studies through phases I, II and III clinical research to registration and product launch or discontinuation. Certain investigational drugs are also included.

Each entry in Pharmaprojects is presented as a concise monograph providing an overview of a drug's current status. Major information sources are direct communication with companies, conference attendance, scanning abstracts and journals for conferences not attended, information from the Scrip editorial team (which is verified by the pharmaceutical company itself before the data is put into the database), company and analyst reports, and patents (with which a patent expert helps). Much new information comes from the pharmaceutical companies themselves, either directly into the Pharmaprojects office or via the Scrip editorial team. It is always verified and then checked by a senior editor before entry into the database.

The Pharmaprojects Database is searchable using 25 search fields on DIALOG and 22 search fields on DataStar. The traditional search fields comprise drug name, synonym, and chemical name; CAS Number; molecular formula; company (originator and licensee); country; pharmacology; therapy; development status; patent information (priority dates, countries where patented, update history field); and information updates. Linking strategies, introduced in 1996, ensure the specificity of searches, avoiding false positive or false negative results. Link fields are the combination of originator, country and development status and the combination of therapy, pharmacology and development status.

The 16-line limit to the text field has been removed. Pharmaprojects now contains around 50% more data than one year ago. To improve readability, the text is now split into sections: marketing (filings, launches, agreements etc.); clinical (human testing); preclinical (animal studies, toxicology, pharmacology); and licensing (availability).

The CD-ROM version and hard copy have more than 900 company profiles. These give a brief description of the company; operating locations and subsidiaries; joint ventures; agreements; a financial summary and licensing contacts. There are 195 therapy profiles comprising therapeutic definition, current therapy, incidences and market values, and research trends. PJB Publications is talking to KR Information about appropriate ways of putting the company and therapy profiles online.

The new drug rating system, which was introduced in March this year, is an objective system based on data contained within Pharmaprojects database. It is designed to provide an "at a glance" guide to the potential importance of a particular drug. It rates product novelty (on a scale of 1-4), potential market size (1-5), and speed of development (0-3). A total score is also stored. There are five new fields: novelty, market size, speed of development, total score (out of 12) and a field for all 4 ratings. The product novelty rating uses Pharmaprojects' pharmacology and therapy classifications. It identifies which particular therapy/pharmacology combination is the most developmentally advanced. The most advanced product scores 4 points; the second, third and fourth products score 3 points and others are "Me-Too" products scoring 2 points. Novel formulations score 1. More than three compounds launched with the same therapy/pharmacology combination are classified as "E" for established. The score "N" is applied where novelty is not applicable, e.g., pharmacology not established. "U" means no progress beyond preclinical.

For scoring market size, all 195 therapeutic categories are rated from 1-5 based on worldwide sales. Sales of more than $10,000 million give a score of 5; $5001-$10,000 million = 4; $2001-$5000 million = 3; $501-$2,000 million = 2; less than $500 million = 1. The figures are checked by independent analysts and revised annually.

To calculate speed of development, for each therapeutic category the average number of months is calculated for compounds to pass from one clinical trial phase to the next. Individual drugs are rated as follows: 3 = faster than average; 2 = average for this therapeutic category; 1 = slower than average. Averages are recalculated weekly and new speed ratings are given for those drugs changing status. "N" means that no data is available for the average while all preclinical drugs are rated zero until they enter Phase I trials. Launched drugs are rated "E" for established.

Pharmaprojects Online can now answer questions such as the following ones. Which are the most developmentally advanced aldose reductase inhibitors for the treatment of symptomatic diabetes? Which cardiovascular drugs in Phase III trials are showing a faster than average speed of development? Mr. Wilkinson tabulated the answers to these questions (in Japan) with the appropriate data, while admitting that the tabulation facility is not part of Pharmaprojects Online.

In the case of symptomatic antidiabetics (the pharmacology) which inhibit aldose reductase (a therapy) he tabulated drug name, manufacturer, phase and novelty rating, and Ono's Epalrestat came top with a novelty rating of 4. In the case of cardiovascular drugs in Phase III, he tabulated drug name, manufacturer, phase and speed of development, and found that Levcromakalim, Defotilide and Otenzepad were worth inspecting.

The Pharmaprojects "Non Subscriber" rate has been removed. Now there is only one rate for all users. In the future, there will be new search fields (e.g. compound source, indication, route of administration); additional cross-linking of fields; enhanced detail, e.g. in CAS numbers; more analysis of data (e.g., drug rating) to add value; and a Pharmaprojects "Members Only" site. [http://www.pjbpubs.co.uk/pharma/top.html]

Tracking the Competition in Drug Pipeline Directories

Ms. Bonnie Snow, KR Information (co-authored by Anne Morisseau)

Ms. Snow's paper covered Adis R&D Insight, Prous' Drug Data Report, IMSWorld R&D Focus, NDA Pipeline (from F-D-C Reports), Prous' NME Express, and Pharmaprojects. The handout included masses of information including search outputs.

Users might search product names to assess future competition or to prepare for more effective marketing. Product "names" include chemical names, lab codes, generic names, brand names and CAS Registry Numbers and molecular formulae. A drug pipeline database should offer all these and most of them do, although they differ in the way that they transcribe names. Ms. Snow picked clodronate disodium as an example. R&D Insight gives the drug name clodronic acid, with clodronate disodium as a synonym, Drug Data Report calls it C12MDP with clodronate disodium as the generic name, R&D Focus has clodronic acid and clodronate as drug names and does not list clodronate disodium at all, NME Express does not have a record for this drug, and NDA Pipeline and Pharmaprojects give clodronate disodium as the drug name. In contrast, CHEMSEARCH has no brand names and has the lab code missing for clodronate disodium.

Users might search company names to gauge earning potential or to audit research efforts in the context of the competition by periodic benchmarking. Originators and licensees need to be distinguished: NME Express and NDA Pipeline do not do this (although the latter does distinguish occasionally, but not for all countries). Patent assignees should be listed and the user must beware differences in parent-subsidiary indexing. IMSWorld R&D Focus is particularly good on company names. Pharmaprojects usually prefers parent company names. In Adis R&D Insight parent and subsidiary are given for the originating company but not necessarily for the licensee.

Users might search drug categories to identify innovators, to focus drug discovery efforts for a more rapid return on investment or to extend the commercial life of a patent portfolio. Drug categories include broad therapeutic classes, mechanism of action (pharmacology) and indications (e.g., "heart failure"). Codes are more efficient than key words for broad categories such as "cardiovascular". Ms Snow compared this with the EXPLODE feature in Medline or EMBASE. In Pharmaprojects, codes are "pre-exploded" so "C" without truncation will retrieve C1B, C2B2 etc.

There are various standards for therapeutic category indexing. EPhMRA is used in Adis R&D Insight, IMSWorld R&D Focus and Pharmaprojects but the implementations vary. Adis R&D Insight alone uses WHO ATC (Anatomic Therapeutic Scheme). WHO ATC has some categories that EPhMRA does not have. This means that Adis R&D Insight is a good source to use for vinca alkaloids, Chinese medicines and so on. Prous has developed its own pharmacological/therapeutic numeric classification scheme. Pharmaprojects consistently and thoroughly indexes therapeutic categories and pharmacology but the user must take care because Pharmaprojects has an extended version of the EPhMRA codes.

There is only one mechanism of action for a drug but the terms used to describe it vary from file to file. Mechanism of action is sometimes incorporated into therapeutic code code definitions. NDA Pipeline includes action terms in its descriptive text as well. Pharmaprojects is the only file with a hierarchical scheme for mechanism of action.

Pharmaprojects does not have controlled vocabulary for indications. R&D Focus has a separate indications field but not all the indications will necessarily be there. IMSWorld has now begun co-ordinating indications with country and status. Prous mixes pharmacological and therapeutic categories and indications are hidden in the record text. There is only one code per record in NME Express. NDA Pipeline has a separate record for each indication: a free text strategy is recommended here. Ms. Snow recommends Adis R&D Insight as first choice for indications, then IMSWorld R&D Focus and Pharmaprojects.

Users might search development stage and regulatory status indicators to plan for successful and strategic product launches and to verify time to market projections. Except for NME Express, all the pipeline databases have a way of searching for the highest stage of the development cycle that a product has reached. Drug Data Report is the only directory which identifies drugs as early as the biological testing phase. Pharmaprojects co-ordinates pharmacology with status, stage with country, and company name with highest development stage. Adis co-ordinates stage with target market and indications. Country and stage are consistently linked in R&D Focus records but indications are not always present in the same field. NDA Pipeline concentrates on the US market and gives FDA milestones. In all files, users should beware variations in terminology: "launched" versus "marketed" and "registration" or "registered". To achieve an accurate answer on development stage, the secret is to search more than one file. R&D Insight, R&D Focus and Pharmaprojects are best in terms of handling drugs all the way from preclinical to launch. Prous is less good on commercialisation.

Users might search target markets to anticipate competitor challenges or to identify windows of opportunity. Three files are useful here. Pharmaprojects provides development status updates for 28 countries. R&D Focus includes status information for 107 countries, of which 59 or less seem to have a significant number of records. R&D Insight lists 115 countries.

Ms Snow continued by discussing some value-added features offered by a few of the directories. Three out of the six assess commercial value although it is sometimes buried in the text and currencies vary. Adis uses Lehman Brothers' figures; IMSWporld has a broader range of sources; Pharmaprojects sometimes gives figures that have previously appeared in Scrip. Pharmaprojects now has a rating field. Licensing information is most easily located in R&D Insight, R&D Focus and Pharmaprojects.

Patent information is available in four of the six directories, and in Drugs of the Future (not in Ms. Snow's original list of six). R&D Focus often gives a good patent summary but the numbers are buried in the text so the MAP feature for cross-file searching cannot be used. Pharmaprojects isolates one patent number which can be MAPped. Drug Data Report has the most patent information. Drugs of the Future has patent information in the bibliographies.

Citations from primary sources other than patents are embedded in the text of records found in Pharmaprojects, R&D Focus, Drug Data Report and NME Express. R&D Insight, Drug Data Report and Drugs of the Future have lengthier bibliographies. Abbreviated references in the other directories may be verified in fuller format by recourse to R&D Insight and Drug Data Report. Abstracts may then be found in MEDLINE, EMBASE, BIOSIS Previews, Derwent Drug File, SciSearch or International Pharmaceutical Abstracts. IMSWorld now includes EMBASE numbers. R&D Insight is especially notable for its bibliographies and links (MAPping) to LMS Drug Alerts.

Ms. Snow gave some examples of the advantages of MAPping from all six drug pipeline directories. She showed how many literature references can be missed if MAPping is not used. Her lengthy section on the RANK command had to be omitted because of lack of time.

Drug Patents - Comprehensive Searching

Ms. Bonnie Snow, KR Information (co-authored by Anne Morisseau)

This paper will be of most interest to those familiar with online searching, its commands and its jargon. It was based on the following problem. A pharmaceutical company hears that a new AIDS drug has been approved by the FDA within days of application, although no details were mentioned about the chemical, the chemical structures, or the company involved. The searcher's company has its own AIDS drug in Phase III clinical trials, and numerous U.S. and foreign patent applications filed. The task is to find information about this newly-approved drug sufficient for pharmacological structure and clinical comparative analysis, in order to determine the level of threat to the searcher's company.

The San Francisco Chronicle online is a starting point. A possible hit "New AIDS Drug Gets FDA OK In Record Time" is found. The searcher uses KEY WORD IN CONTEXT to reveal a "window of words" around selected concepts and finds a news story about Merck & Co.'s Crixivan. Following this up in CHEMSEARCH locates the alternative nomenclature needed for a comprehensive strategy. It is noted that a salt is involved (C36H47N5O4.H2O4S) with CAS RN 157810-81-6 and a long chemical name. MAP RN "extracts" and saves the CAS RN. MAP SYRN "extracts" and saves synonyms as well. CA Search provides Registry Number access to a selection of chemical patents in addition to journal articles. RANK compiles a list of patent assignees. (RANK is cheap/free). However this list does not include Merck, so where are the Merck patents?

Derwent WPI is now used but direct access to the patent literature armed only with drug names is usually futile so a combination of name and number terms from the CA SEARCH output is used. The results are RANKed again but Merck is not one of the listed terms.

Pharmaceutical subject speciality files can also provide patent information. The index is checked and a search is done for the name Crixivan in IMSWorld Patents International. RANKing to preview patent assignees is carried out. A user-defined title can be added to RANK displays. Rank number 1 has 35 items for Merck & Co. The search is narrowed by Patent Country (US or Japan). One record shows CAS RN 150378-17-9, a different CAS RN from the earlier one. This is used as a replaced registry Number in CHEMSEARCH. and a search of CA SEARCH followed by RANKing has 29 terms for Merck and Co., Inc. and 1 for Marck and Co., Inc. These patents are now listed.

Pharmaceutical subject speciality files make relevant patent identification easier. For example, in addition to IMSWorld Patents International, IMSWorld R&D Focus often supplies patent numbers. A free-text strategy is not recommended. The search for Crixivan is limited to drug name field and Profile (rather than News) record is specified. Ms. Snow showed a record for Crixivan, with two CAS RNs, development history (including the date of marketing in Japan) and a patent summary. Note that MAP PN capability is not available in IMSWorld R&D Focus.

Pharmaprojects also provides patent information in many (but not all) records. Ms. Snow showed a record for Crixivan with launch dates for Japan. MAP PN capability is available in Pharmaprojects. Since 1984, new drug approval records in DIOGENES quarterly NDL list have cited patent information. A MAP PN capability was also implemented in DIOGENES in 1997. A bonus is that patent expiration and US market exclusivity dates are included.

Drug Data Report is another excellent source of patent information for drugs in the pipeline. It also has compounds at the biological testing stage and now has structure images for more than 98% of records. However, this file is no good for regulatory status. Ms Snow displayed a lot of information about Crixivan and said that 100 additional references to the primary literature, including peer-reviewed journal articles, conference papers, news releases, company reports etc. are included in the Drug Data Report record for this drug.

The searcher can take advantage of the multifile search capability of DIALOG OneSearch. Ms. Snow listed a number of files. The patent number is MAPped from Pharmaprojects, DIOGENES and Drug Data Report. Patent duplicate identification (added to DIALOG in 1996) is carried out. Records with images from Derwent WPI are isolated. Analysis of the images accompanying WPI records uncovers various Markush structures.

Drugs of the Future is a new source of structure and synthesis data for drugs in the pipeline. Ms. Snow showed search output for the name Crixivan, including a number for cross-reference to Scheme of Synthesis, numbers for exploring structure activity relationships via context tables and a cross-reference to the primary literature (in this case a patent). Ms. Snow showed pages and pages of output text but did not show the 169 additional references to the primary literature, including peer-reviewed journal articles, conference papers, news releases, company reports etc., which are included in this record. She showed how to search the Scheme(s) of Synthesis and displayed some reaction schemes. She also showed some of the structures displayed after a context table search for the number that references HIV protease inhibitors launched and in clinical trial.

Instead of the San Francisco Chronicle, pharmaceutical subject speciality files would have been a good place to start a search for Crixivan. Ms. Snow gave an example of broader context patent strategies. The Pink Sheet had the story earlier than Scrip. The IAC Newsletter database also useful too, as is DIOGENES as a news source. Free patent services on the Internet offer limited search facilities.

Why should someone use KR Information for drug patent information? KR has a significant patent collection. There are 15 databases that are either devoted to or provide significant levels of patent coverage. An additional 50 databases (according to their blue sheets) provide some level of patent coverage information. Several of the drug pipeline directories, IMSWorld Patents International, and even DIOGENES, have records for new drug approvals which include patent information. KR offers multidatabase search capabilities. The major patent collections may be searched either in DIALINDEX, or in ONESEARCH. This feature is often desirable, e.g. for searching on Japanese patents, since there may be years of coverage or content differences among the databases. System features (MAP, RANK, SORT, IDPAT, etc.) allow the searcher to navigate with great ease and speed through the database library. Document delivery (SourceOne/UnCover) is available and the current awareness search capability, ALERTS, will assist users in keeping up-to-date.

Knight-Ridder New Product Information

The day's technical proceedings ended with product reviews of DIALOG Web, DataStar Web, KR@Site, KR SourceOne and DIALOG Select. I chose to hear Catherine Fitzgerald talk about KR@Site. This product gives intranet access to KR OnDisc databases.

Benefits for the information professional are

  • a fast, powerful engine
  • support for multiple platforms
  • fixed price searching
  • support of complex Boolean searching
  • an essential component of an enterprise wide solution
  • no charge for the change from CD to intranet format.

Benefits for the end user are

  • desktop access through a web browser
  • an easy, point-and-click interface
  • easy integration unto existing Web applications
  • access to previously restricted resources (no knowledge of command language needed).

The advantages of an intranet application are as follows. It allows access, via a Web browser, from multiple platforms. Access speed is managed by the customer. The application can be integrated with other intranet applications. Security risk is minimised because the information resource is maintained behind the firewall. The user has access to centrally controlled resources.

KR@Site makes the existing menus on KR OnDisc products searchable via a Web browser. The interface has Boolean searching and browsable indexes. Search history is stored, sets can be recombined, searches can be saved, and hit terms are highlighted. The product uses the OnDisc LAN pricing model, based on the number of simultaneous users.

Menu boxes are specific to the database being searched. Ms. Fitzgerald used MEDLINE and the Federal Register as examples. Only three of the OnDisc databases will not be available in KR@Site. They are OHS MSDS, Physical Science Encyclopedia and one other she could not remember. Alpha testing began in February 1997. Beta testing began in September. It is hoped that the product will be released on November 1.

Reception Party

The conference lunch was an introduction to Japanese custom and efficiency. As we filed in we were issued with a heated Japanese lunch in a box, complete with chopsticks and a packet of cold tea with a straw. None of the usual Western nonsense of getting in line for the buffet and waiting your turn for the salad tongs. All 400 of us were fed in well under an hour. I even had ample time to buy new batteries for my camera. Incidentally, I wonder how many British hotels could have sold me the special batteries I needed, or could have answered questions about batteries in a strange language? No problem at the Imperial, Tokyo - the basement has a gallery of expensive shops and I could have bought a television or a video camera or an evening dress as easily as a battery.

The day ended with a sumptuous reception. There are few hotels in the West that have facilities to rival those of The Imperial Hotel in Tokyo. The reception took place in an elegant ballroom with the largest floral centrepiece I have ever seen. A splendid, westernised, buffet was spread on four tables around this. Waiters hovered with trays of drinks - no expense was spared on this. Inevitably, though, we had a word from our sponsors. The most impressive part of this was the translation - not just the linguistic expertise but the translator's ability to remember several long sentences and complex arguments before the speaker stops and awaits a perfect rendition in Japanese. Towards the end of the proceedings all the speakers were led onto the platform and we were asked in turn to introduce ourselves to the audience and be thanked. Many photographs were taken!



I am not sure that it is fair to compare Database Tokyo with the International Online meeting in London, since Online '97 is pan-European, if not international, whereas Database Tokyo is specifically aimed at a Japanese audience. However, one cannot help drawing comparisons. Database Tokyo is seemingly much smaller (260 booths and 101 companies) but it attracts nearly 30,000 attendees over 3 days. Moreover, the attendees are not there to meet old friends and exchange gossip in the aisles. This is serious business.

On the other hand, some aspects of the exhibition booths are rather unusual from the American business point of view. Many Database Tokyo booths (but not, I hasten to add, KMK DigiTex's) have a group of young ladies identically dressed in costumes designed to expose a great deal of the lower part of their anatomy. These "booth bunnies" are outliers employed to smile at the passing punter and hand out literature. They, and the other outliers, are in fact particularly skilled at handing out literature and one is so taken by their friendly bows and smiles that one hesitates to refuse the literature and ends up even more weighed down than at London Online. The other unique feature of some Database Tokyo booths is the "show" or "oration". At certain specified hours a demonstration takes place in the booth and an orator with a microphone fixed to her head reads from a prepared text as the product for sale is shown. (I use the word "her" advisedly, not out of ungrammatical political correctness - the orator does tend to be female and under forty.) Well, when in Rome do as the Romans. The method seems to work well for the Japanese. Large numbers of people gathered around each booth at demonstration time. In contrast, the American-style Lexis-Nexis booth, with little Japanese on the back-drop, no booth bunnies, and no orator, seemed to attract very few visitors.

As I entered the hall a NiftyServe carrier bag was thrust into my hand by a charming young lady. I then wandered around bewildered for a while. Japanese online hosts are not necessarily familiar to Europeans and Americans. Also, a product may appear on more than one booth yet I could only locate a few booths with scientific information systems on show. Fortunately within an hour of my arrival I was helped enormously by an unknown gentlemen with a French accent who offered me an English version of the exhibitors list and profile, printed off a European Union Web site. He apologised for any peculiarities in the machine translation but who was I to complain?

The exhibition was divided into 4 areas, the general database zone (by far the biggest), the database related systems and services zone, the mapping database zone and the patent database zone. There seemed to be a higher proportion of GIS and mapping applications than at other exhibitions. Machine translation was also (not surprisingly) better represented than at general online shows in the West. Some companies were offering machine translation as part of another service. Other companies specialise in machine translation. For example, Logo Vista Corporation was showing its "E to J", English to Japanese translation software which comes in different versions for Windows, Macintosh or Internet. The most noticeable feature of the whole exhibition was the impact of Internet and intranets. The Japanese may have been slower to embrace the Internet than the Americans but they are certainly making up for that now. Several companies were showing interesting news services but I will concentrate here on scientific and technical information.

Digital Ware have a commercial Web site Netscience selling more than 3000 scientific software products. They use a "One to One" marketing theory, matching software to the buyer's requirements. An easy ordering procedure is offered. Netscience can also be contacted by email.

The Electronic Book Committee of Japan lists 230 titles available for the Sony Data Discman and not a few of them were being tried out by visitors, adding their sound effects to the noise of the orators at other booths. It was interesting to see the Dictionaries and other aids for European languages on show.

G-Search gives access to 160 Japanese databases and 850 overseas ones. They are agents for the Profound database.

Heiwa Information Center Co. Ltd. (HIS) was showing the drug database EDIS for intranet which was also on the Super K. K. booth. They have a number of PreWise products where "PRE" stands for "Patent Retrieval Expert". PreWise for World is for Derwent WPI. Also available are PreWise for Web and PreWise SDI Manager for intranet. The products PreWise Press/DB and EDIS use the HIC database engine. HIC's Happiness products (Future/Happiness Light for Internet and Happiness Base) use Japanese language processing technology.

The Japan Association for International Chemical Information and Japan Science and Technology Corporation (JST) Information Center for Science and Technology (JICST) had a joint booth showing SciFinder, STN Easy and other products of STN International. JICST Internet services are JOIS, STN Easy, NLM and JBBS . A Japanese to English machine translation service for STN was also on show.

The Japan Information Processing Service Co. Ltd. Database Group and USACO Corporation shared a booth. There is an English version of USACO's Web site. The database group was showing JAPIC adverse drug reactions, JAPICDOC package insert drug information and other medical information, and DRUGBASE which includes ARIS. Questel.Orbit products, including Imagination, were also on show.

USACO imports and/or provides scientific and technical journals; books and back files; PC hardware, software and networking services; bibliographic, numeric and chemical databases; microform products; computer software; online services; management of international conferences; information and database consulting services; direct mail services; and exhibition services for publishers. USACO represents:

  • Academic Press
  • Aldrich Chemical Company
  • ASM (American Society for Microbiology)
  • Beilstein Information Systems, Inc.
  • The British Library Document Supply Centre,
  • Bowker Saur (A Reed Elsevier Group Company)
  • BNA (The Bureau of National Affairs)
  • Chapman and Hall
  • Churchill Livingstone International
  • Counterpoint Publishing
  • EBSCO Publishing
  • Excerpta Medica (A Reed Elsevier Group Company)
  • Information Handling Services



  • ISI (Institute for Scientific Information)
  • Japana Centra Revuo Medicina
  • JICST (Japan Information Center of Science and Technology)
  • Kitasato Institute OVID Technologies, Inc.
  • MDL Information Systems
  • Niles & Associates
  • Oxford University Press
  • Prous Science Publishers
  • SilverPlatter Information
  • Thomson Science and Professional
  • U-gate, and
  • UMI (University Microfilm International)

Together with handouts on their booth were photocopies of Fujitsu literature for Midas-Reagent and Oxford Molecular's RS3 Discovery. I found it interesting to see one company representing the interests of companies who most certainly would not share a booth in the West!

KMK DigiTex were showing DIALOG and DataStar services including DIALOG Select, DIALOG Web, DataStar Web DIALOG World Reporter.

Kinokuniya Company Ltd. was showing Book/Web which offers 3.2 million titles (Japanese and foreign) on the Internet. There is a simple ordering system. This booth was also advertising OCLC First Search, SilverPlatter's Electronic Reference Library (ERL) and CD-ROMs from ADONIS, Bowker Saur, Chapman & Hall, KR OnDisc, ISI, SilverPlatter and UMI. The company also supports online information services: ASSIST, BIGLOBE, DIALINE, G-Search, JOIS and ELNET.

Maruzen Co. Ltd. have "Maruzen for book" a book database of 1.5 million foreign and Japanese books which can be ordered online on the Internet. There is an English version of the Web site. They also offer Internet shopping. They represent various document delivery services, SilverPlatter ERL and Cambridge Scientific Abstracts.

The National Center for Science Information Systems, NACSIS, offers an electronic library service, NACSIS-ELS, which digitises the pages of the journals of Japanese academic societies and is on the Internet. There is an English language version of the Web site. They also run an interlibrary loan service, NACSIS-ILL.

Nichigai Associates Inc. offer publishing, electronic publishing, online information and consulting services. They have a long established database service Nichigai ASSIST. Nichigai/Web (which is new) has simple interface menus for book information, and online information, and a dictionary of computer and technical terms.

Nissho Iwai Corporation were showing IHS Health Products CD-ROMs to help users stay on top of the regulations:

  • The Food and Drug Library
  • The Food Regulation Library
  • The Medical Devices Library
  • The Healthcare Facilities Library
  • The European Union Pharmaceutical Library
  • The Medicare/Medicaid Library and
  • The Physician Medicare/Medicaid Library

Super K. K. and Yakugyo Jiho Co. Ltd. shared a booth showing EDIS, the Ethical Drugs Information Service for WWW and Drugs in Japan (OTC Drugs). EDIS requires Netscape Navigator or Microsoft Explorer, the Adobe Acrobat reader and something called "ChemNet". I asked to see ChemNet and discovered CambridgeSoft's ChemFinder, ChemOffice etc. on the menu. I have visited the Web site (Japanese only) on returning to my office and found that the chemical structure icon could not be activated. EDIS offers drug product information, dosage photographs etc. on Internet and intranets. There is also an SGML data preparation support system for pharmaceutical manufacturers.


I am grateful to Alastair Warr for helping me to install and configure the Chinese character software RichWin for Internet and helping me visit the various Japanese web sites mentioned above.

This page updated on 23rd July 1999