Frequently Asked Questions About
The Florida State University Method
for Obtaining "NRC-style" Publication and Citation Counts



Why are "year of publication" and "year of entry into the database" sometimes different, and why is the difference important?

The Institute for Scientific Information (ISI), which produces and maintains the Web of Science, enters publications into their database as they receive them, and not all scientific publications are actually available "on the newsstands," as it were, on their nominal publication dates. The December 1992 issue of a journal, for example, may not reach subscribers (and the ISI) until early in 1993. It will therefore be entered into the ISI database in 1993, but with a publication date of 1992.

The difference is important for two reasons: First, the 1995 NRC report was produced from a set of data truncated on the last day of 1992, so any late-1992 publications not actually entered into the database until 1993 were excluded. Second, the "Year selection" option on the "Full search" page of Web of Science selects by date of entry, NOT date of publication (so that it can also be used to select entries from "This week" or "Latest 4 weeks"). That's why, if you attempt to obtain a list of some author's 1992 publications by checking "Year selection" and "1992," then doing a "General Search" on the author's name, you may find a few 1991 entries among the results, and a few late-1992 entries may be missing. In some cases, individual entries may be two years late.


Why use the initial search page rather than "Cited Ref Search"?

Mainly, because "Cited Ref Search" provides no way to search for an author at a particular address (zip code), which you will need to do in step 4.

Secondarily, because two apparent "advantages" of using "Cited Ref Search" in fact product incorrect results.

First, listing selected years under "Cited Year" is not helpful. If you enter "smith j" under "Cited Author" and "1994 or 1995 or 1996 or 1997" under "Cited Year," you will get a list of all of J. Smith's papers published in 1994-1997 that have ever been cited in any year. That is, you will miss countable publications that happen never to have been cited, and you will not be assured that the publications listed were cited in 1994-1997. Further, the totals in the "hits" column will always be total citations over all time (no entry you can make on any screen in Web of Science will ever restrict "Hits" or "Times cited" to citations made in particular years).

Second, even if you want total citations over all time (rather than NRC-style citations), check-marking more than one publication before you click "search" will yield an undercount. Consider the case in which some publication somewhere cites three papers by the author for whom you are counting citations. That situation should count as three citations (both by NRC standards and by ordinary "common sense" standards), but if you try to count all of your author's citations at once by check-marking all the boxes and clicking "search," it will count as only one citation (because Web of Science pools all the citations, eliminates "duplicates," then lists those that remain--no publication will appear more than once on the list, no matter how many of your author's papers it cites).


In my "chronological" search results, why are some entries listed out of date order?

Most chronological listings that Web of Science produces are ordered by date of entry into the database rather than by date of publication (as discussed under
Why are "year of publication" and "year of entry into the database" sometimes different, . . .), so entries added late to the database may appear among the entries for the following year.


Can one determine a publication's year of entry, as opposed to its year of publication?

Yes, you can use the "Year selection" option on the "Full search" page of Web of Science to deduce the year in which an entry was made. First, check "Year selection" and the entry's year of publication, then do a "General Search" on the author's name. If the entry turns up in the resulting list, then it was entered in its year of publication. If not, then repeat the search, checking instead adjacent years (one at a time) until you find the entry among the results. It will turn up in a search of the year in which it was entered.

You could, therefore, improve the match between your results and those NRC would obtain by checking the year of entry of every publication to make certain you counted only those NRC would see (that is, by excluding any entered after the NRC's database-trunctation date). To match NRC's citation counts exactly, though, you would have to double-check the entry date of every publication citing any publication you counted--too time-consuming to be practical.


Why select the full time range on the search page? If I'm counting publications and citations for 1994-1997, wouldn't it make sense to restrict my search to those four years at this stage?

No. Remember that the years you list on the Full Search page are dates of entry, NOT dates of publication. If you choose only 1994 through 1997, some 1993 entries will turn up among the results, and some 1997 entries will be missing. FSU has elected to count strictly by publication date, not by entry date, so you will get more accurate counts by sticking with the full time range here and judging each entry's eligibility on the basis of the publication date that appears in the entry. See also
Why are "year of publication" and "year of entry into the database" sometimes different, and why is the difference important?, above.


Why should I use the zip code in the address field?

Because that's how NRC distinguished among, for example, several authors named "J. Smith," all publishing on the same subject, but at different institutions. The zip code is short, is of standard size and configuration for all authors, and is the only element of each author's address that ties the author to a particular institution and is unlikely to vary from entry to entry in the database (the way, say, the name of the university might if it were abbreviated differently in different entries).


Must I always enter the zip code, even though the name is unique and this author's work cannot be confused with that of any other?

The NRC started by developing a list of individual authors, then went through the database of publications, attributing each publication to one (or more, in the case of multiple authorship) of those authors. Because so many authors had similar names, the authors were further identified by the zip codes of their current institutions. Each publication in the database was therefore attributed to, say, "J. Smith 32306" or "J. Smith 55512" or "Q. Uniquename 32310." Each author was viewed as affiliated with only a single institution (the one that included that author on its 1992 list of faculty), so outdated zip codes (e.g., those of institutions where the author once worked but did not work at the end of 1992) were regarded as incorrect zip codes.

The NRC eliminated from its database, and therefore did not count at all, any publication that could not be attributed to one of these author-zip combinations. That is, NRC ignored not just "J. Smith" papers with missing, incorrect, or outdated zip codes (because they could not be attributed to a particular J. Smith) but also "Q. Uniquename" papers with missing, incorrect, or outdated zip codes.

Therefore, if you are counting publications or citations for "Q. Uniquename," you may indeed omit the zip code from the search screen, BUT you must then verify that each entry your search produces does indeed bear Q. Uniquename's current zip code and eliminate from your count any that do not. I find it simpler just to enter the zip code on the search screen so that Web of Science will do that elimination step for me.


Why do some entries appear that are not publications of the FSU author I searched for?

Each Web of Science entry contains zip codes for all its authors, and they are not "matched" to authors, so a search for publications by "smith j" at "32306" can, for example, turn up the occasional item by some non-FSU "smith j" who is a coauthor on a paper with an FSU faculty member--the entry will contain both his name and our zip code. As nearly as I can tell, NRC would in fact attribute such a publication to FSU's "smith j" so long as the two Smiths publish on similar topics.


Why can I count only entries for which "Document type" reads "article"?

Because that's how the NRC did the count. NRC excluded from its raw database, for example, review papers, notes, letters, book reviews, editorial material, and book chapters (and citations thereof). They included only "article," "proceedings paper," "art exhibit paper," "fiction, creative prose," "music score," and "poetry." Of those, only "article" (and perhaps "proceedings paper") is relevant to the sciences.


Why should I count each entry's citations separately. Isn't there a way to mark all the entries and have Web of Science count all the citations at once?

No, not accurately. See above under
Why use the initial search page rather than "Cited Ref Search"?


Why must I count the citations individually rather than just recording the total number?

Because, unless the total is zero, some of the citations are likely to be dated outside the 1994-1997 time window and should not be included in the count.


Why can I count only citations dated 1994-1997? Doesn't that eliminate virtually all citations of the 1997 papers?

Because NRC worked from a database truncated on the last day of 1992, they had no data for either publications or citations after that date. To be analogous, our data must stop at the end of 1997, for both publications and citations. Yes, this procedure eliminates virtually all citations of 1997 papers, but the same was true of 1992 papers in the NRC database.


Why should I lump self and nonself citations together?

The NRC did not distinguish between self and nonself citations (they had no ready means of doing so), so our counts should include both.


What happens to citations of a paper while it is "in press"? Or those that cite it incorrectly? Are they included in its citation total?

No. Although ISI does record citations of papers as "in press," those citations are not included in the published paper's eventual citation totals. Neither are citations that include typographical errors in journal title, volume, beginning page number, or date.

ISI currently has no way of identifying an erroneous or in-press citation with the paper it was intended to cite, so such citations appear in the "Cited Ref Search" results as citations of items not included in the database, just as citations of, e.g., books and book chapters do. Until ISI can devise a reliable way of reconciling such discrepancies (and they are working on the problem), these citations cannot be added to the totals of the papers they are intended for.

For NRC's purposes, erroneous and in-press citations are simply lost from the count, because they cannot figure in the citation numbers of any paper in the database.

To see examples of such "lost" citations, go to the "Cited Ref Search" page, enter "abele lg" in the "CITED AUTHOR" field, and click "Search." Of the first 10 entries, only one is highlighted--it is the only one representing a record in the Web of Science. The others are publications not included in the ISI database. Some are monographs; others are in journals not indexed at the time. A little study should convince you that the 10 entries in fact represent only 5 or 6 publications. Typically, the entry with the highest number of citations represents the correct citation form; the others differ from it in, e.g., date or page number. For example, the record "ABELE LG, AM NAT, 1979, 114, 559" has over 30 citations; it should have two more, but the one immediately after it has an error in the date, and the one after that shows no date, volume, or page--it is probably an "in press" citation.

If you have questions or comments, please contact Anne B. Thistle.